Site Reliability Engineer - Linux, Observability, Containers

5 - 8 years

20 - 25 Lacs

Posted:3 days ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

We are seeking a motivated Site Reliability Engineer (SRE) to join our Observability team. In this role, you will support the team in maintaining and improving the reliability, security, and performance of our systems. You will learn from experienced engineers while gaining hands-on experience with modern monitoring, logging, and automation tools.
As an SRE I, you will assist in day-to-day operational tasks, help monitor system health, and participate in basic troubleshooting. You will also contribute to the maintenance of documentation and develop your technical skills through training and on-the-job experience.
This is a hybrid position, requiring 2 3 days per week in the office, as determined by leadership.
Responsibilities
Assist in maintaining system security by applying hotfixes and operating system patches under guidance to protect against cybersecurity threats. Support the deployment and configuration of monitoring and logging tools. Help automate routine operational tasks to improve efficiency and support system integration. Assist with the maintenance and basic management of observability tools such as Splunk, ClickHouse, Grafana, Prometheus, OpenTelemetry, Fluent Bit, ElasticSearch, OpenSearch, and CloudWatch. Work with team members to help implement and maintain monitoring solutions in development, staging, and production environments. Learn and apply DevOps and SRE best practices as directed by senior engineers. Contribute to the setup and maintenance of CI CD pipelines to support automated build, test, and deployment processes. Provide support in managing cloud infrastructure (AWS, GCP) to help ensure availability and security. Learn to use infrastructure as code tools such as Terraform, Ansible, or CloudFormation to support environment configuration. Monitor system performance and assist in identifying and escalating issues for resolution. Support the implementation and management of containerization technologies like Docker and Kubernetes. Participate in basic troubleshooting and assist with root cause analysis for production incidents. Help create and update documentation for infrastructure, processes, and operational procedures. Provide first-level support for routine infrastructure and deployment issues, escalating complex problems as needed. Look for opportunities to automate repetitive tasks and suggest improvements to workflows.
Justification
Visa s Observability ecosystem includes over 2,000 platform nodes, utilizing approximately 15 different tools for logging, monitoring, and tracing, alongside 80,000 client agents. The system handles daily log ingestion exceeding 100TB and oversees hundreds of critical applications, supporting vital alerts, dashboards, and reports. To maintain this high level of performance and reliability, we need a Site Reliability Engineer (SRE) with comprehensive knowledge and practical experience. This position requires an I4-level engineer who can operate independently with minimal supervision.
About Visa s PRE Observability Team
Visa s Product Reliability Engineering (PRE) Observability team partners with Product Development as well as Operations & Infrastructure teams to build and manage innovative, reliable, scalable, secure, and cost-effective observability platform solutions. We are looking for talented Senior Site Reliability Engineers to join our driven team, with a focus on maximizing system availability, performance, security, and reliability. This dynamic role requires technical leadership, strong problem-solving skills, and expertise in coding, testing, and debugging.

Basic Qualifications:
-Bachelor s degree with at least 2 years of relevant work experience, OR
Advanced degree (e.g., Master s, MBA, JD, MD) with no required work experience, OR
-5+ years of relevant professional experience.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Visa logo
Visa

IT Services and IT Consulting

Foster City California

RecommendedJobs for You

Noida, Uttar Pradesh, India