Home
Jobs

Lead SRE Engineer

5 - 7 years

20 - 22 Lacs

Posted:2 months ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

We are seeking a Site Reliability Engineer (SRE) with strong DevOps expertise to ensure the reliability, availability, and performance of critical systems and services. This role bridges the gap between development and operations teams by employing automation, monitoring, and best practices to enhance system scalability, reduce downtime, and improve overall operational efficiency. The SRE will focus on optimizing development pipelines, managing infrastructure, and implementing proactive monitoring and ing systems while upholding the principles of DevOps and reliability engineering . Key Responsibilities: 1. Reliability Engineering: Design, implement, and maintain high-availability systems. Create and enforce Service Level Objectives (SLOs) , Service Level Indicators (SLIs) , and Service Level Agreements (SLAs) . Conduct root cause analysis for system failures and implement post-mortem processes to prevent recurrence. 2. DevOps Automation: Automate infrastructure provisioning, deployment pipelines, and operational processes. Build and maintain CI/CD pipelines using tools like Jenkins , GitHub Actions , or GitLab CI/CD . Develop Infrastructure as Code (IaC) using tools like Terraform , CloudFormation , or Ansible . 3. Monitoring and Incident Management: Implement robust monitoring , logging , and ing solutions using tools like Datadog or Splunk . Establish proactive incident response processes and manage on-call rotations. Ensure effective documentation for incident handling and resolution. 4. Performance and Scalability: Optimize system performance through capacity planning and resource management . Enable horizontal scaling of services to handle increasing loads. Work closely with development teams to improve application resilience and performance . 5. Security and Compliance: Enforce security best practices in infrastructure and application development. Conduct vulnerability assessments and implement remediation measures. Ensure compliance with organizational and industry standards. 6. Collaboration and Culture: Act as a bridge between development and operations teams to foster a DevOps culture . Coach teams on best practices in reliability , automation , and DevOps . Advocate for a culture of ownership and continuous improvement . Key Skills and Competencies: Technical Skills: Expertise in cloud platforms like AWS , Azure , or GCP . Proficiency in Linux system administration and networking concepts . Strong programming/scripting skills (e.g., Python , Go , Bash ). Understanding of Terraform creation and management. Familiarity with containerization and orchestration tools like Docker and Kubernetes . Knowledge of database management (SQL and NoSQL).

Mock Interview

Practice Video Interview with JobPe AI

Start Automation Interview Now

My Connections UST

Download Chrome Extension (See your connection in the UST )

chrome image
Download Now
UST
UST

IT Services and IT Consulting

Aliso Viejo CA

10001 Employees

1845 Jobs

    Key People

  • Kris Canekeratne

    Co-Founder & CEO
  • Sandeep Reddy

    President

RecommendedJobs for You

Bengaluru / Bangalore, Karnataka, India

Hyderabad / Secunderabad, Telangana, Telangana, India

Noida, Uttar Pradesh, India