SRE AWS Operations Lead

10 - 14 years

0 Lacs

Posted:1 week ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Job Title: Site Reliability Engineering (SRE) Lead Location: Hyderabad / Bengaluru Job Type: Full-time Experience Level: 10+ years Job Overview: We are seeking a seasoned Site Reliability Engineering (SRE) Lead with a strong background in cloud operations, production systems, and automation. This is a senior-level hands-on role that combines leadership with deep technical expertise in AWS, DevOps, and infrastructure reliability. You will lead a team focused on ensuring availability, scalability, and operational excellence for our cloud-native product environments. Key Responsibilities: Leadership & Operations Management - Lead and mentor a team of SREs and Cloud Operations Engineers. - Define and enforce reliability standards, SLOs/SLIs, and incident response practices. - Drive reliability, observability, and automation improvements across cloud-based platforms. - Act as the bridge between product engineering, DevOps, and support teams for operational readiness. Cloud & Infrastructure Reliability - Manage production-grade environments hosted on AWS with a focus on high availability and performance. - Lead incident management processes, perform root cause analysis, and implement corrective actions. - Own and evolve monitoring, alerting, and observability using tools like CloudWatch, Prometheus, Grafana, ELK. - Ensure compliance with security and regulatory standards (e.g., HIPAA, SOC2, GDPR). DevOps & Automation - Design and improve CI/CD pipelines using tools like Jenkins, GitHub Actions, or Azure DevOps. - Implement Infrastructure as Code (IaC) using CloudFormation. - Experience with Packer and Ansible - Automate manual operational tasks and production workflows. - Support containerized workloads using Docker, ECS, or Kubernetes (EKS). Stakeholder Communication - Present technical issues, incident reports, and performance metrics to business and technical stakeholders. - Collaborate with Engineering, Product, and Security teams to embed reliability across the software lifecycle. - Provide guidance on cloud cost optimization, performance tuning, and capacity planning. Required Qualifications: - 10+ years of overall IT experience, including: - At least 5 years in AWS cloud operations or SRE. - Minimum 3 years in production-grade environments and incident response. - Strong leadership experience managing high-performing technical teams. - Deep understanding of SRE principles, DevOps practices, and cloud-native architecture. - Proven experience in: - AWS core services (VPC, EC2, RDS, ECS, EKS, IAM, S3) - Container orchestration and microservices - Infrastructure as Code (Terraform / CloudFormation) - Monitoring & observability tools (ELK, Prometheus, CloudWatch) Preferred Qualifications: - AWS Certified Solutions Architect or DevOps Engineer. - Experience working on SaaS or multi-tenant platforms. - Familiarity with multi-cloud and hybrid cloud strategies.,

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You