Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in Bengaluru
>
Augusta Infotech
>
Lead Site Reliability Engineer

Lead Site Reliability Engineer

Augusta Infotech

10 - 13 years

18 - 25 Lacs

Bengaluru

Posted:7 months ago| Platform:

Apply

Skills Required

Devops Cloud Site Reliability Engineering Microsoft Azure Observability SRE Prometheus Ci/Cd Load Balancing Grafana Terraform PowerBI GCP Onpremise AWS

Work Mode

Hybrid

Job Type

Full Time

Job Description

Hiring, Lead Site Reliability Engineer with following skills and expertise. What will this person do? Provide leadership in designing and implementing reliable, scalable, and secure infrastructure solutions. Develop and maintain observability solutions, ensuring visibility into system performance using native Azure Cloud solutions. Define and track SLIs, ensuring compliance with SLOs and SLAs. Lead incident response efforts, conduct root cause analysis, and implement preventive measures to minimize downtime. Automate infrastructure provisioning, configuration and management using Terraform & Ansible. Build and maintain robust Observability pipelines to support automated deployments and continuous monitoring practices. Continuously analyze system health and optimize performance by identifying and resolving bottlenecks. Work with our BCDR team to minimize business impact during failures and measure the quality of services. Work with Cloud Governance team to monitor cloud infrastructure spending and implement cost-saving strategies. Implement centralized logging, metric collection, and distributed tracing for troubleshooting and debugging. Deploy, Manage and Monitor containerized workloads. Maintain configuration consistency and compliance across cloud environments using tools like Ansible. Partner with software development teams to integrate reliability best practices into the application development lifecycle. Conduct detailed post-mortems, document learnings, and drive improvements to reduce future incidents. Develop automation scripts in Python, Bash, or other languages to reduce manual efforts and improve efficiency. Provide mentorship to junior engineers, fostering a culture of learning and continuous technical growth. Research and evaluate new technologies, tools, and methodologies to improve system reliability and efficiency. Maintain detailed documentation on infrastructure, monitoring setups, incident responses, and best practices. Qualifications Bachelors degree in Computer Science, Engineering, or a related field. 10+ years in Observability, DevOps, and Site Reliability Engineering (SRE). At least 2 years of experience in defining Observability KPIs for both on-premises and cloud environments. Strong experience with cloud platforms (AWS, Azure, GCP) and cloud-native technologies. Passion for automation, reducing toil and implementing reliability-focused best practices. Deep knowledge of services/tools like Grafana, PowerBI, Prometheus, Azure Monitor, Application Insights & Azure Metrics. Expertise in Terraform, Ansible, Chef, and CI/CD pipeline tools like GitHub Actions, Jenkins, and GitOps methodologies. Working understanding of load balancing, authentication (AAA), encryption, and network parameters monitoring. Strong troubleshooting skills and experience handling on-call incidents and post-mortem analysis. Ability to work cross-functionally, drive technical discussions, and mentor junior engineers. Ability to work in a dynamic team environment and possess time management skills to meet deadlines. Sense of ownership and pride in your performance and its impact on the companys success. Critical thinker with problem-solving skills. Good interpersonal and communication skills.

More Jobs at Augusta Infotech

Senior Software Developer

Bengaluru

4 - 9 yrs

INR 15 - 30 Lacs

Relativity Infrastructure Engineer

Bengaluru

5 - 8 yrs

INR 8 - 15 Lacs

Lead Site Reliability Engineer

Bengaluru

10 - 13 yrs

INR 18 - 25 Lacs

Associate ServiceDesk Analyst

Bengaluru

1 - 2 yrs

INR 1 - 4 Lacs

Sustainable, Client and Regulatory Reporting Data Product Owner

Bengaluru

15 - 20 yrs

INR 40 - 100 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

Augusta Infotech

Information Technology

New Delhi

Login to

Please Verify Your Phone or Email

Confirm Action

Lead Site Reliability Engineer