Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

SRE lead

Advaiya Solutions

9 - 14 years

12 - 19 Lacs

gurugram

Posted:-1 days ago| Platform:

Apply

Skills Required

site reliability engineering jenkins terraform cloud ci/cd kubernetes

Work Mode

Work from Office

Job Type

Full Time

Job Description

Role overview

The SRE lead will oversee the reliability, performance, and operational excellence of cloud and on-premise infrastructure, applications, and services. This role combines deep technical expertise with leadership responsibilityensuring stability, security, and scalability across environments. The SRE lead will manage a team of engineers providing support and lead initiatives around automation, patching, monitoring, and FinOps optimization to ensure high availability and efficiency.

Key responsibilities

1. Infrastructure and VM management

Oversee VM provisioning, patching, scaling, and performance management.
Automate patching and log maintenance processes to minimize downtime.
Ensure monthly updates, backups, and system health checks.
Coordinate with business teams to schedule patching for minimal impact.

2. Application and CI/CD service reliability

Manage patching and updates for app services and associated components.
Administer Jenkins pipelines, job management, and agent scaling.
Implement secure access controls and perform regular reviews.
Maintain Azure DevOps and pipeline governance for CI/CD stability.

3. Security and compliance

Support CSPM and vulnerability management, prioritizing high-severity remediation.
Respond to SOC/SIEM alerts, conducting incident triage and resolution.
Manage PAM integrations, access controls, and compliance tracking.
Maintain DNS and certificate lifecycle management, including renewals and secure updates.

4. Monitoring and observability

Establish unified monitoring for infrastructure, applications, and performance metrics.
Create dashboards and alerting systems to proactively detect anomalies.
Provide incident response coverage and periodic service health reports.
Conduct post-mortem analyses and implement corrective actions.

5. Cloud and FinOps operations

Optimize cloud resource usage and cost through detailed FinOps reporting.
Identify savings opportunities via rightsizing and unused resource cleanup.
Generate monthly cost reports by application, service, and environment.
Collaborate with business and finance teams for budget forecasting and cost governance.

6. Performance and scalability

Continuously monitor infrastructure utilization and adjust resources dynamically.
Analyze performance data to drive improvements in reliability and efficiency.
Manage scaling of services and compute resources based on consumption trends.

7. Change and release management

Facilitate CAB meetings and manage end-to-end change lifecycle.
Review and prioritize change requests based on risk and business impact.
Supervise production deployments and implement rollback strategies.
Conduct post-implementation evaluations and report on success metrics.

8. Support and maintenance

Lead the SRE team in providing L3 support for incidents and operational issues.
Maintain documentation, knowledge bases, and troubleshooting guides.
Implement preventive maintenance measures to enhance system stability.

Qualifications and experience

Essential

Bachelor’s degree in computer science, engineering, or equivalent experience.
8+ years of IT operations experience, with at least 3 years in an SRE or DevOps leadership role.
Expertise in cloud environments (Azure preferred), including infrastructure automation, monitoring, and FinOps.
Hands-on experience with CI/CD tools (Jenkins, Azure DevOps).
Strong knowledge of scripting (PowerShell, Python, or Bash).
Deep understanding of networking, security, and system administration principles.

Preferred

Experience with CSPM tools and vulnerability management platforms.
Familiarity with SOC/SIEM tools (e.g., Microsoft Sentinel, Splunk).
Strong communication and stakeholder management skills.
ITIL, Azure Administrator, or DevOps Engineer certification.

Key competencies

Reliability mindset:
designs systems for fault tolerance and operational excellence.
Automation-first approach:
reduces manual effort through tooling and scripts.
Leadership:
mentors engineers and coordinates cross-functional initiatives.
Analytical rigor:
uses data-driven insights for optimization and cost control.
Collaboration:
works closely with security, development, and infrastructure teams to ensure seamless delivery.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

Advaiya Solutions

Information Technology and Services

Richmond

Before You Leave... Find Your Perfect Job!

Login to

Please Verify Your Phone or Email

Confirm Action

Contact Us

SRE lead

Experience & Salary

Skills Required

Work Mode

Job Type

Job Description

Role overview

Key responsibilities

1. Infrastructure and VM management

2. Application and CI/CD service reliability

3. Security and compliance

4. Monitoring and observability

5. Cloud and FinOps operations

6. Performance and scalability

7. Change and release management

8. Support and maintenance

Lead the SRE team in providing L3 support for incidents and operational issues.

Qualifications and experience

Essential

Preferred

Key competencies

Reliability mindset:

Automation-first approach:

Leadership:

Analytical rigor:

Collaboration:

More Jobs at Advaiya Solutions

SRE lead

Mock Interview

Start Your Job Search Today

Please Verify Your Phone or Email

Job Application AI Bot

Download the Mobile App

Setup Job Alerts

Enhance Your Skills

Visit Company Website

RecommendedJobs for You

SRE lead

Infrastructure and Platform Architect L2

Vmware Engineer

Sap Abap Consultant

SRE lead

Platform Architect(Python Backend developer, technical architect)

Solution Architect - AWS

Principal Azure Cloud Engineer / Architect Strictly 15+ years only

Principal Azure Cloud Engineer / Architect Strictly 15+ years only

Principal Azure Cloud Engineer / Architect Strictly 15+ years only

AI Job Matching Summary

Pros

Cons

Summary