Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

SRE Head

SID Global Solutions

10 years

0 Lacs

mumbai maharashtra india

Posted:2 months ago| Platform:

Apply

Skills Required

engineering scaling reliability strategy model service drive development metrics logging risk gcp aws datadog planning network security compliance certification

Work Mode

On-site

Job Type

Full Time

Job Description

Job Title:

Experience Level:

Role Type:

Role Overview:

The SRE Head is responsible for leading and scaling the Site Reliability Engineering (SRE) function across the organization. This role defines the reliability strategy, standards, and practices to ensure high availability, performance, and resilience of critical systems. The SRE Head partners with engineering, infrastructure, and operations teams to embed reliability, observability, and continuous improvement across all services.

Key Responsibilities:

Lead and define the
SRE strategy
, operating model, and best practices across the organization.
Establish and maintain
SLIs, SLOs, and SLAs
to measure and ensure service reliability and performance.
Oversee
incident management
,
post-incident reviews
, and
root cause analysis
for major outages.
Drive
resilience engineering
,
disaster recovery
, and
chaos engineering
initiatives.
Collaborate with
development, infrastructure, and operations teams
to improve reliability and automation.
Lead efforts to improve
observability
, including metrics, logging, and tracing frameworks.
Foster a culture of
proactive reliability
,
continuous learning
, and
blameless postmortems
.
Mentor and guide
SRE leads and engineers
, building high-performing reliability teams.
Track and communicate
reliability trends
, key metrics, and risk areas to leadership.
Evaluate and adopt emerging tools and practices to enhance platform reliability and scalability.

Required Qualifications & Experience:

10+ years
of experience in
SRE, reliability engineering, or production operations
in large-scale environments.
Proven expertise in
availability management
,
incident response
, and
service continuity
.
Strong technical understanding of
cloud platforms (GCP/AWS/Azure)
,
Kubernetes
,
CI/CD
, and
automation
.
Proficiency in
observability tools
(e.g., Prometheus, Grafana, Dynatrace, Datadog, ELK, OpenTelemetry).
Experience implementing
SLIs/SLOs
,
error budgets
, and
capacity planning frameworks
.
Strong
leadership
,
strategic thinking
, and
cross-functional collaboration
skills.
Excellent
communication
,
mentoring
, and
culture-building
abilities.

Desirable Skills:

Experience in
building and scaling SRE organizations
or CoEs.
Exposure to
performance engineering
,
cost optimization
, and
AIOps practices
.
Deep understanding of
network reliability
,
security resiliency
, and
compliance-driven uptime goals
.
Certification in
reliability or cloud architecture
(e.g., Google SRE, GCP Professional Architect).

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

SID Global Solutions

Before You Leave... Find Your Perfect Job!

Login to

Please Verify Your Phone or Email

Confirm Action

Contact Us

SRE Head

Experience & Salary

Skills Required

Work Mode

Job Type

Job Description

Job Title:

Experience Level:

Role Type:

Role Overview:

Key Responsibilities:

SRE strategy

SLIs, SLOs, and SLAs

incident management

post-incident reviews

root cause analysis

resilience engineering

disaster recovery

chaos engineering

development, infrastructure, and operations teams

observability

proactive reliability

continuous learning

blameless postmortems

SRE leads and engineers

reliability trends

Required Qualifications & Experience:

10+ years

SRE, reliability engineering, or production operations

availability management

incident response

service continuity

cloud platforms (GCP/AWS/Azure)

Kubernetes

CI/CD

automation

observability tools

SLIs/SLOs

error budgets

capacity planning frameworks

leadership

strategic thinking

cross-functional collaboration

communication

mentoring

culture-building

Desirable Skills:

building and scaling SRE organizations

performance engineering

cost optimization

AIOps practices

network reliability

security resiliency

compliance-driven uptime goals

reliability or cloud architecture

More Jobs at SID Global Solutions

SRE Head

Mock Interview

Start Your Job Search Today

Please Verify Your Phone or Email

Job Application AI Bot

Download the Mobile App

Setup Job Alerts

Enhance Your Skills

RecommendedJobs for You

SRE Head

SRE Head

SRE Head

SRE Head

AI Job Matching Summary

Pros

Cons

Summary