Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in bengaluru
>
Palo Alto Networks
>
Manager, Site Reliability Engineering (Technical Incidents) - Cortex

Manager, Site Reliability Engineering (Technical Incidents) - Cortex

Palo Alto Networks

12 - 17 years

8 - 14 Lacs

bengaluru

Posted:3 months ago| Platform:

Apply

Skills Required

cloud platforms gitlab ci incident response mttr container orchestration sre predictive aiml anomaly detection artificial intelligence devops aws ml

Work Mode

Work from Office

Job Type

Full Time

Job Description

Your Career

Were seeking an experienced Cloud SRE lead to lead high-severity incident and problem management across our GCP-centric platforms. This role combines deep technical troubleshooting with process ownership, ensuring rapid recovery, root cause elimination, and long-term reliability improvements. You will own L3 OnCall responsibilities, drive post-incident learning, and champion automation and operational excellence.

Implement and lead post-mortem processes within SLAs, identify root causes, and drive corrective actions to reduce repeat incidents.

Your Impact :

In your technical and leadership capacity you will contribute to a seamless production site reliability operations , partnering closely with regional and global SRE counterparts with special attention to the below
Incident Analysis & Problem Management: Implement and lead post-mortem processes within SLAs, identify root causes, and drive corrective actions to reduce repeat incidents. Establish and maintain a problem backlog, ensuring timely resolution and continuous process improvement.
Troubleshooting: Rapidly diagnose and resolve failures across Kubernetes, Terraform, and GCP using advanced troubleshooting frameworks.
Preventative Measures: Implement automation and enhanced monitoring to proactively detect issues and reduce incident frequency.
Stakeholder Communication: Work with GCP / AWS TAMs and othre vendors to request new features or followups for updates.
Mentorship: Coach and elevate SRE and DevOps teams, promoting best practices in reliability and incident/problem management.
Documentation: Establish and maintain a problem backlog, ensuring timely resolution and continuous process improvement.
Envision the future or SRE with AI/ML : Ability to envision how a modern SRE team should operate leveraging AI/ML

Qualifications

Your Experience

12+ years of experience in SRE/DevOps/Infrastructure roles, with a strong foundation in cloud-based environments.
5+ years of proven experience managing SRE/DevOps teams, preferably with a strong focus on Google Cloud Platform (GCP).
Deep hands-on knowledge of Terraform, Kubernetes (GKE), GitLab CI/CD, and modern observability practices (e.g., Prometheus, OpenTelemetry).
Strong experience in managing incident response and postmortems, reducing MTTR, and driving proactive reliability improvements.
Proficiency with cloud platforms such as GCP & AWS.
Solid grasp of Infrastructure as Code, container orchestration, and scalable cloud architectures.
Track record of building tools for system reliability, automated remediation, and performance tuning.
Experience leveraging AI/ML-based operations tools for automation, anomaly detection, and predictive alerting is a plus.
Expertise in SLI/SLO/SLA design and implementation, and driving operational maturity through data.
Strong interpersonal and leadership skills, with a demonstrated ability to coach, mentor, and inspire teams.
Effective communicator, capable of translating complex technical concepts to non-technical stakeholders.
Committed to inclusion, collaboration, and creating a culture where every voice is heard and respected.

Additional Information

The Team

To stay ahead of the curve, its critical to know where the curve is, and how to anticipate the changes were facing. For the fastest-growing cybersecurity company, the curve is the evolution of cyberattacks and access technology and the products and services that dedicatedly address them. Our engineering team is at the core of our products connected directly to the mission of preventing cyberattacks and enabling secure access to all on-prem and cloud applications. They are constantly innovating challenging the way we, and the industry, think about Access and security. These engineers arent shy about building products to solve problems no one has pursued before. They define the industry, instead of waiting for directions. We need individuals who feel comfortable in ambiguity, excited by the prospect of challenge, and empowered by the unknown risks facing our everyday lives that are only enabled by a secure digital environment.

Our engineering team is provided with an unrivaled chance to create the products and practices that will support our company growth over the next decade, defining the cybersecurity industry as we know it. If you see the potential of how incredible people and products can transform a business, this is the team for you. If the prospect of affecting tens of millions of people, enabling them to work remotely securely and easily in ways never done before, thrill you - you belong with us.

More Jobs at Palo Alto Networks

Solution Consultant

Mumbai Metropolitan Region

2 - 5 yrs

Salary: Not disclosed

Senior Manager - FP&A

Bengaluru

12 - 17 yrs

INR 45 - 50 Lacs

Senior Manager FP&A

Bengaluru

12 - 15 yrs

INR 40 - 45 Lacs

Senior Manager, Internal Audit - Channel Sales

Bengaluru

10 - 15 yrs

INR 12 - 17 Lacs

Senior Staff Engineer Software (Prisma SaaS) - Netsec

Gurugram, Haryana, India

Experience: Not specified

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start Artificial Intelligence Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

Palo Alto Networks

Cybersecurity

Santa Clara

Login to

Please Verify Your Phone or Email

Confirm Action

Manager, Site Reliability Engineering (Technical Incidents) - Cortex