Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in gurugram
>
Cvent
>
Principal Site Reliability Engineer

Principal Site Reliability Engineer

Cvent

10 - 13 years

0 Lacs

gurugram haryana india

Posted:4 days ago| Platform:

Apply

Skills Required

reliability stability collaborative leadership software deployment security risk technology devops scalability timeline ai automation support developer drive design integration containerization docker kubernetes terraform aws monitoring strategies datadog planning optimization engineering learning triage workflow architecture scaling management ml efficiency programming scripting python linux troubleshooting networking analysis unix development database communication

Work Mode

On-site

Job Type

Full Time

Job Description

Cvent is looking for a Principal Site Reliability Engineer to help us scale our systems and ensure stability, reliability and performance and rapid deployments of our platform. We build teams that are inclusive, collaborative, and have a strong sense of ownership for the things they build. If you have a passion and track record for solving problems; moreover, have strong leadership skills, this is a great fit for you.

As a Principal Engineer, you will demonstrate both emerging and current technologies, methods, and processes contributing to the evolution of software deployment processes, enhancing security, reducing risk, and improving the overall end-user experience. As part of the Technology R&D Team, you will play an integral part in advancing DevOps maturity and be a part of a new culture of quality and site reliability. You will continually improve reliability, resiliency and scalability of our products, processes, and procedures. In this position, you would also be expected to ramp up to manage/mentor engineers and ensure their technical growth.

What You Will Be Doing:

• Set long-term technical direction for complex problems; communicate timeline, scope, risks, and the technical roadmap to leadership and stakeholders.

• Continuously evaluate emerging cloud and AI/automation technologies; run POCs to assess fit and pioneer intelligent copilots for support, incident response, and developer workflows.

• Architect, standardize, and scale SRE frameworks and best practices; drive adoption and continual improvement of SLIs/SLOs/SLAs across business-critical platforms.

• Lead design and integration of CI/CD, containerization (Docker, Kubernetes), and IaC (Terraform, AWS CDK) for large-scale environments; ensure security and regulatory compliance.

• Define and implement observability, monitoring, and alerting strategies; conduct deep-dive RCAs using Datadog, Prometheus, Grafana, and ELK; lead blameless postmortems.

• Lead capacity planning, cost optimization, and disaster recovery to ensure scalability, reliability, and system resilience.

• Translate business risk and product goals into actionable reliability and observability strategies; partner closely with SRE, Product, and Engineering teams.

• Mentor and upskill SRE/DevOps engineers; foster a culture of ownership, continuous learning, and operational excellence.

• Pioneer the use of AI-powered automation and intelligent copilots for alert triage, event grouping, and developer/operations workflow efficiencies.

• Serve as a mentor and organizational leader, influencing technical direction, upskilling teams, and fostering a culture of shared reliability ownership and blameless postmortems.

• Lead capacity planning, cost optimization, and disaster recovery initiatives to ensure seamless scalability and system resilience.

• Bridge business and technology stakeholders, translating business risk and product goals into actionable reliability and observability strategies.

• Represent the technology perspective and priorities to leadership and other stakeholders by continuously communicating timeline, scope, risks, and technical road map.

What You Need for this Position:

• 10-13 years in SRE, cloud engineering, or DevOps with significant time in an architect, staff, or principal role.

• Deep fluency in AWS across multi-account, multi-region, and high-traffic environments; strong foundation in distributed systems architecture and infrastructure as code.

• Demonstrable leadership scaling organizational SRE practices: CI/CD, observability, incident management, RCAs, and blameless postmortems.

• Proven track record driving adoption of AI, automation, and ML to improve reliability, operational efficiency, and developer productivity.

• Expert programming/scripting (Python, Go, or similar) with Linux internals depth and advanced troubleshooting of distributed systems.

• Validated breadth across networking, cloud, databases, and scripting, experience with multi-tier architectures.

• Exceptional ability to influence, coach, and communicate across engineering and product, acts as a pragmatic technical conscience with a strong bias for execution.

• Mastery of incident management, postmortem culture, and root cause analysis for distributed systems.

• Experience with Unix/Linux environments with a deep grasp on system internals

• Worked on large-scale distributed systems including multi-tiered architecture.

• Validated breadth of understanding and development of solutions based on multiple technologies, including networking, cloud, database, and scripting languages.

• Strong leadership, communication and interpersonal skills geared to getting things done.

More Jobs at Cvent

Employee Relations Manager

Gurgaon

8 - 10 yrs

INR 25 - 30 Lacs

Senior Analyst, Sales Commissions

Gurgaon

2 - 6 yrs

INR 6 - 10 Lacs

Outreach Specialist, Enterprise Success

Gurgaon

2 - 5 yrs

INR 9 - 13 Lacs

Senior Site Reliability Engineer

Gurugram, Bengaluru

4 - 7 yrs

INR 35 - 45 Lacs

Team Lead, Reporting and Insights

Gurugram

5 - 10 yrs

INR 10 - 15 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

Cvent

Software and Technology, Event Management

Tysons Corner

Login to

Please Verify Your Phone or Email

Confirm Action

Principal Site Reliability Engineer