We are seeking a highly organized and proactive
Senior Program Manager, Engineering Operations
to lead critical operational programs across our engineering teams. This role will focus on release management, incident response, change control, and engineering metrics to improve reliability, scalability, and velocity of software delivery. You will also be responsible for scaling operational excellence by supporting engineering best practices, managing on-call escalation programs, and collaborating closely with technical customer support to resolve customer-impacting issues. The ideal ca ndidate thrives in fast-paced environments, communicates with clarity, and has a strong background in engineering or technical operations.
Key Responsibilities
Release Management
- Coordinate software releases across multiple engineering teams.
- Own and maintain a release calendar with consistent communication.
- Ensure all release-related artifacts (e.g., change tickets, checklists) are completed.
- Enforce release readiness policies including freeze periods and rollback plans.
- Lead post-release retrospectives and drive process improvements.
- Promote engineering standards and best practices to ensure consistent, high-quality deployments across teams
.
Incident Management
- Serve as facilitator during high-severity (SEV) incidents.
- Manage and improve incident response templates, tools, and on-call practices.
- Ensure timely and effective stakeholder communication during incidents.
- Lead incident reviews and ensure follow-up actions are completed.
- Analyze incident trends and recommend preventive improvements.
- Oversee PagerDuty configuration and escalation policies to ensure 24/7 operational coverage.
- Manage on-call rotation programs, track escalation health, and continuously optimize team alerting workflows.
Change Management
- Own the change request and approval process ensuring compliance and audit readiness.
- Partner with engineering teams on planning and reviewing major changes.
- Maintain documentation for change control processes and policies.
- Continuously evolve frameworks for assessing change risk and rollout strategies.
Engineering Metrics
- Define, track, and report key delivery and reliability metrics, including:
- DORA Metrics: Deployment Frequency, Lead Time for Changes, MTTR, Change Failure Rate
- Cycle Time: Issue creation to production deployment
- Build visibility into engineering efficiency, throughput, and incident performance.
- Collaborate with engineering and product leaders to ensure metrics drive action and accountability.
- Maintain operational dashboards and lead monthly metrics reviews.
- Identify gaps and support continuous improvement in engineering practices and resource allocation based on metric insights.
Cross-Functional Collaboration
- Partner closely with Technical Customer Support to ensure customer-reported incidents are prioritized, escalated, and resolved effectively.
- Support readiness programs that prepare engineering teams to respond efficiently to live customer issues.
- Collaborate with Product Management and Product Design to ensure operational requirements, scalability considerations, and incident learnings inform product planning and user experience decisions.
Required Skills & Experience
- 5 -8 years in Engineering Operations, DevOps, or Site Reliability Engineering (SRE).
- Proven track record managing software releases and high-severity incidents.
- Strong familiarity with tools such as Jira, PagerDuty, LaunchDarkly, GitHub Actions, Confluence, and LinearB.
- Exceptional communication skills to interface across technical and non-technical teams.
- Highly organized with a continuous improvement mindset.
- Demonstrated experience implementing engineering best practices and on-call management programs.
Preferred Qualifications
- Exposure to ITIL or similar operational governance frameworks.
- Experience using incident-related metrics (e.g., MTTA, MTTR) and dashboards for analysis.
- Understanding of Agile/Scrum methodologies and CI/CD pipelines.
- Prior participation in production readiness reviews or Change Advisory Boards (CABs).
- Experience collaborating with customer support or success teams to address technical escalations.
- Background in standardizing operational playbooks and service ownership across engineering teams.