This is a unique opportunity to lead a key part of OCI&aposs Observability stack focused on
Telemetry, Monitoring and Alarming
systems, which are essential to ensuring the performance, availability, and trustworthiness of all Oracle Cloud services. Our mission is to deliver a
world-class Integrated Observability and Management platform
that seamlessly supports OCI, hybrid, and multi-cloud environments.Our platform combines Monitoring, Alarming, Logging, Events, Auditing, and SIEM capabilities to give customers and internal teams a unified, actionable view into their infrastructure and applications. This role specifically focuses on the
Monitoring and Alarming platform
, which provides the foundation for real-time metric ingestion, scalable alerting, incident detection, and proactive canary-based health verification of services.We are looking for a
Senior Engineering Manager
to lead an exceptionally talented team of software engineers in advancing this critical part of OCIs platform. You will drive innovation and scale to ensure our Telemetry systems remain among the most
reliable, performant, and intelligent
in the modern cloud landscape.
Responsibilities
- Own the design, development, and operation of a high-scale, distributed telemetry platform that processes billions of datapoints and petabytes of time-series data across OCI regions.
- Ensure the reliability, availability, and operational excellence of services responsible for Monitoring, Alarming, and Canary-based health checks, supporting mission-critical infrastructure.
- Provide technical leadership, direction, and strategic vision for a team of senior and principal engineers, fostering a culture of innovation, accountability, and continuous improvement.
- Define and execute a clear, prioritized roadmap of features, platform investments, and operational improvements delivering on commitments on time and with high quality.
- Collaborate cross-functionally with Product Management, other OCI service teams, and Oracle-wide stakeholders to align goals, manage dependencies, and drive integrated solutions.
- Drive and mature engineering processes, including design reviews, operational readiness reviews, quality standards, and incident postmortems.
- Represent the team in executive-level updates and strategic planning discussions, articulating technical direction, risks, and delivery status.
- Proactively monitor the health and performance of services in the global OCI fleet, identifying trends, mitigating risks, and ensuring fault-tolerant, scalable telemetry infrastructure.
Qualifications
Career Level - M3
About Us
As a world leader in cloud solutions, Oracle uses tomorrows technology to tackle todays challenges. Weve partnered with industry-leaders in almost every sectorand continue to thrive after 40+ years of change by operating with integrity.We know that true innovation starts when everyone is empowered to contribute. Thats why were committed to growing an inclusive workforce that promotes opportunities for all.Oracle careers open the door to global opportunities where work-life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.Were committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing [HIDDEN TEXT] or by calling +1 888 404 2494 in the United States.Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.