Lead Observability Engineer - Elasticsearch, APM, Azure, Service now

7 - 10 years

12 - 17 Lacs

Posted:13 hours ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Scope

  • Receives work assignments through the ticketing system or from senior leadership.
  • Provides Tier-4 engineering expertise, platform ownership, and technical leadership for all observability capabilities across hybrid cloud, on-premises, and SaaS environments.
  • Leads the design, architecture, and maturity of the enterprise observability ecosystem with a primary focus on the Elastic Observability Platform, ensuring end-to-end visibility for infrastructure, cloud services, networks, and business-critical applications.
  • Drives the enterprise strategy for logging, metrics, traces, synthetics, and alerting including governance, standardization, and performance optimization.
  • Partners closely with Cloud, Infrastructure, Security, Enterprise Applications, and SRE leadership to define observability frameworks, drive operational transparency, and strengthen service reliability.
  • Ensures observability platforms meet enterprise requirements for security, performance, availability, compliance, and scalability.
  • Oversees monitoring implementations for key SaaS applications including Workday, Salesforce, ServiceNow, and Microsoft 365, ensuring proactive issue detection and excellent user experience.
  • Provides guidance, mentorship, and direction to observability engineers, SREs, and operational teams to uplift monitoring maturity and promote best-practice adoption.
  • Acts as a strategic advisor during major incidents by providing real-time diagnostics, correlation insights, and driving RCA improvements.
  • Required to provide on-call support during off-hours on weekdays, weekends, and holidays on a rotating basis.

Our Current Technical Environment:

  • Tools & Platforms:Elastic Stack (Elasticsearch, Kibana, APM, Logstash, Beats/Elastic Agent), ServiceNow, Azure Monitor, API-driven integrations, SIEM/SOAR systems.
  • Cloud Platforms:Azure, VMware, Kubernetes/Container platforms, Linux and Windows servers, enterprise network infrastructure.
  • SaaS Applications:Workday, Salesforce, ServiceNow, Microsoft 365 (Teams, Exchange, SharePoint, OneDrive), commercial SaaS telemetry sources.
  • Programming & Scripting:PowerShell, Python, Bash, API automation.
  • Architecture & Engineering:Azure ARM templates, Terraform, Ansible, hybrid cloud architecture, observability governance, ILM, ML-based anomaly detection, synthetics.

What You ll Do:

  • Own and lead the architecture and roadmap for the Elastic Observability platform across the enterprise.
  • Define and enforce governance standards for logs, metrics, traces, data retention, and alerting quality.
  • Lead platform scaling initiatives including cluster sizing, performance tuning, ILM tiering, and cost optimization.
  • Architect, deploy, and maintain advanced Elastic Observability solutions across hybrid environments.
  • Design executive-grade dashboards, correlation views, analytics boards, anomaly detection, and ML-based detections.
  • Optimize ingestion pipelines, index structures, data flow, and search/query performance at scale.
  • Integrate Elastic Observability with Azure, VMware, Kubernetes, network platforms, ServiceNow, and API sources.
  • Define and lead enterprise monitoring standards across logs, metrics, traces, and synthetics.
  • Drive cloud and on-prem monitoring maturity by improving instrumentation, coverage, and telemetry consistency.
  • Establish alert engineering frameworks that reduce noise and improve detection fidelity.
  • Lead design of synthetic transactions, user-experience monitoring, and availability baselines for SaaS apps.
  • Ensure proactive monitoring of Workday, Salesforce, ServiceNow, and Microsoft 365 integrations.
  • Serve as the observability lead during P1/P0 incidents by delivering real-time visibility and correlation insights.
  • Drive MTTR/MTTD improvements through enhanced observability patterns and RCA alignment.
  • Build and maintain operational runbooks, dashboards, and standard operating procedures.
  • Work with engineering, Cloud, Infrastructure, Applications, and Security leadership to improve observability adoption.
  • Act as the senior technical advisor in major IT projects, shaping observability-by-design principles.
  • Mentor and guide observability engineers, analysts, and SRE teams to uplift operational capabilities.
  • Ensure all monitoring pipelines follow enterprise security, compliance, retention, and logging policies.
  • Validate that new systems adhere to observability onboarding requirements and telemetry standards.

What We Are Looking For:

  • Bachelor s degree in Computer Science, Engineering, MIS, or equivalent experience.
  • 7-10+ years of experience in observability engineering, SRE, monitoring platform ownership, or infrastructure operations.
  • Deep, hands-on expertise with Elastic Stack (Elasticsearch, Kibana, Logstash, Beats/Elastic Agent, APM).
  • Strong architectural knowledge of cloud (Azure/AWS) and hybrid observability patterns.
  • Experience leading observability for infrastructure, cloud platforms, network systems, Kubernetes, and Microsoft 365.
  • Proven experience designing monitoring for SaaS platforms (Workday, Salesforce, ServiceNow).
  • Advanced scripting/automation experience (Python, PowerShell, Bash).
  • Strong knowledge of API integrations, data pipelines, and log-flow engineering.
  • Experience leading incident diagnostics and delivering visibility for RCA and operational improvement.
  • Strong analytical, architectural, and troubleshooting skills with a platform-owner mindset.
  • Demonstrated ability to influence cross-functional teams and drive enterprise observability adoption.
  • Familiarity with Grafana, Prometheus, Splunk, AppDynamics, Dynatrace (preferred).
  • Knowledge of Terraform, Ansible, Kubernetes, and infrastructure-as-code tools (preferred).
  • Knowledge of ITIL processes, SRE principles, and operational governance.
  • Excellent communication, leadership, and stakeholder-management skills.
  • empowering partner IT teams, such as Infrastructure and Apps, to self-service by creating their own monitors, all within the unified guidance and framework established by Observability.

Our Values


If you want to know the heart of a company, take a look at their values. Ours unite us. They are what drive our success - and the success of our customers. Does your heart beat like oursFind out here: Core Values

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Blue Yonder logo
Blue Yonder

Supply Chain Management/Technology

Scottsdale

RecommendedJobs for You