Scope:
- Receives work assignments through the ticketing system or from Lead Administrators or management.
- Provide Level 3/4 support for all observability capabilities across cloud, on-premises, and hybrid environments, with a primary focus on the Elastic Observability Platform.
- Design, implement, and maintain end-to-end monitoring, logging, tracing, and alerting systems using the Elastic Stack (Elasticsearch, Kibana, Elastic Agents/Beats, Logstash, APM).
- Collaborate with internal and external stakeholders to ensure reliable telemetry, high-quality alerting, and seamless integrations with infrastructure, networks, SaaS platforms, and cloud services.
- Ensure security, scalability, performance, and availability of observability systems and enterprise monitoring pipelines.
- Support monitoring maturity initiatives by improving instrumentation, reducing alert noise, and delivering actionable operational insights.
- Develop and maintain synthetic monitoring, user-experience metrics, health checks, and alerting for key SaaS applications including Workday, Salesforce, and ServiceNow.
- Support incident management by providing real-time dashboards, correlation insights, root-cause analysis data, and overall observability-driven improvements to MTTD and MTTR.
- Provide documentation, observability standards, and runbooks while guiding internal teams on dashboards, alert tuning, and best-practice monitoring patterns.
- Work closely with Cloud, Infrastructure, Network, Application, and Security teams to ensure cohesive telemetry coverage and continuous improvement of monitoring across the enterprise.
- Required to provide on-call support during off-duty hours on weekdays, weekends, and holidays on a rotating basis.
Our Current Technical Environment:
- Tools:Elastic Stack (Elasticsearch, Kibana, APM, Logstash, Beats / Elastic Agents), ServiceNow, Azure Monitor, API integrations.
- Platforms:Azure cloud services, VMware, Linux/Windows servers, enterprise networking, Kubernetes/Containers.
- SaaS:Workday, Salesforce, ServiceNow, Jira, Confluence, Microsoft 365 (Teams, Exchange, SharePoint).
- Programming & Scripting:PowerShell, Python, Bash, API integrations.
- Cloud Architecture:Azure (ARM Templates, Terraform), container environments, hybrid cloud.
- Monitoring Concepts:Logs, metrics, traces, synthetics, anomaly detection, machine-learning-based alerting, RCA dashboards, ILM policies.
What You ll Do:
- Receive work assignments from the ticketing system or leadership and execute observability engineering requests.
- Architect, deploy, and manage Elastic-based observability solutions across hybrid cloud and on-prem environments.
- Configure dashboards, visualizations, alert rules, anomaly detection jobs, and ML-based detections within Elastic.
- Optimize ingestion pipelines, index lifecycle management (ILM), retention policies, and search/query performance.
- Integrate Elastic Observability with Azure, VMware, network devices, Microsoft 365, ServiceNow, and API-based data sources.
- Implement logs, metrics, and traces instrumentation across infrastructure, cloud workloads, network systems, and containers.
- Build synthetic monitoring checks, baseline performance metrics, and user-experience monitoring for critical SaaS applications.
- Design and refine alerting strategies to reduce false positives and improve detection precision.
- Support incident response by providing real-time monitoring views, RCA data, correlation insights, and post-incident analytics.
- Maintain dashboards, documentation, and runbooks for internal teams.
- Train teams on Elastic usage, dashboard interpretation, and alert tuning for operational excellence.
- Collaborate with cross-functional teams (Cloud, Infrastructure, Enterprise Apps, Security) to define and enforce monitoring standards.
- Participate in major initiatives to embed observability controls into design, deployment, and operations.
- Ensure observability systems meet enterprise requirements for performance, security, scalability, and compliance.
- Provide on-call support for critical incidents on a scheduled rotation.
What We Are Looking For:
- Bachelor s degree in Computer Science, Engineering, MIS, or equivalent work experience.
- 5-8 years of experience in observability engineering, SRE, monitoring, or IT operations.
- Proven hands-on experience with the Elastic Stack (Elasticsearch, Kibana, Beats/Elastic Agent, Logstash, APM).
- Experience monitoring Azure and/or AWS cloud services.
- Strong knowledge of instrumentation for logs, metrics, and traces across hybrid environments.
- Experience with monitoring data center systems, networks, VMware, Microsoft 365, and Linux/Windows workloads.
- Hands-on experience configuring monitoring for SaaS apps such as Workday, Salesforce, Jira, Confluence and ServiceNow.
- Strong scripting skills in PowerShell, Python, Bash, or similar.
- Experience with API integrations, automation, data ingestion pipelines, and monitoring agents.
- Ability to work under pressure and meet deadlines in a fast-paced environment.
- Ability to act independently, prioritize effectively, and drive observability improvements.
- Excellent analytical and troubleshooting skills with a focus on reliability and continuous improvement.
- Strong communication skills and ability to collaborate with engineers, architects, and stakeholders.
- Understanding of monitoring governance, logging standards, and observability best practices.
- Familiarity with Grafana, Prometheus, Splunk, AppDynamics, or Dynatrace (preferred).
- Knowledge of Terraform, Ansible, or Infrastructure-as-Code (preferred).
- Understanding of ITIL processes and incident/problem management methodologies.
- Experience working with hybrid cloud technologies, containers, and automation pipelines.
Our Values
If you want to know the heart of a company, take a look at their values. Ours unite us. They are what drive our success - and the success of our customers. Does your heart beat like oursFind out here: Core Values
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status.