Observability & Monitoring Engineer

4 - 6 years

0 Lacs

Posted:4 days ago| Platform: GlassDoor logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Job Title: Observability & Monitoring Engineer

Location: India (Remote/Hybrid as per company policy

Experience: 4–6 years

Employment Type: Full-time

Role Summary
While many vendors treat monitoring as a reactive afterthought, we embed Datadog-trained Observability Engineers directly into our engineering and operations teams to deliver real-time visibility, proactive tuning, and smarter incident management. We are looking for a highly capable Observability & Monitoring Engineer with 4–6 years of experience in Datadog and related observability practices. The engineer will be at the forefront of transforming how systems are monitored—reducing noise, accelerating root-cause discovery, and enabling smarter, correlated event flows across cloud-native environments.

Core Responsibilities:

Datadog Ownership:

    Build and maintain Datadog dashboards, monitors, and SLOs with a focus on business and operational relevance.

    Configure and tune alerts to eliminate noise and reduce false positives, enabling focused responses and intelligent routing.

    Proactive Monitoring & Alert Tuning:

    Implement proactive alert strategies based on usage patterns and event behavior.

    Continuously optimize thresholds, baselines, and anomaly detection logic to ensure actionable monitoring signals.

    Observability & Root-Cause Analysis (RCA):

    Correlate metrics, logs, and traces across distributed systems to facilitate rapid root-cause triangulation.

    Drive investigations from high CPU alerts to middleware issues such as queue overloads, using Datadog APM and tracing.

    Integrated Support & Event Correlation:

    Work closely with L2/Smart L3 and platform teams to support event correlation, AWS incident flows, and CI/CD telemetry.

    Participate in day-to-day IT operations, functional system support, and incident escalation workflows.

    SAP CPI API Monitoring:

    Build and maintain targeted dashboards for SAP CPI APIs to ensure availability, throughput, and performance visibility.

What Makes This Role Unique:

    You are embedded in the core delivery team, not isolated in a separate monitoring silo.

    You work on proactive monitoring, not just reacting to alerts.

    You support a platform aligned with Smart’s tooling and architecture, including high-frequency CI tracing and real-time AWS integration.

    You help evolve how we define “observability maturity” by integrating it deeply into development and ops workflows.

Required Skills & Experience:

    4–6 years of experience in observability, SRE, or DevOps roles with strong exposure to Datadog.

    Experience with configuring and managing Datadog’s dashboards, monitors, APM, and logs.

    Deep understanding of observability principles: metrics, logs, distributed traces, RUM, and synthetic monitoring.

    Experience tracing infrastructure or application alerts (e.g., CPU, latency) to actual service or middleware-level bottlenecks.

    Familiarity with cloud platforms like AWS (preferred), Azure, or GCP.

    Hands-on experience in event management, incident support, and RCA documentation.

    Exposure to SAP CPI monitoring or other enterprise integration middleware is a plus.

What You’ll Get:

    The opportunity to redefine observability in a modern, fast-paced environment.

    Ownership of critical monitoring pipelines and real-time troubleshooting tools.

    Work with global engineering and platform teams to drive performance and reliability.

    Flexible work environment and access to upskilling resources.

Send us your resume at:

careers@algoworks.com

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You