Product Manager

8 years

0 Lacs

Posted:1 week ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description


Key Responsibilities

  • Own the observability product roadmap

    with a focus on enabling visibility for data pipelines, distributed compute frameworks (e.g., Spark, Flink), and cloud-native workloads.
  • Define and deliver features for

    metrics ingestion

    ,

    distributed tracing

    ,

    log processing pipelines

    ,

    alerting

    ,

    dashboards

    , and

    SLO/SLA tooling

    .
  • Drive integration with cloud platforms (AWS, GCP, Azure), container orchestration systems (Kubernetes), and data infrastructure components (Kafka, Airflow, Snowflake, etc.).
  • Define APIs, data models, and storage strategies for telemetry data at scale.
  • Collaborate with platform, SRE, and data engineering teams to understand pain points, gather requirements, and validate solutions.
  • Contribute to the definition and tracking of

    service health indicators

    (SLIs/SLOs),

    incident response tooling

    , and

    automated root cause analysis

    .
  • Stay current on emerging trends in observability (e.g., eBPF, AI/ML for anomaly detection, continuous profiling), cloud infrastructure, and big data ecosystems.
  • Work with engineering to build scalable systems for telemetry collection, processing, retention, and visualization.
  • Develop product specifications with clear technical detail for engineering execution.

Preferred Experience & Skills

  • 8+ years in technical product management, ideally with products related to

    observability

    ,

    infrastructure

    , or

    data platforms

    .
  • Hands-on experience with observability tools like

    OpenTelemetry

    ,

    Prometheus

    ,

    Grafana

    ,

    Jaeger

    ,

    ELK stack

    ,

    Datadog

    ,

    New Relic

    , or similar.
  • Strong understanding of cloud-native architecture patterns, microservices, containers, and orchestration (especially

    Kubernetes

    ).
  • Experience with

    distributed systems

    and

    data platforms

    — e.g.,

    Apache Kafka

    ,

    Apache Spark

    ,

    Flink

    ,

    Airflow

    ,

    Presto

    ,

    Snowflake

    , etc.
  • Familiarity with infrastructure-as-code (e.g., Terraform, Helm) and CI/CD systems.
  • Working knowledge of telemetry data storage and processing at scale (TSDBs, log indexing, event pipelines).
  • Ability to read and communicate technical designs with engineers and stakeholders (e.g., API specs, sequence diagrams, data flows).
  • Experience working with SREs, platform teams, or DevOps roles in production environments.
  • Strong analytical skills; ability to define and monitor KPIs for performance, reliability, and user adoption.

Nice to Have

  • Background in data engineering or site reliability engineering (SRE).
  • Experience with cost optimization and resource utilization tracking in cloud environments.
  • Exposure to AI/ML-based anomaly detection and predictive analytics in observability.
  • Experience contributing to or working with open-source observability communities.


Roles & Responsibilities


Key Responsibilities

  • Own the observability product roadmap

    with a focus on enabling visibility for data pipelines, distributed compute frameworks (e.g., Spark, Flink), and cloud-native workloads.
  • Define and deliver features for

    metrics ingestion

    ,

    distributed tracing

    ,

    log processing pipelines

    ,

    alerting

    ,

    dashboards

    , and

    SLO/SLA tooling

    .
  • Drive integration with cloud platforms (AWS, GCP, Azure), container orchestration systems (Kubernetes), and data infrastructure components (Kafka, Airflow, Snowflake, etc.).
  • Define APIs, data models, and storage strategies for telemetry data at scale.
  • Collaborate with platform, SRE, and data engineering teams to understand pain points, gather requirements, and validate solutions.
  • Contribute to the definition and tracking of

    service health indicators

    (SLIs/SLOs),

    incident response tooling

    , and

    automated root cause analysis

    .
  • Stay current on emerging trends in observability (e.g., eBPF, AI/ML for anomaly detection, continuous profiling), cloud infrastructure, and big data ecosystems.
  • Work with engineering to build scalable systems for telemetry collection, processing, retention, and visualization.
  • Develop product specifications with clear technical detail for engineering execution.



NOTE- We are hiring for Indore and Bangalore.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You

indore, madhya pradesh, india

pondicherry, karaikal, mahe, chennai, bhubaneswar, cuttack, gwalior, paradeep, jharsuguda, bhopal

kolkata, mumbai, new delhi, hyderabad, pune, chennai, bengaluru

gurugram, haryana, india