Posted:1 hour ago| Platform:
Work from Office
Full Time
Job Summary We are seeking a seasoned Observability Architect to define and lead our end-to-end observability strategy across highly distributed, cloud-native, and hybrid environments. This role requires a visionary leader with deep hands-on experience ... Observability Architect - CLOUDEQ SOFTWARE INDIA PRIVATE LIMITED Observability Architect Posted 105 weeks ago Job Description Job Summary We are seeking a seasoned Observability Architect to define and lead our end-to-end observability strategy across highly distributed, cloud-native, and hybrid environments. This role requires a visionary leader with deep hands-on experience in New Relic and a strong working knowledge of other modern observability platforms like Datadog, Prometheus/Grafana, Splunk, OpenTelemetry, and more. You will design scalable, resilient, and intelligent observability solutions that empower engineering, SRE, and DevOps teams to proactively detect issues, optimize performance, and ensure system reliability. This is a senior leadership role with significant influence over platform architecture, monitoring practices, and cultural transformation across global teams. Key Responsibilities Architect and implement full-stack observability platforms, covering metrics, logs, traces, synthetics, real user monitoring (RUM), and business-level telemetry using New Relic and other tools like Datadog, Prometheus, ELK, or AppDynamics. Design and enforce observability standards and instrumentation guidelines for microservices, APIs, front-end applications, and legacy systems across hybrid cloud environments. Experience in OpenTelemetry adoption, ensuring vendor-neutral, portable observability implementations where appropriate. Build multi-tool dashboards, health scorecards, SLOs/SLIs, and integrated alerting systems tailored for engineering, operations, and executive consumption. Collaborate with engineering and DevOps teams to integrate observability into CI/CD pipelines, GitOps, and progressive delivery workflows. Partner with platform, cloud, and security teams to provide end-to-end visibility across AWS, Azure, GCP, and on-prem systems. Lead root cause analysis, system-wide incident reviews, and reliability engineering initiatives to reduce MTTR and improve MTBF. Evaluate, pilot, and implement new observability tools/technologies aligned with enterprise architecture and scalability requirements. Deliver technical mentorship and enablement, evangelizing observability best practices and nurturing a culture of ownership and data-driven decision-making. Drive observability governance and maturity models, ensuring compliance, consistency, and alignment with business SLAs and customer experience goals. Required Qualifications 15+ years of overall IT experience, hands-on with application development, system architecture, operations in complex distributed environments, troubleshooting and integration for applications and other cloud technology with observability tools. 5+ years of hands-on experience with observability tools such as New relic, Datadog, Prometeus, etc. including APM, infrastructure monitoring, logs, synthetics, alerting, and dashboard creation. Proven experience and willingness to work with multiple observability stacks, such as: Datadog, Dynatrace, AppDynamics Prometheus, Grafana, etc. Elasticsearch, Fluentd, Kibana (EFK/ELK) Splunk, OpenTelemetry, Solid knowledge of Kubernetes, service mesh (e.g., Istio), containerization (Docker) and orchestration strategies. Strong experience with DevOps and SRE disciplines, including CI/CD, IaC (Terraform, Ansible), and incident response workflows. Fluency in one or more programming/scripting languages: Java, Python, Go, Node.js, Bash. Hands-on expertise in cloud-native observability services (e.g., CloudWatch, Azure Monitor, GCP Operations Suite). Excellent communication and stakeholder management skills, with the ability to align technical strategies with business goals. Preferred Qualifications Architect level Certifications in New Relic, Datadog, Kubernetes, AWS/Azure/GCP, or SRE/DevOps practices. Experience with enterprise observability rollouts, including organizational change management. Understanding of ITIL, TOGAF, or COBIT frameworks as they relate to monitoring and service management. Familiarity with AI/ML-driven observability, anomaly detection, and predictive alerting. Why Join Us? Lead enterprise-scale observability transformations impacting customer experience, reliability, and operational excellence. Work in a tool-diverse environment, solving complex monitoring challenges across multiple platforms. Collaborate with high-performing teams across development, SRE, platform engineering, and security. Influence strategy, tooling, and architecture decisions at the intersection of engineering, operations, and business. Unit #E1J, First Floor, Tower B, Godrej Eternia, Plot #70, Industrial Area, Phase 1, Chandigarh Chandigarh, Chandigarh, 160002 You have already applied for this job with this account.
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
20.0 - 25.0 Lacs P.A.
Chandigarh, Chandigarh, India
Salary: Not disclosed
Bengaluru
12.0 - 16.0 Lacs P.A.
Bengaluru
45.0 - 50.0 Lacs P.A.
Bengaluru
15.0 - 20.0 Lacs P.A.
20.0 - 25.0 Lacs P.A.
6.0 - 10.0 Lacs P.A.
20.0 - 25.0 Lacs P.A.
16.0 - 20.0 Lacs P.A.
14.0 - 18.0 Lacs P.A.