Jobs
Interviews

350 Observability Jobs - Page 14

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

8.0 - 12.0 years

2 - 11 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

This might be a good fit for you, if enabling people to do their best resonates with you. you love platform engineering you want to build cool things with cool people. you love automating everything you love building high impact tools and software which everyone depends on you love automating everything! What Your Responsibilities Will Be Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality What Youll Need to be Successful Qualifications Software Engineering : Understand...

Posted 3 months ago

Apply

8.0 - 13.0 years

3 - 12 Lacs

Hyderabad / Secunderabad, Telangana, Telangana, India

On-site

This might be a good fit for you, if enabling people to do their best resonates with you. you love platform engineering you want to build cool things with cool people. you love automating everything you love building high impact tools and software which everyone depends on you love automating everything! What Your Responsibilities Will Be Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality What Youll Need to be Successful Qualifications Software Engineering : Understand...

Posted 3 months ago

Apply

8.0 - 13.0 years

3 - 11 Lacs

Delhi, India

On-site

This might be a good fit for you, if enabling people to do their best resonates with you. you love platform engineering you want to build cool things with cool people. you love automating everything you love building high impact tools and software which everyone depends on you love automating everything! What Your Responsibilities Will Be Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality What Youll Need to be Successful Qualifications Software Engineering : Understand...

Posted 3 months ago

Apply

6.0 - 9.0 years

8 - 11 Lacs

Pune

Work from Office

We are hiring a DevOps / Site Reliability Engineer for a 6-month full-time onsite role in Pune (with possible extension). The ideal candidate will have 69 years of experience in DevOps/SRE roles with deep expertise in Kubernetes (preferably GKE), Terraform, Helm, and GitOps tools like ArgoCD or Flux. The role involves building and managing cloud-native infrastructure, CI/CD pipelines, and observability systems, while ensuring performance, scalability, and resilience. Experience in infrastructure coding, backend optimization (Node.js, Django, Java, Go), and cloud architecture (IAM, VPC, CloudSQL, Secrets) is essential. Strong communication and hands-on technical ability are musts. Immediate j...

Posted 3 months ago

Apply

0.0 years

0 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

Ready to shape the future of work At Genpact, we don&rsquot just adapt to change&mdashwe drive it. AI and digital innovation are redefining industries, and we&rsquore leading the charge. Genpact&rsquos AI Gigafactory, our industry-first accelerator, is an example of how we&rsquore scaling advanced technology solutions to help global enterprises work smarter, grow faster, and transform at scale. From large-scale models to agentic AI, our breakthrough solutions tackle companies most complex challenges. If you thrive in a fast-moving, tech-driven environment, love solving real-world problems, and want to be part of a team that&rsquos shaping the future, this is your moment. Genpact (NYSE: G) is...

Posted 3 months ago

Apply

9.0 - 14.0 years

20 - 35 Lacs

Chennai, Bengaluru

Work from Office

Dynatrace Specialist 9+ Years Location : Bangalore / Chennai Company : HCLTech Experience : 9 to 13 Years Employment Type : Full-Time | Permanent About the Role : HCLTech is seeking an experienced Dynatrace Specialist to join our IT Observability and AIOps team. The ideal candidate will be responsible for implementing, managing, and optimizing Dynatrace-based performance monitoring for enterprise applications. Key Responsibilities : Deploy, configure, and maintain Dynatrace for end-to-end observability. Create custom dashboards, alerts, and synthetic monitoring. Troubleshoot application and infrastructure performance issues using Dynatrace insights. Collaborate with development and DevOps te...

Posted 3 months ago

Apply

3.0 - 5.0 years

0 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

Candidate is expected to write good quality C/C++, Java Code and should be able to develop corresponding Unit tests and Automation. He/She must have hands on experience with Cloud Native technologies like docker/Kubernetes/Monitoring/observability. Additional skill sets include Perl and Python scripting, DB and XML concepts. Should be familiar with Agile methodology, CI/CD process and should have exposure to messaging framework like Kafka. Candidate should be able to understand requirements and deliver independently. Experience in billing domain will be an added advantage. Career Level - IC2

Posted 3 months ago

Apply

3.0 - 5.0 years

15 - 18 Lacs

Pune

Work from Office

Experience: 3 to 5 years in cloud infrastructure operations, L1 incident management, automation support, and observability, with team coordination or mentoring experience. Location: Pune Shift: 24x7 Support (Rotational Shifts) Education: BE/B.Tech (Relevant certifications preferred AWS Cloud Practitioner/Associate, Azure Fundamentals, CKA, Terraform Associate) Job Summary: We are seeking a L1 Lead – Site Reliability Engineer (SRE) to guide and manage the frontline SRE team in ensuring the stability, availability, and efficiency of enterprise-scale cloud infrastructure operations. This role involves supervising incident response, ensuring adherence to runbooks and SOPs, providing technical gu...

Posted 3 months ago

Apply

6.0 - 9.0 years

6 - 9 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

The Role LeadSquared platform and product suite is 100% on the cloud and currently all on AWS. The product suite comprises a large number of applications, services, and APIs built on various open-source and AWS native tech stacks and deployed across multiple AWS accounts. The role involves leading the mission-critical responsibility of ensuring that all our online services are available, reliable, performant, and running at optimal costs. We firmly believe in a code and automation-driven approach to Site Reliability. Key Responsibilities Build Processes and platforms to ensure full observability and automated incident response management of all systems, applications, platforms, and infrastru...

Posted 3 months ago

Apply

6.0 - 10.0 years

0 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

Oracle Cloud Infrastructure (OCI) is one of the fastest-growing cloud platforms, and we are assembling a world-class team to build the next generation of security products. We're seeking a Principal Software Engineer to drive the design and development of mission-critical systems that protect OCI customers at hyperscale. As a Principal Engineer in the Security Products Group, you will play a key leadership role in: Architecting and delivering complex, distributed systems with a focus on security, resiliency, and scalability. Driving strategic technical decisions and shaping the long-term vision for OCI's security offerings. Mentoring engineers, influencing cross-team engineering practices, a...

Posted 3 months ago

Apply

10 - 13 years

18 - 25 Lacs

Bengaluru

Hybrid

Hiring, Lead Site Reliability Engineer with following skills and expertise. What will this person do? Provide leadership in designing and implementing reliable, scalable, and secure infrastructure solutions. Develop and maintain observability solutions, ensuring visibility into system performance using native Azure Cloud solutions. Define and track SLIs, ensuring compliance with SLOs and SLAs. Lead incident response efforts, conduct root cause analysis, and implement preventive measures to minimize downtime. Automate infrastructure provisioning, configuration and management using Terraform & Ansible. Build and maintain robust Observability pipelines to support automated deployments and conti...

Posted 4 months ago

Apply

5 - 8 years

15 - 25 Lacs

Chennai, Bengaluru

Work from Office

We are looking for a Senior Platform Engineer Airflow & Control-M with 5-10 years of experience to join our team in Bangalore or Chennai The ideal candidate will have strong expertise in Airflow, Control-M, Kubernetes, Observability (OpenTelemetry), Python, and Bash scripting The role involves managing critical data workflows, enhancing platform automation, and ensuring system reliability and scalability Excellent communication skills and hands-on experience in stabilizing production environments are essential

Posted 4 months ago

Apply

8 - 12 years

16 - 27 Lacs

Kolkata

Work from Office

Role Observability Engineer (AWS) EXP : 8 + Years Essential Skills (Two top skills) AWS Ecosystem – EKS, EC2, DynamoDB, Lambda, etc. Dynatrace (or similar) Monitoring Site, trend analysis, log analysis Key Responsibilities: Design, implement, and maintain observability solutions using AWS and Dynatrace to monitor application performance and infrastructure health. Collaborate with development and operations teams to define observability requirements and ensure seamless integration of monitoring tools. Develop and manage dashboards, alerts, and reports to provide insights into system performance and user experience. Troubleshoot complex issues by analyzing logs, metrics, and traces to identify...

Posted 4 months ago

Apply

10 - 20 years

25 - 35 Lacs

Pune, Bengaluru, Delhi / NCR

Work from Office

Role & responsibilities SRE Architect in running large Reliability & Observability Programs for large, complex infrastructure deployments / distributed systems for major Banking customers. Proficiency in using Application Performance Monitoring (APM) tool New Relic/Dynatrace for monitoring, logging, tracing and Splunk for Log monitoring. should have implemented solutions around Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for services. • Understanding of software delivery life cycles, particularly Agile/Lean & DevOps • Proven experience in handling large scale and growing infrastructure across Data Centers and heterogeneous Cloud platforms • Expert level hands on knowl...

Posted 4 months ago

Apply

10 - 13 years

18 - 25 Lacs

Bengaluru

Hybrid

Hiring, Lead Site Reliability Engineer with following skills and expertise. What will this person do? Provide leadership in designing and implementing reliable, scalable, and secure infrastructure solutions. Develop and maintain observability solutions, ensuring visibility into system performance using native Azure Cloud solutions. Define and track SLIs, ensuring compliance with SLOs and SLAs. Lead incident response efforts, conduct root cause analysis, and implement preventive measures to minimize downtime. Automate infrastructure provisioning, configuration and management using Terraform & Ansible. Build and maintain robust Observability pipelines to support automated deployments and conti...

Posted 4 months ago

Apply

8 - 13 years

30 - 45 Lacs

Bengaluru

Work from Office

Drive SRE implementation and DevOps best practices. Reduce technical debt, automate reliability workflows, and ensure performance, scalability, and observability across cloud-based digital platforms. Required Candidate profile Experienced SRE with deep knowledge of Azure cloud, CI/CD, observability, automation, and programming. Strong DevOps mindset, troubleshooting ability, and alignment with digital transformation goals

Posted 4 months ago

Apply

7.0 - 12.0 years

12 - 22 Lacs

Pune

Work from Office

Experience-7+ Years Job Locations-Pune Notice Period-30 Days Job Description- AWS Ecosystem EKS, EC2, DynamoDB, Lambda, etc. Dynatrace (or similar) The Observability team should include some members with Dynatrace experience, while the rest can have experience with similar tools. Monitoring Site, trend analysis, log analysis **Key Responsibilities: ** Design, implement, and maintain observability solutions using AWS and Dynatrace to monitor application performance and infrastructure health. Collaborate with development and operations teams to define observability requirements and ensure seamless integration of monitoring tools. Develop and manage dashboards, alerts, and reports to provide in...

Posted 4 months ago

Apply

7.0 - 12.0 years

13 - 23 Lacs

bengaluru

Hybrid

Job Description: Drive high levels of stability and availability of services driving Site Reliability Engineering as a practice across IPE. Grow partnership with Product Engineering owners, drive initiatives which benefit the team in accordance with SRE. 24*7 available as an escalation point for the operational teams. Reduced MTTR and service impact Address technical debt across IPE to remove risk Reduce recovery time on incidents Aid in major incidents which are owned by IPE. Validate service communications from technical perspective during major incidents Drive standard process and continual improvement for incident recovery, problem management, service resilience and availability Bring in...

Posted Date not available

Apply

10.0 - 17.0 years

30 - 45 Lacs

pune, chennai, bengaluru

Hybrid

Please find the below JD and company portfolio for your reference: Senior Observability Specialist Location: [Chennai, Pune, Bangalore] Employment Type: Fulltime Experience Required: [15-18] Job Summary: We are seeking a highly skilled Senior Observability Specialist to design, implement, and manage endtoend observability strategies across cloud and on-premises environments. This role requires expertise in modern monitoring, logging, and tracing tools, ensuring system reliability, performance optimization, and proactive incident detection. The ideal candidate will have experience with Dynatrace, Datadog, and various opensource solutions, including Grafana, Loki, Tempo, Mimir, and Prometheus....

Posted Date not available

Apply

9.0 - 14.0 years

15 - 30 Lacs

bengaluru

Work from Office

towards Identify trends and possible opportunities for Service Improvement Program (cross-domain/divisional), gain support and sponsorship then track and drive those program's through to conclusion providing regular service updates on progress. Responsible for oversight and governance of key resilience requirements for applications within IPE and address technical debt across IPE to remove risk. MINIMUM REQUIREMENTS: Bachelors degree or equivalent experience in an IT related discipline preferred. Technical knowledge of SRE areas of focus – implementations with Datadog as an observability focus, Capacity management etc. Outstanding communication and influencing skills. Experience of industry ...

Posted Date not available

Apply

5.0 - 9.0 years

15 - 20 Lacs

pune

Hybrid

Managing the team of SRE Monitor production systems & services using observability tools, Respond to incidents Design, implement & maintain observability solutions (eg Prometheus, Grafana, ELK) Technical Operations & Continuous Improvement Required Candidate profile Must have* Exp in Azure services /AWS Hands on with (IaC) tools such as Terraform Scripting skills in Python/Bash/PowerShell *Must be designated in SRE role /duties Notice Period - 1 month or less

Posted Date not available

Apply

7.0 - 12.0 years

14 - 22 Lacs

hyderabad

Hybrid

Role & responsibilities SRE mid-senior engineer Configuring observability tools(Coralogix, Checkly) through Terraform Implementing/Enhancing CI/CD pipelines for multiple product leveraging GitHub, Azure DevOps, Bitbucket & Jenkins. On-Call support covering IST General shift or second shift Ability to understand requirements on new tools quickly and implement them in short period of times.

Posted Date not available

Apply

5.0 - 10.0 years

15 - 30 Lacs

hyderabad, pune, bengaluru

Hybrid

Role Summary: We are seeking a highly skilled professional to lead initiatives at the intersection of Generative AI (GenAI) , Agentic AI frameworks , Root Cause Analysis (RCA) , and Full Stack Observability . This role is ideal for individuals who can leverage AI-driven insights to enhance incident detection, impact analysis, and system reliability across complex IT environments. Key Responsibilities: Design and implement AI-driven RCA and impact analysis frameworks integrated with observability platforms. Utilize Agentic AI and GenAI technologies to automate and enhance decision-making in incident response and service reliability. Integrate observability tools (e.g., AppDynamics, Dynatrace,...

Posted Date not available

Apply

4.0 - 7.0 years

14 - 24 Lacs

bengaluru

Hybrid

Job Title: Site Reliability Engineer (SRE) Role Overview Youll lead efforts to instrument and monitor our production environment with deep visibility and proactive issue detection. This includes tracking Core Web Vitals, feature KPIs, funnel conversions, API responsiveness, and broader traffic shifts. Your work will empower WM.com to measure daily change with precision and respond to any anomalies before customers are impacted. Key Responsibilities Architect and maintain robust monitoring frameworks using LogRocket, Datadog, AppDynamics, LaunchDarkly, BrowserStack, and more Define and track performance indicators such as Core Web Vitals, feature-specific KPIs, and system throughput metrics Q...

Posted Date not available

Apply

7.0 - 10.0 years

12 - 22 Lacs

pune, chennai

Hybrid

Role & responsibilities Architectural Design and Implementation Design and deploy scalable, highly available, and fault-tolerant systems on AWS. Develop and implement cloud infrastructure solutions using AWS services such as EC2, S3, VPC, RDS, Lambda, etc. Utilize Infrastructure as Code (IaC) tools like Terraform, CloudFormation, and AWS CDK to automate the provisioning and management of AWS resources. Kubernetes and Containerization Deploy, manage, and scale Kubernetes clusters on AWS (EKS). Design container orchestration solutions and manage containerized applications. Implement best practices for Kubernetes resource management, networking, and security. Observability and Monitoring Implem...

Posted Date not available

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies