Senior Operations Engineer

3 - 6 years

6 - 13 Lacs

Posted:1 week ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Key Skills:

Roles and Responsibilities:

  • Own operational oversight for services running on a Java-based microservices platform.
  • Act as the primary escalation point for production incidents; lead incident response and communication.
  • Drive post-incident reviews (blameless RCAs) and embed learnings through preventive actions.
  • Maintain service dashboards, alerts, and incident tooling (e.g., PagerDuty, Datadog).
  • Guide operational practices across services built using Java (Spring Boot), Kafka, MongoDB, and related technologies.
  • Oversee monitoring, observability, and performance tuning using Datadog, ELK, Prometheus, or similar tooling.
  • Lead proactive and reactive problem management efforts.
  • Identify recurring production issues and collaborate with engineering to design permanent solutions.
  • Track and reduce operational toil via automation and tooling improvements.
  • Partner with development teams to onboard new services with production readiness standards.
  • Ensure all services meet requirements for monitoring, logging, documentation, support, and resilience before go-live.
  • Support safe, rapid change practices including canary releases, feature flags, and progressive delivery.
  • Lead and mentor a team of operations engineers and/or SREs.
  • Manage performance reviews, career development, and day-to-day team workload.
  • Foster a high-performance culture with strong accountability, collaboration, and a learning mindset.
  • Drive automation and self-service initiatives to reduce manual intervention and operational burden.
  • Champion observability best practices (metrics, traces, logs) and error budget tracking.
  • Promote DevOps culture and continuous feedback loops between engineering and operations.
  • Ensure operational processes comply with security, privacy, and regulatory requirements (e.g., SOC 2, ISO 27001).
  • Manage operational risks, service continuity plans, and audit readiness.

Skills Required:

  • Strong experience with cloud platforms and services
  • Knowledge of DevOps practices and SRE principles
  • Expertise in Java-based microservices, Spring Boot, Kafka, MongoDB
  • Experience with monitoring, observability, and incident management tools (Datadog, ELK, Prometheus, PagerDuty)
  • Ability to lead incident response, blameless RCAs, and operational problem management
  • Strong automation, scripting, and process improvement skills
  • Excellent leadership, mentoring, and collaboration abilities
  • Understanding of compliance, security, and regulatory requirements (SOC 2, ISO 27001)

Education:

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Careernet logo
Careernet

Recruitment & Staffing

Tech City

RecommendedJobs for You