10 - 14 years

0 Lacs

Posted:1 month ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

As a Lead Site Reliability Engineer (SRE) based in India, you will be embedded into our U.S.-based SRE and development teams. Your role will involve hands-on engineering tasks, with a focus on building automation, writing scripts, analyzing system performance, and ensuring platform observability, uptime, and incident response for high-scale applications. Collaboration with U.S.-based engineers during critical situations is essential. Key Responsibilities: - Collaborate with U.S.-based counterparts to define and monitor service SLOs, SLAs, and key performance indicators. - Lead root cause analysis, blameless postmortems, and reliability improvements across environments. - Review Java/Spring application code to identify defects and systemic performance issues. - Automate deployment pipelines, recovery workflows, and runbook processes to minimize manual intervention. - Build and manage dashboards, alerts, and health checks using tools like DataDog, Azure Monitor, Prometheus, and Grafana. - Contribute to architectural decisions focusing on performance and operability. - Guide and mentor offshore team members in incident response and production readiness. - Participate in 24x7 support rotations aligned with EST coverage expectations. Required Experience & Skills: - 10+ years in SRE, DevOps, or platform engineering, supporting U.S. enterprise systems. - Strong hands-on experience with Java/Spring Boot applications for code-level troubleshooting. - Cloud infrastructure knowledge (Azure preferred) and container orchestration (Kubernetes). - Proficiency in logging/monitoring stacks such as DataDog, ELK, Azure Monitor, Dynatrace, Splunk. - Experience with ServiceNow (SNOW) for ITSM processes. - Familiarity with Terraform or ARM templates, CI/CD automation, and scripting (Python, Bash). - Knowledge of Salesforce systems is highly preferred. - Excellent communication skills and outstanding problem-solving ability in distributed environments. - Demonstrated history of improving stability, availability, and delivery velocity for large-scale platforms.,

Mock Interview

Practice Video Interview with JobPe AI

Start Java Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Java Skills

Practice Java coding challenges to boost your skills

Start Practicing Java Now

RecommendedJobs for You

bengaluru, karnataka, india