Senior Production Support Engineer

2 - 6 years

0 Lacs

Posted:3 days ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Role Overview: As a Senior Production Support Engineer, you will be responsible for leading the stability, scalability, and performance of production systems and cloud infrastructure. Your role will involve taking ownership of complex technical challenges, driving system reliability initiatives, and collaborating with DevOps, SRE, and product engineering teams to guide the cloud operations strategy. Key Responsibilities: - Own the performance, availability, and health of production environments. - Lead complex incident investigations and drive resolution of application, infrastructure, and network-level issues. - Perform deep root cause analyses (RCA) for critical and recurring incidents and implement long-term fixes. - Design, maintain, and enhance observability across systems using tools like Prometheus, Grafana, New Relic, and ELK Stack. - Establish best practices for monitoring, alerting, and logging to facilitate proactive issue detection and rapid response. - Lead incident response efforts, coordinate across teams, and ensure swift resolution of high-impact issues. - Provide Tier 3 support and technical escalation for unresolved complex issues. - Mentor junior engineers to foster a culture of reliability, accountability, and continuous improvement. - Collaborate with SRE, DevOps, and product engineering teams to design and build resilient and scalable systems. - Maintain high-quality documentation standards for incidents, playbooks, and system knowledge. Qualification Required: - 2+ years of hands-on experience in cloud infrastructure, production support, or system operations. - Proven experience with AWS or GCP and managing distributed cloud-native applications at scale. - Advanced troubleshooting and performance tuning skills in high-availability environments. - Expertise in monitoring, observability, and alerting tools such as Prometheus, Grafana, New Relic, etc. - Experience with incident management platforms like JIRA and Zendesk. - Proficiency in scripting or infrastructure-as-code tools like Bash, Python, Terraform. - Solid understanding of databases, networking, and security best practices in cloud environments. - Strong communication and leadership skills with a collaborative mindset. Experience mentoring and guiding junior engineers is a plus.,

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You