Site Reliability Engineer

4 - 9 years

5 - 12 Lacs

Posted:1 day ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

URGENT HIRING

Job Title: Site Reliability Engineer (SRE)

Job Summary:


We are seeking a motivated and detail-oriented 5+ exp SRE TechOps Engineer to join our growing operations team. In this role, you will be the first line of defence, responsible for monitoring our production environments, responding to alerts, and performing initial troubleshooting of our cloud infrastructure. The ideal candidate will have a strong foundation in cloud technologies, a passion for automation, and expert-level knowledge of the ELK stack for monitoring and analysis. This is a fantastic opportunity to grow your skills in a modern, cloud-native environment.

Key Responsibilities:


System Monitoring: Proactively monitor the health and performance of our applications and infrastructure using Azure Monitor and the ELK stack. Incident Response: Serve as the initial responder for all production alerts, following established runbooks and escalation procedures. Triage and Troubleshooting: Perform initial investigation and triage of incidents, gathering logs and data to identify the root cause. Issue Escalation: Escalate unresolved and complex issues to the L1/L2 engineering teams, providing detailed ticket information in Jira. Automation: Assist in the maintenance and improvement of our CI/CD pipelines using GitHub Actions. Infrastructure Support: Provide basic support for our Kubernetes and Terraform-managed infrastructure. Documentation: Contribute to the creation and maintenance of runbooks and other operational documentation. Collaboration: Work closely with development and other operations teams to ensure the stability and reliability of our services.

Required Qualifications:
ELK Stack Expertise: Proven, expert-level experience with the ELK (Elasticsearch, Logstash, Kibana) stack, including creating dashboards, setting up alerts, and writing complex queries for log analysis and troubleshooting. Cloud Experience: Hands-on experience with Microsoft Azure, including Azure Monitor, virtual machines, and networking basics. CI/CD Familiarity: Understanding of Continuous Integration and Continuous Deployment (CI/CD) principles, with some experience using tools like GitHub Actions. Containerization and IaC: Good knowledge of Kubernetes and Infrastructure as Code (IaC) concepts, preferably with some exposure to Terraform. Ticketing Systems: Proficiency in using Jira for incident tracking and management. Problem-Solving Skills: Strong analytical and troubleshooting skills with the ability to remain calm and effective under pressure. Communication: Excellent verbal and written communication skills.

Preferred Skills:

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Insightek Global Consulting logo
Insightek Global Consulting

Consulting

London

RecommendedJobs for You