SRE Engineer

2 - 6 years

5 - 10 Lacs

Posted:3 days ago| Platform: Naukri logo

Apply

Work Mode

Hybrid

Job Type

Full Time

Job Description

J1125-0744

Project context

As a Site Reliability Engineer (SRE) working in a 24/7 shift rotation, you will be responsible for ensuring the reliability, availability, and performance of critical systems and services. You will combine strong technical skills with operational excellence to proactively monitor, troubleshoot, and resolve issues. Your expertise in observability will help maintain robust monitoring, alerting, and incident response processes, ensuring seamless operations around the clock.

This role demands 24x7 monthly rotational shifts

Goals and deliverables

24/7 Operations & Incident Management Monitor production systems and services using observability tools (logs, metrics, traces, dashboards). Respond to incidents, alerts, and outages in real time, ensuring rapid resolution and minimal impact. Participate in a rotating on-call schedule, providing support during nights, weekends, and holidays.

Observability & Monitoring Design, implement, and maintain observability solutions (e.g., Prometheus, Grafana, ELK and similar tools). Develop and refine dashboards, alerts, and automated health checks for critical infrastructure and applications. Analyze system performance and reliability data to identify trends and prevent future incidents, looking from an end-to-end full stack from infrastructure to application layers

Technical Operations Collaborate with development, infrastructure, application and security teams to ensure system reliability and scalability. Automate operational tasks and incident response processes using scripting and configuration management tools. Document procedures, runbooks, and incident reports for knowledge sharing and continuous improvement.

Continuous Improvement Conduct post-incident reviews and root cause analysis to drive improvements in reliability and response. Propose and implement enhancements to monitoring, alerting, and operational processes.

Education and experience

  • Bachelors degree in information technology, Computer Science, Business

Administration, or a related field. Master's degree or relevant certifications

would be a plus.

  • Minimum of 2-5 years of experience in cloud engineering and operations

engineering

  • Proven experience with Azure services, with AWS and GCP an advantage
  • Hands-on experience with Infrastructure-as-Code (IaC) tools such as

Terraform.

  • Strong scripting skills in Python, Bash or PowerShell for automation tasks
  • Familiarity with Gitlab CI/CD tools and experience integrating them with

Azure

  • Proficiency in monitoring and logging tools such as native cloud tools,

OpenMetrics, OpenTelemetry

Skills and behavioral competencies

  • Excellent problem solving and troubleshooting abilities
  • Result orientation, influence & impact
  • Empowerment & accountability with the ability to work independently
  • Team spirit, building relationships, collective accountability
  • Excellent oral and written communication skills for documenting and

sharing information with technical and non-technical stakeholders

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
CGI logo
CGI

Information Technology and Consulting

Montreal

RecommendedJobs for You

pune, bengaluru, mumbai (all areas)

hyderabad, telangana, india

hyderabad, telangana, india

bengaluru, karnataka, india