Site Reliability Engineer

8 - 13 years

22 - 25 Lacs

Posted:5 hours ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Role & responsibilities

Job Summary:

SRE AI Ops Engineer

Key Responsibilities:

  • Leverage AI Ops tools

    to automatically detect, analyze, and triage incidents across the infrastructure and application landscape.
  • Perform advanced root cause analysis

    using AI/ML-driven observability and incident correlation tools.
  • Develop self-healing capabilities

    using AI Ops platforms or by writing automation scripts (e.g., in Python, Bash, or similar languages).
  • Build and maintain dashboards

    for cloud infrastructure monitoring, service performance metrics, and operational analytics.
  • Implement intelligent alerting and correlation

    mechanisms to reduce noise and improve incident detection accuracy.
  • Create and manage a knowledge base

    for recurring issues and automate their resolution using bots and AI agents.
  • Collaborate with SRE, DevOps, and Application teams

    to ensure end-to-end observability and operational excellence.

Required Skills & Qualifications:

  • Bachelor's degree in Computer Science, Information Technology, or a related field.
  • 3+ years of experience in SRE, DevOps, or IT Operations, with a focus on observability or AI Ops.
  • Hands-on experience with

    AI Ops platforms

    (e.g., Moogsoft, BigPanda, OpsRamp, Dynatrace, IBM Watson AIOps, or similar).
  • Proficiency in

    scripting languages

    (Python, Shell, etc.) to automate operational tasks and implement self-healing solutions.
  • Strong understanding of

    cloud infrastructure monitoring

    (AWS, Azure, GCP) and performance management.
  • Experience with

    dashboard and analytics tools

    (e.g., Grafana, Kibana, Power BI, or cloud-native dashboards).
  • Knowledge of ITIL processes and experience integrating with ITSM platforms (e.g., ServiceNow).
  • Excellent problem-solving, analytical thinking, and communication skills.

Preferred Qualifications:

  • Exposure to

    machine learning or data analytics

    frameworks used in operations.
  • Familiarity with

    incident management bots

    or chatops tools (e.g., Slack bots, MS Teams bots).
  • Experience working in an agile environment with CI/CD pipelines and DevOps best practices.
  • Certification in cloud platforms (AWS/GCP/Azure) or AI Ops tools is a plus.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Techno Facts Solutions logo
Techno Facts Solutions

Information Technology Consulting

Tech City

RecommendedJobs for You