Home
Jobs

SRE/DevOps

18 - 24 years

12 - 16 Lacs

Posted:5 days ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

We are looking for a skilled SRE/DevOps professional with 18 to 24 years of experience to join our team in Noida, Pune, and Bangalore. The ideal candidate will have a strong background in technical leadership and driving continuous improvement of reliability, stability, and performance of digital platforms.
Roles and Responsibility
  • Provide technical and people leadership to SRE, DevOps, Monitoring, and Database Operations teams.
  • Collaborate with leadership on budgeting, planning, hiring, and managing third-party contracts.
  • Oversee project status, assemble project teams, and define assignments with schedules and milestones.
  • Drive continuous improvement of reliability, stability, and performance of digital platforms.
  • Implement automated telemetry, observability, and applied intelligence systems.
  • Lead efforts to develop automated alerting, self-healing mechanisms, and intelligent response systems.
  • Ensure 24/7 uptime of sites and services, with minimal unplanned downtime.
  • Serve as Escalation Manager/Critical Incident Manager during major incidents, leading teams in rapid service restoration.
  • Provide on-call escalation support based on 24/7/365 schedules and communicate timely updates and incident reports to senior leadership.
  • Partner with administrators, platform engineers, and other stakeholders to achieve highly reliable infrastructure, systems, and integrations.
  • Collaborate with product, application development, QA, and technology teams to enhance service reliability and performance.
  • Provide advanced Incident and Problem Management support to effectively diagnose, remediate, and resolve platform issues.
  • Automate critical workflows across the platform to minimize manual errors and reduce human intervention.
  • Implement ITIL processes like Incident, Problem, and Change Management.
  • Design and implement effective monitoring systems with proper alerting and escalation mechanisms for critical events.
  • Ensure timely capacity planning and infrastructure upgrades for optimal reliability.
  • Develop and refine processes to minimize Mean Time to Recover (MTTR) and extend Mean Time to Fail (MTTF).
  • Create and maintain detailed documentation, including run books, incident response guides, post-mortem reports, RCAs, and mitigation plans.
  • Ensure all changes adhere to established procedures and documentation standards.
  • Understand business workflows and map technology solutions to address problems effectively.
  • Lead conversations and provide technical support to both internal and external customers.
  • Job
  • Minimum 18 years of experience in a related field.
  • Strong understanding of technical leadership, automation, and scalability.
  • Experience with monitoring tools and techniques.
  • Excellent communication and collaboration skills.
  • Ability to work in a fast-paced environment and lead cross-functional teams.
  • Strong problem-solving skills and attention to detail.
  • A graduate degree is required.
  • Mock Interview

    Practice Video Interview with JobPe AI

    Start DevOps Interview
    cta

    Start Your Job Search Today

    Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

    Job Application AI Bot

    Job Application AI Bot

    Apply to 20+ Portals in one click

    Download Now

    Download the Mobile App

    Instantly access job listings, apply easily, and track applications.

    coding practice

    Enhance Your Python Skills

    Practice Python coding challenges to boost your skills

    Start Practicing Python Now
    Apptad
    Apptad

    IT Services and IT Consulting

    Alpharetta Georgia

    RecommendedJobs for You

    Kolkata, Mumbai, New Delhi, Hyderabad, Pune, Chennai, Bengaluru