Site Reliability Engineer Lead

9 - 14 years

30 - 45 Lacs

Posted:3 hours ago| Platform: Naukri logo

Apply

Work Mode

Hybrid

Job Type

Full Time

Job Description

In one sentence

As the SRE Lead, you will be responsible for the reliability, operational excellence, and release governance of amAIz (Telco Agentic Suite). You will lead a cross-functional team of NFT, QA, and DevOps Engineers, driving best practices in observability, automation, performance optimization, quality assurance, and orchestrating smooth, predictable releases across environments.

All you need is...

  • Bachelors degree in Science/IT/Computing or equivalent.
  • 5+ years of experience in SRE, DevOps, or infrastructure engineering roles.
  • Proven leadership experience managing cross-functional engineering teams.
  • Excellent communication and stakeholder management skills.
  • Strong understanding of cloud platforms (AWS, GCP, or Azure).
  • Experience with container orchestration (Kubernetes), CI/CD, and Infrastructure as Code.
  • Knowledge in ArgoCD – an advantage.
  • SaaS experience – an advantage.
  • Proficiency in monitoring tools (Prometheus, Grafana, Datadog, etc.).
  • Solid scripting/coding skills (Python, Go, Bash).
  • Experience with QA methodologies, test automation, and E2E testing frameworks.
  • Experience in Release Management: planning, scheduling, and coordinating releases in complex environments.
  • GenAI experience – an advantage.

What will your job look like?

  • Lead and mentor the DevOps team to build scalable, secure, and automated infrastructure for amAIz (Telco Agentic Suite).
  • Automate CI/CD pipelines to streamline deployments and ensure fast, reliable delivery of features and fixes.
  • Establish and maintain observability systems (monitoring, alerting, logging) to enable proactive issue detection and resolution.
  • Promote and integrate GenAI capabilities into the SDLC, ensuring all R&D teams leverage these tools effectively.
  • Drive FinOps practices to optimize cloud costs and resource utilization.
  • Guide the QA team in designing and executing E2E test cases for Generative AI workflows and platform features.
  • Integrate test automation into CI/CD pipelines to support continuous quality validation.
  • Define and track quality metrics to ensure release readiness and platform stability.
  • Lead NFT efforts for performance, scalability, and reliability testing to validate system behavior under load.
  • Define and maintain non-functional requirements in collaboration with product and architecture teams.
  • Analyze NFT results and drive optimizations to meet enterprise-grade standards and SLAs.
  • Coordinate release planning, scope definition, and risk assessment with stakeholders.
  • Govern release processes, including approvals, documentation, and compliance with change management policies.

Why you will love this job:

  • The chance to serve as a specialist in software and technology.
  • You will take an active role in technical mentoring within the team.
  • We provide stellar benefits from health to dental to paid time off and parental leave!

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Amdocs logo
Amdocs

Software and Services

Chesterfield

RecommendedJobs for You

hyderabad, pune, greater noida

bengaluru, delhi / ncr, mumbai (all areas)