Sr. DevOps Engineer - Infrastructure

7 - 10 years

12 - 18 Lacs

Posted:1 day ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

As a Sr. DevOps Engineer, you will be part of the Cloud Platform team, which is responsible for architecting, deploying, maintaining, and supporting infrastructure platforms across public, private, and hybrid cloud environments. The team owns core platform services, including Kubernetes, Infrastructure as Code, CI/CD platforms, and GitOps tooling, and supports both data centre and cloud-native deployments.
In addition to core platform responsibilities, this role will focus on intelligent automation and applied AI use cases to reduce operational toil, improve reliability, and accelerate incident response across the SaaS platform.

Here is how, through this exciting role, YOU will contribute to BMC''s and your own success:

  • Support, maintain, and optimize private and public cloud deployments across AWS, Google Cloud, Oracle Cloud, and other IaaS platforms.
  • Maintain and support all platform deployment software including Kubernetes, Rancher, OKE, GKE, and EKS.
  • Design, develop, and maintain automation using Terraform, Jenkins, ArgoCD, and Ansible.
  • Write and maintain automation and tooling using Python and/or Go.
  • Deploy and operate data center environments across bare metal, virtualized, and cloud-based infrastructure.
  • Provide architectural input for capacity planning, performance management, data center rollouts, and consolidation initiatives.
  • Design and implement AI-assisted automation to streamline operational workflows, including:
    • Intelligent alert enrichment, correlation, and noise reduction
    • Automated runbooks and self-healing workflows
    • AI-assisted troubleshooting and incident summarization
  • Integrate LLM-based capabilities into existing platform tooling, CI/CD pipelines, and ChatOps workflows (Slack/Teams) where appropriate.
  • Partner with SRE, Ops, and Platform stakeholders to identify high-impact automation opportunities and drive measurable operational improvements.
  • Ensure all automation and AI-enabled solutions are secure, reliable, auditable, and production-ready.

To ensure youre set up for success, you will bring the following skillset & experience:

  • Strong hands-on experience with Terraform, Jenkins, Kubernetes, ArgoCD
  • 7-10 years of relevant industry experience (industry domain not critical).
  • Proven experience supporting production platforms with on-call and incident response responsibilities.
  • Experience working with Change and Incident Management tools and processes.
  • Prior experience deploying and operating infrastructure in at least one major cloud provider: AWS, Google Cloud, or Oracle Cloud.
  • Strong Linux fundamentals and system troubleshooting skills.
  • Strong Python skills for building AI-assisted tooling, automation workflows, and integrations.
  • Hands-on experience using AI/LLM APIs (e.g., OpenAI, Azure OpenAI, Anthropic, or equivalent) for practical automation use cases.
  • Experience designing or implementing:
    • Alert triage and enrichment using AI
    • AI-assisted runbooks or operational knowledge bases
    • ChatOps or self-service operational tooling
  • Solid understanding of AI limitations, guardrails, and operational risks, including data privacy and security considerations.
  • Ability to apply AI to augment operational decision-making, not replace engineering judgment.
  • Ability to work independently with minimal supervision and exercise sound technical judgment.
  • Experience solving problems of diverse scope requiring data analysis, evaluation of alternatives, and practical decision-making.
  • Strong verbal and written communication skills for collaboration with a global, cross-functional team.
  • Experience with Spectro Cloud or similar Kubernetes management platforms.
  • Familiarity with observability stacks such as Prometheus, Grafana, ELK/OpenSearch.
  • Exposure to AIOps platforms or event-driven automation systems.
  • Experience with frameworks such as LangChain, LlamaIndex, or vector databases.
  • Relevant certifications (ITIL, AWS, GCP, OCI, VCP, etc.).

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
BMC Software logo
BMC Software

IT Services and IT Consulting

Houston Texas

RecommendedJobs for You

navi mumbai, mumbai (all areas)