8 - 13 years

10 - 18 Lacs

Posted:1 day ago| Platform: Naukri logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Senior DevOps Architect

Location:

Shift:

Experience:

Role Summary

As a Senior DevOps Architect, you will architect and evolve our Kubernetes platform, CI/CD ecosystem, observability stack, and cloud infrastructure. You will work closely with senior engineers and cross-functional teams to define standards, build reusable patterns, automate operations, and ensure reliable, secure, and cost-efficient delivery pipelines.

Your role includes shaping platform strategy, mentoring engineers, and ensuring our systems are resilient, observable, and scalable.

Key Responsibilities

Platform Architecture & Leadership

  • Architect, standardize, and improve platform tooling, deployment frameworks, and operational workflows used across product engineering teams.
  • Define and maintain

    golden paths

    , reusable templates, and guardrails ensuring consistency, reliability, and compliance.
  • Provide guidance and mentorship to DevOps engineers, supporting adoption of best practices in operations and automation.

Kubernetes & Cloud Infrastructure

  • Own and evolve Kubernetes operations at scale: Helm chart architecture, cluster governance, namespaces/security, rollout strategies, and lifecycle management.
  • Design and optimize AWS infrastructure, including IAM, EKS, EC2, S3, networking, and secrets management.
  • Implement cluster-level resiliency patterns, capacity planning, cost optimization, and runtime performance tuning.

CI/CD Strategy & Automation

  • Architect and extend GitLab CI/CD pipelines with enterprise-grade quality gates, environment strategies, and automated delivery frameworks.
  • Establish advanced deployment patterns (blue/green, canary, progressive delivery) and help teams adopt continuous delivery safely and consistently.
  • Automate build and release workflows, improve developer experience, and reduce manual operational load.

Observability, Monitoring & SRE Practices

  • Lead architecture and operations of the observability stack (Prometheus, Alertmanager, Thanos, OpenTelemetry).
  • Define and refine SLOs/SLIs, alerting strategies, and dashboards to improve signal-to-noise and support high-quality incident response.
  • Guide service instrumentation across metrics, logs, and tracing; work closely with application teams to ensure end-to-end observability.

Operations, Incident Response & Governance

  • Provide escalation-level support during incidents; perform root cause analysis and drive platform-wide corrective actions.
  • Establish operational standards: runbooks, SOPs, troubleshooting guides, and change-management workflows.
  • Introduce automation to reduce toil, improve platform stability, and ensure compliance with security and operational controls.

Continuous Improvement & Technical Stewardship

  • Identify architectural gaps and lead efforts to modernize workflows, reduce complexity, and improve developer productivity.
  • Own platform scalability, cost management, and infrastructure hygiene (resource pruning, right-sizing, lifecycle policies).
  • Partner with security, compliance, and product teams to ensure secure, reliable, compliant platform operations.

Required Qualifications

  • 812+ years

    in DevOps, SRE, platform engineering, or cloud infrastructure roles.
  • Expert-level experience operating

    Kubernetes

    in production (Helm, Docker, Ingress NGINX, rollout strategies, cluster operations).
  • Deep experience with

    GitLab CI/CD

    : creating reusable pipelines, artifacts, environments, and automated deployment patterns.
  • Strong programming/scripting ability in

    Bash and Python

    (Go preferred).
  • Solid AWS expertise: IAM, EKS, EC2, S3, networking, CloudWatch/CloudTrail, and secrets management.
  • Strong observability experience:

    Prometheus, Alertmanager, Thanos

    , and

    OpenTelemetry

    for metrics/logs/traces.
  • Demonstrated ability to lead platform engineering efforts, mentor engineers, and collaborate across teams.
  • Comfortable working via tickets (Jira/ServiceNow) and adhering to change-management and production control processes.

Highly Preferred

  • Terraform

    expertise for building and managing AWS infrastructure and Kubernetes platform components.
  • Experience developing lightweight internal tools (API integrations, automation frameworks) in Python/Go/Java.
  • Deep Linux and container-runtime fundamentals for debugging, performance tuning, and security hardening.
  • Experience in regulated industries such as

    insurance or financial services

    , including familiarity with compliance and operational risk controls.

What Youll Bring

You bring a builders mindset, a strong sense of ownership, and the ability to transform complex platform challenges into scalable, automated solutions. You enjoy mentoring, designing systems for long-term sustainability, and enabling product teams to move quickly without compromising reliability.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You