Senior Site Reliability Engineer- Kubernetes

3 - 8 years

13 - 18 Lacs

Posted:-1 days ago| Platform: Naukri logo

Apply

Work Mode

Hybrid

Job Type

Full Time

Job Description

Position Overview:

The Site Reliability Engineer (SRE) will play a key role in building and managing Kubernetes platforms that support cloud-native applications and services. This position focuses on architecting and managing reliable, scalable, and secure Kubernetes-based platforms on AWS, ensuring high availability and performance while optimizing costs and automation. The ideal candidate will have hands-on experience with AWS infrastructure, Kubernetes platform creation, Helm charts, Karpenter scaling, and Istio service mesh.

Key Responsibilities:

  • Kubernetes Platform Creation:

    Design, implement, and maintain highly available, scalable, and fault-tolerant Kubernetes platforms. Ensure clusters are optimized for production workloads, providing high resilience and operational efficiency.
  • AWS Infrastructure Management:

    Build, manage, and optimize AWS cloud infrastructure, including EKS,ECS, S3, VPCs, RDS, IAM, and more. Implement best practices for cost management, scaling, and security within AWS.
  • Helm Management:

    Utilize Helm to automate and streamline the deployment of applications and services to Kubernetes clusters. Create, maintain, and manage Helm charts for production-ready deployments.
  • Karpenter Implementation:

    Implement and manage Karpenter to dynamically scale Kubernetes clusters in response to workload demands.
  • Istio Service Mesh Management:

    Configure and manage Istio to provide service-to-service communication, security, and observability within the Kubernetes clusters. Enable fine-grained traffic management, service discovery, and policy enforcement.
  • Platform Automation & Scaling:

    Automate the deployment, scaling, and management of infrastructure and applications. Work with CI/CD pipelines to ensure a seamless flow from development to production with minimal downtime.
  • Incident Management & Troubleshooting:

    Respond to incidents, troubleshoot, and resolve system issues related to performance, availability, and security in a timely and effective manner.
  • Security & Compliance:

    Design and implement secure cloud infrastructure with appropriate access controls, network security, and compliance frameworks.
  • Documentation & Knowledge Sharing:

    Create and maintain detailed documentation for Kubernetes platform setup, operational procedures, and best practices. Promote knowledge sharing across teams.

Required Qualifications:

  • 3+ years of experience with Kubernetes/ K8s, Helm,Karpenter,Istio;
  • 5+ years of Experience with

    infrastructure-as-code

    tools like

    Terraform

    ,

    Chef or Ansible

  • 5+ years of Experience with

    serverless computing

    (AWS Lambda, API Gateway) and microservices architecture.
  • Experience with

    multi-region

    cloud environments.
  • Proven experience with

    AWS

    (EC2, RDS, S3, CloudFormation, IAM, etc.) and solid understanding of cloud-native architectures.
  • Strong expertise in

    Kubernetes platform creation

    , management, and optimization (e.g., setting up highly available clusters, networking, and storage).
  • Hands-on experience with

    Helm

    for Kubernetes application deployment and management.
  • Practical experience with

    Karpenter

    for dynamic scaling of Kubernetes clusters and optimizing resource usage. Expertise in managing and securing

    Istio

    for service mesh, including traffic management, security, and observability features.
  • Proficiency in

    CI/CD pipelines

    and automation tools (e.g., Jenkins, GitLab, CircleCI, Terraform, Ansible, Spinnaker). Strong scripting and automation skills in

    Python

    ,

    Bash

    , or

    Go

    for infrastructure management and platform automation.
  • Experience with monitoring, logging, and alerting tools such as

    Prometheus

    ,

    Grafana

    ,

    CloudWatch

    , and

    ELK Stack

    .

Preferred Qualifications:

  • Understanding of

    security best practices

    for cloud platforms and Kubernetes (e.g., role-based access control (RBAC), encryption, and compliance frameworks).
  • Familiarity with

    Docker

    and containerization principles.
  • Bachelors degree in Computer Science, Engineering, or related field

    (or equivalent professional experience).
  • Certifications (Preferred):

    CKA (Certified Kubernetes Administrator), CKAD (Certified Kubernetes Application Developer), or AWS Certified DevOps Engineer are highly desirable.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You