Site Reliability Engineer (SRE) – Cloud Infrastructure & Automation

3 years

0 Lacs

Posted:4 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Job Title:

Locations:

  • Munich:

    Strong preference for GCP and AWS (Azure is a plus)
  • Kirkland:

    Strong preference for Azure and AWS (GCP is a plus)
  • Pune, India:

    Open to strong profiles with multi-cloud experience (AWS mandatory; GCP or Azure is a plus)
  • Onsite Requirement:

    100% Onsite or Hybrid based on location and role level

Overview:

hands-on Site Reliability Engineer

You’ll play a key role in ensuring reliability, scalability, and performance of production systems across multi-cloud environments.

Key Responsibilities:

  • Build and manage infrastructure as code using modern tooling (e.g., Terraform, ArgoCD).
  • Design and operate Kubernetes-based production systems using open-source solutions.
  • Develop automation tools and scripts to improve operational efficiency.
  • Contribute to the development of highly available, fault-tolerant distributed systems.
  • Monitor and maintain infrastructure health using Prometheus-compatible observability tools.
  • Collaborate with developers to enable CI/CD pipelines and cloud-native application delivery.

Required Qualifications:

  • Bachelor’s degree in Computer Science, Engineering, or equivalent experience.
  • Minimum 3 years of

    hands-on experience as a Site Reliability Engineer

    .
  • Minimum 3 years of

    Kubernetes production experience

    , leveraging ecosystem tools.
  • At least 3 years of experience developing production-grade software.
  • Proficiency in at least one OOP language (Go, C++, Python preferred).
  • Strong hands-on experience with

    AWS

    and one or more of

    GCP or Azure

    .
  • Solid foundation in

    Linux

    and

    networking concepts

    .
  • Proven experience in building and maintaining

    high-concurrency, highly available systems

    .

Nice to Have (Bonus Skills):

  • Knowledge of

    application security

    in cloud-native environments.
  • Familiarity with

    service mesh

    or

    multi-cluster mesh infrastructure

    (e.g., Cilium).
  • Experience with

    observability stacks

    (Prometheus, Grafana, etc.).
  • Experience using

    GitHub Actions

    or other modern CI/CD tools.
  • Exposure to tools like

    ArgoCD

    ,

    Terraform

    ,

    FoundationDB

    ,

    Kafka

    , or

    Kubernetes operators

    .

Munich

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You