Site Reliability Engineer

5 - 7 years

0 Lacs

Posted:4 days ago| Platform: Foundit logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Job Title: SRE / DevOps Engineer

Location: Remote (India)

Experience: 5+ Years

Role Overview:

Site Reliability / DevOps Engineer

Kubernetes (GKE)

Key Responsibilities:

  • Design, implement, and manage

    highly available GCP infrastructure

    using Terraform (IaC).
  • Build and operate

    Kubernetes (GKE)

    clusters, including deployments, ingress, autoscaling, and Helm-based releases.
  • Develop and maintain

    CI/CD pipelines

    using

    GitHub Actions

    and

    Google Cloud Build

    .
  • Implement

    SRE best practices

    : SLIs, SLOs, SLAs, error budgets, and incident response.
  • Containerize and deploy

    microservices using Docker

    and Kubernetes.
  • Implement

    monitoring, logging, and observability

    using Cloud Monitoring, Cloud Logging, Prometheus, and Grafana.
  • Troubleshoot production issues, perform root cause analysis, and drive permanent fixes.
  • Manage

    networking and security

    including VPCs, load balancers, DNS, SSL/TLS, firewalls, IAM, and VPNs.
  • Collaborate with engineering teams to improve system reliability, performance, and scalability.
  • Automate operational tasks using

    Python, Bash, or Go

    .
  • Participate in on-call rotations and incident management processes.

Required Skills & Qualifications:

  • 5+ years

    of experience as an

    SRE / DevOps / Cloud Engineer

    .
  • Strong hands-on experience with

    Google Cloud Platform (GCP)

    :
  • Compute Engine, GKE, Cloud Functions
  • Cloud Storage, VPC, IAM
  • Cloud Logging & Cloud Monitoring
  • Expert-level Kubernetes experience

    (preferably

    GKE

    ):
  • Deployments, Services, Ingress
  • Autoscaling (HPA)
  • Helm charts
  • Strong experience with

    Terraform

    for Infrastructure as Code.
  • Proven experience building

    CI/CD pipelines

    using

    GitHub Actions

    and

    Cloud Build

    .
  • Strong understanding of

    Docker, containers, microservices

    , and

    service mesh concepts

    .
  • Experience with

    observability tools

    :
  • Stackdriver (Cloud Ops), Prometheus, Grafana
  • Solid understanding of

    networking & cloud security

    :
  • Load balancers, DNS, SSL
  • VPNs, firewalls, IAM best practices
  • Hands-on scripting experience in

    Python, Bash, or Go

    .
  • Excellent

    problem-solving, debugging, and communication skills

    .

Nice to Have:

  • Experience with

    service mesh

    (Istio, Linkerd).
  • Experience with

    SRE metrics

    and reliability engineering practices.
  • Knowledge of

    cost optimization (FinOps)

    on GCP.
  • Experience working in

    remote, globally distributed teams

    .

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You