Staff Software Development Engineer

2 - 5 years

4 - 8 Lacs

Posted:1 day ago| Platform: Foundit logo

Apply

Skills Required

CI/CD pipelines

Work Mode

On-site

Job Type

Full Time

Job Description

Position Summary

Site Reliability Engineer (SRE) / DevOps Engineer

Key Responsibilities

  • Design, implement, and manage cloud infrastructure (GCP/AWS/Azure) using

    Infrastructure as Code (Terraform)

  • Build, maintain, and optimize CI/CD pipelines with tools such as

    GitLab CI, CircleCI, ArgoCD

  • Ensure high availability and performance of applications running on

    Kubernetes (GKE/EKS/AKS)

    and container orchestration tools
  • Implement observability solutions using

    Prometheus, Grafana, ELK

    , and other monitoring/logging tools
  • Work with development teams to enhance application performance and deployment workflows
  • Automate and manage

    IAM, RBAC, network policies

    , and vulnerability scanning
  • Participate in

    incident management

    , root cause analysis, and postmortem processes
  • Continuously improve infrastructure reliability and reduce manual operational efforts (toil)

Basic Qualifications

  • Strong knowledge of

    Linux system administration

  • Proficiency in

    scripting languages

    such as

    Python, Bash, or Go

  • Solid hands-on experience with

    cloud platforms

    (GCP preferred; AWS or Azure acceptable)
  • Proficient in

    Kubernetes operations

    , including Helm charts, service meshes, and operators
  • Experience with

    Terraform

    and Infrastructure as Code best practices
  • Experience building and maintaining CI/CD pipelines (e.g., GitLab CI, CircleCI, ArgoCD)
  • Familiarity with

    observability tools

    (Prometheus, Grafana, ELK, etc.)
  • Good understanding of

    networking concepts

    : TCP/IP, DNS, Load Balancing, Firewalls

Preferred Qualifications

  • Experience with

    advanced networking

    and service meshes (e.g., Istio)
  • Familiarity with

    SRE principles

    : SLOs, SLIs, error budgets
  • Exposure to

    multi-cluster

    or

    hybrid-cloud

    infrastructure setups
  • Experience with

    incident response

    and post-incident review processes

Key Skills (Comma-Separated)

Site Reliability Engineering, DevOps, GCP, AWS, Azure, Terraform, CI/CD, GitLab CI, CircleCI, ArgoCD, Kubernetes, GKE, EKS, AKS, Helm, Prometheus, Grafana, ELK, Python, Bash, Go, IAM, RBAC, Network Policies, Service Mesh, Istio, TCP/IP, DNS, Load Balancers, Firewalls, Monitoring, Logging, Error Budgets, SLOs, SLIs, Incident Management

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Gruve logo
Gruve

Transportation & Logistics

San Francisco

RecommendedJobs for You

Bengaluru / Bangalore, Karnataka, India