Site Reliability Engineer

4 - 6 years

10 - 20 Lacs

Posted:2 months ago| Platform: Naukri logo

Apply

Work Mode

Hybrid

Job Type

Full Time

Job Description

Site Reliability Engineer (SRE)

Ideal candidates have strong foundations in infrastructure automation, systems engineering, and software development—along with a passion for building reliable systems at scale.

Key Responsibilities:

Reliability Engineering & Automation

  • Design and implement

    Infrastructure as Code

    (IaC) using Terraform, Pulumi, CloudFormation, or Ansible to provision and manage scalable cloud infrastructure.
  • Build self-healing and auto-scaling infrastructure using Kubernetes, Docker, and managed services across AWS, Azure, or GCP.
  • Automate operational tasks including failovers, backups, and capacity adjustments.

SLI/SLO Monitoring & Observability

  • Define and track

    Service Level Indicators (SLIs)

    and

    Objectives (SLOs)

    to measure and maintain service reliability.
  • Build and manage observability stacks using Prometheus, Grafana, Datadog, CloudWatch, AppD, or equivalent.
  • Improve alert quality and reduce noise through intelligent alerting and tuning.

Incident Response & Operational Excellence

  • Lead production incident response, perform root cause analysis (RCA), and write

    blameless postmortems

    to drive continuous improvement.
  • Establish and refine runbooks, playbooks, and on-call processes to improve Mean Time to Recovery (MTTR).
  • Participate in on-call rotations to support critical production systems.

CI/CD Reliability & Release Engineering

  • Develop and optimize CI/CD pipelines (Jenkins, GitHub Actions, ArgoCD, TeamCity) with built-in checks for reliability, performance, and security.
  • Implement

    progressive delivery

    patterns like canary deployments and blue/green rollouts.
  • Collaborate with developers to ensure release processes are safe, repeatable, and observable.

Security, Compliance & Risk Management

  • Enforce cloud security best practices for IAM, network segmentation, and secret management.
  • Integrate DevSecOps practices and tooling (e.g., Snyk, SonarQube, OWASP ZAP) into pipelines for early vulnerability detection.
  • Ensure systems adhere to regulatory and compliance standards (SOC2, ISO 27001, GDPR, HIPAA, etc.).

Collaboration & Mentorship

  • Work cross-functionally with engineering, QA, and platform teams to embed reliability into the SDLC.
  • Provide guidance and mentorship to junior SREs and engineers on reliability practices.
  • Champion a culture of operational excellence, documentation, and knowledge sharing.

Skills & Qualifications:

Technical Expertise

  • Cloud Platforms

    : Strong hands-on experience with AWS, GCP, or Azure across compute, networking, storage, and identity.
  • Kubernetes

    : Advanced experience managing production-grade clusters (EKS, GKE, AKS), Helm, and containerized workloads.
  • CI/CD Tools

    : Jenkins, GitHub Actions, ArgoCD, TeamCity, Spinnaker (preferred).
  • IaC

    : Terraform, CloudFormation, Pulumi, Ansible (strong experience required).
  • Programming & Scripting

    : Proficiency in Python, Go, or Java. Strong scripting skills with Bash or PowerShell.
  • Version Control

    : Expert-level Git usage and branching strategies (GitOps experience is a plus).
  • Monitoring & Logging

    : Familiarity with Prometheus, Grafana, ELK, Datadog, New Relic, or AppD.

Security & Compliance

  • Strong understanding of cloud security principles and IAM policies.
  • Experience with automated security testing and static code analysis tools.

Soft Skills

  • Analytical thinker with strong troubleshooting and problem-solving skills.
  • Clear communication and ability to drive cross-team collaboration.
  • Strong ownership mindset and bias for action in high-pressure situations.
  • Ability to manage multiple priorities and lead technical initiatives.

Preferred Qualifications:

  • Certifications: AWS Solutions Architect, GCP Professional Cloud DevOps Engineer, or Azure DevOps Expert.
  • Experience with

    GitOps

    tooling such as GitHub, Jenkins, ArgoCD, TeamCity, jFrog, etc.
  • Exposure to

    serverless

    architecture (Lambda, GCF, Azure Functions).
  • Experience with

    chaos engineering

    and resiliency testing frameworks

Education & Experience:

  • Bachelor’s degree in Computer Science, Information Technology, or a related field.
  • 5+ years of experience in SRE, DevOps, or cloud infrastructure roles with a focus on system reliability.

About Picarro:

We are the world's leader in timely, trusted, and actionable data using enhanced optical spectroscopy. Our solutions are used in various applications, including natural gas leak detection, ethylene oxide emissions monitoring, semiconductor fabrication, pharmaceutical, petrochemical, atmospheric science, air quality, greenhouse gas measurements, food safety, hydrology, ecology, and more. Our software and hardware are designed and manufactured in Santa Clara, California. They are used in over 90 countries worldwide based on over 65 patents related to cavity ring-down spectroscopy (CRDS) technology. They are unparalleled in their precision, ease of use, and reliability.

At Picarro, we are committed to fostering a diverse and inclusive workplace. All qualified applicants will receive consideration for employment without regard to race, sex, color, religion, national origin, protected veteran status, gender identity, social orientation, or disability. Posted positions are not open to third-party recruiters/agencies, and unsolicited resume submissions will be considered free referrals.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Picarro Technologies logo
Picarro Technologies

Environmental Monitoring, Instrumentation

Santa Clara

RecommendedJobs for You

hyderabad, telangana, india