Site Reliability Engineer (SRE) – Infrastructure & Automation

7 years

0 Lacs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Description

About InstaService

InstaService is revolutionizing the home services industry through AI-driven technology, connecting customers with trusted professionals instantly. We’re growing fast across 23+ states and expanding nationwide — backed by strong traction, rapid adoption, and a mission to simplify how people get work done at home.

Senior Site Reliability Engineer (SRE)


What You’ll Do

  • Lead

    incident response

    , conduct

    root cause analysis

    , and ensure permanent preventive measures.
  • Design and optimize

    CI/CD pipelines

    , automate deployments, and enforce release stability.
  • Build and manage scalable infrastructure on

    AWS, GCP, or Azure

    using

    Terraform

    ,

    Ansible

    , and

    Kubernetes

    .
  • Continuously monitor system health with

    Prometheus

    ,

    Grafana

    ,

    ELK

    , and

    CloudWatch

    .
  • Conduct

    load and performance testing

    (k6, JMeter, Locust) and optimize systems for high-traffic events.
  • Improve

    observability

    , reduce alert noise, and enhance signal clarity for faster debugging.
  • Collaborate with developers and architects to ensure systems meet

    SLOs, SLIs, and SLAs

    .
  • Develop automation scripts and tools in

    Python, Go, Node.js, or Shell

    to streamline operations.
  • Manage distributed systems and message queues like

    Kafka

    or

    RabbitMQ

    .
  • Drive a culture of

    reliability, automation, and scalability

    across teams.


What We’re Looking For

  • 4–7 years of experience in

    SRE or DevOps

    roles (preferably in high-scale or e-commerce environments).
  • Strong hands-on experience with

    Kubernetes

    ,

    Docker

    ,

    Terraform

    ,

    Ansible

    , and

    CI/CD pipelines

    .
  • Deep understanding of

    Linux systems

    ,

    networking

    , and

    distributed architecture

    .
  • Solid programming skills in

    Python

    ,

    Go

    , or

    Node.js

    .
  • Experience managing

    cloud platforms

    (AWS, GCP, or Azure).
  • Proven track record of maintaining

    production uptime

    and optimizing

    system performance

    .


Nice to Have

  • Experience with

    observability stacks

    ,

    distributed tracing

    , and

    incident automation

    .
  • Familiarity with

    microservices

    and

    event-driven systems

    .
  • Exposure to

    cost optimization

    and

    capacity planning

    in multi-cloud environments.


Why Join InstaService?

  • Fast-growing startup reshaping a massive industry
  • Work on

    high-scale systems

    and impactful technology
  • Collaborative and innovation-driven team
  • Competitive compensation and growth opportunities


Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You