6 - 8 years
13 - 18 Lacs
Gurugram
Posted:1 week ago|
Platform:
Work from Office
Full Time
Responsibilities : - Define and enforce SLOs, SLIs, and error budgets across microservices - Architect an observability stack (metrics, logs, traces) and drive operational insights - Automate toil and manual ops with robust tooling and runbooks - Own incident response lifecycle: detection, triage, RCA, and postmortems - Collaborate with product teams to build fault-tolerant systems - Champion performance tuning, capacity planning, and scalability testing - Optimise costs while maintaining the reliability of cloud infrastructure Must have Skills : - 6+ years in SRE/Infrastructure/Backend related roles using Cloud Native Technologies - 2+ years in SRE-specific capacity - Strong experience with monitoring/observability tools (Datadog, Prometheus, Grafana, ELK etc.) - Experience with infrastructure-as-code (Terraform/Ansible) - Proficiency in Kubernetes, service mesh (Istio/Linkerd), and container orchestration - Deep understanding of distributed systems, networking, and failure domains - Expertise in automation with Python, Bash, or Go - Proficient in incident management, SLAs/SLOs, and system tuning - Hands-on experience with GCP (preferred)/AWS/Azure and cloud cost optimisation - Participation in on-call rotations and running large-scale production systems Nice to have skills : - Familiarity with chaos engineering practices and tools (Gremlin, Litmus) - Background in performance testing and load simulation (Gatling, Locust, k6, JMeter)
GreyOrange
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
13.0 - 18.0 Lacs P.A.
11.0 - 15.0 Lacs P.A.
Hyderabad
20.0 - 25.0 Lacs P.A.
10.0 - 14.0 Lacs P.A.
Chennai
11.0 - 12.0 Lacs P.A.
Chennai, Tamil Nadu, India
Experience: Not specified
Salary: Not disclosed
4.0 - 4.0 Lacs P.A.
Bengaluru
50.0 - 60.0 Lacs P.A.
Bengaluru
3.0 - 7.0 Lacs P.A.