Site Reliability Engineer

Cybage

5 - 10 years

12 - 22 Lacs

pune

Posted:6 days ago| Platform:

Apply

Skills Required

infrastructure as code site reliability engineering python automation aws cloud docker linux prometheus grafana kubernetes

Work Mode

Hybrid

Job Type

Full Time

Job Description

Key Responsibilities

Build and scale observability systems:
Design and maintain infrastructure for collecting, aggregating, and analyzing telemetry data (metrics, logs, and traces).
Enable actionable insights:
Develop dashboards, alerts, and visualizations that turn raw data into clear, meaningful information for engineers, SREs, and business stakeholders.
Collaborate across teams:
Partner with engineering, operations, and SRE teams to define SLIs/SLOs and improve visibility into system performance and health.
Drive best practices:
Advocate for and support consistent instrumentation, effective alerting, and strong observability practices across engineering teams.
Optimize systems and tools:
Continuously assess performance, usage, and cost of observability tools, identifying opportunities for improvement and efficiency.
Automate:
Engineer capabilities that will drive the adoption of SRE principles and best practices into what is deployed within the Nexxen environment.
Improve:
In collaboration with engineering teams develop plans to improve the reliability of applications and infrastructure and assist these teams with the engineering of these improvements.
Support incident response:
Participate in and help improve the incident response process, reducing MTTR and contributing to post-incident reviews and root cause analysis.

What Were Looking For

Technical Skills

Programming experience
in languages like Go, Python, Java, or Node.js. Able to contribute tools and advise on application-level instrumentation improvements.
Observability tooling expertise
within these tools:
LGTM (Loki, Grafana, Tempo, Mimr)
Datadog
Cloudwatch
Prometheus
Pagerduty
ClickStack
VictoriaMetrics
Groundcover
Libre
Zabbix
Cloud experience
with AWS and services like EC2, EKS, ECS, VPC networking
Containers & orchestration
: Familiarity with Docker and Kubernetes.
Infrastructure as Code & automation:
Experience with tools like Terraform, Ansible, Chef, or SCCM to manage observability infrastructure at scale.
Linux systems knowledge:
Strong understanding of Linux, shell scripting, and the storage/networking stack.
Tracing:
Deep understanding of tracing technology and OpenTelemetry
SRE Practices
: SLIs, SLOs, Error Budgets, and Failure Domains

More Jobs at Cybage

Golang Developer

Pune

6 - 11 yrs

INR 5 - 15 Lacs

Cloud Engineer - Lead

Pune

7 - 9 yrs

INR 12 - 20 Lacs

Cybage Software is hiring Sr. Python Developers

Pune

6.0 - 11.0 yrs

INR 15 - 25 Lacs

Scala Lead

Pune

10.0 - 17.0 yrs

INR 15 - 20 Lacs

Cybage is hiring For Lead Node JS developer

Gandhinagar, Pune

7.0 - 9.0 yrs

INR 15 - 20 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.