Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in Bengaluru
>
Uplers
>
Site Reliability Engineer

Site Reliability Engineer

Uplers

2 - 5 years

10 - 20 Lacs

Bengaluru

Posted:4 months ago| Platform:

Apply

Skills Required

ELK Grafana Terraform Linux Prometheus Dynatrace Bash AWS Kubernetes Python

Work Mode

Work from Office

Job Type

Full Time

Job Description

Experience : 2+ years
Expected Notice Period : 15 Days
Shift : (GMT+05:30) Asia/Kolkata (IST)

Must have skills required:

Bash, Dynatrace, ELK, Grafana, Prometheus, Terraform, AWS, Kubernetes, ???Linux, Python

Job Overview

We are looking for a Site Reliability Engineer (SRE) with 2.5 to 5 years of experience to join our team. The ideal candidate will be responsible for ensuring the availability, scalability, and reliability of our distributed systems, improving observability, automating infrastructure, and enhancing system performance. This role provides an opportunity to work on high-scale, mission-critical environments and contribute to building a resilient infrastructure.

Key Responsibilities

Improve observability by implementing and managing monitoring, logging, and alerting solutions using Prometheus, ELK stack, and Grafana.

Work with APMs like Dynatrace, New Relic to monitor performance metrics, define SLIs, SLOs, and error budgets.

Participate in incident management, including on-call rotation, and Root Cause Analysis (RCA).

Automate infrastructure provisioning using Terraform and Infrastructure as Code (IaC) principles.

Ensure system scalability, reliability, and performance in a distributed environment.

Strengthen security by applying cybersecurity best practices, vulnerability assessments, and compliance policies.

Collaborate with cross-functional teams to establish SRE best practices, improve release pipelines, and minimize deployment risks.

Maintain and improve disaster recovery plans to enhance resilience.

Manage and optimize workflows using Apache Airflow to ensure efficient scheduling and execution of data pipelines.

Support Snowflake data operations, ensuring high availability, performance optimization, and security compliance.

Qualifications & Certifications

Education:

Bachelor's degree in Computer Science, Engineering, or related fields.

Experience:

2.5 to 5 years of experience in Site Reliability Engineering, Observability, or Performance Monitoring.

Hands-on experience in:

Monitoring and observability using Prometheus, ELK, Grafana.

Application Performance Monitoring (APM) tools like Dynatrace, New Relic, or Datadog.

Incident response and on-call rotation management.

Infrastructure automation using Terraform.

Distributed systems operations and scaling.

Load testing and performance analysis using tools like JMeter, k6, or Locust.

Security at scale, including vulnerability scanning and compliance automation.

Workflow automation and orchestration using Apache Airflow.

Experience with Snowflake, including query optimization, data management, and security controls.

Technical Skills:

Strong knowledge of cloud platforms (AWS preferred).

Experience with troubleshooting distributed systems and high-traffic environments.

Hands-on knowledge of Linux, networking, and security fundamentals.

Familiarity with container orchestration (Kubernetes, Docker).

Ability to write automation scripts using Python, Bash, or Go.

Preferred Certifications:

AWS Certified DevOps Engineer Professional (or equivalent AWS certification).

HashiCorp Certified: Terraform Associate.

Certified Kubernetes Administrator (CKA).

Google SRE Professional Certificate (preferred but not mandatory).

Skills

More Jobs at Uplers

Data Scientist

Bengaluru

4 - 9 yrs

INR 15 - 25 Lacs

Project Coordinator Trainee

Ahmedabad, Gujarat, India

Experience: Not specified

₹ 2 - 3 Lacs

Account Executive (NAMER)

Greater Hyderabad Area

5 - 5 yrs

₹ 25 - 30 Lacs

QA Automation Engineer

Gurugram

2 - 5 yrs

INR 10 - 11 Lacs

Full Stack C# .NET Developer

Ahmedabad

3 - 5 yrs

INR 15 - 25 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

Uplers

Digital Services

Ahmedabad

Login to

Please Verify Your Phone or Email

Confirm Action

Site Reliability Engineer