Site Reliability Engineer

7 - 12 years

7 - 11 Lacs

Posted:5 days ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Location PAN India As per companys designated LTIM locations

Shift Type Rotational Shifts including Night Shift and Weekend Availability

Experience 7 Years of Exp

Job Summary

We are looking for a skilled and adaptable Site Reliability Engineer SRE Observability Engineer to join our dynamic project team The ideal candidate will play a critical role in ensuring system reliability scalability observability and performance while collaborating closely with development and operations teams This position requires strong technical expertise problemsolving abilities and a commitment to 247 operational excellence

Key Responsibilities

Site Reliability Engineering

Design build and maintain scalable and reliable infrastructure

Automate system provisioning and configuration using tools like Terraform Ansible Chef or Puppet

Develop tools and scripts in Python Go Java or Bash for automation and monitoring

Administer and optimize LinuxUnix systems with a strong understanding of TCPIP DNS load balancers and firewalls

Implement and manage cloud infrastructure across AWS or Kubernetes

Maintain and enhance CICD pipelines using tools like Jenkins ArgoCD

Monitor systems using Prometheus Grafana Nagios or Datadog and respond to incidents efficiently

Conduct postmortems and define SLAsSLOs for system reliability and performance

Plan for capacity and performance using benchmarking tools and implement autoscaling and failover systems

Observability Engineering

Instrument services with relevant metrics logs and traces using OpenTelemetry Prometheus Jaeger Zipkin etc

Build and manage observability pipelines using Grafana ELK Stack Splunk Datadog or Honeycomb

Work with timeseries databases eg InfluxDB Prometheus and log aggregation platforms

Design actionable s and dashboards to improve system observability and reduce fatigue

Partner with developers to promote observability best practices and define key performance indicators KPIs

Required Skills Qualifications

Proven experience as an SRE or Observability Engineer in complex production environments

Handson expertise in LinuxUnix systems and cloud infrastructure AWSKubernetes

Strong programming and scripting skills in Python Go Bash or Java

Deep understanding of monitoring logging and ing systems

Experience with modern Infrastructure as Code and CICD practices

Ability to analyze and troubleshoot production issues in realtime

Excellent communication skills to collaborate with crossfunctional teams and stakeholders

Flexibility to work in rotational shifts including night shifts and weekends as required by project demands

A proactive mindset with a focus on continuous improvement and reliability
Additional Requirements

Excellent communication skills to collaborate with crossfunctional teams and stakeholders

Flexibility to work in rotational shifts including night shifts and weekends as required by project demands

A proactive mindset with a focus on continuous improvement and reliability

Mock Interview

Practice Video Interview with JobPe AI

Start Java Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Java Skills

Practice Java coding challenges to boost your skills

Start Practicing Java Now
Apptad logo
Apptad

IT Services and IT Consulting

Alpharetta Georgia

RecommendedJobs for You

Chennai, Tamil Nadu, India

Chennai, Tamil Nadu, India

Serilingampalli, Telangana, India