Site Reliability Engineer

8 - 10 years

0 Lacs

Posted:1 day ago| Platform: Foundit logo

Apply

Skills Required

Work Mode

On-site

Job Type

Full Time

Job Description

Job Title:

Site Reliability Engineer

Location:

Bangalore

Experience:

8 yearsWe are looking for a skilled Site Reliability Engineer to ensure the reliability, scalability, and performance of critical systems and services. The SRE will work closely with development, operations, and cloud teams to implement automation, monitoring, incident management, and continuous improvement processes across applications and infrastructure.

Key Responsibilities

  • Design, implement, and maintain reliable, scalable, and highly available systems and services.
  • Monitor system performance, availability, and capacity to ensure SLAs and SLOs are met.
  • Automate deployment, configuration, and operational tasks using scripting and orchestration tools.
  • Respond to incidents, perform root cause analysis, and implement preventive measures.
  • Collaborate with development teams to ensure systems are designed for reliability and performance.
  • Build and maintain CI/CD pipelines for applications and infrastructure.
  • Implement observability solutions including logging, metrics, and alerting systems.
  • Manage cloud infrastructure (AWS, Azure, or GCP) and ensure security, cost optimization, and compliance.
  • Continuously improve systems and processes through automation, tooling, and best practices.

Required Skills

  • Strong experience with cloud platforms: AWS, Azure, or GCP.
  • Proficiency in Linux/Unix system administration.
  • Expertise in infrastructure as code (Terraform, CloudFormation, Ansible).
  • Experience with containerization and orchestration: Docker, Kubernetes, OpenShift.
  • Hands-on with monitoring and observability tools: Prometheus, Grafana, ELK Stack, Splunk.
  • Proficiency in CI/CD tools: Jenkins, GitLab CI, Azure DevOps.
  • Scripting and automation skills in Python, Bash, or Go.
  • Strong understanding of networking, security, and high-availability architectures.
  • Familiarity with incident management, alerting, and troubleshooting production systems.

Good To Have

  • Experience with Service Mesh (Istio, Linkerd).
  • Knowledge of microservices architecture and cloud-native applications.
  • Familiarity with Site Reliability Engineering principles, SLOs, SLAs, and error budgets.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You

bengaluru, karnataka, india