Home
Jobs

Senior Site Reliability Engineer

6 - 11 years

7 - 11 Lacs

Posted:12 hours ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

About US
We at Sails, assist leading companies in designing, developing, and operating the products and services that will define tomorrows world. we specialize on envisioning, designing, engineering and managing digital goods and experiences for high-growth organizations striving to disrupt through innovation and velocity. Our experience helps businesses in fast-growing areas including Hi-tech, manufacturing, banking & financial services, insurance, consumer services, public services, and healthcare to achieve their goals.

Our USP s

Digital Innovation
Passionate Approach
Transparent Business Model

Job Summary

We are looking for an experienced and driven Senior Site Reliability Engineer (SRE) to architect, implement, and maintain robust cloud infrastructure. This role demands a deep understanding of AWS, Kubernetes, ECS, and the ability to build scalable, secure, and highly available infrastructure from scratch. The ideal candidate will be a strong advocate for DevOps principles, automation, and reliability, and will possess the skills to support and optimize complex microservices-based architectures.

Key Responsibilities

  • Infrastructure Design & Implementation
  • Design and build highly scalable, fault-tolerant, and secure cloud infrastructure using AWS, Kubernetes, and ECS.
  • Lead efforts in infrastructure as code (IaC) using tools like Terraform or CloudFormation.
  • Develop and enforce best practices for infrastructure provisioning, security, and cost optimization.

System Reliability & Performance

  • Ensure availability, performance, scalability, and security of production systems.
  • Implement observability strategies including monitoring, logging, and alerting using tools such as Prometheus, Grafana, ELK, or Datadog.
  • Analyse system performance metrics and proactively identify potential issues and bottlenecks.

DevOps & Automation

  • Build and maintain CI/CD pipelines to streamline code deployments across environments.
  • Drive automation in infrastructure provisioning, configuration management, and operational tasks.
  • Ensure repeatable and reliable deployments using containers and orchestration tools like Kubernetes and ECS.

Service Management

  • Own the SRE lifecycle, including incident management, postmortems, root cause analysis, and runbook creation.
  • Collaborate closely with development and QA teams to ensure seamless microservices integration, deployment, and lifecycle management.
  • Maintain service-level objectives (SLOs), service-level agreements (SLAs), and error budgets.

Security & Compliance

  • Implement and enforce cloud security best practices for networking, identity and access management, and data protection.
  • Support audits, compliance assessments, and vulnerability remediation.
  • Monitor for security anomalies and work with security teams to respond to threats.

Technical Skills

  • 6+ years of hands-on experience in Site Reliability Engineering, DevOps, or Cloud Engineering.
  • Expertise in AWS services such as EC2, S3, RDS, IAM, VPC, Lambda, CloudWatch, etc.
  • Strong knowledge of Kubernetes and container orchestration best practices.
  • Experience managing services on Amazon ECS (Fargate or EC2).
  • Proficient in infrastructure-as-code tools like Terraform, CloudFormation, or Pulumi.
  • Skilled in scripting languages such as Python, Bash, or Go.
  • Solid grasp of networking, load balancing, DNS, and firewall rules in cloud environments.
  • Deep understanding of microservices architectures, API gateways, and service meshes.

Soft Skills

  • Proven leadership and cross-functional collaboration skills.
  • Strong problem-solving and incident-resolution mindset.
  • Clear communication, documentation, and stakeholder reporting abilities.
  • Passion for continuous improvement and automation.

Preferred Qualifications

  • AWS certifications such as AWS Certified DevOps Engineer, Solutions Architect - Professional, or equivalent.
  • Familiarity with service meshes like Istio or Linkerd.
  • Experience with serverless architectures and event-driven systems.
  • Knowledge of regulatory compliance (SOC2, ISO 27001, GDPR) in cloud environments.

Open Date:

Jul-04-2025

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Sails Software Solutions
Sails Software Solutions

Software Development

Novi MI

RecommendedJobs for You