Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in visakhapatnam
>
Sails Software Inc
>
Site Reliability Engineer

Site Reliability Engineer

Sails Software Inc

6 years

0 Lacs

visakhapatnam andhra pradesh india

Posted:1 month ago| Platform:

Apply

Skills Required

reliability aws kubernetes devops automation support microservices design code terraform provisioning security scalability strategies monitoring logging metrics drive configuration management orchestration service analysis development integration deployment networking data audits compliance engineering iam vpc scripting python firewall api leadership collaboration communication documentation reporting certifications datadog gcp jenkins automate test containerization docker helm sonarqube sonar

Work Mode

On-site

Job Type

Full Time

Job Description

SRE- AWS

Job Summary

We are looking for an experienced and driven Senior Site Reliability Engineer (SRE) to architect, implement, and maintain robust cloud infrastructure. This role demands a deep understanding of AWS, Kubernetes, ECS, and the ability to build scalable, secure, and highly available infrastructure from scratch. The ideal candidate will be a strong advocate for DevOps principles, automation, and reliability, and will possess the skills to support and optimize complex microservices-based architectures.

Key Responsibilities

• Infrastructure Design & Implementation

• Design and build highly scalable, fault-tolerant, and secure cloud infrastructure using AWS, Kubernetes, and ECS.

• Lead efforts in infrastructure as code (IaC) using tools like Terraform or CloudFormation.

• Develop and enforce best practices for infrastructure provisioning, security, and cost optimization.

System Reliability & Performance

• Ensure availability, performance, scalability, and security of production systems.

• Implement observability strategies including monitoring, logging, and alerting using tools such as Prometheus, Grafana, ELK, or Datadog.

• Analyse system performance metrics and proactively identify potential issues and bottlenecks.

DevOps & Automation

• Build and maintain CI/CD pipelines to streamline code deployments across environments.

• Drive automation in infrastructure provisioning, configuration management, and operational tasks.

• Ensure repeatable and reliable deployments using containers and orchestration tools like Kubernetes and ECS.

Service Management

• Own the SRE lifecycle, including incident management, postmortems, root cause analysis, and runbook creation.

• Collaborate closely with development and QA teams to ensure seamless microservices integration, deployment, and lifecycle management.

• Maintain service-level objectives (SLOs), service-level agreements (SLAs), and error budgets.

Security & Compliance

• Implement and enforce cloud security best practices for networking, identity and access management, and data protection.

• Support audits, compliance assessments, and vulnerability remediation.

• Monitor for security anomalies and work with security teams to respond to threats.

Technical Skills

• 6+ years of hands-on experience in Site Reliability Engineering, DevOps, or Cloud Engineering.

• Expertise in AWS services such as EC2, S3, RDS, IAM, VPC, Lambda, CloudWatch, etc.

• Strong knowledge of Kubernetes and container orchestration best practices.

• Experience managing services on Amazon ECS (Fargate or EC2).

• Proficient in infrastructure-as-code tools like Terraform, CloudFormation, or Pulumi.

• Skilled in scripting languages such as Python, Bash, or Go.

• Solid grasp of networking, load balancing, DNS, and firewall rules in cloud environments.

• Deep understanding of microservices architectures, API gateways, and service meshes.

Soft Skills

• Proven leadership and cross-functional collaboration skills.

• Strong problem-solving and incident-resolution mindset.

• Clear communication, documentation, and stakeholder reporting abilities.

• Passion for continuous improvement and automation.

Preferred Qualifications

• AWS certifications such as AWS Certified DevOps Engineer, Solutions Architect – Professional, or equivalent.

• Familiarity with service meshes like Istio or Linkerd.

• Experience with serverless architectures and event-driven systems.

• Knowledge of regulatory compliance (SOC2, ISO 27001, GDPR) in cloud environments.

Skills – AWS Cloud, CICD, EC2, Kubernete, Grafana, Datadog, Python

Key Responsibilities:

Cloud Platform: GCP

• Infrastructure Automation: Design, implement, and manage infrastructure as code using Terraform to provision and manage GCP resources.

• Container Orchestration: Deploy and manage Kubernetes clusters, ensuring efficient operation of containerized applications.

• Continuous Integration/Continuous Deployment (CI/CD): Develop and maintain CI/CD pipelines using Jenkins to automate application build, test, and deployment processes.

• Containerization: Collaborate with development teams to containerize applications using Docker and manage deployments with Helm Charts.

• Code Quality Assurance: Integrate and manage SonarQube to ensure code quality and security standards are met.

• Monitoring and Logging: Implement and manage monitoring solutions using Datadog to ensure system health, performance, and security.

• Collaboration: Work closely with cross-functional teams, including developers, QA, and operations, to streamline processes and improve productivity.

Requirements:

• Experience: 5+ years in DevOps or cloud engineering roles, with at least 3 years of relevant experience in the specified technologies.

• Technical Proficiency:

o Hands-on experience with GCP services and architecture.

o Proficiency in Terraform for infrastructure as code implementations.

o Strong understanding and experience with Kubernetes and Docker.

o Experience in setting up and managing CI/CD pipelines using Jenkins.

o Familiarity with Helm Charts for application deployment.

o Experience with SonarQube for code quality analysis.

o Proficiency in monitoring and logging tools, particularly Datadog.

• Scripting Skills: Proficiency in scripting languages such as Bash or Python is an added advantage.

o Strong problem-solving abilities and analytical thinking.

o Excellent communication skills, both verbal and written.

o Ability to work collaboratively in a team environment.

o Strong organizational and time management skills.

Skills – Terraform, Kubernetes, Cluster, Docker, GCP, Sonar

Technical Skills

· 6+ years of hands-on experience in Site Reliability Engineering, DevOps, or Cloud Engineering.

· Expertise in AWS services such as EC2, S3, RDS, IAM, VPC, Lambda, CloudWatch, etc.

· Strong knowledge of Kubernetes and container orchestration best practices.

· Experience managing services on Amazon ECS (Fargate or EC2).

· Proficient in infrastructure-as-code tools like Terraform, CloudFormation, or Pulumi.

· Skilled in scripting languages such as Python, Bash, or Go.

· Solid grasp of networking, load balancing, DNS, and firewall rules in cloud environments.

· Deep understanding of microservices architectures, API gateways, and service meshes.

Soft Skills

· Proven leadership and cross-functional collaboration skills.

· Strong problem-solving and incident-resolution mindset.

· Clear communication, documentation, and stakeholder reporting abilities.

· Passion for continuous improvement and automation.

Preferred Qualifications

· AWS certifications such as AWS Certified DevOps Engineer, Solutions Architect – Professional, or equivalent.

· Familiarity with service meshes like Istio or Linkerd.

· Experience with serverless architectures and event-driven systems.

· Knowledge of regulatory compliance (SOC2, ISO 27001, GDPR) in cloud environments.

Skills – AWS Cloud, CICD, EC2, Kubernete, Grafana, Datadog, Python

More Jobs at Sails Software Inc

Technical Lead

Visakhapatnam, Andhra Pradesh, India

8.0 - 8.0 yrs

Salary: Not disclosed

Senior Java Software Engineer

Visakhapatnam, Andhra Pradesh, India

5.0 - 5.0 yrs

Salary: Not disclosed

Dotnet Developer

Vishakhapatnam, Andhra Pradesh, India

5.0 - 5.0 yrs

Salary: Not disclosed

Python Developer

India

12.0 - 12.0 yrs

Salary: Not disclosed

Dotnet Developer

Visakhapatnam, Andhra Pradesh, India

5.0 - 5.0 yrs

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

Sails Software Inc

Login to

Please Verify Your Phone or Email

Confirm Action

Site Reliability Engineer