Site Reliability Engineer

3 - 6 years

20 - 25 Lacs

Posted:1 hour ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Role & responsibilities

1. Infrastructure & Cloud Operations

  • Design, implement, and maintain highly available, scalable infrastructure on AWS cloud platform
  • Manage AWS services including EC2, RDS, S3, VPC, CloudFormation, Lambda, ECS/EKS, and monitoring services
  • Optimize cloud resource utilization and cost management strategies
  • Ensure security best practices and compliance across cloud infrastructure

2. Production Deployment & CI/CD

  • Lead production deployment processes for enterprise software applications
  • Design and implement robust CI/CD pipelines using tools such as Jenkins, GitLab CI, AWS CodePipeline, or similar platforms
  • Establish deployment strategies including blue-green deployments, canary releases, and rollback procedures
  • Monitor and troubleshoot production systems to ensure minimal downtime and optimal performance

3. Infrastructure as Code & Automation

  • Develop and maintain infrastructure as code using tools like Terraform, CloudFormation, or AWS CDK
  • Create automation scripts and tools to reduce manual operational overhead
  • Implement configuration management using tools such as Ansible, Puppet, or Chef
  • Build self-healing systems and automated monitoring solutions

4. Scripting & Programming

  • Write efficient scripts in Python, Bash, Go, or other relevant programming languages
  • Develop tools for system monitoring, alerting, and operational efficiency
  • Contribute to internal tooling and automation frameworks
  • Debug and optimize existing automation and deployment scripts

5. Networking & Security

  • Configure and manage cloud networking components including VPCs, subnets, security groups, and load balancers
  • Implement network security best practices and troubleshoot connectivity issues
  • Manage DNS, CDN, and other network services
  • Ensure proper network segmentation and access controls

6. Collaboration & Communication

  • Work closely with DevOps, Database Administrators, System Administrators, and Software Development teams
  • Participate in on-call rotation and incident response procedures
  • Lead post-incident reviews and implement preventive measures
  • Communicate technical concepts clearly to both technical and non-technical stakeholders

Required Skills and Experience:

  • Minimum 3 years of experience in Site Reliability Engineering, DevOps, or similar role
  • 5+ years preferred with demonstrated progression in responsibility and technical expertise
  • Extensive hands-on experience with AWS cloud services and SysOps operations
  • Proven track record in production deployment of enterprise software systems
  • Strong understanding of CI/CD concepts and implementation experience
  • Proficiency in infrastructure as code tools and methodologies
  • Advanced scripting abilities in Python, Bash, Go, or similar programming languages
  • Solid understanding of cloud networking concepts, security groups, VPCs, and load balancing
  • Experience with containerization technologies (Docker, Kubernetes)
  • Knowledge of monitoring and observability tools (CloudWatch, Prometheus, Grafana, ELK stack)
  • Familiarity with database administration and performance optimization
  • Understanding of security best practices and compliance frameworks
  • Excellent professional written and spoken English communication skills
  • Strong analytical and problem-solving abilities
  • Experience working in cross-functional team environments
  • Ability to work independently and manage multiple priorities effectively
  • Customer-focused mindset with attention to detail

Good to Have:

  • AWS certifications (Solutions Architect, SysOps Administrator, or DevOps Engineer)
  • Experience with microservices architecture and serverless technologies
  • Knowledge of disaster recovery and business continuity planning
  • Background in performance tuning and capacity planning
  • Experience with agile development methodologies
  • Previous experience in enterprise environments with high availability requirements

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You