Site Reliability Engineer

2 - 7 years

2 - 7 Lacs

Posted:21 hours ago| Platform: Foundit logo

Apply

Skills Required

ci/cd

Work Mode

On-site

Job Type

Full Time

Job Description

Key Responsibilities:

  • System Reliability: Monitor, maintain, and enhance system uptime and availability, minimizing downtime.
  • Infrastructure as Code (IaC): Design, implement, and manage infrastructure using tools such as CloudFormation, Terraform, Ansible, or Puppet.
  • Automation: Develop and maintain CI/CD pipelines and deployment scripts to streamline software releases.
  • Containerization: Manage and orchestrate application containers using Docker Swarm and AWS ECS.
  • Monitoring and Alerting: Set up and maintain monitoring tools like CloudWatch, Datadog, Zenduty, and New Relic for proactive issue resolution.
  • Scalability and Performance: Optimize application and infrastructure performance collaboratively with development teams.
  • Security: Implement and maintain security best practices across development and operations pipelines.
  • Incident Management: Participate in incident response, root cause analysis, and preventive measures.
  • Documentation: Maintain clear documentation of system architecture, deployment processes, and best practices.
  • Collaboration: Facilitate communication and knowledge sharing between development, operations, and other teams.

Qualifications:

  • Bachelor's degree in Computer Science, Information Technology, or related field.
  • Minimum 2 years of experience in DevOps, System Operations, or SRE roles.
  • Strong Linux knowledge and shell scripting skills.
  • Proficiency in AWS cloud platform and AWS services.
  • Understanding of networking, security, and secure infrastructure best practices.
  • Knowledge of ELK stack and Kafka.
  • Experience with Docker Swarm or AWS ECS for containerization.
  • Hands-on experience with CloudFormation, Terraform, Ansible.
  • Familiarity with CI/CD pipelines and version control (GitLab CI, Jenkins, Git).
  • Working knowledge of databases (MySQL, Postgres).
  • Willingness to participate in L1 incident response rotation.
  • AWS certifications (e.g., AWS Certified DevOps Engineer, AWS Certified Solutions Architect) are a plus.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
People Group logo
People Group

Human Resources / Staffing

San Francisco

RecommendedJobs for You