Site Reliability Engineer- Platform Engineering

4 years

0 Lacs

Posted:6 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

This role is for one of Weekday's clientsMin Experience: 4 yearsJobType: full-time

Requirements

We are looking for an experienced and motivated

Site Reliability Engineer (SRE) - Platform Engineering

to join our growing technology team. In this role, you will be responsible for designing, building, and maintaining scalable, resilient, and secure infrastructure platforms that support business-critical applications and services. The SRE will work at the intersection of software development and systems engineering to ensure the availability, performance, and reliability of our platforms.This role requires deep expertise in automation, cloud-native technologies, monitoring, and platform operations. The ideal candidate is passionate about solving complex infrastructure challenges, streamlining deployment pipelines, and building highly reliable systems.

Key Responsibilities

  • Platform Engineering: Design, implement, and optimize platform services and infrastructure to ensure high availability, scalability, and performance.
  • Reliability & Resilience: Build self-healing and fault-tolerant systems while proactively identifying and eliminating reliability risks.
  • Automation: Develop Infrastructure as Code (IaC) solutions using tools like Terraform, Ansible, or CloudFormation to automate infrastructure provisioning and configuration.
  • Monitoring & Observability: Implement monitoring, logging, and alerting systems using tools such as Prometheus, Grafana, ELK, or Datadog to track platform health and performance.
  • Incident Management: Troubleshoot incidents, perform root cause analysis, and ensure timely resolution while minimizing downtime and customer impact.
  • DevOps & CI/CD: Collaborate with development teams to enhance CI/CD pipelines for seamless deployment and integration, ensuring reliability in production environments.
  • Cloud Infrastructure: Manage cloud environments (AWS, Azure, or GCP) and optimize for cost, security, and performance.
  • Security & Compliance: Implement security best practices, monitor vulnerabilities, and ensure compliance with industry standards across infrastructure and platforms.
  • Collaboration: Partner with software engineers, product teams, and IT operations to align infrastructure capabilities with business requirements.
  • Continuous Improvement: Analyze existing infrastructure and processes, identifying areas for improvement, and implementing best practices for operational efficiency.
  • Capacity Planning: Forecast infrastructure requirements, ensuring the platform is always prepared to handle current and future workloads.

Qualifications & Skills

  • Bachelor's degree in Computer Science, Information Technology, or related field. Equivalent practical experience may be considered.
  • 4+ years of experience in Site Reliability Engineering, DevOps, or Platform Engineering.
  • Strong proficiency with cloud platforms (AWS, Azure, or GCP).
  • Hands-on experience with Infrastructure as Code (Terraform, Ansible, or CloudFormation).
  • Solid understanding of Linux systems administration, networking, and container orchestration (Docker, Kubernetes).
  • Experience with CI/CD pipelines (Jenkins, GitLab CI, or similar tools).
  • Proficiency in scripting/programming languages such as Python, Go, Bash, or Java.
  • Strong knowledge of monitoring and observability tools (Prometheus, Grafana, ELK, Datadog, Splunk).
  • Familiarity with incident response and on-call support practices.
  • Knowledge of security best practices and compliance frameworks.
  • Excellent problem-solving, debugging, and analytical skills.
  • Strong communication and collaboration abilities to work effectively across cross-functional teams.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You