Site Reliability Engineer (SRE)

4 - 8 years

0 Lacs

Posted:4 days ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Role Overview: You are sought for a role focusing on ensuring reliability, scalability, and performance of the cloud-based infrastructure and services on Google Cloud Platform (GCP). Your responsibilities will include designing, deploying, and managing infrastructure, implementing monitoring solutions, developing and maintaining scripts for provisioning, building CI/CD pipelines, ensuring cloud security, analyzing system performance, maintaining backup strategies, and defining service level objectives. Key Responsibilities: - Design, deploy, and manage scalable and resilient infrastructure on Google Cloud Platform (GCP). - Implement robust monitoring solutions (e.g., Prometheus, Grafana) to proactively detect and resolve issues. - Develop and maintain Terraform, Ansible, or other Infrastructure-as-Code (IaC) scripts for provisioning and managing cloud resources. - Build and optimize CI/CD pipelines using GitHub Actions, Cloud Build, or Jenkins for automated deployments. - Ensure cloud security best practices, IAM policies, and compliance standards are met. - Analyze system performance, troubleshoot latency issues, and implement improvements. - Design and maintain backup strategies, failover mechanisms, and multi-region deployments. - Define and monitor service level objectives (SLOs) and indicators (SLIs) to maintain high reliability. Qualifications Required: - Bachelors Degree in Computer Science from a four year college or university or related experience and/or training; or equivalent combination of education and experience. - 4-8 years of experience in SRE, DevOps, or Cloud Engineering roles. - Strong hands-on experience with Google Cloud Platform (GCP) services. - Experience with Terraform, Ansible, or other Infrastructure-as-Code (IaC) tools. - Proficiency in scripting/programming languages like Python, Go, or Bash. - Expertise in CI/CD pipelines and container orchestration using Docker and Kubernetes (GKE preferred). - Hands-on experience with monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, Cloud Operations Suite). - Strong knowledge of networking concepts (VPC, load balancers, DNS, VPNs, etc.) in a cloud environment. - Experience with incident response, on-call rotations, and root cause analysis. - Solid understanding of security best practices for cloud-based environments. - Creative thinker and team player. - Should be ready to adopt new technologies. - Excellent communication skills. - Resourcefulness and troubleshooting aptitude. Additional Company Details: The company emphasizes flexibility with a hybrid work policy in place to ensure teams can perform at their best. They prioritize work-life balance by encouraging fun activities, games, events, and outings to make the work environment vibrant and engaging. Role Overview: You are sought for a role focusing on ensuring reliability, scalability, and performance of the cloud-based infrastructure and services on Google Cloud Platform (GCP). Your responsibilities will include designing, deploying, and managing infrastructure, implementing monitoring solutions, developing and maintaining scripts for provisioning, building CI/CD pipelines, ensuring cloud security, analyzing system performance, maintaining backup strategies, and defining service level objectives. Key Responsibilities: - Design, deploy, and manage scalable and resilient infrastructure on Google Cloud Platform (GCP). - Implement robust monitoring solutions (e.g., Prometheus, Grafana) to proactively detect and resolve issues. - Develop and maintain Terraform, Ansible, or other Infrastructure-as-Code (IaC) scripts for provisioning and managing cloud resources. - Build and optimize CI/CD pipelines using GitHub Actions, Cloud Build, or Jenkins for automated deployments. - Ensure cloud security best practices, IAM policies, and compliance standards are met. - Analyze system performance, troubleshoot latency issues, and implement improvements. - Design and maintain backup strategies, failover mechanisms, and multi-region deployments. - Define and monitor service level objectives (SLOs) and indicators (SLIs) to maintain high reliability. Qualifications Required: - Bachelors Degree in Computer Science from a four year college or university or related experience and/or training; or equivalent combination of education and experience. - 4-8 years of experience in SRE, DevOps, or Cloud Engineering roles. - Strong hands-on experience with Google Cloud Platform (GCP) services. - Experience with Terraform, Ansible, or other Infrastructure-as-Code (IaC) tools. - Proficiency in scripting/programming languages like Python, Go, or Bash. - Expertise in CI/CD pipelines and container orchestration using Docker and Kubernetes (GKE preferred). - Hands-on experience with monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, Cloud Operations Suite). - Strong knowledge of networking concepts (VPC, load balancers, DNS, VPNs, etc.) in a cloud environment. - Experience with inci

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You