Job
Description
As an SRE Team Lead at our company, you will play a crucial role in guiding the team to build reliable, scalable, high-performance systems while fostering engineering excellence and operational maturity. You will be working extensively with GCP and modern cloud-native infrastructure. **Key Responsibilities:** - **Cloud Platforms (Mandatory):** - You should have at least 5 years of hands-on experience with GCP, covering areas like Compute, GKE, VPC, IAM, Cloud Storage, Pub/Sub, etc. - Bonus points if you have experience with AWS for hybrid or migration scenarios. - **Containerization & Orchestration:** - You must possess a minimum of 3 years of experience with Docker and Kubernetes, including design, deployment, and monitoring. - **Infrastructure as Code (IaC):** - You should have at least 3 years of experience with Terraform (preferred) or CloudFormation. - **Monitoring, Observability & Logging:** - You should have 4 years of experience with tools like Prometheus, Grafana, New Relic, ELK, or Splunk. - Strong understanding of SLOs, SLIs, golden signals, and end-to-end system health is essential. - **Automation & Scripting:** - You must have 3 years of experience with Python or Bash. - Experience with configuration management tools like Ansible, Salt, or Chef is required. - **CI/CD:** - You should have at least 3 years of experience working with pipelines/tools like Jenkins or GitLab CI. - **Reliability & Performance:** - You must have a minimum of 5 years of experience in designing scalable, highly available distributed systems. - Expertise in performance tuning and resource optimization is critical. - **Incident Response:** - You should have at least 3 years of experience as a primary on-call responder. - Experience with PagerDuty and mature incident management practices is preferred. - **Collaboration & Leadership:** - Proven experience in mentoring engineers, driving cross-team alignment, and fostering a culture of growth and knowledge-sharing is required. - **Documentation & Problem Solving:** - Strong documentation habits and a methodical, problem-first mindset are essential. - You should have the ability to troubleshoot complex distributed systems end-to-end effectively. **Qualifications Required:** - 5+ years hands-on experience with GCP (Compute, GKE, VPC, IAM, Cloud Storage, Pub/Sub, etc.) - Experience with AWS for hybrid or migration scenarios is a plus - 3+ years experience with Docker and Kubernetes (design, deployment, monitoring) - 3+ years with Terraform (preferred) or CloudFormation - 4+ years with tools like Prometheus, Grafana, New Relic, ELK, or Splunk - 3+ years with Python or Bash - Experience with configuration management tools (Ansible, Salt, Chef) - 3+ years working with pipelines/tools like Jenkins or GitLab CI - 5+ years designing scalable, highly available distributed systems - 3+ years as a primary on-call responder - Strong documentation habits and a problem-first mindset **About the Company:** At GlobalLogic, we prioritize a culture of caring, continuous learning, interesting work, balance, and flexibility. You'll be part of a high-trust organization with a rich array of programs, training curricula, and opportunities for personal and professional growth. Join us in creating innovative solutions and shaping the world through intelligent products and services.,