Posted:7 hours ago|
Platform:
Work from Office
Full Time
Hiring SRE Lead for the Hyderabad location
Requirements:Qualifications:
• Bachelor’s or Master’s degree in Computer Science, Information Systems, Engineering, or a related technical field.• 12+ years of total experience in infrastructure, platform engineering, or software development roles, including at least 3–5 years in an SRE or DevOps leadership role.• Deep understanding of Linux/Unix systems, networking fundamentals, and containerized environments (Docker, Kubernetes).• Proven experience managing large-scale production systems, including high-availability, distributed, and event-driven architectures.• Strong hands-on experience with cloud platforms such as AWS, GCP, or Azure and infrastructure-as-code tools (e.g., Terraform, CloudFormation).• Proficiency in at least one scripting or programming language (Python, Go, Shell, Java, etc.).• Demonstrated experience building observability solutions (metrics, logs, traces) and integrating them into proactive monitoring and alerting systems.• Solid understanding of incident response practices, runbook automation, on-call rotation management, and disaster recovery planning.• Familiarity with modern CI/CD tools (Jenkins, GitLab CI, Argo CD, Spinnaker) and release automation best practices.• Strong problem-solving and debugging skills, especially in high-pressure, production-critical environments.• Excellent leadership, communication, and cross-functional collaboration skills.
Responsibilities:
• Lead the SRE function, owning end-to-end service reliability, observability, incident management, capacity planning, and production readiness.• Establish SLOs, SLIs, and error budgets in collaboration with product and engineering teams to drive service quality goals.• Build and maintain highly available, fault-tolerant, and self-healing infrastructure leveraging IaC, automation, and scalable architectures.• Design and implement monitoring, alerting, and observability platforms using tools like Prometheus, Grafana, Datadog, ELK/EFK stack, or equivalent.• Drive the evolution of CI/CD pipelines, release automation, and safe deployment practices using GitOps or similar methodologies.• Lead and refine the incident management lifecycle, including root cause analysis (RCA), incident postmortems, and production runbooks.• Optimize cost, performance, and scalability of cloud infrastructure across hybrid or multi-cloud environments (AWS, GCP, Azure).• Champion DevSecOps and SRE best practices, advocating for early detection, chaos engineering, and continuous improvement in resilience engineering.• Mentor and develop a team of SREs and platform engineers; conduct performance reviews and technical coaching.• Serve as a key advisor in architectural reviews to ensure systems are built with reliability, scalability, and observability in mind.• Maintain strong partnerships with Security, Product, QA, and Engineering teams to support agile development and delivery.
Globallogic
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python NowHyderabad
20.0 - 35.0 Lacs P.A.
Pune, Maharashtra, India
Salary: Not disclosed
Hyderabad, Telangana, India
Salary: Not disclosed
0.5 - 3.0 Lacs P.A.
10.0 - 11.0 Lacs P.A.
15.0 - 19.0 Lacs P.A.
6.0 - 15.0 Lacs P.A.
Bengaluru
20.0 - 35.0 Lacs P.A.
Noida, Uttar Pradesh, India
Salary: Not disclosed
Bengaluru
60.0 - 90.0 Lacs P.A.