Lead Site Reliability Engineer

8 - 12 years

0 Lacs

Posted:1 day ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

The role of SRE Lead (Engineering & Reliability) at Landmark Group requires an experienced and dynamic individual to oversee the reliability, scalability, and performance of critical systems. As the SRE Lead, you will be responsible for implementing SRE practices, leading a team of engineers, and driving automation, monitoring, and incident response strategies. This position combines software engineering and systems engineering expertise to build and maintain high-performing, reliable systems. Your key responsibilities will include maintaining high availability and reliability of critical services, defining and monitoring SLIs, SLOs, and SLAs, proactively identifying and resolving performance bottlenecks, and establishing incident management processes and on-call rotations. You will lead incident response, root cause analysis, post-incident reviews, and drive the implementation of actionable insights. Additionally, you will develop and implement automated solutions to reduce manual operational tasks, enhance system observability through metrics, logging, and distributed tracing tools, and optimize CI/CD pipelines for seamless deployments. Collaboration with software engineering teams, product/engineering teams, and ensuring seamless integration of monitoring and alerting systems across teams will also be a part of your role. As a leader and team builder, you will manage, mentor, and grow a team of SREs, promote SRE best practices, and foster a culture of reliability and performance across the organization. Capacity planning, cost optimization, technical expertise in cloud platforms, Kubernetes, infrastructure-as-code tools, Java, distributed systems, databases, load balancing, monitoring observability, automation, CI/CD, incident management, leadership, and communication skills are essential for this role. Preferred qualifications include experience with database optimization, Kafka or other messaging systems, autoscaling techniques, previous experience in SRE, DevOps, or infrastructure engineering leadership, and understanding of compliance and security best practices in distributed systems. Join us at Landmark Group to be a key driver in building and scaling reliable systems in a fast-paced environment, work with cutting-edge technologies, and lead a high-impact team while fostering a culture of reliability and innovation.,

Mock Interview

Practice Video Interview with JobPe AI

Start Java Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Java Skills

Practice Java coding challenges to boost your skills

Start Practicing Java Now

RecommendedJobs for You