As the Team lead, Site Reliability Operations (SRO) for our multi-cloud environment (AWS GCP), you will be responsible for leading and mentoring a team of Operations Engineers (SRO). You will oversee the day-to-day monitoring, maintenance, and support of Finalsite infrastructure and deployed products, ensuring operational excellence and adherence to SLAs. This role requires a strong leader who can drive process improvements, facilitate effective communication between Operations and SRE teams, and strategically manage resources across Amazon Web Services (AWS) and Google Cloud Platform (GCP).

Key Responsibilities:

Lead, mentor, and manage a team of Operations Engineers (SRO), bringing a culture of continuous improvement, accountability, and collaboration.
Oversee the 24/5 monitoring, incident response, and problem management processes for multi-cloud infrastructure and applications in AWS and GCP.
Develop, implement, and enforce operational procedures, run-books, and best practices to ensure stability, performance, and security.
Collaborate closely with developers, support, data integration, SRE, and other engineering teams to streamline deployment processes, improve system reliability, and reduce operational overhead.
Drive automation initiatives within the operations team to enhance efficiency and reduce manual effort.
Manage infrastructure capacity and optimize resource utilization and costs across AWS and GCP.
Conduct performance reviews, set goals, and support the professional development of team members.
Act as an escalation point for critical operational issues and lead incident management efforts to resolution.
Ensure compliance with security policies, regulatory requirements, and internal audit standards.
Prepare and present operational reports, metrics, and dashboards to senior management.
Oversee cleanup activities for client data and efficient release of unused infrastructure resources.
Provides 24/5 support to the team through flexible working hours.

Qualifications:

Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
6+ years of experience in Operations or SRE role with at least 2 years in leadership or senior engineering role.
Strong hands-on experience managing and supporting infrastructure in multi-cloud environments (AWS and GCP).
Proven ability to lead and motivate a technical team.
Extensive experience with monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, CloudWatch, NewRelic).
Solid understanding of networking, security, and database concepts in cloud environments.
Experience with incident management, problem management, and change management processes.
Ability to communicate and collaborate effectively across technical and non-technical teams is essential. Strong verbal English skills are mandatory.
Ability to work effectively in a fast-paced, dynamic environment.

More Jobs at Finalsite

Team lead, Site Reliability Operations

chennai

2.0 - 7.0 yrs

INR 4 - 9 Lacs

Senior Software Engineer (Python) - India

chennai

Experience: Not specified

Salary: Not disclosed

Senior Manager, Software Engineering - India

chennai, tamil nadu, india

5.0 - 5.0 yrs

Salary: Not disclosed

Team Leader (Python) - India

india

8.0 - 10.0 yrs

Salary: Not disclosed

Team Leader (Python) - India

india

8.0 - 10.0 yrs

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

Finalsite

Software Development

Glastonbury CT

Login to

Please Verify Your Phone or Email

Confirm Action

Team lead, Site Reliability Operations