Sr Site Reliability Engineer I

8 - 13 years

50 - 55 Lacs

Posted:1 day ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

  • SRE Strategy and Leadership: Develop and implement a comprehensive SRE strategy aligned with the companys goals and objectives. Lead junior members of the team to drive the reliability, performance, and scalability of technology solutions.
  • Observability and Monitoring: Establish observability practices to ensure real-time insights into system performance, availability, and customer experience. Implement monitoring tools, metrics, and dashboards to proactively identify and address potential issues.
  • Reliability Engineering Best Practices: Promote and implement standard methodologies, including error budgeting, chaos engineering, and disaster recovery planning. Cultivate a culture of resilience and reliability within technology.
  • Automation and Efficiency: Champion automation initiatives to streamline operational workflows, deployment processes, and incident response tasks. Leverage automation tools and orchestration to improve reliability and reduce manual intervention.
  • Production Support Optimization: Lead all aspects of end-to-end production support process, including incident management, problem resolution, and service-level agreement (SLA) compliance. Drive continuous improvement initiatives to enhance operational effectiveness and reduce mean time to resolution (MTTR).
  • Colleague Journeys: Collaborate with multi-functional teams to enhance colleague journeys through seamless and reliable technology experiences.
Qualifications:
  • 8-13 years of experience and degree or equivalent experience in Computer Science, Information Technology, or related field.
  • Advanced certifications in SRE or related are a plus.
  • Leadership and people management skills, with the ability to inspire and empower successful SRE teams.
Required Skills:
  • Hands-on coding of highly available distributed systems in any of the programming languages: Java/Python/JavaScript
  • Knowledge on modern observability stack splunk, elastic search, Prometheus, Grafana
  • Knowledge of cloud-based SRE practices and experience with public cloud platforms such as AWS, Azure, or Google Cloud.
  • Familiarity with microservices architecture and design.
  • Demonstrated expertise in driving culture change, DevOps practices, and continuous improvement in SRE and production support functions.
  • Deep understanding of observability tools and methodologies, including experience with logging, monitoring, tracing, and performance analysis platforms.
  • Knowledge of ServiceNow or any other ticketing tools, ITIL experience.

Mock Interview

Practice Video Interview with JobPe AI

Start JavaScript Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Javascript Skills

Practice Javascript coding challenges to boost your skills

Start Practicing Javascript Now
AMERICAN EXPRESS logo
AMERICAN EXPRESS

Financial Services

New York NY

RecommendedJobs for You