Site Reliability Engineer-L3

4 - 9 years

20 - 30 Lacs

Posted:1 week ago| Platform: Naukri logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

We are looking for a skilled TechOps Lead to manage and maintain our OTT platforms technical Operation. The ideal candidate will have experience in Application Support, Content Delivery Networks, Logging & Triaging, and Cloud-based technologie s. You will be responsible for ensuring high availability, scalability, and performance of our platform. You will be responsible for triaging issues and finding issues using trend analysis. Role & Responsibilities: Must be aware of end to end incident handling. Monitor, identify, and respond to incidents promptly to minimize business impact. Prioritize, classify, and escalate incidents based on severity and urgency. Coordinate and facilitate communication between stakeholders during incidents. Perform root cause analysis and implement preventive measures. Document incidents, resolutions, and generate performance reports. Provide Technical support by handling and consulting on BAU, Incidents for respective applications. Act as an escalation point for user issues and requests and from L1/L2 support. Report issues to senior management. Define, document, and maintain SLAs, technical documentation, and knowledge bases to support platform. Monitor application performance, identifying areas for improvement. Build and maintain effective and productive relationships with stakeholders in business, development, product, and third-party system providers. Facilitate coordination across L1/L2 and L3/engineering Teams to investigate and resolve ongoing platform or application issues impacting business. Candidate will have to work in shifts as part of Rota covering 24*7. In event of major outage or issues we may ask for flexibility to help provide appropriate cover. Weekend on-call coverage needs to be provided on rotational/need basis. Understand reliability metrics and enhance automation solutions for auto-healing and incident resolution. Understand and improve applications and plan for faster MTTD, MTTR, and auto healing Preferred candidate profile: 4 to 7 years in Application Support/SRE or a related field. Should have experience with any API monitoring tool (Experience with Datadog and Cora Logix is ideal) Knowledge of CDNs ( Akamai, Cloudflare etc.) and cloud-based technologies ( AWS,GCP, etc.) Comfortable with large scale production systems, configurations management, load balancing & distributed systems. Must be strong in backend development (80%) with some frontend experience (20% ) Experience with troubleshooting tools and techniques for FE,BE, API etc. Familiar with job scheduling tools: cron and experience with application monitoring tools. Knowledge of web services ( SOAP based and RESTful Web services ) Prior experience in L2/L3 support. Well versed with anyone of the Scripting language ( Shell, Python etc. ) Strong Problem-Solving Skills and attention to detail Should you be interested please share the updated copy of resume on Jyotsana.bisht@cloud-Kinetics.com

Mock Interview

Practice Video Interview with JobPe AI

Start Application Performance Monitoring Interview Now

My Connections Cloud Kinetics

Download Chrome Extension (See your connection in the Cloud Kinetics )

chrome image
Download Now
Cloud Kinetics
Cloud Kinetics

Information Technology and Services

Bengaluru

51-200 Employees

29 Jobs

    Key People

  • Raghavendra Rao

    Co-Founder & CEO
  • Abhishek Sharma

    Co-Founder & COO

RecommendedJobs for You

Mumbai, New Delhi, Bengaluru