Ent Apps Service Reliability Advisor

3 - 6 years

20 - 25 Lacs

Posted:-1 days ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

We are seeking a skilled and experienced Service Reliability Advisor to join our diverse team as part of newly created Service Reliability Centre (SRC). In this role, you will help improve the availability and performance of Arm Business Critical Applications by utilising Arms AI Operations (AIOPS) and observability platforms. You will collaborate closely with development and engineering teams to build and maintain robust observability and response processes.

Responsibilities:

  • Lead the analysis and resolution of application incidents across platforms including SAP, Salesforce, O365 and Slack
  • Work with third parties and application teams to expand monitoring coverage, define alert thresholds, and onboard new applications and services into SRC support.
  • Drive proactive monitoring, tuning, and optimization of systems using Dynatrace and other observability tools.
  • Identify opportunities to adopt automation to support the AIOPS platform
  • Conduct root cause analysis of incidents and implement preventive measures.
  • Management of incidents to suppliers and Arms technical on-call rotas as appropriate
  • To log all issues in the Service Management Tool and handle them to completion within EIT service levels and quality criteria matrix
  • Work on a shift pattern, on a 24/7/365 operating model, while being able to work independently and flexibly in response to emergencies or critical issues

Required Skills and Experience:

  • 3 6 years of hands-on experience in `applications operations, development or Support roles.
  • Strong experience with observability tools preferably Dynatrace (Or Datadog, Splunk, Uptrends, Prometheus etc) for real-time monitoring, alerting, and diagnostics.
  • Proficiency in one or more scripting or programming languages (e.g., Python, Java, PowerShell, .NET, Node.js, Ansible or JavaScript).
  • Practical knowledge of application automation, including writing and maintaining playbooks.
  • Proficient in ticket management via an ITSM platform such as ServiceNow
  • Experience leading incident response, driving service restoration and coordinating root cause analysis.
  • Effective communicator within a team with a proactive approach and personal accountability for outcomes.
  • Ability to analyse incident patterns and metrics to proactively recommend reliability improvements.

Nice To Have Skills and Experience:

  • Experience working with enterprise Business applications such as SAP, Salesforce etc!
  • Experience within a technical monitoring and alerting toolset.
  • Experience of supporting SaaS platforms such as Slack, Office 365 and Zoom.
  • Experience in developing an APM/Observability platform.
  • Background in automation and DevOps practices!

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
ARM Embedded Technologies logo
ARM Embedded Technologies

Technology / Embedded Systems

San Jose

RecommendedJobs for You