Job
Description
As a Major Incident Manager, you will play a crucial role in assessing the impact and severity of critical major incidents. Your responsibilities will include gathering information and data to support incident analysis and decision-making. You will act as the central point of contact during critical incidents, ensuring that all relevant teams are informed and engaged. Developing and implementing an effective communication template to keep internal and external stakeholders informed will be essential. You will be responsible for ensuring clear and timely communication with internal stakeholders, external partners, and relevant authorities. Maintaining Response SLA and coordinating with external vendors for additional support will also be part of your role. You will be expected to maintain detailed records of incident response activities, including actions taken, decisions made, and outcomes. Conducting a thorough post-incident analysis to identify lessons learned and areas for improvement will be critical. Implementing changes to enhance incident response capabilities and providing guidance and support to the incident response team during challenging incidents are key aspects of the role. Generating comprehensive incident reports, leading restoration efforts for all business/customer impacting incidents, and coordinating triage, recovery, and communication during major incidents will be part of your daily activities. You will lead major incident technical bridges and drive all activities towards service restoration. Assigning related Problem records to resolution teams and coordinating root cause analysis (RCA) to closure will also be your responsibility. Establishing metrics and reporting to create visibility into all Major Incidents and progress of open Problems, working on a 24x7x365 on-call rotation, and creating incident handling plans that define roles, duties, and escalation procedures will be crucial aspects of your role. Ensuring that the organization's incident management framework remains current and effective, developing contingency plans for various scenarios, and reviewing and updating incident response plans and procedures as needed are also part of your responsibilities. To be successful in this role, you should have at least 7+ years of supporting IT operations in a large-scale environment, 5+ years of experience leading the resolution of major incidents, and 5+ years of experience dedicated to Incident and Problem Management. A strong understanding of ITIL and Incident, Problem, and Change Management Processes, as well as a working background in AWS, Azure, ServiceNow, and/or other technologies, will be beneficial. Experience with managing ITIL workflows in ServiceNow is also desired. Your skills in Major Incident Management, Incident Reporting, Security Incident Response, and Critical Analysis will be essential for excelling in this role.,