Job Description:
Senior Delivery Manager (Production Support and DevOps)
The person is responsible for ensuring the smooth and reliable operation of production systems and applications, acting as a point of contact for incidents and ensuring the efficient resolution of issues. They also play a key role in incident management, root cause analysis, and continuous improvement efforts. The person should have excellent leadership and people management skills and should be able to lead a large team of Production Support and DevOps Engineers.Key roles and responsibilities:
Incident Management and Support:Monitoring and Troubleshooting:
Continuously monitor systems and applications for performance issues, incidents, and alerts, and proactively respond to incidents.Issue Resolution:
Diagnose and resolve production issues using advanced troubleshooting techniques.Root Cause Analysis:
Perform in-depth analysis to identify the root causes of incidents and prevent recurrence.Documentation and Communication:
Create and maintain documentation related to production issues and resolutions, and effectively communicate with stakeholders, including development and operations teams.Incident Management:
Oversee the incident management process, including prioritization, escalation, and resolution, ensuring timely and effective incident resolution.System Performance and Optimization:Performance Monitoring:
Monitor system performance metrics, identify bottlenecks, and recommend solutions for performance optimization.Process Improvement:
Implement and maintain processes and procedures to improve production support efficiency and reduce downtime.Automation:
Identify and implement automation opportunities to streamline repetitive tasks and reduce manual effort.Data Analysis:
Analyze data related to production performance, incident trends, and support requests to identify areas for improvement and optimization.Cross-Functional Collaboration:Collaboration with Development and Operations:
Work closely with development, operations, and other relevant teams to ensure seamless software deployment and integration.Communication and Reporting:
Provide regular reports on system performance, incident status, and support metrics to senior management and stakeholders.On-Call Support:
Participate in on-call rotations and respond to production issues after business hours.Other Responsibilities:Training and Documentation:
Develop and deliver training materials and documentation to support production support teams.Process Improvement:
Identify and implement improvements to production support processes and procedures.Knowledge Management:
Maintain and update knowledge databases and documentation to support troubleshooting and incident resolution.Continuous Improvement:
Drive continuous improvement initiatives to enhance the overall efficiency and reliability of production support.Technical Skills:Excellent knowledge of ServiceNow, NewRelic, AWS Cloud, Application, System, Network, Cloud and DevOps.Experience:12+ yearsCertification:ITIL, AWS Certification are desiredWe offer you a competitive total rewards package, continuing education & training, and tremendous potential with a growing worldwide organization.DISCLAIMER:
Nothing in this job description restricts management's right to assign or reassign duties and responsibilities of this job to other entities; including but not limited to subsidiaries, partners, or purchasers of Alight business units.