Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in Hyderabad
>
S&P Global Market Intelligence
>
Director, Application Operations - SRE

Director, Application Operations - SRE

S&P Global Market Intelligence

15 - 20 years

50 - 55 Lacs

Hyderabad

Posted:5 months ago| Platform:

Apply

Skills Required

SRE risk management DevOps Change management ITSM Problem management information security

Work Mode

Work from Office

Job Type

Full Time

Job Description

The Role : Director, Application Operations, SRE (Site Reliability Engineering) The Team : This team is part of the global SRE group that provides Site Reliability Engineering Services for the critical applications used by the analysts for conducting the business. Application Operations team is responsible for the Stability (Uptime), Reliability (Quality & Performance) and Engineering of these applications to improve business outcomes, user experience and efficiencies. The Team operates at the intersection of IT operations and software development, ensuring that our services are not only robust but also agile enough to adapt to the ever-evolving business needs. Impact and Responsibilities : The Impact of this role extends far beyond the immediate team. You will be instrumental in shaping the reliability and performance standards of our critical applications, ensuring they meet the highest benchmarks. By driving advancements in automation and cloud technologies, you will contribute significantly to the organization's strategic goals and toil reduction, enhancing both the user experience and operational efficiency. You will nurture the team members to be the best-in-class by upskilling and cross-skilling. General & Team management: Ensure the team balances its focus between daily operational tasks and strategic long-term projects Drive the adoption of new technologies and processes through training and mentoring Lead/Mentor/Guide/Coach and transform a team of Application Operations to SREs Create/maintain documentation for systems and processes to ensure continuity and knowledge sharing within the team. Adoption of Gen AI to leverage knowledge repository Collaborate with cross-functional teams to ensure seamless integration and support for new technologies and initiatives Oversee daily operations and ensure the shifts are adequately managed Set the roadmap; derive goals for each team member; review, motivate and support to make them successful Stability: Build a SRE practice that improves system stability with Monitoring & AIOps. Avert P1/P2 incidents and minimize business impact Analyze system vulnerabilities, SPOFs and address them proactively to improve stability Refactor monolithic apps and databases to containerized services to improve delivery/scale Work with business users to understand needs, issues, develop root cause analysis and work with the cross functional teams to address them permanently Reliability: Monitor system performance and create strategies to improve it Reduce the number of incidents and the time taken to resolve them (MTTR) Develop and implement disaster recovery plans to ensure business continuity Lead DevOps transformation to improve the delivery of value to business, reduction of costs & manual errors, increased velocity of releases and improved config management Engineering: Involvement in Architecture and Development design reviews (Shift-left) for new implementation and integration projects to build SRE best practices into the SDLC Continuously look for opportunities to automate tasks, simplify processes, Self-service to reduce the toil Value Stream Alignment: While alignment as horizontal lead is expected to begin with, its expected that you also handle the role of a SRE value stream lead going forward. Ensure smooth inter-working with value streams (VS) to meet the objectives & realize value Foster a 2-way knowledge sharing with VS and reduce dependency on SRE Help shepherd VS to improve SRE maturity levels; implement & prioritize best practices like monitoring, post-mortem, toil reduction, retrospectives etc. Application to User Journey orientation and transformation Whats in it for you : In this role, you will have the opportunity to collaborate with a diverse and talented team, working on cutting-edge technology solutions to drive efficiency and innovation within the organization. You will be at the forefront of implementing best practices in site reliability engineering, with a strong emphasis on automation, cloud technologies, and performance optimization. You will interface with the value stream leads to improve the SRE practices and maturity levels within the value streams. What Were Looking For: Basic Qualifications : Bachelors degree in computer science or equivalent is required, or in lieu, a demonstrated equivalence in work experience 15+ years of experience in Information Technology domain including cloud, systems & database administration, networking, performance, and application operations Proven experience in IT Operations and/or Site Reliability Engineering, successful handling of Application Operations in a complex IT setup Manage Multi-cloud (AWS/Azure) environments Engineering and implementing proactive monitoring of applications, infrastructure & databases. Engineering automation to self-heal and mature towards AIOps Manage, innovate, and create processes, software and tools that continuously improve the availability, reliability, scalability, latency and efficiency of platforms Engineer Self-service portals, Scalable platforms and repeatable processes that allow product teams to own the entire life cycle of their products, reducing the SRE dependency Excellent communication skills with experience in managing, coaching, and building highly effective teams. Manage and inspire a team of full stack Site Reliability Engineers across regions and time zones, emphasizing collaboration and efficiency. Establish relationships with business teams & other IT partners. Identifying and measuring KPIs like CSAT/NPS scores, establishing feedback channels which have a direct correlation to UX Cost management through forecasting consumption, budgeting, tagging assets & tracking cost, disposing unused allocations & right sizing, optimizing usage & correlating cost to business value Establish incident & defect review process to help guide and continually improve stability of applications Shapes and leverages advanced conceptual thinking to solve complex and/or completely new or novel situations that have never been dealt with before. Actively pursues innovative solutions that align with the companys tolerance for risk (business and reputational) Looks at external companies, products and capabilities and how they may accelerate Ratings technology initiatives Preferred Qualifications: Experience in application & data architecture, system design, algorithms, data structures, complexity analysis, and software design Ability to architect high availability application and servers on cloud adhering best practices. Ability to perform technical deep-dives into code, networking, systems, databases and storage configuration Experience working in Agile software product development Experience working with stakeholders and collaborating across organizational boundaries. Configuration management, automation of patching, threat and vulnerability management, security monitoring, network security, endpoint security, cloud application and data security Awareness of security frameworks like NIST to address technology, information and resilience risk, information security and risk management Support & transform ITSM process Incident, Change & Problem management to align with DevOps maturity

More Jobs at S&P Global Market Intelligence

Full Stack / ReactJS Software Developer

Gurugram

3 - 7 yrs

INR 5 - 9 Lacs

Solution Architect ( Java, Angular )

Hyderabad

15 - 20 yrs

INR 45 - 50 Lacs

Senior Associate, Independent Quality Team

Hyderabad

10 - 15 yrs

INR 30 - 35 Lacs

Quality Assurance Team Lead

Bengaluru

6 - 7 yrs

INR 8 - 12 Lacs

Vendor Operations Administrator (Technology)

Hyderabad

5 - 7 yrs

INR 7 - 9 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

S&P Global Market Intelligence

Financial Services

New York

Login to

Please Verify Your Phone or Email

Confirm Action

Director, Application Operations - SRE