- Role Summary
- We are seeking a highly skilled professional to lead initiatives at the intersection of Generative AI GenAI Agentic AI frameworks Root Cause Analysis RCA and Full Stack Observability
- This role is ideal for individuals who can leverage AI driven insights to enhance incident detection impact analysis and system reliability across complex IT environments
Key Responsibilities:
- Key Responsibilities
- Design and implement AI driven RCA and impact analysis frameworks integrated with observability platforms
- Utilize Agentic AI and GenAI technologies to automate and enhance decision making in incident response and service reliability
- Integrate observability tools e
- g
- AppDynamics Dynatrace DataDog Grafana CloudWatch Splunk ELK with AI models for intelligent alerting and diagnostics
- Collaborate with SRE DevOps and platform teams to define observability KPIs and RCA workflows
- Develop autonomous agents for real time monitoring anomaly detection and resolution
- Drive innovation in predictive analytics and self healing systems using GenAI capabilities
- Ensure scalability security and compliance of AI integrated observability solutions
Technical Requirements:
- Required Skills Experience
- Hands on experience with GenAI and Agentic AI frameworks e
- g
- LangChain LangGraph
- Strong understanding of Root Cause Analysis RCA and Impact Analysis methodologies
- Experience in designing AI powered automation and monitoring systems
- Familiarity with cloud native architectures and distributed systems
- Excellent analytical problem solving and communication skills
Additional Responsibilities:
- Ability to develop value creating strategies and models that enable clients to innovate drive growth and increase their business profitability
- Good knowledge on software configuration management systems
- Awareness of latest technologies and Industry trends
- Logical thinking and problem solving skills along with an ability to collaborate
- Understanding of the financial processes for various types of projects and the various pricing models available
- Ability to assess the current processes identify improvement areas and suggest the technology solutions
- One or two industry domain knowledge
- Client Interfacing skills
- Project and Team management
Preferred Skills:
Technology->Infra_ToolAdministration-ITSM->ServiceNow->ITSM,Technology->Microservices->Microservices API Management,Technology->Microsoft Technologies->.Net Application Development,Technology->Machine Learning->Responsible AI,Technology->Machine Learning->Generative AI,Foundational->SDLC->Problem Analysis->Root Cause Analysis