Short department description:
Allianz Technology
A-IF02CDC
department delivers a strategic IT Disaster Recovery Resiliency Orchestration Service
to over 20 Allianz entities, covering hundreds of mission-critical applications. This service ensures operational continuity and business resilience through cutting-edge automation and orchestration technologies.
We are looking for a seasoned
Amelia Automation & AIOps Specialist
with a strong background in AI-driven automation and orchestration
, combined with solid experience in Disaster Recovery (DR)
for enterprise IT environments. This role places a primary focus on designing and implementing
automation solutions using Amelia
, while leveraging your DR knowledge to ensure those workflows are operationally resilient, regulatory compliant, and recovery-ready. Youll be at the intersection of AI, infrastructure automation, and DR orchestrationbringing intelligence and efficiency to critical IT operations.
Key Tasks:
- Design, develop, and maintain intelligent automation workflows using the Amelia AIOps platform, integrating conversational AI, cognitive agents, and adaptive process logic to streamline IT operations.
- Implement and optimize Amelia-driven orchestration solutions by leveraging Generative AI and agentic frameworks (e.g., Langchain, AutoGen, Semantic Kernel, LangGraph) to enable dynamic response and decision-making capabilities.
- Integrate Amelia automation with enterprise IT ecosystems, including ITSM tools (ServiceNow), observability platforms (Splunk, Dynatrace), cloud environments (AWS, Azure, GCP), and DevOps pipelines.
- Develop and maintain reusable automation components, APIs, and integrations (REST, GraphQL, Kafka) to ensure scalable and modular automation architectures.
- Utilize AI/ML models (built using frameworks like TensorFlow, PyTorch, Huggingface) to enhance anomaly detection, root cause analysis, and predictive automation within Amelia workflows.
- Build and manage infrastructure-as-code (IaC) and automation scripts using tools such as Ansible, Terraform, and scripting languages (Python, Bash, Java) to support orchestration and self-healing capabilities.
- Support the automation of Disaster Recovery (DR) workflows, ensuring failover/failback processes are effectively orchestrated and aligned with RTO/RPO requirements across infrastructure and application layers.
- Lead the onboarding of new applications and services to the DR resiliency automation platform, customizing automation paths to meet organizational recovery policies and compliance needs.
- Collaborate with DR, IT Ops, and Cloud teams to ensure all DR workflows are validated, tested, and audit-ready, with automation logs and dashboards built into the Amelia environment.
- Support incident and problem management processes of DRRO service
- Contribute to the evolution of automation and DR best practices, incorporating the latest advances in AIOps, Generative AI, and cloud-native tooling into operational standards.
Qualification, education, work experience:
Bacheloror Masterdegree in Computer Science, Business Information Systems, or related field, or equivalent practical experience.
7+ years of experience in AI, Automation, and AIOps, with expertise in enterprise automation and orchestration solutions.
Extensive hands-on experience with
Amelia AIOps
, including workflow development, system integration, and automation of IT operations. Proficient in
automation scripting
(Python, Bash, PowerShell) and developing intelligent workflows for IT operations and disaster recovery. Strong understanding of
disaster recovery principles
, including RTO/RPO strategies, and experience automating recovery processes. Solid experience with
cloud platforms
(AWS, Azure, GCP), containerization
(Docker, Kubernetes), and orchestration tools (Argo, Helm). Familiar with
infrastructure technologies
(VMware, networking, storage, backup/recovery, Commvault) and ITSM/ITOM platforms
(e.g., ServiceNow, Splunk). Expertise in
Generative AI frameworks
(Langchain, AutoGen, Semantic Kernel) and AI/ML techniques
using TensorFlow, PyTorch, and Huggingface. Skilled in using automation tools (Ansible, Puppet, Terraform) and API integrations
(REST, GraphQL, Kafka). Strong interpersonal skills and ability to consult with both technical and non-technical stakeholders.
Fluent in English; German is a plus.
Relevant certifications in
cloud platforms
and AI/automation
technologies are advantageous.