Key Responsibilities1. Containerization Strategy & Implementation
- Design and maintain containerized environments using Kubernetes (or similar orchestration platforms).
- Define best practices around container usage and image creation, versioning, and lifecycle management.
- Optimize container performance, resource usage, and startup times.
2. Automation & Infrastructure as Code (IaC)
- Automate infrastructure provisioning and configuration.
- Build and maintain CI/CD pipelines that support containerized deployments (e.g., GitHub Actions, GitLab CI, Jenkins, ArgoCD).
- Reduce manual toil through scripting and automation of repetitive tasks.
3. Site Reliability Engineering (SRE) Practices
- Define and monitor SLIs/SLOs/SLAs to ensure service reliability and performance.
- Integration with observability stacks (e.g., Datadog) for proactive monitoring and alerting.
- Conduct incident response, root cause analysis, and postmortem investigations to improve system reliability.
4. Security & Compliance
- Enforce container security best practices (e.g., image scanning, runtime protection, RBAC policies).
- Collaborate with security teams to ensure compliance with internal and external standards.
5. Collaboration & Enablement
- Work closely with Digital M&S application teams (product owners, functional experts), Quality & Compliance, and technical platform teams to provide self-service container platforms.
- Educate teams on container usage and benefit, deployment workflows, and reliability principles.
- Create documentation, runbooks, and onboarding materials for internal teams.
Technical Skills- Deep expertise in container technologies, including architecture, security, advanced troubleshooting and container orchestration at scale in production.
- Hybrid Infrastructure Expertise, design and operate container platforms across both cloud environments (MS Azure & AWS) and on-premises data centers
- Solid knowledge of Linux systems, networking, and distributed systems.
- Solid automation skills using Ansible and scripting languages (Python, Bash, PowerShell).
- CI/CD pipeline design for containerized applications using Jenkins, GitLab CI, GitHub Actions, or ArgoCD.
- Strong grasp of SRE principles, including SLI/SLO definition, error budgets, and incident management.
- Observability stack implementation using Datadog or Splunk for instance.
- Performance tuning and capacity planning for container platforms.
Additional Skills:- Problem-solving mindset with ability to troubleshoot complex distributed systems
- Systems thinking: Understands the full stack from infrastructure to application.
- Business-centric thinking, capable of translating reliability and performance improvements into outcomes that support operational efficiency and customer satisfaction.
- Strong communication skills for working with diverse technical and operational teams
- Collaborative approach to work effectively with site teams and products stakeholders
- Adaptability to work in both cloud and traditional on-premises environments
- Documentation & Enablement: Creates clear runbooks, onboarding guides, and internal tooling.
- Attention to detail for compliance and validation requirements
- Continuous learning mindset to stay current with evolving technologies
- Advanced English proficiency with articulate and concise communication.
Experience Requirements- 5-7+ years hands-on experience with containerization technologies
- 3+ years of experience in reliability engineering, DevOps, or infrastructure automation
- Proven experience in hybrid cloud or cloud migration projects.
- Experience with industrial applications and/or working in regulated industries (pharmaceutical, chemical, food & beverage) is highly valued
Education Requirements- Bachelor's degree in Computer Science, Information Technology, Engineering, or related technical field
- Master's degree is a plus
- Relevant Certifications (CKA/CKS, Docker DCA, Azure/AWS certifications, Certified SRE Professional or similar etc.) highly valued
- Continuous learning through professional development and industry training are significant advantages