About The Role
The DevOps team is responsible for building and running the ecosystem that delivers services to our customers internally and externally. They are embedded in project tribes to collaborate and ensure project objectives are achieved within time, cost, and quality.In true DevOps spirit, they will improve the DevOps process and ensure Cloud Infrastructure architectural integrity, automating where possible and ensuring production website stability, integrity, performance, and supportability.
Required Skills & Experience
- Degree in IT/Software Engineering or similar, or equivalent practical experience
- 2-4 years experience in DevOps supporting cloud environments, specifically in AWS or GCP
- 1+ year experience with automation tools and performing refactorisation
- Excellent technical problem-solving skills which you can quickly draw on in unfamiliar situations
- Exposure to Agile/DevOps principles and CI/CD tools
- Good written skills and demonstrated experience in documentation of work
- Good capabilities in source control technologies such as Git
- Good capabilities with Python/Bash/PowerShell
- Familiarity with multiple operating systems, particularly Linux
- Good spoken and written English
Must-haves
- AWS certification or similar Cloud certification and working cloud support experience
- Experience with Docker or Kubernetes (EKS)
- Experience with Terraform
- Experience in setting up Prometheus and Grafana in Amazon EKS
Good-to-have
- Experience with CDN tools
- Experience with using cloud security tools
- Experience with multiple public cloud providers (AWS/GCP/Azure)
- Experience with SQL / MongoDB / PostgreSQL / GraphQL
- Experience with Content Mgmt tools and Front End applicmations
Key Responsibilities
- To support the DevOps process for web-based products hosted on cloud infrastructure, specifically:
- To respond and complete tickets, meet SLAs, and manage Reporter expectations.
- To collaborate with assigned tribe change streams to deliver project/change objectives:
- Understand requirements, and support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
- Build software and systems to manage infrastructure and applications through automation deployment.
- Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve performance, reliability, scalability, security, and velocity.
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
- To monitor and respond to alerts, issues, and incidents about cloud infrastructure (and corporate infrastructure as required).
- Practice sustainable incident response and provide appropriate communications and blameless post-mortems.
- To drive DevOps process and Cloud infrastructure improvements as part of service and security improvement roadmap, specifically:
- Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation, and refinement.
- To support training and learning by sharing knowledge with the Tech team and taking responsibility for own professional development.
- Explore and evaluate new technologies and solutions to push our capabilities forward.
- To articulate and escalate risks and issues, provide recommended solutions to problems, and implement them.
- Document procedures, configuration changes, and guidelines.
- To maintain cloud infrastructure and networking as per Cloud policy, standards, and governance requirements.
Other Duties
Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee for this position. Duties, responsibilities, and activities may change at any time without notice.After-hours support, as agreed, for eg, on-call support, releases that need to be done out of business hours due to potential risk of disruption to the business, incidents work that needs to be resolved as a priority.Skills:- Amazon Web Services (AWS), Kubernetes, Terraform and Docker