Resolve to Save Lives (RTSL) is a global health organization that partners locally and globally to create and scale solutions to the world’s deadliest health threats. Millions of people die from preventable health threats. We collaborate to close the gap between proven, life-saving solutions and the people who need them. Since 2017, we’ve worked with governments and other partners in more than 60 countries to save millions of lives. We work toward a future where people live longer, healthier lives, communities flourish, and economies thrive. This is an ambitious vision, and it inspires us and our partners to make progress every day.The Digital Team at RTSL is implementing or helping to implement cutting-edge digital tools such as the Simple app, DHIS2 and Africa Covid Dashboard. We work with national and regional health organizations to accelerate progress and advancing the use of digital technologies to save lives through our approach of simplicity, speed, and scale.
Position Summary
The primary purpose of this consultancy is to provide technical expertise to ensure the reliability and performance of our production systems. You will assess and simplify our current infrastructure, implementing and maintaining efficient, scalable solutions. A key part of this position will be to train and mentor the engineering team, to support the adoption of best practices and knowledge sharing. Additionally, you will be responsible for tackling the existing backlog of tasks and suggesting long-term improvements to streamline our operations.
Duration and Location
- This is a 12-month part-time or a 6-month full-time contract
- This position is open to candidates based in India
Core Tasks and Activities
The DevOps Consultant shall provide IT/DevOps services. The services may include, but are not limited to:
- Production Support
- Kubernetes / ArgoCD
- Review Kubernetes clusters and pipelines, then provide a written audit report with recommended actions
- Disentangle our Kubernetes/ArgoCD pipelines and our CI/CD pipelines into clear, independent pipelines
- Monitoring
- Implement proper Prometheus federation across the different systems we're managing
- Costs
- Suggest cost optimization on our different AWS accounts.
- Rationalize the online services we are using
- SSO
- Implement SSO for all the online services we are using
- Implement SSO for all the software we are using
- Federate CVH team SSO with RTSL SSO
- Document all DevOps processes for internal use.
- Train the Backend Engineering team so they can fully maintain the environment
Contract Deliverables
Initial deliverables, to be produced during the early phase of the engagement
- Comprehensive document describing the Initial assessment of the situation, including
- Cost analysis (with recommendations to reduce/control the costs)
- Authentication Providers audit
- Vulnerabilities and points of failure, with recommendations on how to render the infrastructure more resilient
- Plan for the proper implementation of Prometheus federation to be validated with the manager
- Plan outlining areas for cost reductions
Final deliverables, to be produced after each task completion
- AWS Cost reduction report, and tools to monitor costs efficiently
- Single unified Grafana/Prometheus instance with relevant alerts
- Unified SSO for all services we use
- All the tools we're using should be tied to one single Identity provider, either by integrating them with our SSO or by changing to an SSO SSO-compliant alternative
- Comprehensive set of documentation covering all DevOps procedures and good practices.
Contract Management
The Independent Contractor will submit all deliverables to the Senior Engineering Manager, who will oversee this contract and monitor progress toward the deliverables.
Qualifications
Education
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field
Experience
- 5+ years of software development experience with at least 3-4 years of experience in DevOps
- Proven track record of maintaining and supporting production environments
- Experience collaborating with multi-disciplinary and cross-functional teams
Skills & Abilities
- Strong proficiency in Kubernetes
- Strong experience with AWS
- Knowledge of Prometheus and Grafana
- Experience with cost-benefit analysis and ability to translate findings into actionable recommendations
- Ability to train and mentor technical teams
- Strong communication skills with the ability to produce clear documentation of procedures and best practices
Application Process
Interested candidates should submit their CV and a cover letter detailing their suitability for the role.RTSL believes its programs are strengthened when they are developed and supported by individuals with diverse life experiences whose understanding of social and cultural issues can help make our work and workforce more inclusive. We encourage applications from and provide equal employment opportunities to all qualified applicants without regard to race, color, religion, gender, gender identity or expression, ancestry, sexual orientation, national origin, age, disability, marital status, organ donor status, or status as a veteran. Resolve to Save Lives complies with all applicable US EEO laws.