Roles and Responsibilities The Leonardo Centre on Business for Society seeks an all-rounder data scientist with excelent Python proficiency, problem-solving skills and critical thinking to aid our team. The role mainly aims at developing a codebase and data process pipelines involving our core dataset, although the selected candidate will also have opportunity to drive direct impact through analysis and algorithms in projects involving major universities, industry leads, and governments. Automatize EDA outputs using Data Wrangling, charts, and LLMs. Collaborate with cross-functional teams to integrate data engineering solutions into existing systems. Consolidate diverse datasets and integrate them into a central relational database Validate and integrate Machine Learning models into existing pipelines.. Develop and integrate pipelines to analyze large amounts of PDF documents on an S3-based data lake Desired Candidate Profile Strong proficiency in Python with expertise in libraries such as Polars, Plotly, OpenAI API, and s3fs Critical thinking and attention to details Expert in or with capacity for developing elaborate Data Validation and Data Quality Assurance pipelines on complex datasets Autonomous problem-solver Experience with using AI as a coding assistant