We are seeking a Cloud Data Engineer who excels in cleaning, transforming, and preparing complex data for integration into machine learning models. In this role, you will work closely with clients to understand their data and ensure it is properly formatted, validated, and optimized for use in our data pipelines.This position emphasizes data hygiene, reconciliation, and transformation—turning raw, messy data into structured, reliable datasets that flow seamlessly through our systems. The ideal candidate will combine analytical thinking with strong technical skills in Python (NumPy, Pandas) and SQL, bringing a software-engineering mindset to data quality and operational efficiency.
Responsibilities
- Collaborate with client teams to understand data sources, formats, and integration requirements.
- Clean, reconcile, and transform raw customer data to align with our platform’s ingestion standards.
- Build and maintain data transformation and validation pipelines to ensure consistent, high-quality outputs.
- Diagnose and resolve data discrepancies, ensuring accuracy and consistency across systems.
- Develop scripts and workflows for repeatable data processing and format standardization.
- Partner with internal engineering and client success teams to deliver clean, production-ready datasets.
- Ability to quickly learn new libraries and APIs to connect to external data sources – for example: Looker SDK, BigQuery python client or Snowflake python connector.
- Document data mappings, transformation logic, and validation procedures for long-term maintainability.
- Ability to create effective visualizations to provide insights from the data.
Qualifications
- 1–3 years of experience in a data-focused role (data scientist, software engineer, data analysis, data operations, or data engineering).
- Bachelor’s degree in Computer Science, Information Systems, Statistics, or a related field preferred.
- Strong proficiency in Python, particularly NumPy and Pandas for data manipulation and transformation.
- Solid experience with SQL for data querying, joining, and reconciliation across multiple sources.
- Experience building or maintaining ETL/ELT pipelines or data wrangling workflows.
- Attention to detail and a commitment to maintaining high data quality and consistency.
- Ability to interpret and clean messy or incomplete data for use in production systems.
- Excellent communication skills for collaborating with both technical and non-technical stakeholders.
Preferred Skills
- Experience with data validation, schema alignment, or reconciliation across disparate systems.
- Familiarity with version control, testing, or other software engineering best practices.
- Exposure to data visualization or reporting tools for validating and communicating results.
- Background in client-facing or consulting environments where data integrity and delivery were key.
About Ikigai Labs
Ikigai Labs is committed to equal employment opportunity and non-discrimination for all employees and qualified applicants without regard to a person's race, color, sex, gender identity or expression, age, religion, national origin, ancestry or citizenship, ethnicity, disability, military or protected veteran status, genetic information, sexual orientation, marital or familial status, or any other personal characteristic protected under applicable law.Powered by JazzHRWSxZGFaD0m