As a core member of the Data Transformation team, you'll own the critical business data pipelines that power every Firmable product. You'll work hands-on with DBT, SQL, and Python to transform raw data into reliable, business-ready insights. Partnering closely with stakeholders across product, analytics, and business teams, you'll implement robust transformation logic that ensures data quality and drives real-time decision-making at scale for our customers.
The Core Responsibilities For The Job Include The Following
Pipeline Development and Architecture:
- Design and build scalable data transformation pipelines using DBT, converting raw data into the world's most accurate business dataset.
- Develop end-to-end data workflows with Python and Airflow, ensuring reliability and business continuity.
- Implement sophisticated business logic through SQL transformations and data modelling.
- Deploy and orchestrate pipelines at scale on AWS cloud infrastructure.
Data Quality And AI Integration
- Clean, deduplicate, and enrich datasets using AI and LLM models for maximum data quality.
- Build comprehensive testing frameworks with validation, monitoring, and anomaly detection.
- Automate quality checks using fine-tuned models and GPT heuristics to catch issues before production.
- Implement real-time schema drift detection and data validation systems.
Stakeholder Collaboration And Innovation
- Partner with product, analytics, and business teams to translate requirements into technical solutions.
- Engage regularly with stakeholders to understand evolving needs and deliver scalable solutions.
- Champion new tooling and best practices, from next-gen model distillation to novel vector databases.
- Drive continuous improvement of our data transformation architecture.
Development Excellence
- Leverage AI coding assistants (Cursor) daily to accelerate development and maintain code quality.
- Write efficient, performance-optimised SQL for complex transformations across large datasets.
- Maintain transformation reliability through robust monitoring and automated workflows.
Core Technical Skills
The core requirements for the job include the following:
- 4+ years building production data transformation pipelines in business-critical environments.
- Expert-level proficiency for data modelling and complex business logic implementation.
- Strong Python expertise, including pandas, numpy, and data pipeline automation.
- Advanced SQL skills with proven ability to optimise performance and design scalable data models.
- Extensive Airflow experience for end-to-end pipeline orchestration and workflow management.
- Experience integrating LLMs with data pipelines.
- AI-assisted modern development practices, such as cursor.
Platform And Architecture Experience
- Hands-on experience with cloud data platforms and modern data stack technologies.
- Experience with Databricks or similar data platforms such as Snowflake, Redshift, and RDS.
- Solid understanding of data warehousing, dimensional modelling, and transformation best practices.
- Experience with AWS or similar cloud platforms for data pipeline deployment.
Business And Collaboration Skills
- Strong stakeholder engagement abilities with experience translating business requirements to technical solutions.
- Excellent communication skills for working with non-technical teams.
Product Mindset
- You understand how data quality directly impacts customer value.
- Business-focused approach, you care about outcomes, not just pipeline uptime.
Problem-Solving And Teamwork
- Proven track record of building reliable, maintainable data solutions in collaborative environments.
- Excellent problem-solving abilities with a focus on scalable, efficient solutions.
This job was posted by Aishwarya from Firmable.