The Principal Data Engineer will be a senior individual contributor responsible for designing, building, and optimizing advanced data solutions that power enterprise-wide analytics, AI/ML, and data-driven decision-making. Reporting directly to the Senior Director of Data Platforms, this role will serve as a technical expert and thought partner in scaling modern data architectures, enabling AI orchestration, and delivering robust, secure, and compliant data products. 
This position is highly hands-on, requiring expertise in data engineering, Databricks, Python, AI/ML platforms, and orchestration frameworks. The Principal Engineer will work across functional teams to design and implement high-performance pipelines, ensure platform reliability, and set technical standards for the organizations data engineering practices. 
Job Duties & Responsibilities
- Design & Build Data Systems: Architect and implement scalable data pipelines, lakehouse/lake/warehouse environments, APIs, and orchestration workflows to support analytics, AI/ML, and business intelligence. 
- Enable AI & ML at Scale: Partner with Data Science and AI teams to productionize ML models, automate workflows, and enable AI orchestration frameworks (e.g., MLflow, Airflow, Databricks workflows). 
- Technical Leadership: Act as a hands-on subject matter expert in Databricks, Python, Spark, and related technologiesdriving adoption of best practices and mentoring other engineers. 
- Optimize Performance: Ensure data pipelines and platforms are highly available, observable, and performant at scale through monitoring, automation, and continuous improvement. 
- Ensure Compliance & Security: Build solutions that adhere to data governance, privacy, and regulatory frameworks (HIPAA, SOC 2, GCP, GDPR) within clinical research, life sciences, and healthcare contexts. 
- Collaborate Across Functions: Work closely with platform engineering, analytics, product management, and compliance teams to deliver aligned solutions that meet enterprise needs. 
- Advance Modern Architectures: Contribute to evolving data platform strategies, including event-driven architectures, data mesh concepts, and lakehouse adoption. 
Location
This role is open to candidates working in the United States (remote or hybrid). 
Basic Qualifications
- Bachelors degree in Computer Science, Engineering, Data Science, or equivalent practical experience. 
- 8+ years of data engineering experience in designing, implementing, and optimizing large-scale data systems. 
- Strong proficiency in Python, with production-level experience in building reusable, scalable data pipelines. 
- Hands-on expertise with Databricks (Delta Lake, Spark, MLflow), and modern orchestration frameworks (Airflow, Prefect, Dagster, etc.). 
- Proven track record of deploying and supporting AI/ML pipelines in production environments. 
- Experience with cloud platforms (AWS, Azure, or GCP) for building secure and scalable data solutions. 
- Familiarity with regulatory compliance and data governance standards in healthcare or life sciences. 
Preferred Qualifications
- Experience with event-driven systems (Kafka, Kinesis) and real-time data architectures. 
- Strong background in data modeling, lakehouse/lake/warehouse design, and query optimization. 
- Exposure to AI orchestration platforms and generative AI use cases. 
- Contributions to open-source projects or published work in data engineering/ML. 
- Agile development experience, including CI/CD, automated testing, and DevOps practices. 
Mandatory Competencies 
Development Tools and Management - Development Tools and Management - CI/CD 
Tech - Agile Methodology 
DevOps/Configuration Mgmt - Cloud Platforms - AWS 
DevOps/Configuration Mgmt - Cloud Platforms - GCP 
Database - Database Programming - SQL 
Data Science and Machine Learning - Data Science and Machine Learning - Databricks 
Big Data - Big Data - SPARK 
Data Science and Machine Learning - Data Science and Machine Learning - AI/ML 
Data Science and Machine Learning - Data Science and Machine Learning - Gen AI 
Beh - Communication and collaboration