Posted:1 week ago|
Platform:
On-site
Full Time
Overview We are looking for a Principal Quality Engineer to lead the quality, governance, and reliability efforts across our cloud-based enterprise data Lakehouse platform. This role will define and implement the standards, tools, and processes required to ensure that our data is accurate, timely, secure, and dependable. You will work closely with cross-functional teams—including engineering, product, methodology, and governance—to deliver trusted data pipelines and products that meet both business objectives and compliance standards. Primary Responsibilities Define and drive the quality strategy for data pipelines, curated datasets, and data products across all layers of the Lakehouse. Act as a thought leader in data quality and governance, influencing architectural decisions and mentoring peers on best practices and emerging trends. Design and automate validation processes for both raw and enriched data to ensure quality, and to detect schema changes, anomalies, and data gaps. Ensure all platform and data releases meet defined quality standards prior to production deployment. Lead or contribute to release assessments, deployment validations, and post-release evaluations. Design and execute comprehensive test plans for both batch and real-time data pipelines, covering aspects such as latency, event sequencing, data loss, duplication, reprocessing, performance under load, and scalability across large datasets and concurrent workloads. Collaborate with engineering and product teams to incorporate quality gates throughout the development lifecycle. Drive lineage, metadata, and change tracking in a data Lakehouse, partnering with governance and security teams to enforce data classification, access, and lifecycle policies—ensuring full traceability and compliance. Conduct root cause analysis for data issues and lead efforts to implement preventive improvements. Integrate quality controls into CI/CD pipelines and orchestration workflows. Foster a strong data quality culture by encouraging ownership and accountability within teams. Required Qualifications Good understanding of modern Data Lakehouse architectures and distributed data processing frameworks (e.g., Spark, Delta Lake). Proficiency in SQL and at least one data-centric programming language (e.g., Python or Scala). Hands-on experience implementing or using data validation and quality assurance frameworks. Hands-on experience with CI/CD practices, automated testing, and data deployment workflows. Proven ability to design, automate, and test batch and real-time streaming data pipelines, ensuring reliability, scalability, and performance across diverse data workloads. Hands-on experience working with healthcare data, including electronic medical records (EMR), claims datasets, and FHIR-based data sources. Hands-on experience with cloud-based data platforms, with exposure to Microsoft Azure services such as Data Lake, Databricks, Data Factory, and related tools Show more Show less
Optum
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
My Connections Optum
Chennai, Tamil Nadu, India
Experience: Not specified
Salary: Not disclosed
Chennai, Tamil Nadu, India
Experience: Not specified
Salary: Not disclosed