Summary
Role description
We are looking for an experienced
Lead Data Quality Engineer
with a strong background in
data engineering
to drive enterprise-wide data quality and reliability initiatives. This role will focus on building and scaling
data quality frameworks, validation processes, and performance testing
across modern,
cloud-agnostic data platforms
. The ideal candidate is a hands-on technologist who thrives on solving complex data challenges, ensuring trustworthy and high-performing data pipelines, and collaborating across teams to uphold enterprise data standards.
Key Responsibilities
Data Quality & Performance Engineering (Primary Focus)
- Define, implement, and enhance data quality frameworks, validation rules, and monitoring systems for data pipelines and APIs.
- Lead performance and reliability testing for large-scale data lakes, warehouses, and streaming systems.
- Diagnose and optimize bottlenecks in latency, throughput, and system reliability.
- Integrate and manage tools such as Great Expectations, dbt, JMeter, Locust, Grafana, Prometheus.
- Lead a team of SDETs in test automation and data platform QA.
- Define data quality KPIs with business stakeholders and share actionable insights.
- Partner with Data/Platform Engineers to embed quality checks into CI/CD pipelines and infrastructure deployments.
Data Engineering & Platform Support (Secondary Focus)
- Contribute to the design, development, and optimization of ETL/ELT pipelines and data platforms.
- Support scalable data ingestion, transformation, and storage solutions.
- Ensure data governance, lineage, and compliance across multi-cloud environments.
- Collaborate with analytics and BI teams to guarantee reliable data availability for reporting, APIs, and ML models.
- Contribute to infrastructure automation with Terraform, CloudFormation, or equivalent IaC tools.
Required Qualifications
- Bachelor’s/Master’s in Computer Science, Information Systems, or related field.
- 12+ years in software engineering, with 5+ years in Data Engineering/Data Quality and 2+ years in a lead role.
- Strong programming skills in Java, Python, and SQL.
- Expertise with automation/testing tools (e.g., Selenium, REST Assured, Postman, Cypress).
- Experience with RDBMS, NoSQL, and document-oriented databases (PostgreSQL, MongoDB, Elasticsearch).
- Hands-on with big data & distributed systems (Spark, Kafka, Hadoop).
- Proven track record in performance engineering and tuning data systems.
- Experience with at least one major cloud (AWS, Azure, GCP); multi-cloud familiarity preferred.
- Strong communication and stakeholder engagement skills.
Preferred Qualifications
- Cloud certifications (AWS, Azure, GCP – Data/DevOps focus).
- Experience with:
- Data governance/catalog tools (Collibra, Apache Atlas, Alation).
- Lakehouse technologies (Delta Lake, Apache Iceberg, Hudi).
- DataOps/MLOps platforms (MLflow, Kubeflow, TensorFlow).
- Prior exposure to regulated industries or domain-specific environments (e.g., Hospitality).
Skills
Data Quality Assurance,Data Architecture,Etl Design