Draup - Multi-Dimensional Global Labor & Market Data Data Analyst - Tech
Job Summary
We are looking for a highly skilled
Big Data & ETL Tester
to join our data engineering and analytics team. The ideal candidate will have strong experience in PySpark
, SQL
, and Python
, with a deep understanding of ETL pipelines
, data validation
, and cloud-based testing on AWS
. Familiarity with data visualization tools like Apache Superset or Power BI
is a strong plus. You will work closely with our data engineering team to ensure data availability, consistency, and quality across complex data pipelines, and help transform business requirements into robust data testing frameworks.
Key Responsibilities
- Collaborate with big data engineers to validate data pipelines and ensure data integrity across ingestion, processing, and transformation stages.
- Write complex
PySpark
and SQL
queries to test and validate large-scale datasets.
- Perform
ETL testing
, covering schema validation, data completeness, accuracy, transformation logic, and performance testing.
- Conduct root cause analysis of data issues using structured debugging approaches.
- Build automated test scripts in
Python
for regression, smoke, and end-to-end data testing.
- Analyze large datasets to track
KPIs
and performance metrics supporting business operations and strategic decisions.
- Work with data analysts and business teams to translate business needs into testable data validation frameworks.
- Communicate testing results, insights, and data gaps via reports or dashboards (Superset/Power BI preferred).
- Identify and document areas of improvement in data processes and advocate for automation opportunities.
- Maintain detailed documentation of test plans, test cases, results, and associated dashboards.
Required Skills and Qualifications
- 2+ years of experience in
big data testing
and ETL testing
.
- Strong hands-on skills in
PySpark
, SQL
, and Python
.
- Solid experience working with
cloud platforms
, especially AWS (S3, EMR, Glue, Lambda, Athena, etc.)
.
- Familiarity with
data warehouse
and lakehouse
architectures.
- Working knowledge of
Apache Superset
, Power BI
, or similar visualization tools.
- Ability to analyze large, complex datasets and provide actionable insights.
- Strong understanding of data modeling concepts, data governance, and quality frameworks.
- Experience with automation frameworks and CI/CD for data validation is a plus.
Preferred Qualifications
- Experience with
Airflow
, dbt
, or other data orchestration tools
.
- Familiarity with
data cataloging tools
(e.g., AWS Glue Data Catalog).
- Prior experience in a product or SaaS-based company with high data volume environments.
Why Join Us?
- Opportunity to work with cutting-edge data stack in a fast-paced environment.
- Collaborate with passionate data professionals driving real business impact.
- Flexible work environment with a focus on learning and innovation.
EAIGG Draup is a Member of the Ethical AI Governance Group
As an AI-first company, Draup has been a champion of ethical and responsible AI since day one. Our models adhere to the strictest data standards and are routinely audited for bias.
Ready to see results?
Drive better decisions with unmatched, real-time data & agentic intelligence
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Strictly necessary (always active)
Cookies required to enable basic website functionality.
Cookies used to deliver advertising that is more relevant to you and your interests.
Cookies allowing the website to remember choices you make (such as your user name, language, or the region you are in).
Cookies helping understand how this website performs, how visitors interact with the site, and whether there may be technical issues.