On-site
Full Time
Job Summary
We are looking for a highly skilled Big Data & ETL Tester to join our data engineering and analytics team. The ideal candidate will have strong experience in PySpark, SQL, and Python, with a deep understanding of ETL pipelines, data validation, and cloud-based testing on AWS. Familiarity with data visualization tools like Apache Superset or Power BI is a strong plus
You will work closely with our data engineering team to ensure data availability, consistency, and quality across complex data pipelines, and help transform business requirements into robust data testing frameworks.
Key Responsibilities
• Collaborate with big data engineers to validate data pipelines and ensure data integrity across ingestion, processing, and transformation stages.
• Write complex PySpark and SQL queries to test and validate large-scale datasets.
• Perform ETL testing, covering schema validation, data completeness, accuracy, transformation logic, and performance testing.
• Conduct root cause analysis of data issues using structured debugging approaches.
• Build automated test scripts in Python for regression, smoke, and end-to-end data testing.
• Analyze large datasets to track KPIs and performance metrics supporting business operations and strategic decisions.
• Work with data analysts and business teams to translate business needs into testable data validation frameworks.
• Communicate testing results, insights, and data gaps via reports or dashboards (Superset/Power BI preferred).
• Identify and document areas of improvement in data processes and advocate for automation opportunities.
• Maintain detailed documentation of test plans, test cases, results, and associated dashboards.
Required Skills and Qualifications
2+ years of experience in big data testing and ETL testing.
• Strong hands-on skills in PySpark, SQL, and Python.
• Solid experience working with cloud platforms, especially AWS (S3, EMR, Glue, Lambda, Athena, etc.).
• Familiarity with data warehouse and lakehouse architectures.
• Working knowledge of Apache Superset, Power BI, or similar visualization tools.
• Ability to analyze large, complex datasets and provide actionable insights.
• Strong understanding of data modeling concepts, data governance, and quality frameworks.
• Experience with automation frameworks and CI/CD for data validation is a plus
Preferred Qualifications
• Experience with Airflow, dbt, or other data orchestration tools.
• Familiarity with data cataloging tools (e.g., AWS Glue Data Catalog).
• Prior experience in a product or SaaS-based company with high data volume environments.
Why Join Us?
• Opportunity to work with cutting-edge data stack in a fast-paced environment.
• Collaborate with passionate data professionals driving real business impact.
• Flexible work environment with a focus on learning and innovation
Draup
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python NowChennai, Tamil Nadu, India
Salary: Not disclosed
Vadodara
Salary: Not disclosed
Telangana, India
Experience: Not specified
Salary: Not disclosed
Experience: Not specified
Salary: Not disclosed
Mumbai, Maharashtra, India
Salary: Not disclosed
Mumbai, Maharashtra, India
Salary: Not disclosed
Navi Mumbai, Maharashtra, India
Salary: Not disclosed
Kerala, India
Experience: Not specified
Salary: Not disclosed
Mumbai, Maharashtra, India
Salary: Not disclosed
Salary: Not disclosed