This role is for one of Weekday's clientsMin Experience: 5 yearsLocation: Gurgaon, GurugramJobType: full-time
Requirements
We are seeking an experienced
Data Engineer
with strong expertise in
AWS
,
Databricks
,
PySpark
, and
SQL
to design, develop, and optimize scalable data pipelines. The ideal candidate will have a deep understanding of modern data architectures, ETL frameworks, and cloud-based data ecosystems. You will work closely with data scientists, analysts, and business stakeholders to ensure reliable, high-quality, and timely data delivery that powers key business decisions and analytics initiatives.This role requires both hands-on technical capability and strategic thinking, as you'll contribute to building robust data foundations and scalable data platforms.
Key Responsibilities
- Data Pipeline Development:
- Design, build, and maintain scalable ETL/ELT pipelines for both batch and streaming data processing using PySpark and Databricks.
- Optimize and automate data workflows to ensure efficient and reliable data movement across systems.
- Develop end-to-end data integration solutions from diverse data sources to centralized data lakes and warehouses.
- Cloud Data Engineering (AWS):
- Work extensively with AWS data services such as S3, Glue, Kinesis, and Redshift to design and manage modern data architectures.
- Implement data lake and data warehouse solutions ensuring scalability, security, and performance.
- Build workflow orchestration using Airflow, AWS Step Functions, or similar tools for automation and scheduling.
- Data Modeling & Optimization:
- Develop and optimize SQL queries for performance, scalability, and data quality.
- Design data models and schemas to support analytical workloads and reporting.
- Ensure data accuracy, consistency, and lineage through validation and quality checks.
- Collaboration & Cross-Functional Support:
- Work closely with data scientists, analysts, and product teams to understand data requirements and deliver relevant datasets for analytics and machine learning use cases.
- Partner with platform and DevOps teams to ensure smooth data pipeline deployment and monitoring.
- Translate complex technical concepts into business-friendly insights and documentation.
- Data Governance & Security:
- Implement data management best practices, including metadata management, access controls, and compliance with data governance standards.
- Ensure adherence to security and privacy guidelines across all data solutions.
Key Skills And Qualifications
- 5+ years of professional experience as a Data Engineer or similar role in a cloud-based environment.
- Expertise in PySpark for distributed data processing and transformation.
- Advanced proficiency in SQL—query optimization, performance tuning, and large-scale data manipulation.
- Hands-on experience with Databricks for collaborative data development and pipeline orchestration.
- Strong understanding of AWS Data Stack, including S3, Glue, Kinesis, Lambda, and Redshift.
- Experience in building and maintaining data lakes and data pipelines (batch and streaming).
- Proficiency in workflow orchestration tools such as Airflow or AWS Step Functions.
- Familiarity with data versioning, CI/CD pipelines, and infrastructure-as-code (IaC) is a plus.
- Strong problem-solving skills, attention to detail, and ability to work in fast-paced, agile environments.