Job
Description
About The Role
Project Role :Data Engineer
Project Role Description :Design, develop and maintain data solutions for data generation, collection, and processing. Create data pipelines, ensure data quality, and implement ETL (extract, transform and load) processes to migrate and deploy data across systems.
Must have skills :PySpark, Adobe Suite
Good to have skills :MySQL, Python (Programming Language), AWS BigData, Service design exposure
Minimum 3 year(s) of experience is required
Educational Qualification :15 years full time education
Summary:We are looking for a Data Engineer skilled in Python, PySpark, SQL, AWS Glue, Redshift, and Airflow to build and maintain scalable data pipelines. The ideal candidate will work on cloud-based ETL processes, data integration, and supporting analytics across the organization.
Roles and Responsibilities:
Develop and maintain ETL/ELT pipelines using PySpark and AWS Glue.Write clean and efficient Python code for data processing and automation tasks.Integrate, load, and optimize data in AWS Redshift for reporting and analytics.Use Airflow to schedule, orchestrate, and monitor data workflows.Work with AWS services such as S3, Glue Catalog, Lambda, and Athena as needed.Troubleshoot pipeline issues and improve performance and reliability.Collaborate with analysts and data teams to deliver accurate, high-quality datasets.Technical
Skills:Strong programming skills in Python.3-5 years of hands-on experience with PySpark for distributed data processing.Experience with AWS Glue (ETL jobs, Glue Catalog).Knowledge of AWS Redshift (loading, querying, optimization basics).Experience building workflows in Airflow.Solid understanding of SQL and data modeling fundamentals.Familiarity with AWS cloud environments.
Additional Information:Bachelor’s degree in computer science/MCA.Experience working in a global delivery model with onshore-offshore coordination.
Qualification 15 years full time education