Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
7.0 - 12.0 years
15 - 30 Lacs
Gurugram, Delhi / NCR
Work from Office
Job Description We are seeking a highly skilled Senior Data Engineer with deep expertise in AWS data services, data wrangling using Python & PySpark, and a solid understanding of data governance, lineage, and quality frameworks. The ideal candidate will have a proven track record of delivering end-to-end data pipelines for logistics, supply chain, enterprise finance, or B2B analytics use cases. Role & responsibilities. Design, build, and optimize ETL pipelines using AWS Glue 3.0+ and PySpark. Implement scalable and secure data lakes using Amazon S3, following bronze/silver/gold zoning. Write performant SQL using AWS Athena (Presto) with CTEs, window functions, and aggregations. Take full ownership from ingestion transformation validation metadata documentation dashboard-ready output. Build pipelines that are not just performant, but audit-ready and metadata-rich from the first version. Integrate classification tags and ownership metadata into all columns using AWS Glue Catalog tagging conventions. Ensure no pipeline moves to QA or BI team without validation logs and field-level metadata completed. Develop job orchestration workflows using AWS Step Functions integrated with EventBridge or CloudWatch. Manage schemas and metadata using AWS Glue Data Catalog. Take full ownership from ingestion transformation validation metadata documentation dashboard-ready output. Ensure no pipeline moves to QA or BI team without validation logs and field-level metadata completed. Enforce data quality using Great Expectations, with checks for null %, ranges, and referential rules. Ensure data lineage with OpenMetadata or Amundsen and add metadata classifications (e.g., PII, KPIs). Collaborate with data scientists on ML pipelines, handling JSON/Parquet I/O and feature engineering. Must understand how to prepare flattened, filterable datasets for BI tools like Sigma, Power BI, or Tableau. Interpret business metrics such as forecasted revenue, margin trends, occupancy/utilization, and volatility. Work with consultants, QA, and business teams to finalize KPIs and logic. Build pipelines that are not just performant, but audit-ready and metadata-rich from the first version. Integrate classification tags and ownership metadata into all columns using AWS Glue Catalog tagging conventions. Preferred candidate profile Strong hands-on experience with AWS: Glue, S3, Athena, Step Functions, EventBridge, CloudWatch, Glue Data Catalog. Programming skills in Python 3.x, PySpark, and SQL (Athena/Presto). Proficient with Pandas and NumPy for data wrangling, feature extraction, and time series slicing. Strong command over data governance tools like Great Expectations, OpenMetadata / Amundsen. Familiarity with tagging sensitive metadata (PII, KPIs, model inputs). Capable of creating audit logs for QA and rejected data. Experience in feature engineering rolling averages, deltas, and time-window tagging. BI-readiness with Sigma, with exposure to Power BI / Tableau (nice to have).
Posted 1 day ago
7.0 - 12.0 years
15 - 30 Lacs
Gurugram
Hybrid
Job Description We are seeking a highly skilled Senior Data Engineer with deep expertise in AWS data services, data wrangling using Python & PySpark, and a solid understanding of data governance, lineage, and quality frameworks. The ideal candidate will have a proven track record of delivering end-to-end data pipelines for logistics, supply chain, enterprise finance, or B2B analytics use cases. Role & responsibilities Design, build, and optimize ETL pipelines using AWS Glue 3.0+ and PySpark. Implement scalable and secure data lakes using Amazon S3, following bronze/silver/gold zoning. Write performant SQL using AWS Athena (Presto) with CTEs, window functions, and aggregations. Take full ownership from ingestion transformation validation metadata documentation dashboard-ready output. Build pipelines that are not just performant, but audit-ready and metadata-rich from the first version. Integrate classification tags and ownership metadata into all columns using AWS Glue Catalog tagging conventions. Ensure no pipeline moves to QA or BI team without validation logs and field-level metadata completed. Develop job orchestration workflows using AWS Step Functions integrated with EventBridge or CloudWatch. Manage schemas and metadata using AWS Glue Data Catalog. Take full ownership from ingestion transformation validation metadata documentation dashboard-ready output. Ensure no pipeline moves to QA or BI team without validation logs and field-level metadata completed. Enforce data quality using Great Expectations, with checks for null %, ranges, and referential rules. Ensure data lineage with OpenMetadata or Amundsen and add metadata classifications (e.g., PII, KPIs). Collaborate with data scientists on ML pipelines, handling JSON/Parquet I/O and feature engineering. Must understand how to prepare flattened, filterable datasets for BI tools like Sigma, Power BI, or Tableau. Interpret business metrics such as forecasted revenue, margin trends, occupancy/utilization, and volatility. Work with consultants, QA, and business teams to finalize KPIs and logic. Build pipelines that are not just performant, but audit-ready and metadata-rich from the first version. Integrate classification tags and ownership metadata into all columns using AWS Glue Catalog tagging conventions. Preferred candidate profile Strong hands-on experience with AWS: Glue, S3, Athena, Step Functions, EventBridge, CloudWatch, Glue Data Catalog. Programming skills in Python 3.x, PySpark, and SQL (Athena/Presto). Proficient with Pandas and NumPy for data wrangling, feature extraction, and time series slicing. Strong command over data governance tools like Great Expectations, OpenMetadata / Amundsen. Familiarity with tagging sensitive metadata (PII, KPIs, model inputs). Capable of creating audit logs for QA and rejected data. Experience in feature engineering rolling averages, deltas, and time-window tagging. BI-readiness with Sigma, with exposure to Power BI / Tableau (nice to have).
Posted 1 day ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
23751 Jobs | Dublin
Wipro
12469 Jobs | Bengaluru
EY
8625 Jobs | London
Accenture in India
7339 Jobs | Dublin 2
Uplers
7127 Jobs | Ahmedabad
Amazon
6778 Jobs | Seattle,WA
IBM
6514 Jobs | Armonk
Oracle
6388 Jobs | Redwood City
Muthoot FinCorp (MFL)
5532 Jobs | New Delhi
Capgemini
4741 Jobs | Paris,France