Posted:2 days ago|
Platform:
On-site
Full Time
We are looking for an experienced Senior Data Engineer to lead the development of scalable AWS-native data lake pipelines with a strong focus on time series forecasting and upsert-ready architectures. This role requires end-to-end ownership of the data lifecycle, from ingestion to partitioning, versioning, and BI delivery. The ideal candidate must be highly proficient in AWS data services, PySpark, versioned storage formats like Apache Hudi/Iceberg, and must understand the nuances of data quality and observability in large-scale analytics systems.
Responsibilities
● Design and implement data lake zoning (Raw → Clean → Modeled) using Amazon S3, AWS Glue, and Athena.
● Ingest structured and unstructured datasets including POS, USDA, Circana, and internal sales data.
● Build versioned and upsert-friendly ETL pipelines using Apache Hudi or Iceberg.
● Create forecast-ready datasets with lagged, rolling, and trend features for revenue and occupancy modeling.
● Optimize Athena datasets with partitioning, CTAS queries, and metadata tagging.
● Implement S3 lifecycle policies, intelligent file partitioning, and audit logging.
● Build reusable transformation logic using dbt-core or PySpark to support KPIs and time series outputs.
● Integrate robust data quality checks using custom logs, AWS CloudWatch, or other DQ tooling.
● Design and manage a forecast feature registry with metrics versioning and traceability.
● Collaborate with BI and business teams to finalize schema design and deliverables for dashboard consumption.
Essential Skills Job
● Deep hands-on experience with AWS Glue, Athena, S3, Step Functions, and Glue Data Catalog.
● Strong command over PySpark, dbt-core, CTAS query optimization, and partition strategies.
● Working knowledge of Apache Hudi, Iceberg, or Delta Lake for versioned ingestion.
● Experience in S3 metadata tagging and scalable data lake design patterns.
● Expertise in feature engineering and forecasting dataset preparation (lags, trends, windows).
● Proficiency in Git-based workflows (Bitbucket), CI/CD, and deployment automation.
● Strong understanding of time series KPIs, such as revenue forecasts, occupancy trends, or demand volatility.
● Data observability best practices including field-level logging, anomaly alerts, and classification tagging.
Regards
Sahiba
8296043355
Talent Corner HR Services Pvt Ltd
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Hyderābād
4.10803 - 7.15 Lacs P.A.
Bengaluru
Experience: Not specified
15.0 - 30.0 Lacs P.A.
Gurgaon, Haryana, India
Salary: Not disclosed
Bengaluru, Karnataka, India
Salary: Not disclosed
Noida, Uttar Pradesh, India
Salary: Not disclosed
Hyderabad, Telangana, India
Salary: Not disclosed
Bengaluru, Karnataka, India
Salary: Not disclosed
Bengaluru, Karnataka, India
Salary: Not disclosed
Hyderabad, Telangana, India
Salary: Not disclosed
Mumbai, Maharashtra, India
Salary: Not disclosed