Role Description
Role Overview
We are seeking a highly skilled
Data Engineer
with expertise in building scalable data pipelines, ETL/ELT workflows, and cloud-based data solutions. The ideal candidate will be proficient in Python, SQL, PySpark, and modern data warehousing platforms such as
Snowflake, BigQuery, Delta Lake, and Lakehouse architectures
. This role requires hands-on development, strong problem-solving ability, and end-to-end ownership of data engineering deliverables.
Key Responsibilities
Data Pipeline Development
- Design, build, and optimize ETL/ELT pipelines for ingesting, transforming, and joining large datasets.
- Develop high-quality code (Python, PySpark, SQL) following best practices and coding standards.
- Perform pipeline testing, debugging, and performance tuning to ensure efficiency, scalability, and cost-optimization.
Data Architecture & Modelling
- Create and maintain data schemas, data models (3NF, Star Schema, Wide/Tall projections) and metadata structures.
- Implement and optimize data warehouse solutions on Snowflake, BigQuery, Delta Lake, Data Lake systems.
- Work with data governance tools (e.g., Collibra) and maintain data cataloging standards.
Cloud & DevOps
- Build pipelines using cloud ETL tools (Informatica, AWS Glue, Azure ADF, GCP DataProc/Dataflow, Airflow).
- Understand infrastructure cost impacts, perform cloud resource optimizations, monitor usage, and ensure SLAs.
- Apply DevOps practices including Git-based version control and CI/CD processes.
Project Execution & Stakeholder Management
- Interpret business requirements and translate them into technical design and architecture.
- Support Project Manager in planning, sprint execution, and delivery of assigned modules.
- Conduct product demos, clarify requirements, and collaborate with customer architects.
Quality, Testing & Documentation
- Review and create design documents (HLD, LLD, SAD), test cases, infra costing, STMs, and user documentation.
- Perform RCA, defect prevention, and continuous pipeline reliability improvement.
- Ensure compliance with engineering standards, security policies, and configuration management guidelines.
Team Leadership & Contribution
- Mentor junior engineers and guide them on best practices and certification paths.
- Contribute reusable components, frameworks, templates, and knowledge-base assets.
- Proactively identify risks, support retention initiatives, and drive team engagement.
Technical Skills
Required Skills & Qualifications
- Strong hands-on experience in Python (Pandas, PySpark), SQL optimization, and data manipulation.
- Expertise in ETL tools: Informatica, Airflow, Glue, DataProc, ADF, dbt (nice-to-have).
- Proficiency in Snowflake, including metadata management, RBAC, query profiling, and cost control.
- Knowledge of data modeling, data warehouse concepts, and performance tuning.
- Experience building robust pipelines with fault-tolerant architecture and metadata handling.
- Strong understanding of data security concepts, governance, RBAC, and compliance.
Cloud Skills
- Hands-on experience in AWS / Azure / GCP (data-specific services).
- Familiarity with Lakehouse, Delta Lake, S3/GCS/ADLS, NoSQL stores, and distributed data systems.
Soft Skills
- Strong communication skills for interacting with stakeholders and presenting solutions.
- Experience working in Agile/Scrum environments, participating in ceremonies, and maintaining scrum artefacts.
- Ability to estimate efforts, manage deliverables, and own end-to-end modules.
Success Metrics (KPIs)
- Adherence to engineering standards and project timelines
- Reduced defects and non-compliance issues
- Faster pipeline performance and resource optimization
- Quick turnaround for production issues
- Completion of required certifications and trainings
- High stakeholder satisfaction and sustained team engagement
Preferred Certifications
- Snowflake, AWS/GCP/Azure Data Engineer, Informatica
- Domain-specific certifications (as required)Top of Form
Bottom of Form
Skills
Snowflake, Airflow, DBT, Data Engineering