Job description :
We are seeking an experienced and driven Data Engineer with 5+ years of hands-on experience in building scalable data infrastructure and systems. You will play a key role in designing and developing robust, high-performance ETL pipelines and managing large-scale datasets to support critical business functions. This role requires deep technical expertise, strong problem-solving skills, and the ability to thrive in a fast-paced, evolving environment. Key Responsibilities :
- Design, develop, and maintain scalable and reliable ETL/ELT pipelines for processing large volumes of data (terabytes and beyond). - Model and structure data for performance, scalability, and usability. - Work with cloud infrastructure (preferably Azure) to build and optimize data workflows. - Leverage distributed computing frameworks like Apache Spark and Hadoop for large-scale data processing. - Build and manage data lake/lakehouse architectures in alignment with best practices. - Optimize ETL performance and manage cost-effective data operations. - Collaborate closely with cross-functional teams including data science, analytics, and software engineering. - Ensure data quality, integrity, and security across all stages of the data lifecycle. Required Skills & Qualifications :
- 7 to 10 years of relevant experience in bigdata engineering. - Advanced proficiency in Python, - Strong skills in SQL for complex data manipulation and analysis. - Hands-on experience with Apache Spark, Hadoop, or similar distributed systems. - Proven track record of handling large-scale datasets (TBs) in production environments. - Cloud development experience with Azure (preferred), AWS, or GCP. - Solid understanding of data lake and data lakehouse architectures. - Expertise in ETL performance tuning and cost optimization techniques. - Knowledge of data structures, algorithms, and modern software engineering practices. Soft Skills :
- Strong communication skills with the ability to explain complex technical concepts clearly and concisely. - Self-starter who learns quickly and takes ownership. - High attention to detail with a strong sense of data quality and reliability. - Comfortable working in an agile, fast-changing environment with incomplete requirements. Preferred Qualifications :
- Experience with tools like Apache Airflow, Azure Data Factory, or similar. - Familiarity with CI/CD and DevOps in the context of data engineering. - Knowledge of data governance, cataloging, and access control principles. Skills : Python,Sql,Aws,Azure, Hadoop