Responsibilities: · Data Pipeline Development: Build and maintain scalable data pipelines to extract, transform, and load (ETL) data from various sources (e.g., databases, APIs, files) into data warehouses or data lakes. · Data Infrastructure: Design, implement, and manage data infrastructure components, including data warehouses, data lakes, and data marts. · Data Quality: Ensure data quality by implementing data validation, cleansing, and standardization processes. · Performance Optimization: Optimize data pipelines and infrastructure for performance and efficiency. · Collaboration: Collaborate with data analysts, scientists, and business stakeholders to understand their data needs and translate them into technical requirements. · Tool and Technology Selection: Evaluate and select appropriate data engineering tools and technologies (e.g., SQL, Python, Spark, Hadoop, cloud platforms). · Documentation: Create and maintain clear and comprehensive documentation for data pipelines, infrastructure, and processes. Experience: 5 - 10 years Skills: · Strong proficiency in SQL and at least one programming language (e.g., Python, Java). · Experience with data warehousing and data lake technologies (e.g., Snowflake, AWS Redshift, Databricks). · Knowledge of cloud platforms (e.g., AWS, GCP, Azure) and cloud-based data services. · Understanding of data modeling and data architecture concepts. · Experience with ETL/ELT tools and frameworks. · Excellent problem-solving and analytical skills. · Ability to work independently and as part of a team. Preferred Qualifications: · Experience with real-time data processing and streaming technologies (e.g., Kafka, Flink). · Knowledge of machine learning and artificial intelligence concepts. · Experience with data visualization tools (e.g., Tableau, Power BI). · Certification in cloud platforms or data engineering.