Job
Description
About The Role
Project Role :Data Engineer
Project Role Description :Design, develop and maintain data solutions for data generation, collection, and processing. Create data pipelines, ensure data quality, and implement ETL (extract, transform and load) processes to migrate and deploy data across systems.
Must have skills :Databricks Unified Data Analytics Platform
Good to have skills :NA
Minimum 5 year(s) of experience is required
Educational Qualification :15 years full time education
Summary:As a Data Platform Engineer, you will assist with the data platform blueprint and design, collaborating with Integration Architects and Data Architects to ensure cohesive integration between systems and data models. You will play a crucial role in shaping the data platform components.
Roles and Responsibilities:
Work as part of the data engineering team to build, maintain, and optimize scalable data pipelines for large-scale data processing.Develop and implement ETL/ELT processes using Py Spark, Spark, and other relevant tools to move and transform data from various sources.Assist in designing and deploying solutions in major cloud platforms such as AWS, Azure, or GCP.Support the development and maintenance of Big Data processing frameworks and data lakes to handle structured and unstructured data.Collaborate with data scientists, analysts, and other engineers to ensure data accuracy and availability.Implement data ingestion strategies, ensuring the secure and efficient movement of data across different storage solutions.Work on real-time streaming data pipelines and batch data processing to handle high-volume workloads.Develop and maintain reusable code for data extraction, transformation, and loading (ETL) operations.Contribute to performance tuning of Spark jobs and data pipelines to ensure scalability and efficiency.Assist in maintaining governance and data security practices across cloud platforms. Technical
Skills:Experience with AWS, Azure, or GCP for data engineering workflows.Strong proficiency in Py Spark, Spark, or similar frameworks for building scalable data pipelines.Understanding of Big Data architectures, data storage, and data processing concepts.Familiarity with cloud-native data storage solutions such as S3, Blob Storage, Big Query, or Redshift.Experience with data orchestration tools like Apache Airflow or similar.Knowledge of data formats like Parquet, Avro, or JSON.Strong coding skills in Python for building data pipelines.Good understanding of SQL and database technologies.Excellent troubleshooting, debugging, and performance optimization skills.
Additional InformationThe candidate should have a minimum of 6 years of experience in Databricks Unified Data Analytics Platform.Experience with real-time data processing and Kafka or similar tools.Exposure to CI/CD pipelines and DevOps practices in cloud environments.Familiarity with DBT (Data Build Tool) for data transformation workflows.Experience with other ETL tools like Informatica, Talend, or Matillion.
Educational Qualification:15 years full time education is required.
Qualification 15 years full time education