Data Engineer-GCP

3 - 7 years

0 Lacs

Posted:4 days ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Role Overview: As a skilled Data Engineer with experience in Google Cloud Platform (GCP), PySpark, SQL, and ETL processes, you will be responsible for building, optimizing, and maintaining scalable data pipelines and workflows. Your expertise in technologies such as Apache Airflow, PySpark, and cloud-native tools will be crucial in ensuring efficient data processing. Key Responsibilities: - Design, develop, and maintain efficient and scalable data pipelines using PySpark and SQL. - Build and manage workflows/orchestration using Apache Airflow. - Utilize GCP services like BigQuery, Cloud Storage, Dataflow, and Composer. - Implement and optimize ETL processes to guarantee data quality, consistency, and reliability. - Collaborate with data analysts, data scientists, and engineering teams to meet data requirements. - Monitor pipeline performance and troubleshoot issues promptly. - Write clean, maintainable, and well-documented code. Qualifications Required: - Strong programming skills in PySpark and SQL. - Proven experience with Google Cloud Platform (GCP). - Solid grasp of ETL concepts and data pipeline design. - Hands-on experience with Apache Airflow or similar orchestration tools. - Familiarity with cloud-based data warehousing and storage systems such as BigQuery and Cloud Storage. - Experience in performance tuning and optimizing large-scale data pipelines. - Strong problem-solving skills and the ability to work both independently and collaboratively within a team. Role Overview: As a skilled Data Engineer with experience in Google Cloud Platform (GCP), PySpark, SQL, and ETL processes, you will be responsible for building, optimizing, and maintaining scalable data pipelines and workflows. Your expertise in technologies such as Apache Airflow, PySpark, and cloud-native tools will be crucial in ensuring efficient data processing. Key Responsibilities: - Design, develop, and maintain efficient and scalable data pipelines using PySpark and SQL. - Build and manage workflows/orchestration using Apache Airflow. - Utilize GCP services like BigQuery, Cloud Storage, Dataflow, and Composer. - Implement and optimize ETL processes to guarantee data quality, consistency, and reliability. - Collaborate with data analysts, data scientists, and engineering teams to meet data requirements. - Monitor pipeline performance and troubleshoot issues promptly. - Write clean, maintainable, and well-documented code. Qualifications Required: - Strong programming skills in PySpark and SQL. - Proven experience with Google Cloud Platform (GCP). - Solid grasp of ETL concepts and data pipeline design. - Hands-on experience with Apache Airflow or similar orchestration tools. - Familiarity with cloud-based data warehousing and storage systems such as BigQuery and Cloud Storage. - Experience in performance tuning and optimizing large-scale data pipelines. - Strong problem-solving skills and the ability to work both independently and collaboratively within a team.

Mock Interview

Practice Video Interview with JobPe AI

Start PySpark Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You

gurugram, all india

gurugram, all india