GCP & PySpark with ETL - Lead

5 - 9 years

0 Lacs

Posted:1 month ago| Platform: Shine logo

Apply

Skills Required

Work Mode

On-site

Job Type

Full Time

Job Description

As an ETL Developer with expertise in PySpark and Google Cloud Platform (GCP), your role involves designing, developing, and optimizing ETL pipelines on GCP. You will work with various GCP services such as BigQuery, Cloud Dataflow, Cloud Composer (Apache Airflow), and Cloud Storage for data transformation and orchestration. Your responsibilities include developing and optimizing Spark-based ETL processes for large-scale data processing, implementing best practices for data governance, security, and monitoring in a cloud environment, and ensuring data quality, validation, and consistency across pipelines. Key Responsibilities: - Design, develop, and optimize ETL pipelines using PySpark on Google Cloud Platform (GCP). - Work with BigQuery, Cloud Dataflow, Cloud Composer (Apache Airflow), and Cloud Storage for data transformation and orchestration. - Develop and optimize Spark-based ETL processes for large-scale data processing. - Implement best practices for data governance, security, and monitoring in a cloud environment. - Collaborate with data engineers, analysts, and business stakeholders to understand data requirements. - Troubleshoot performance bottlenecks and optimize Spark jobs for efficient execution. - Automate data workflows using Apache Airflow or Cloud Composer. - Ensure data quality, validation, and consistency across pipelines. Qualifications Required: - 5+ years of experience in ETL development with a focus on PySpark. - Strong hands-on experience with Google Cloud Platform (GCP) services such as BigQuery, Cloud Dataflow, Cloud Composer (Apache Airflow), and Cloud Storage. - Proficiency in Python and PySpark for big data processing. - Experience with data lake architectures and data warehousing concepts. - Knowledge of SQL for data querying and transformation. - Experience with CI/CD pipelines for data pipeline automation. - Strong debugging and problem-solving skills. - Experience with Kafka or Pub/Sub for real-time data processing. - Knowledge of Terraform for infrastructure automation on GCP. - Experience with containerization (Docker, Kubernetes). - Familiarity with DevOps and monitoring tools like Prometheus, Stackdriver, or Datadog.,

Mock Interview

Practice Video Interview with JobPe AI

Start PySpark Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
UST logo
UST

IT Services and IT Consulting

Aliso Viejo CA

RecommendedJobs for You