Posted:1 week ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Job Tittle - Data Engineer

Job Location - Pune (Onsite)

Job Type- Fulltime

Start Date - 1st Oct 2015


About the Role

We are seeking a highly skilled Data Engineer to design, build, and manage robust data pipelines and frameworks on Google Cloud Platform (GCP). The ideal candidate will have hands-on experience in PySpark, Python, GCP services (BigQuery, Cloud Functions, Pub/Sub), and Terraform, with strong capabilities in pipeline development, monitoring, and documentation (HLD & LLD).

Key Responsibilities

  • Data Pipeline Development
  • Design, build, and optimize scalable ETL/ELT data pipelines using PySpark and Python.
  • Implement GCP-native solutions leveraging BigQuery, Cloud Functions, Pub/Sub, and related services.
  • Use Terraform to automate infrastructure provisioning and deployments.

  • Pipeline Monitoring & Reliability
  • Implement monitoring, logging, and alerting mechanisms to ensure pipeline reliability and data quality.
  • Troubleshoot pipeline issues and optimize performance.

  • Architecture & Documentation
  • Contribute to High-Level Design (HLD) and Low-Level Design (LLD) documents for data solutions.
  • Collaborate with architects, data scientists, and business teams to translate requirements into technical specifications.

  • Collaboration & Best Practices
  • Work with cross-functional teams to integrate pipelines into broader data platforms.
  • Follow best practices for code quality, version control, CI/CD, and security.

Required Skills & Experience

  • Strong proficiency in PySpark and Python for data processing.
  • Hands-on experience with GCP services: BigQuery, Cloud Functions, Pub/Sub.
  • Infrastructure-as-Code expertise with Terraform.
  • Experience in building, deploying, and monitoring large-scale data pipelines.
  • Knowledge of data architecture and ability to prepare HLD and LLD documentation.
  • Strong problem-solving skills and ability to work in agile environments.

Preferred Qualifications

  • Experience in technologies such as Hadoop, Hive, kafka, snowflake, Matillion and AWS
  • Knowledge of CI/CD pipelines (Jenkins, GitLab, Git Actions etc.).
  • Familiarity with data governance, lineage, and security frameworks.
  • Experience with containerization (Docker, Kubernetes) is a plus.

Mock Interview

Practice Video Interview with JobPe AI

Start PySpark Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You