Home
Jobs

Lead I - Senior Data Engineer – PySpark - GCP or Any Cloud - SQL

6 years

0 Lacs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Role Description

Roles & Responsibilities:

Development & Implementation

  • Design, build, and maintain large-scale batch and real-time data pipelines using PySpark, Spark, Hive, and related big data tools.
  • Write clean, efficient, and scalable code aligned with application design and coding standards.
  • Create and maintain technical documentation including design documents, test cases, and configurations.

Technical Leadership

  • Contribute to HLD, LLD, and data architecture documents.
  • Review and validate designs and code from peers and junior developers.
  • Lead technical discussions and decisions with cross-functional teams.

Data Management & Optimization

  • Optimize data processing workflows for efficiency, cost, and performance.
  • Manage data quality and ensure data accuracy, lineage, and governance across the pipeline.

Stakeholder Collaboration

  • Collaborate with product managers, data stewards, and business stakeholders to translate data requirements into robust engineering solutions.
  • Clarify requirements and propose design options to customers.

Testing & Quality Assurance

  • Write and review unit tests and integration tests to ensure data integrity and performance.
  • Monitor and troubleshoot data pipeline issues and ensure minimal downtime.

Agile Project Contribution

  • Participate in sprint planning, estimation, and daily stand-ups.
  • Ensure on-time delivery of user stories and bug fixes.
  • Drive release planning and execution processes.

Team Mentorship & Leadership

  • Set FAST goals and provide timely feedback to team members.
  • Mentor junior engineers, contribute to a positive team environment, and drive continuous improvement.

Compliance & Documentation

  • Ensure adherence to compliance standards such as SOX, HIPAA, and organizational coding standards.
  • Contribute to knowledge repositories, project wikis, and best practice documents.

Must-Have Skills

  • Minimum 6+ years of experience as a Data Engineer.
  • Hands-on expertise in PySpark and SQL.
  • Experience in Google Cloud Platform (GCP) or similar cloud environments (AWS, Azure).
  • Proficient in Big Data technologies such as Spark, Hadoop, Hive.
  • Solid understanding of ETL/ELT frameworks, data warehousing, and data modeling.
  • Strong knowledge of CI/CD tools (Jenkins, Git, Ansible, etc.).
  • Excellent problem-solving and analytical skills.
  • Strong written and verbal communication skills.
  • Experience with Agile/Scrum methodologies.

Good-to-Have Skills

  • Experience with data orchestration tools (Airflow, Control-M).
  • Familiarity with modern data platforms such as Snowflake, DataRobot, Denodo.
  • Experience in containerized environments (Kubernetes, Docker).
  • Exposure to data security, governance, and compliance frameworks.
  • Hands-on with Terraform, ARM Templates, or similar scripting tools for infrastructure automation.
  • Domain knowledge in banking, healthcare, or retail industries.

Skills

Spark,Hadoop,Hive,Gcp

Mock Interview

Practice Video Interview with JobPe AI

Start Data Interview Now
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
UST
UST

IT Services and IT Consulting

Aliso Viejo CA

10001 Employees

1291 Jobs

    Key People

  • Kris Canekeratne

    Co-Founder & CEO
  • Sandeep Reddy

    President

RecommendedJobs for You