3 - 8 years

2 - 4 Lacs

Posted:1 day ago| Platform: Foundit logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

  • We are seeking an experienced Data Engineer with a strong background in designing and developing scalable data pipelines using PySpark and AWS. The ideal candidate will have a deep understanding of data architecture, pipeline optimization, and hands-on experience in big data environments. This role requires strong communication skills to work with technical and business teams and a problem-solving mindset to develop efficient data engineering strategies.

Key Responsibilities:

  • Communicate with users, technical teams, and senior management to gather requirements and align data engineering strategies.
  • Work with business stakeholders to define business requirements and translate them into user stories and technical specifications.
  • Collaborate with enterprise data architects and engineering leads to ensure best practices and architectural consistency.
  • Ensure compliance with data quality, security, and governance standards.
  • Design and develop scalable, fault-tolerant data pipelines for batch and streaming data processing on AWS.
  • Translate business needs into technical solutions with a focus on scalability and modularity.
  • Optimize PySpark pipelines for performance, memory management, and parallel processing.
  • Utilize Spark UI and other tools for tuning and debugging long-running pipelines.
  • Participate in Agile development processes and CI/CD workflows for continuous integration and deployment.
  • Communicate technical results and data insights clearly to business stakeholders.

Required Qualifications:

  • 3+ years of experience in designing and developing data pipelines using Python (PySpark) and Spark SQL on AWS.
  • Strong hands-on experience with SQL, Hive, and large datasets in big data environments.
  • Proficiency in PySpark data frames, joins, caching, memory management, partitioning, and parallelism.
  • Experience with Spark UI, DAGs, and tuning Spark configuration for optimized performance.
  • Familiarity with building pipelines in both batch and streaming modes.
  • Experience working in Agile teams and using Git and CI/CD tools.
  • Knowledge of Hive table design and partitioning strategies for performance.

Desired Qualifications:

  • Experience in data modeling and ETL/ELT processes.
  • Hands-on experience with workflow schedulers like Autosys or CA Workload Automation.
  • Proficiency with AWS SDKs and integration with native AWS services.
  • Understanding of data architecture, analytics workflows, and pipeline orchestration.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Go Digital Technology Consulting logo
Go Digital Technology Consulting

Technology Consulting

Tech City

RecommendedJobs for You

Bengaluru, Karnataka, India

Pune, Maharashtra, India

Hyderabad, Telangana, India

Pune, Maharashtra, India

Mumbai Metropolitan Region

Pune, Maharashtra, India