Pyspark Developer

Sunware Technologies

5 - 10 years

5 - 15 Lacs

kolkata hyderabad chennai

Posted:16 hours ago| Platform:

Apply

Skills Required

pyspark aws python sql apache atlas amazon redshift cloudwatch delta lake kafka ci/cd aws glue lambda aws

Work Mode

Work from Office

Job Type

Full Time

Job Description

Location:

Experience:

Key Responsibilities:

Design and build robust, scalable ETL/ELT pipelines using
PySpark
to ingest data from diverse sources (databases, logs, APIs, files).
Transform and curate raw transactional and log data into analysis-ready datasets in the Data Hub and analytical data marts.
Develop reusable and parameterized Spark jobs for batch and micro-batch processing.
Optimize performance and scalability of PySpark jobs across large data volumes.
Ensure data quality, consistency, lineage, and proper documentation across ingestion flows.
Collaborate with Data Architects, Modelers, and Data Scientists to implement ingestion logic aligned with business needs.
Work with cloud-based data platforms (e.g.,
AWS S3, Glue, EMR, Redshift
) for data movement and storage.
Support version control,
CI/CD
, and infrastructure-as-code where applicable

Required Skills & Qualifications:

5+ years of experience in data engineering, with strong focus on
PySpark/Spark
for big data processing.
Expertise in building data pipelines and ingestion frameworks from relational, semi-structured
(JSON, XML)
, and unstructured sources
(logs, PDFs).
Proficiency in
Python
with strong knowledge of data processing libraries.
Strong
SQL
skills for querying and validating data in platforms like
Amazon Redshift, PostgreSQL
, or similar.
Experience with distributed computing frameworks (e.g.,
Spark on EMR, Databricks
).
Familiarity with workflow orchestration tools (e.g.,
AWS Step Functions
, or similar).
Solid understanding of
data lake / data warehouse architectures
and
data modeling basics.

Preferred Qualifications:

Experience with AWS data services:
Glue, S3, Redshift, Lambda, CloudWatch, etc.
Familiarity with
Delta Lake
or similar for large-scale data storage.
Exposure to real-time streaming frameworks (e.g.,
Spark Structured Streaming, Kafka
).
Knowledge of data governance, lineage, and cataloging tools (e.g.,
AWS Glue Catalog, Apache Atlas
).
Understanding of DevOps/CI-CD pipelines for data projects using Git, Jenkins, or similar tools.

Application Process:

Interested candidates, email resume and cover letter to velkiruba.s@sunware.in

More Jobs at Sunware Technologies

Senior Dotnet Developer

Chennai

5 - 10 yrs

INR 6 - 10 Lacs

Technical Lead

Pune

8 - 12 yrs

INR 15 - 27 Lacs

Senior Java Software Engineer

Pune, Maharashtra, India

7.0 - 10.0 yrs

Salary: Not disclosed

Senior Java Software Engineer

Pune

6.0 - 10.0 yrs

INR 12 - 20 Lacs

Sunware Technologies - Senior Java Developer - Backend System

Pune, Maharashtra, India

Experience: Not specified

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start PySpark Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.