Python Pyspark Developer

3 - 7 years

0 Lacs

Noida All india

Posted:4 days ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

As a Python Developer, your role involves designing, developing, and implementing performant ETL pipelines using pySpark on Apache Spark within AWS EMR. Your responsibilities will include: - Writing reusable, testable, and efficient code for ETL pipelines - Integrating data storage solutions in Spark, particularly with AWS S3 object storage - Performance tuning of pySpark scripts to ensure optimal performance - Ensuring overall build delivery quality is of high standard and all deliveries are made on time - Handling customer meetings effectively with excellent communication skills Qualifications required for this role include: - Minimum 5 years of experience in programming with Python, demonstrating strong proficiency - Familiarity with functional programming concepts - At least 3 years of hands-on experience in developing ETL data pipelines using pySpark on AWS EMR - Hands-on experience with JSON processing using Python - Good understanding of Spark's RDD API and Dataframe API - Experience in configuring EMR clusters on AWS and working with Apache Spark Data sources API - Experience with troubleshooting and performance tuning of Spark jobs - Understanding of fundamental design principles behind business processes Nice to have skills include: - Knowledge of AWS SDK CLI - Experience setting up continuous integration/deployment of Spark jobs to EMR clusters - Familiarity with scheduling Spark applications in AWS EMR cluster - Understanding the differences between Hadoop MapReduce and Apache Spark - Proficient understanding of code versioning tools such as Git and SVN Experience required for this role is a minimum of 4-7 years. As a Python Developer, your role involves designing, developing, and implementing performant ETL pipelines using pySpark on Apache Spark within AWS EMR. Your responsibilities will include: - Writing reusable, testable, and efficient code for ETL pipelines - Integrating data storage solutions in Spark, particularly with AWS S3 object storage - Performance tuning of pySpark scripts to ensure optimal performance - Ensuring overall build delivery quality is of high standard and all deliveries are made on time - Handling customer meetings effectively with excellent communication skills Qualifications required for this role include: - Minimum 5 years of experience in programming with Python, demonstrating strong proficiency - Familiarity with functional programming concepts - At least 3 years of hands-on experience in developing ETL data pipelines using pySpark on AWS EMR - Hands-on experience with JSON processing using Python - Good understanding of Spark's RDD API and Dataframe API - Experience in configuring EMR clusters on AWS and working with Apache Spark Data sources API - Experience with troubleshooting and performance tuning of Spark jobs - Understanding of fundamental design principles behind business processes Nice to have skills include: - Knowledge of AWS SDK CLI - Experience setting up continuous integration/deployment of Spark jobs to EMR clusters - Familiarity with scheduling Spark applications in AWS EMR cluster - Understanding the differences between Hadoop MapReduce and Apache Spark - Proficient understanding of code versioning tools such as Git and SVN Experience required for this role is a minimum of 4-7 years.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now
coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You