Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in Pune
>
Synechron
>
Senior PySpark Data Engineer

Senior PySpark Data Engineer

Synechron

7 - 12 years

9 - 14 Lacs

Pune Hinjewadi

Posted:6 months ago| Platform:

Apply

Skills Required

PySpark S3 Unix operating systems Spark SQL Luigi HDFS AWS EMR Apache Airflow Hive Linux Azure HDInsight Apache Kafka AWS

Work Mode

Work from Office

Job Type

Full Time

Job Description

Job Summary Synechron is seeking an experienced and technically proficient Senior PySpark Data Engineer to join our data engineering team. In this role, you will be responsible for developing, optimizing, and maintaining large-scale data processing solutions using PySpark. Your expertise will support our organizations efforts to leverage big data for actionable insights, enabling data-driven decision-making and strategic initiatives. Software Requirements Required Skills: Proficiency in PySpark Familiarity with Hadoop ecosystem components (e.g., HDFS, Hive, Spark SQL) Experience with Linux/Unix operating systems Data processing tools like Apache Kafka or similar streaming platforms Preferred Skills: Experience with cloud-based big data platforms (e.g., AWS EMR, Azure HDInsight) Knowledge of Python (beyond PySpark), Java or Scala relevant to big data applications Familiarity with data orchestration tools (e.g., Apache Airflow, Luigi) Overall Responsibilities Design, develop, and optimize scalable data processing pipelines using PySpark. Collaborate with data engineers, data scientists, and business analysts to understand data requirements and deliver solutions. Implement data transformations, aggregations, and extraction processes to support analytics and reporting. Manage large datasets in distributed storage systems, ensuring data integrity, security, and performance. Troubleshoot and resolve performance issues within big data workflows. Document data processes, architectures, and best practices to promote consistency and knowledge sharing. Support data migration and integration efforts across varied platforms. Strategic Objectives: Enable efficient and reliable data processing to meet organizational analytics and reporting needs. Maintain high standards of data security, compliance, and operational durability. Drive continuous improvement in data workflows and infrastructure. Performance Outcomes & Expectations: Efficient processing of large-scale data workloads with minimum downtime. Clear, maintainable, and well-documented code. Active participation in team reviews, knowledge transfer, and innovation initiatives. Technical Skills (By Category) Programming Languages: Required: PySpark (essential); Python (needed for scripting and automation) Preferred: Java, Scala Databases/Data Management: Required: Experience with distributed data storage (HDFS, S3, or similar) and data warehousing solutions (Hive, Snowflake) Preferred: Experience with NoSQL databases (Cassandra, HBase) Cloud Technologies: Required: Familiarity with deploying and managing big data solutions on cloud platforms such as AWS (EMR), Azure, or GCP Preferred: Cloud certifications Frameworks and Libraries: Required: Spark SQL, Spark MLlib (basic familiarity) Preferred: Integration with streaming platforms (e.g., Kafka), data validation tools Development Tools and Methodologies: Required: Version control systems (e.g., Git), Agile/Scrum methodologies Preferred: CI/CD pipelines, containerization (Docker, Kubernetes) Security Protocols: Optional: Basic understanding of data security practices and compliance standards relevant to big data management Experience Requirements Minimum of 7+ years of experience in big data environments with hands-on PySpark development. Proven ability to design and implement large-scale data pipelines. Experience working with cloud and on-premises big data architectures. Preference for candidates with domain-specific experience in finance, banking, or related sectors. Candidates with substantial related experience and strong technical skills in big data, even from different domains, are encouraged to apply. Day-to-Day Activities Develop, test, and deploy PySpark data processing jobs to meet project specifications. Collaborate in multi-disciplinary teams during sprint planning, stand-ups, and code reviews. Optimize existing data pipelines for performance and scalability. Monitor data workflows, troubleshoot issues, and implement fixes. Engage with stakeholders to gather new data requirements, ensuring solutions are aligned with business needs. Contribute to documentation, standards, and best practices for data engineering processes. Support the onboarding of new data sources, including integration and validation. Decision-Making Authority & Responsibilities: Identify performance bottlenecks and propose effective solutions. Decide on appropriate data processing approaches based on project requirements. Escalate issues that impact project timelines or data integrity. Qualifications Bachelors degree in Computer Science, Information Technology, or related field. Equivalent experience considered. Relevant certifications are preferred: Cloudera, Databricks, AWS Certified Data Analytics, or similar. Commitment to ongoing professional development in data engineering and big data technologies. Demonstrated ability to adapt to evolving data tools and frameworks. Professional Competencies Strong analytical and problem-solving skills, with the ability to model complex data workflows. Excellent communication skills to articulate technical solutions to non-technical stakeholders. Effective teamwork and collaboration in a multidisciplinary environment. Adaptability to new technologies and emerging trends in big data. Ability to prioritize tasks effectively and manage time in fast-paced projects. Innovation mindset, actively seeking ways to improve data infrastructure and processes.

More Jobs at Synechron

Full Stack .Net Developer

Pune, Bengaluru

5 - 10 yrs

INR 15 - 30 Lacs

C++ Developer

Chennai

5 - 9 yrs

INR 14 - 19 Lacs

Business Analyst - Capital Markets

Chennai

6 - 11 yrs

INR 0 - 0 Lacs

Database Developer

Chennai

5 - 10 yrs

INR 0 - 0 Lacs

DevOps Engineer

Pune, Maharashtra, India

8 - 8 yrs

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start PySpark Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

Synechron

Information Technology and Services

New York

Login to

Please Verify Your Phone or Email

Confirm Action

Senior PySpark Data Engineer