Posted:5 days ago|
Platform:
Work from Office
Full Time
Design and Develop Data Pipelines: Create, maintain, and optimize scalable Extract,
Transform, Load (ETL) and ELT pipelines using PySpark on a distributed computing
environment (e.g., Databricks, AWS EMR, Azure Synapse).
• Code Migration and Modernization: Lead the effort to re-engineer existing Python-based data
processes, functions, and analytical logic (including legacy systems or complex Pandas
transformations) into efficient and performant PySpark code.
• Performance Tuning: Profile, optimize, and fine-tune PySpark jobs for maximum speed and
efficiency, focusing on techniques like partitioning, caching, broadcast variables, and query
optimization to handle terabyte-scale healthcare datasets.
• Data Quality and Governance: Implement robust data validation, cleansing, and monitoring
procedures within PySpark jobs to ensure the highest quality and integrity of claims, member,
and provider data.
Skillventory
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python Now
hyderābād
15.0 - 20.0 Lacs P.A.
kolkata, hyderabad, chennai
5.0 - 15.0 Lacs P.A.
18.0 - 27.5 Lacs P.A.
chennai, tamil nadu, india
Salary: Not disclosed
kolkata, west bengal, india
Salary: Not disclosed
hyderabad, telangana, india
Salary: Not disclosed
chennai, tamil nadu, india
Salary: Not disclosed
pune, maharashtra, india
Salary: Not disclosed
noida, gurugram
13.0 - 19.0 Lacs P.A.
hyderabad, bengaluru
15.0 - 25.0 Lacs P.A.