Posted:20 hours ago|
Platform:
On-site
Part Time
ROLES & RESPONSIBILITIES
Key Responsibilities
Analyze existing Hadoop, Pig, and Spark scripts from Dataproc and refactor them into Databricks-native PySpark.
Implement data ingestion and transformation pipelines using Delta Lake best practices.
Apply conversion rules and templates for automated code migration and testing.
Conduct data validation between legacy and migrated environments (schema, count, and data-level checks).
Collaborate on developing AI-driven tools for code conversion, dependency extraction, and error remediation.
Ensure best practices for code versioning, error handling, and performance optimization.
Participate in UAT, troubleshooting, and post-migration validation activities.
Technical Skills
Core: Python, PySpark, SQL
Databricks: Delta Lake, Unity Catalog, Databricks Workflows, MLflow (basic understanding)
GCP: Dataproc, BigQuery, GCS, Composer/Airflow, Cloud Functions
Data Engineering: Hadoop, Hive, Pig, Spark SQL
Automation: Experience with migration utilities or AI-assisted code transformation tools
CI/CD: Git, Jenkins, Terraform (preferred)
Validation: Data comparison utilities (Delta-to-Delta, DataFrame diffing, schema validation)
Preferred Experience
5–8 years in data engineering or big data application development.
Hands-on experience migrating Spark or Hadoop workloads to Databricks.
Familiarity with Delta architecture, data quality frameworks, and GCP cloud integration.
Exposure to GenAI-based tools for automation or code refactoring is a plus.
EXPERIENCE
SKILLS
ABOUT THE COMPANY
Infogain is a human-centered digital platform and software engineering company based out of Silicon Valley. We engineer business outcomes for Fortune 500 companies and digital natives in the technology, healthcare, insurance, travel, telecom, and retail & CPG industries using technologies such as cloud, microservices, automation, IoT, and artificial intelligence. We accelerate experience-led transformation in the delivery of digital platforms. Infogain is also a Microsoft (NASDAQ: MSFT) Gold Partner and Azure Expert Managed Services Provider (MSP).
Infogain, an Apax Funds portfolio company, has offices in California, Washington, Texas, the UK, the UAE, and Singapore, with delivery centers in Seattle, Houston, Austin, Kraków, Noida, Gurgaon, Mumbai, Pune, and Bengaluru.
Infogain
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python Now
gurgaon
3.575 - 9.425 Lacs P.A.
gurugram, haryana, india
Salary: Not disclosed
bengaluru
10.0 - 11.0 Lacs P.A.
haryana
Salary: Not disclosed
mumbai, maharashtra, india
Salary: Not disclosed
hyderabad
7.0 - 12.0 Lacs P.A.
8.0 - 10.0 Lacs P.A.
pune, maharashtra, india
Salary: Not disclosed
gurgaon, haryana, india
2.5 - 4.5 Lacs P.A.
hyderabad
10.0 - 11.0 Lacs P.A.