Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in mumbai
>
Celebal Technologies
>
Senior Data Engineer

Senior Data Engineer

Celebal Technologies

5 years

0 Lacs

mumbai maharashtra india

Posted:2 days ago| Platform:

Apply

Skills Required

data apache kafka integration databricks pipeline design architecture coding layers processing capture logic tuning reliability consistency management json spark inference optimization strategies persistence query configuration sql writing governance checks monitoring python pyspark aws azure gcp versioning orchestration airflow assessment deduplication ranking code audit

Work Mode

On-site

Job Type

Full Time

Job Description

JOB DESCRIPTION Data Engineer Designation – Data Engineer Experience – 5+ Years Location Mumbai (onsite)

Job Summary: We are seeking a highly skilled Data Engineer with deep expertise in Apache Kafka integration with Databricks, structured streaming, and large-scale data pipeline design using the Medallion Architecture. The ideal candidate will demonstrate strong hands-on experience in building and optimizing real-time and batch pipelines, and will be expected to solve real coding problems during the interview. Job Description: • Design, develop, and maintain real-time and batch data pipelines in Databricks. • Integrate Apache Kafka with Databricks using Structured Streaming. • Implement robust data ingestion frameworks using Databricks Autoloader. • Build and maintain Medallion Architecture pipelines across Bronze, Silver, and Gold layers. • Implement checkpointing, output modes, and appropriate processing modes in structured streaming jobs. • Design and implement Change Data Capture (CDC) workflows and Slowly Changing Dimensions (SCD) Type 1 and Type 2 logic. • Develop reusable components for merge/upsert operations and window functionbased transformations. • Handle large volumes of data efficiently through proper partitioning, caching, and cluster tuning techniques. • Collaborate with cross-functional teams to ensure data availability, reliability, and consistency. Must Have: • Apache Kafka: Integration, topic management, schema registry (Avro/JSON). • Databricks & Spark Structured Streaming: o Processing Modes: Append, Update, Complete o Output Modes: Memory, Console, File, Kafka, Delta o Checkpointing and fault tolerance • Databricks Autoloader: Schema inference, schema evolution, incremental loads. • Medallion Architecture implementation expertise. • Performance Optimization: o Data partitioning strategies o Caching and persistence o Adaptive query execution and cluster configuration tuning • SQL & Spark SQL: Proficiency in writing efficient queries and transformations. • Data Governance: Schema enforcement, data quality checks, and monitoring. • Good to Have: • Strong coding skills in Python and PySpark. • Experience working in CI/CD environments for data pipelines. • Exposure to cloud platforms (AWS/Azure/GCP). • Understanding of Delta Lake, time travel, and data versioning. • Familiarity with orchestration tools like Airflow or Azure Data Factory.

Mandatory Hands-on Coding Assessment (During Interview): Candidates will be required to demonstrate hands-on proficiency in the following areas:

1. Window Functions: o Implement logic using ROW_NUMBER, RANK, and DENSE_RANK in Spark. o Use cases such as deduplication, ranking within groups.

2. Merge/Upsert Logic: o Write PySpark code to perform MERGE operations in Delta Lake.

3. SCD Implementation: o SCD Type 1: Overwriting existing records. o SCD Type 2: Versioning records with effective start/end dates or is_current flags.

4. CDC (Change Data Capture): o Capture and process changes using techniques such as: ▪ Comparison with previous snapshots ▪ Using audit columns or timestamps ▪ Kafka-based event-driven ingestion

More Jobs at Celebal Technologies

Databricks Platform Architect

Hyderabad, Telangana, India

Experience: Not specified

Salary: Not disclosed

Data Scientist

Jaipur, Rajasthan, India

3.0 - 6.0 yrs

Salary: Not disclosed

MERN Stack Developer/Lead

Jaipur, Rajasthan, India

3.0 - 3.0 yrs

Salary: Not disclosed

Field Network Engineer Industrial Edge Deployment

Pune

5.0 - 7.0 yrs

INR 8 - 12 Lacs

Field Industrial Network Engineer OPC/Modbus/BACnet Protocols

Pune

5.0 - 7.0 yrs

INR 8 - 12 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

Celebal Technologies

Technology Consulting and Services

Ahmedabad

Login to

Please Verify Your Phone or Email

Confirm Action

Senior Data Engineer