Spark Scala Kafka Developer

Cloudxtreme

6 - 10 years

16 - 30 Lacs

pune chennai

Posted:None| Platform:

Apply

Skills Required

scala hadoop apache kafka apache spark data pipeline data warehousing data modeling sql

Work Mode

Hybrid

Job Type

Full Time

Job Description

Key Responsibilities:

Data Pipeline Development: Design, build, and maintain robust, scalable, and efficient ETL/ELT data pipelines using Scala and Apache Spark for large-scale batch and real-time data processing.
Real-time Streaming: Develop and manage high-throughput, low-latency data ingestion and streaming applications using Apache Kafka (producers, consumers, Kafka Streams, or ksqlDB where applicable).
Spark Expertise: Apply in-depth knowledge of Spark internals, Spark SQL, DataFrames API, and RDDs. Optimize Spark jobs for performance, efficiency, and resource utilization through meticulous tuning (e.g., partitioning, caching, shuffle optimizations).
Data Modeling & SQL: Design and implement efficient data models for various analytical workloads (e.g., dimensional modeling, star/snowflake schemas, data lakehouse architectures). Write complex SQL queries for data extraction, transformation, and validation.
Data Quality & Governance: Implement and enforce data quality checks, validation rules, and data governance standards within pipelines to ensure accuracy, completeness, and consistency of data.
Performance Monitoring & Troubleshooting: Monitor data pipeline performance, identify bottlenecks, and troubleshoot complex issues in production environments.
Collaboration: Work closely with data architects, data scientists, data analysts, and cross-functional engineering teams to understand data requirements, define solutions, and deliver high-quality data products.
Code Quality & Best Practices: Write clean, maintainable, and well-tested code. Participate in code reviews, contribute to architectural discussions, and champion data engineering best practices.
Documentation: Create and maintain comprehensive technical documentation, including design specifications, data flow diagrams, and operational procedures.

Required Skills & Qualifications:

Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field.
3+ years of hands-on experience as a Data Engineer or a similar role focused on Big Data.
Expert-level proficiency in Scala for developing robust and scalable data applications.
Strong hands-on experience with Apache Spark, including Spark Core, Spark SQL, and DataFrames API. Proven ability to optimize Spark jobs.
Solid experience with Apache Kafka for building real-time data streaming solutions (producer, consumer APIs, stream processing concepts).
Advanced SQL skills for data manipulation, analysis, and validation.
Experience with distributed file systems (e.g., HDFS) and object storage (e.g., Amazon S3, Azure Data Lake Storage, Google Cloud Storage).
Familiarity with data warehousing concepts and methodologies.
Experience with version control systems (e.g., Git).
Excellent problem-solving, analytical, and debugging skills.

Strong communication and collaboration abilities, with a passion for building data solutions.

More Jobs at Cloudxtreme

Sap Crm Technical Consultant

Hyderabad, Pune, Bengaluru

6.0 - 11.0 yrs

INR 11 - 20 Lacs

.NET Application Developer

Chennai, Bengaluru, Delhi / NCR

6.0 - 11.0 yrs

INR 13 - 15 Lacs

Java Developer

Hyderabad, Chennai, Bengaluru

7.0 - 12.0 yrs

INR 22 - 25 Lacs

Angular Developer

Hyderabad, Chennai, Bengaluru

8.0 - 13.0 yrs

INR 15 - 30 Lacs

Independence ops Sr. Analyst

Hyderabad

4.0 - 6.0 yrs

INR 0 - 0 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.