Posted:1 week ago| Platform: GlassDoor logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Job Title: Data Engineer

Location: Bangalore/ Chennai

Experience: 6 to 9 yrs

Job Summary:

We are looking for a highly skilled PySpark and Scala Developer with hands-on experience in building and optimizing large-scale data processing systems using Apache Spark. The ideal candidate will have a deep understanding of distributed computing, data transformations, and performance tuning, along with strong programming and problem-solving skills.

Key Responsibilities:

  • Design, develop, and maintain data pipelines using Apache Spark (PySpark & Scala) for large-scale data processing.
  • Implement data ingestion, cleansing, transformation, and aggregation workflows from various structured and unstructured data sources.
  • Optimize Spark jobs for performance and scalability in distributed environments.
  • Integrate Spark-based ETL processes with data lakes, warehouses, and streaming sources (e.g., Kafka, HDFS, S3).
  • Collaborate with data engineers, analysts, and architects to define and implement robust data solutions.
  • Develop reusable and modular components to support analytics and machine learning use cases.
  • Perform unit testing, debugging, and troubleshooting of Spark jobs.
  • Monitor and tune cluster resource usage and job performance.
  • Ensure data quality, reliability, and compliance with organizational data governance standards.

Required Skills and Experience:

  • 5–9 years of experience in data engineering or big data development.
  • Strong proficiency in Apache Spark (core, SQL, and streaming).
  • Hands-on programming experience in PySpark and Scala.
  • Solid understanding of distributed computing concepts, Spark internals, and RDD/DataFrame APIs.
  • Experience with Hadoop ecosystem (HDFS, Hive, YARN, etc.).
  • Strong in SQL and working with relational databases.
  • Experience integrating with cloud platforms (AWS, Azure, or GCP).
  • Familiarity with workflow orchestration tools (e.g., Airflow, Oozie, or Azure Data Factory).
  • Knowledge of CI/CD practices and version control (Git, Jenkins).
  • Excellent analytical, debugging, and problem-solving skills.

Good to Have:

  • Experience with Delta Lake, Databricks, or Snowflake.
  • Exposure to streaming frameworks like Kafka or Spark Streaming.
  • Knowledge of Python libraries for data analysis (Pandas, NumPy).
  • Familiarity with containerization (Docker, Kubernetes).

Job Type: Full-time

Pay: Up to ₹2,500,000.00 per year

Experience:

  • Pyspark: 5 years (Preferred)
  • Scala: 5 years (Preferred)

Location:

  • Bangalore, Karnataka (Required)

Work Location: In person

Mock Interview

Practice Video Interview with JobPe AI

Start PySpark Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

hyderabad, bangalore rural, bengaluru