Data Engineer (Hadoop)

5 years

0 Lacs

Posted:3 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Position Overview

An experienced Senior Data Engineer with at least 5 years of hands-on experience in data engineering. The ideal candidate will have a solid understanding of big data technologies and be skilled in building scalable data infrastructure, designing ETL pipelines, and leveraging tools like Hadoop, PySpark, Kafka, and Apache NiFi.

Key Responsibilities

  • Design, develop, and maintain large-scale, high-performance data systems and data pipelines using Python, PySpark, Hadoop, and Kafka.
  • Build, deploy, and optimize ETL workflows to process and transform large volumes of structured and unstructured data.
  • Collaborate with cross-functional teams to understand requirements and implement solutions that meet business needs.
  • Work with Apache NiFi for data ingestion, transformation, and flow management.
  • Write and optimize complex SQL queries for data manipulation and reporting.
  • Apply strong data structures and algorithms knowledge to solve complex technical problems.
  • Automate tasks and processes using Shell Script and Linux-based tools.
  • Participate in code reviews, design discussions.
  • Ensure adherence to best practices in software development, testing, and deployment.
  • Continuously improve software performance, scalability, and reliability.
  • Stay up-to-date with the latest developments in data engineering, big data technologies and incorporate them into the team's practices.


Required Skills and Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
  • At least 5 years of professional software development experience, with strong expertise in the following:
  • Python: Advanced proficiency in Python, including libraries like pandas, numpy, etc.
  • PySpark: Experience with distributed data processing using PySpark.
  • Hadoop: Familiarity with the Hadoop ecosystem, including HDFS, MapReduce, and related tools.
  • Kafka: Hands-on experience in building and maintaining Kafka-based messaging systems.
  • SQL: Strong knowledge of relational databases and advanced SQL querying.
  • Data Structures & Algorithms: Strong understanding and practical application of data structures and algorithms.
  • Data Engineering Best Practices: Deep understanding of data modeling, pipeline design, and data infrastructure architecture.
  • ETL Pipelines: Expertise in designing, building, and maintaining efficient ETL pipelines.
  • Apache NiFi: Knowledge of data flow management using Apache NiFi.
  • Shell Scripting: Proficiency in writing efficient shell scripts for task automation.
  • Linux: Strong knowledge of Linux systems and tools for development and deployment.
  • Experience with Agile development methodologies.
  • Excellent problem-solving skills and ability to troubleshoot complex technical issues.
  • Strong communication skills with the ability to work in a collaborative team environment.


Preferred Qualifications

  • Experience with platforms like Cloudera, Databricks.
  • Familiarity with containerization technologies like Docker and Kubernetes.
  • Knowledge of data warehousing and data lakes.

Mock Interview

Practice Video Interview with JobPe AI

Start PySpark Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Tata Communications logo
Tata Communications

Telecommunications

Chennai

RecommendedJobs for You