Hadoop Architect

5 - 7 years

10 - 20 Lacs

Posted:4 days ago| Platform: Naukri logo

Apply

Work Mode

Hybrid

Job Type

Full Time

Job Description

Job Summary

Data Lake Expert

real-time analytics, AI/ML pipelines, and regulatory compliance

Key Responsibilities

1. Data Lake Architecture & Design

  • Architect and implement scalable

    data lakehouse solutions

    using Apache Iceberg, Hudi, or Delta Lake.
  • Design partitioning, schema evolution, and metadata management strategies for large-scale datasets.
  • Develop unified batch and streaming data pipelines leveraging

    Apache Spark

    ,

    Flink

    , and

    Kafka

    .

2. Data Ingestion & Processing

  • Build robust

    ETL/ELT frameworks

    to handle high-volume structured and semi-structured data (JSON, Parquet, ORC, Avro).
  • Optimize Spark jobs for performance, scalability, and cost efficiency on cloud or hybrid infrastructure.
  • Integrate data from multiple sources: APIs, relational databases, message queues, and IoT devices.

3. Data Governance & Quality

  • Implement

    data versioning, lineage tracking, time travel, and ACID transactions

    using Iceberg/Hudi.
  • Enforce

    data governance, access control, and security policies

    (RBAC, encryption, masking).
  • Collaborate with Data Scientists and Analysts to ensure data accuracy, completeness, and freshness.

4. Performance & Optimization

  • Tune cluster configurations, caching, and query performance on Spark and Presto/Trino.
  • Benchmark storage formats (Parquet, ORC) and compression codecs for optimal throughput.
  • Design efficient checkpointing and compaction mechanisms for streaming data pipelines.

5. DevOps & Integration

  • Deploy and manage pipelines using

    Airflow

    ,

    Dagster

    , or

    Prefect

    .
  • Automate CI/CD for data pipelines using GitHub Actions, Jenkins, or Azure DevOps.
  • Integrate data lake with downstream systems BI tools (Superset, Power BI), ML pipelines, and APIs.

Required Skills and Qualifications

  • Bachelors or Master’s in

    Computer Science, Data Engineering, or Information Systems

    .
  • 5+ years of hands-on experience

    in building and maintaining large-scale data platforms.
  • Expert knowledge of

    Apache Spark

    ,

    Iceberg

    ,

    Hudi

    , or

    Delta Lake

    .
  • Strong experience with

    Kafka

    ,

    Airflow

    ,

    Trino/Presto

    , and

    Python or Scala

    .
  • Deep understanding of

    data modeling, distributed systems, and big data performance tuning

    .
  • Proficiency in SQL and NoSQL databases (PostgreSQL, MongoDB, Cassandra, etc.).
  • Knowledge of

    data governance frameworks

    and

    security best practices

    .

Preferred Qualifications

  • Experience with

    Lakehouse architecture

    in production environments.
  • Exposure to

    machine learning pipelines

    and

    real-time analytics

    use cases.
  • Familiarity with

    open table formats

    (Iceberg, Delta, Hudi) interoperability standards.
  • Understanding of

    Docker, Kubernetes

    , and cloud-native deployment strategies.
  • Certification in

    AWS Data Analytics

    ,

    Databricks

    , or

    Cloudera

    is a plus.

Soft Skills

  • Strong problem-solving and analytical skills.
  • Excellent communication and documentation abilities.
  • Ability to work collaboratively in cross-functional teams (AI, BI, DevOps, Security).
  • Self-motivated, detail-oriented, and proactive in technology research.

    Role & responsibilities

Preferred candidate profile

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Synkcode logo
Synkcode

Software Development

Dubai Vadodara

RecommendedJobs for You

ahmedabad, chennai, bengaluru