Data Architect with data lake implementation

6 - 11 years

25 - 37 Lacs

Posted:5 days ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Skills Required:

  • Familiarity with data processing engines such as Apache Spark, Flink, or other big data tools.
  • Design, develop, and implement robust data lake architectures on cloud platforms (AWS/Azure).
  • Implement streaming and batch data pipelines using Apache Hudi, Apache Hive, and cloud-native services like AWS Glue, Azure Data Lake, etc.
  • Architect and optimize ingestion, compaction, partitioning, and indexing strategies in Apache Hudi.
  • Develop scalable data transformation and ETL frameworks using Python, Spark, and Flink.
  • Work closely with DataOps/DevOps to build CI/CD pipelines and monitoring tools for data lake platforms.
  • Ensure data governance, schema evolution handling, lineage tracking, and compliance.
  • Sound knowledge of Hive, Parquet/ORC formats, and DeltaLake vs Hudi vs Iceberg
  • Strong understanding of schema evolution, data versioning, and ACID guarantees in data lakes
  • Collaborate with analytics and BI teams to deliver clean, reliable, and timely datasets.
  • Troubleshoot performance bottlenecks in big data processing workloads and pipelines.
  • Experience with data governance tools and practices, including data cataloging, data lineage, and metadata management.
  • Strong understanding of data integration and movement between different storage systems (databases, data lakes, data warehouses).
  • Strong understanding of API integration for data ingestion, including RESTful services and streaming data.
  • Experience in data migration strategies, tools, and frameworks for moving data from legacy systems (on-premises) to cloud-based solutions.
  • Proficiency with data warehousing solutions (e.g., Google BigQuery, Snowflake).
  • Expertise in data modeling tools and techniques (e.g., SAP Datasphere, EA Sparx).
  • Strong knowledge of SQL and NoSQL databases (e.g., MongoDB, Cassandra).
  • Familiarity with cloud platforms (e.g., AWS, Azure, Google Cloud).

Nice To Have

  • Experience with Apache Iceberg, Delta Lake
  • Familiarity with Kinesis, Kafka, or any streaming platform
  • Exposure to dbt, Airflow, or Dagster
  • Experience in data cataloging, data governance tools, and column-level lineage tracking

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You