Staff/Principal Data engineer

8 - 13 years

35 - 40 Lacs

Posted:2 weeks ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

 

Roles & Responsibilities
  • Define and lead the data architecture vision and strategy, ensuring it supports analytics, ML, and business operations at scale.
  • Architect and manage cloud-native data platforms using Databricks and AWS, leveraging the lakehouse architecture to unify data engineering and ML workflows.
  • Build and optimize large-scale batch and streaming pipelines using Apache Spark, Airflow, and AWS Glue, ensuring high availability and fault tolerance.
  • Design and develop data marts, warehouses, and analytics-ready datasets tailored for BI, product, and data science teams.
  • Implement robust ETL/ELT pipelines with a focus on reusability, modularity, and automated testing.
  • Enforce and scale data governance practices, including data lineage, cataloging, access management, and compliance with security and privacy standards.
  • Partner with ML Engineers and Data Scientists to build and deploy ML pipelines, leveraging Databricks MLflow, Feature Store, and MLOps practices.
  • Provide architectural leadership across data modeling, data observability, pipeline monitoring, and CI/CD for data workflows.
  • Evaluate emerging tools and frameworks, recommending technologies that align with platform scalability and cost-efficiency.
  • Mentor data engineers and foster a culture of technical excellence, innovation, and ownership across data teams.


Required Skills & Qualifications
  • 8+ years of hands-on experience in data engineering, with at least 4 years in a lead or architect-level role.
  • Deep expertise in Apache Spark, with proven experience developing large-scale distributed data processing pipelines.
  • Strong experience with Databricks platform and its internal ecosystem (e.g., Delta Lake, Unity Catalog, MLflow, Job orchestration, Workspaces, Clusters, Lakehouse architecture).
  • Extensive experience with workflow orchestration using Apache Airflow.
  • Proficiency in both SQL and NoSQL databases (e.g., Postgres, DynamoDB, MongoDB, Cassandra) with a deep understanding of schema design, query tuning, and data partitioning.
  • Proven background in building data warehouse/data mart architectures using AWS services like Redshift, Athena, Glue, Lambda, DMS, and S3.
  • Strong programming and scripting ability in Python (preferred) or other AWS-compatible languages.
  • Solid understanding of data modeling techniques, versioned datasets, and performance tuning strategies.
  • Hands-on experience implementing data governance, lineage tracking, data cataloging, and compliance frameworks (GDPR, HIPAA, etc.).
  • Experience with real-time data streaming using tools like Kafka, Kinesis, or Flink.
  • Working knowledge of MLOps tooling and workflows, including automated model deployment, monitoring, and ML pipeline orchestration.
  • Familiarity with MLflow, Feature Store, and Databricks-native ML tooling is a plus.
  • Strong grasp of CI/CD for data and ML pipelines, automated testing, and infrastructure-as-code (Terraform, CDK, etc.).
  • Excellent communication, leadership, and mentoring skills with a collaborative mindset and the ability to influence across functions.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You