Posted:1 day ago|
Platform:
On-site
Full Time
Key responsibilities
1.Architecture and roadmap
Define reference architectures for lakehouse and medallion patterns using Delta Lake, OneLake, and Synapse/Fabric Lakehouse for scalable analytics and AI.
Create domain-driven data models, canonical schemas, and patterns for batch and streaming integration (bronze/silver/gold).
2.Platform design and build
Design ingestion frameworks for batch (ADF/Fabric Pipelines) and streaming (Event Hubs, Kafka, IoT Hub) into ADLS/OneLake with Delta and Change Data Capture.
Architect Databricks workloads (PySpark/Scala/SQL) for ETL/ELT, feature engineering, and ML data prep with robust job orchestration and scheduling.
3.Real-time streaming
Lead Structured Streaming architectures in Databricks with exactly-once semantics, watermarking, and stateful aggregations; design Kappa/Lambda where appropriate.
Implement low-latency serving layers and materialized views for near-real-time analytics and operational reporting.
4.Microsoft Fabric implementation
Establish Fabric workspaces, Lakehouse, Pipelines, Dataflows Gen2, Shortcuts to ADLS/OneLake, and semantic model standards for governed self-service BI.
Define data product patterns integrating Fabric with Databricks and Power BI for governed, reusable datasets.
5.Data governance and security
Implement RBAC/ABAC, Unity Catalog, Purview (lineage, glossary, classifications), encryption, network isolation, and data masking/tokenization.
Define data quality SLAs, expectations, and contracts; embed quality checks, observability, and lineage in pipelines.
6.DevOps and FinOps
Standardize CI/CD (Azure DevOps/GitHub), environment strategy, IaC (Bicep/Terraform), cluster policies, and workspace baselines.
Optimize cost via right-sized clusters, autoscaling, Photon, Delta optimization/Z-Order, and job scheduling.
7.Delivery leadership
Lead design reviews, threat modeling, performance testing, and production readiness; mentor engineers and partner with product/enterprise architects.
Translate business requirements into technical designs, estimates, and roadmaps; drive stakeholder communication and risk management.
Required skills and experience
8–12 years in data engineering/architecture with 4+ years on Azure data stack; strong leadership in complex enterprise programs.
Deep expertise
Databricks: PySpark/SQL, Delta Lake, Structured Streaming, Jobs/Workflows, Unity Catalog, cluster policies, performance tuning.
Azure: ADLS Gen2, Event Hubs/Kafka, Azure Functions/Logic Apps, Key Vault, ADF, Synapse; VNETs, Private Endpoints, Managed Identity.
Fabric: Lakehouse, OneLake, Pipelines, Dataflows Gen2, Shortcuts, semantic models, governance integration with Purview and Power BI.
Architecture patterns
Lakehouse, medallion, Data Mesh/data products, CDC with Debezium/Fivetran/ADF mapping data flows, SCD handling, schema evolution.
Batch and streaming design, watermarking, state store management, idempotency, backfills, and late/duplicate data handling.
Data management
Dimensional and semantic modeling, Data Vault/Kimball, query performance, partitioning, Z-Order, OPTIMIZE/VACUUM, file sizing.
DQ frameworks (Great Expectations/Deequ), monitoring/observability (Log Analytics, Databricks metrics), SLA/SLO design.
Security and compliance
Purview lineage and classification, Unity Catalog governance, PII/PHI handling, encryption, tokenization; audit, SOC2/ISO, GDPR/DPDP familiarity.
DevOps/IaC and automation
Git-based development, branch strategies, CI/CD for notebooks/SQL/artifacts, IaC for data resources, automated testing.
Communication and leadership
Strong stakeholder engagement, technical writing, solution estimation, and mentoring.
Nice to have
Experience with data products and mesh operating models; product lifecycle and contracts between producer/consumer domains.
ML/feature store integration (Databricks Feature Store), MLOps awareness for data readiness.
Knowledge of dbt, Terraform, Airflow, Confluent, and enterprise SSO/SCIM/SCIM provisioning with Databricks/Fabric.
Qualifications
Bachelor’s/Master’s in Computer Science, Engineering, or related field.
Certifications: Azure Solutions Architect Expert, Azure Data Engineer Associate, Databricks Data Engineer Professional/Associate, Microsoft Fabric Data Engineer Associate.
PwC India
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
bengaluru, karnataka, india
Salary: Not disclosed
noida
10.0 - 20.0 Lacs P.A.
mumbai, maharashtra, india
Experience: Not specified
Salary: Not disclosed
bengaluru, karnataka, india
Salary: Not disclosed
20.0 - 35.0 Lacs P.A.
bangalore, chennai, noida, hyderabad, gurugram, kolkata, pune, mumbai city, delhi
0.00021 - 0.00025 Lacs P.A.
pune, maharashtra, india
Salary: Not disclosed
gurugram, haryana, india
Salary: Not disclosed
pune, maharashtra, india
Experience: Not specified
Salary: Not disclosed
bengaluru, karnataka
Salary: Not disclosed