Your future duties and responsibilities
Develop and build ingestion, transformation, storage, and consumption layers on Databricks
. Develop and maintain data models, data flow diagrams, and other solution documentation based on Kimball dimensional modelling principles. . Develop and implement Lakehouse solution using Delta Lake, Unity Catalog, and structured streaming. . Proven experience in implementing data governance using Unity Catalog, including fine-grained access control, column-level lineage, data classification, audit logging, and centralized metadata management across workspaces and cloud environments . Develop scalable ETL/ELT pipelines using Apache Spark, PySpark, and Databricks Workflows. . Development of integration of Databricks with enterprise systems such as data catalogs, data quality frameworks, ML platforms, and BI tools. . Design and development of high-performance reporting models and paginated reports, configurable inquiries and interactive dashboards using Power BI. . Experience with CI/CD pipelines, version control, and automated testing for Databricks notebooks and jobs. . Experience in performance tuning and cluster configuration. . Participate in architectural reviews, code audits, to ensure adherence to standards and scalability. . Collaborate closely with clients, business stakeholders, and internal teams to translate business requirements into technical solutions. . Stay current with Databricks innovations and advocate for adoption of new features and capabilities.
Required qualifications to be successful in this role
Education Qualification Bachelor s degree in computer science or related field or higher with minimum 3 years of relevant experience. 3 to 6 years of experience in ETL/Power BI, with 3+ years in Databricks and Apache Spark. Strong proficiency in SQL & DAX . Experience in project migrating Snowflake and other custom EDW/ ETL solutions to Databricks. . Experience of migrating different reporting solutions like Cognos, SAP BO etc to Power BI and Databricks. . Experience with Kimball dimensional modelling and data warehousing concepts . Experience in designing and deploying ETL/ELT pipelines for large-scale data integration. . Proficiency in Power BI for paginated report and dashboard development, including DAX. . Strong experience in Delta Lake, structured streaming, PySpark, and SQL. . Strong understanding of Lakehouse architecture, data mesh, and modern data stack principles. . Experience with Unity Catalog, Databricks Repos, Jobs API, and Workflows. . Proven ability to design and implement secure, governed, and highly available data platforms. . Familiarity with cloud platforms (Azure, AWS, GCP) and their integration with Databricks. . Experience with CI/CD, DevOps, and infrastructure-as-code tools (Terraform, GitHub Actions, Azure DevOps). . Knowledge of machine learning lifecycle, MLflow, and model deployment strategies. . An understanding of E-R data models (conceptual, logical, and physical). . Strong analytical skills, including a thorough understanding of how to interpret customer business requirements and translate them into technical designs and solutions. . Strong communication skills both verbal and written. Capable of collaborating effectively across a variety of IT and Business groups, across regions, roles and able to interact effectively with all levels. . Strong problem-solving skills. Ability to identify where focus is needed and bring clarity to business objectives, requirements, and priorities.
Must to Have
Azure Databricks, Databricks Lakehouse Architecture ETL / ELT, Data Architecture Apache Spark / PySpark, Delta Lake, Delta Live Tables (DLT) Unity Catalog, Medallion Architecture Dimensional Modeling (Star & Snowflake), Kimball, Data Vault Slowly Changing Dimensions (SCD Types 1, 2, 3) Data Governance, RBAC, Data Lineage, Metadata Management CI/CD & DevOps (Azure DevOps, GitHub Actions, Terraform) SQL, Power BI, Self-Service Analytics, Semantic Model, Paginated Reports Data Quality (Great Expectations), Performance Tuning, Cost Optimization Cloud Platforms (Azure, AWS, GCP), Azure Data Factory, Synapse, Event Hubs Nice-to-Have Skills
Streaming frameworks (Kafka, Event Hubs), workspace automation Advanced data modeling for Finance, Performance Budgeting, HRM systems Subject-area models for financial reporting, workforce analytics, payroll insights Delta Change Data Feed (CDF) and real-time data marts Certifications oDatabricks Certified Data Engineer Associate / Professional oDatabricks Certified Associate Developer for Apache Spark oAzure / Power BI certifications Skills
- Azure Data Factory
- Azure DevOps
- English