Job
Description
As a Data Engineer with 5+ years of experience, you will be responsible for designing and developing scalable, reusable, and efficient data pipelines using modern Data Engineering platforms such as Microsoft Fabric, PySpark, and Data Lakehouse architectures. Your role will involve integrating data from diverse sources, transforming it into actionable insights, and ensuring high standards of data governance and quality. You will play a key role in establishing and enforcing data governance policies, monitoring pipeline performance, and optimizing for efficiency. Key Responsibilities Design and build robust data pipelines using Microsoft Fabric components including Pipelines, Notebooks (PySpark), Dataflows, and Lakehouse architecture. Ingest and transform data from cloud platforms (Azure, AWS), on-prem databases, SaaS platforms (e.g., Salesforce, Workday), and REST/OpenAPI-based APIs. Develop and maintain semantic models and define standardized KPIs for reporting and analytics in Power BI or equivalent BI tools. Implement and manage Delta Tables across bronze/silver/gold layers using Lakehouse medallion architecture within OneLake or equivalent environments. Apply metadata-driven design principles to ensure pipeline parameterization, reusability, and scalability. Monitor, debug, and optimize pipeline performance; implement logging, alerting, and observability mechanisms. Establish and enforce data governance policies including schema versioning, data lineage tracking, role-based access control (RBAC), and audit trail mechanisms. Perform data quality checks including null detection, duplicate handling, schema drift management, outlier identification, and Slowly Changing Dimensions (SCD) type management. Required Skills & Qualifications 5+ years of hands-on experience in Data Engineering or related fields. Solid understanding of data lake/lakehouse architectures, preferably with Microsoft Fabric or equivalent tools (e.g., Databricks, Snowflake, Azure Synapse). Strong experience with PySpark, SQL, and working with dataflows and notebooks. Exposure to BI tools like Power BI, Tableau, or equivalent for data consumption layers. Experience with Delta Lake or similar transactional storage layers. Familiarity with data ingestion from SaaS applications, APIs, and enterprise databases. Understanding of data governance, lineage, and RBAC principles. Strong analytical, problem-solving, and communication skills. Nice to Have Prior experience with Microsoft Fabric and OneLake platform. Knowledge of CI/CD practices in data engineering. Experience implementing monitoring/alerting tools for data pipelines. Join us for the opportunity to work on cutting-edge data engineering solutions in a fast-paced, collaborative environment focused on innovation and learning. Gain exposure to end-to-end data product development and deployment cycles.,