Data Engineer

BAY Area Technology Solutions

10 - 20 years

12 - 22 Lacs

bengaluru

Posted:1 day ago| Platform:

Apply

Skills Required

data engineering azure data factory azure pyspark data warehouses ci/cd databricks informatica etl unity catalog python hybrid cloud

Work Mode

Work from Office

Job Type

Full Time

Job Description

Role : Data Engineer

Experience : 10 to 20 Yrs

Key Responsibilities:

Serve as the point of contact and subject matter expert for all Databricks-related activities, including architecture, development, and operational best practices.
Should work closely with Sales team, propose data roadmap to prospects intending to migrate to cloud, create proof of concepts to showcase our expertise.
Design, develop, and manage ETL/ELT pipelines in Databricks using Python (PySpark), integrating various data sources to support business operations.
Leverage Unity Catalog to ensure data lineage, security, and governance are properly managed across the Databricks environment.
Implement and maintain CI/CD pipelines for Databricks, ensuring smooth deployments, version control, and automation using Git and other DevOps tools.
Build scalable data architectures, including Data Lakes, Lakehouses, and Data Warehouses, ensuring efficient data management and accessibility.
Configure and optimize Databricks clusters, jobs, and workflows for both batch and streaming data processing to handle large-scale datasets.
Stay up-to-date with the latest Databricks features and advancements, continuously enhancing our data engineering practices.
Collaborate with cross-functional teams to implement data governance and ensure compliance with security and industry regulations.
Monitor and tune Databricks workloads to ensure high performance and scalability, adapting to business needs as required.
Provide training, guidance and mentorship to fellow cloud engineers, ensuring adherence to best practices and fostering a collaborative environment.

Qualifications:

5+ years of experience in data engineering with significant expertise in Databricks and Apache Spark.
Proficient in Unity Catalog for managing data lineage, security, and governance within the Databricks ecosystem.
Experience of estimating and migrating legacy data warehouse workloads to Azure/Hybrid Cloud.
Proficient in Unity Catalog for managing data lineage, security, and governance within the Databricks ecosystem.
Experience building and optimizing ETL pipelines using tools like Azure Data Factory, Informatica, or similar.
Strong understanding of CI/CD practices with experience in Git for version control and integration with Databricks.
Expertise in SQL development and performance tuning for large-scale datasets.
Knowledge of the Azure ecosystem, including data services like Azure Data Factory, Azure Data Lake and Azure Storage.
Ability to work with both batch and streaming data processing pipelines.
Experience with data modeling and dimensional design (e.g., star schema).
Good understanding of data governance, compliance, and security best practices.
Excellent communication and problem-solving skills, with the ability to manage multiple priorities.
Ability to stay current on Databricks innovations and proactively introduce new features and capabilities to the team.

Mock Interview

Practice Video Interview with JobPe AI

Start PySpark Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.