Databricks Engineer

3 - 5 years

0 Lacs

Posted:11 hours ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

ABOUT US:

Founded in 2016, DataZymes is a next-generation analytics and data science company driving technology and digital-led innovation for our clients, thus helping them get more value from their data and analytics investments. Our platforms are built on best-of-breed technologies, thus protecting current investments while providing clients more bang for their buck. As we are a premier partner for many Business Intelligence and Information Management companies, we also provide advisory and consulting services to clients helping them make the right decisions and put together a long-term roadmap.

 

Our mission at DataZymes is to scale analytics and enable healthcare organizations in achieving non-linear, long term and sustainable growth. In a short span, we have built a high-performance team in focused practice areas, built digital-enabled solutions, and are working with some marquee names in the US healthcare industry.

 

JOB LOCATION:

 

QUALIFICATION REQUIRED:


EXPERIENCE REQUIRED:

 

EMPLOYMENT TYPE:

 

Key Responsibilities

  • Pipeline Development: Design, build, and maintain efficient and scalable ETL/ELT pipelines on the Databricks platform using PySpark, SQL, and Delta Live Tables (DLT).
  • Lakehouse Management: Implement and manage data solutions within the Databricks Lakehouse Platform, ensuring best practices for data storage, governance, and management using Delta Lake and Unity Catalog.
  • Code Optimization: Write high-quality, maintainable, and optimized PySpark code for large-scale data processing and transformation tasks.
  • AI & ML Integration: Collaborate with data scientists to productionize machine learning models. Utilize Databricks AI features such as the Feature Store, MLflow for model lifecycle management, and AutoML for accelerating model development.
  • Data Quality & Governance: Implement robust data quality checks and validation frameworks to ensure data accuracy, completeness, and reliability within the delta tables.
  • Performance Tuning: Monitor, troubleshoot, and optimize the performance of Databricks jobs, clusters, and SQL warehouses to ensure efficiency and cost-effectiveness.
  • Collaboration: Work closely with data analysts, data scientists, and business stakeholders to understand their data requirements and deliver effective solutions.
  • Documentation: Create and maintain comprehensive technical documentation for data pipelines, architectures, and processes.


Required Qualifications & Skills


  • Experience: 3-5 years of hands-on experience in a data engineering role.
  • Databricks Expertise: Proven, in-depth experience with the Databricks platform, including Databricks Workflows, Notebooks, Clusters, and Delta Live Tables.
  • Programming Skills: Strong proficiency in Python and extensive hands-on experience with PySpark for data manipulation and processing.
  • Data Architecture: Solid understanding of modern data architectures, including the Lakehouse paradigm, Data Lakes, and Data Warehousing.
  • Delta Lake: Hands-on experience with Delta Lake, including schema evolution, ACID transactions, and time travel features.
  • SQL Proficiency: Excellent SQL skills and the ability to write complex queries for data analysis and transformation.
  • Databricks AI: Practical experience with Databricks AI/ML capabilities, particularly MLflow and the Feature Store.
  • Cloud Experience: Experience working with at least one major cloud provider (AWS, Azure, or GCP).
  • Problem-Solving: Strong analytical and problem-solving skills with the ability to debug complex data issues.
  • Communication: Excellent verbal and written communication skills.


Preferred Qualifications


  • Databricks Certified Data Engineer Associate/Professional certification.
  • Experience with CI/CD tools (e.g., Jenkins, Azure DevOps, GitHub Actions) for data pipelines.
  • Familiarity with streaming technologies like Structured Streaming.
  • Knowledge of data governance tools and practices within Unity Catalog.


Mock Interview

Practice Video Interview with JobPe AI

Start PySpark Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

Bengaluru, Karnataka, India