Tech Lead - Data Bricks

4.0 - 7.0 years

4.0 - 7.0 Lacs P.A.

Navi Mumbai, Maharashtra, India

Posted:1 week ago| Platform: Foundit logo

Apply Now

Skills Required

Work Mode

On-site

Job Type

Full Time

Job Description

We are seeking a skilled Databricks Architect to design, implement, and optimize scalable data solutions within our cloud-based data platform. This role requires extensive knowledge of Databricks (Azure/AWS), data engineering, and a deep understanding of data architecture principles, with the ability to drive strategy, best practices, and hands-on implementation for high-performance data processing and analytics solutions. Responsibilities: Solution Architecture: Design and architect end-to-end data solutions using Databricks and Azure/AWS, including data ingestion, processing, and storage. Delta Lake Implementation: Leverage Delta Lake and Lakehouse architecture to create robust, unified data structures that support advanced analytics and machine learning. Data Processing Development: Develop, design, and automate large-scale, high-performance data processing systems (batch and/or streaming) to drive business growth and enhance the product experience. Performance Tuning: Ensure optimal performance of data pipelines and workloads by implementing best practices for resource management, auto-scaling, and query optimization in Databricks. Engineering Best Practices: Advocate for high-quality software engineering practices in building scalable data infrastructure and pipelines. Architecture/Solution Development: Develop Architecture or solution for large data project using Databricks. Project Leadership: Lead data engineering projects to ensure pipelines are reliable, efficient, testable, and maintainable. Data Modeling: Design data models optimized for storage, retrieval, and critical product and business requirements. Logging Architecture: Understand and influence logging to support data flow, implementing logging best practices as needed. Standardization and Tooling: Contribute to shared data engineering tools and standards to boost productivity and quality for Data Engineers across the company. Collaboration: Work closely with leadership, engineers, program managers, and data scientists to understand and meet data needs. Partner Education: Use data engineering expertise to identify gaps and improve existing logging and processes for partners. Data Governance: Collaborate with stakeholders to build data lineage, data governance, and data cataloging using unity catalog. Agile Project Management: Lead projects using agile methodologies. Communication: Communicate effectively with stakeholders at all organizational levels. Team Development: Recruit, retain, and develop team members, preparing them for increased responsibilities and challenges. Requirements: 10+ years of relevant industry experience. ETL Expertise: Skilled in custom ETL design, implementation, and maintenance. Data Modeling: Experience in developing and designing data models for reporting systems. Databricks Proficiency: Hands-on experience with Databricks SQL workloads. Data Ingestion: Expertise in data ingestion from offline files (e.g., CSV, TXT, JSON) along with API and DB, CDC data ingestion. Should have handled such projects in past. Pipeline Observability: Skilled in setting up robust observability for complete pipelines and Databricks in Azure/AWS. Database Knowledge: Proficient in relational databases and SQL query authoring. Programming and Frameworks: Experience with Java, Scala, Spark, PySpark, Python, and Databricks. Cloud Platforms: Cloud experience required (Azure/AWS preferred). Data Scale Handling: Experience working with large-scale data. Pipeline Design and Operations: Proven experience in designing, building, and operating robust data pipelines. Performance Monitoring: Skilled in deploying high-performance pipelines with reliable monitoring and logging. Cross-Team Collaboration: Able to work effectively across teams to establish overarching data architecture and provide team guidance. ETL Optimization: Ability to optimize ETL pipelines to reduce data transfer and storage costs. Auto Scaling: Skilled in using Databricks SQL s auto-scaling feature to adjust worker numbers based on workload. Tech Stack: Cloud Platform: Azure/AWS. Azure/AWS: Databricks SQL Serverless, Databricks SQL, Databricks workspaces, Databricks notebooks, Databricks job scheduling, Data Catalog. Data Architecture: Delta Lake, Lakehouse concepts. Data Processing: Spark Structured/Streaming. File Formats: CSV, Avro, Parquet. CI/CD: CI/CD for ETL pipelines. Governance Model: Databricks SQL unified governance model (Unity Catalog) across clouds, supporting open formats and APIs.

No locations

RecommendedJobs for You

Gurgaon / Gurugram, Haryana, India

Gurgaon / Gurugram, Haryana, India

Bengaluru / Bangalore, Karnataka, India

Bengaluru / Bangalore, Karnataka, India

Hyderabad / Secunderabad, Telangana, Telangana, India

Gurgaon / Gurugram, Haryana, India