Posted:3 weeks ago|
Platform:
On-site
Full Time
Main Purpose: ▪Collaborate with data scientists and business stakeholders to design, develop, and maintain efficient data pipelines feeding into the organization's data lake. ▪ Maintain the integrity and quality of the data lake, enabling accurate and actionable insights for data scientists and informed decision-making for business stakeholders. ▪Utilize extensive knowledge of data engineering and cloud technologies to enhance the organization’s data infrastructure, promoting a culture of data-driven decision-making. ▪ Apply data engineering expertise to define and optimize data pipelines using advanced concepts to improve the efficiency and accessibility of data storage. ▪Own the development of an extensive data catalog, ensuring robust data governance and facilitating effective data access and utilization across the organization. Knowledge Skills and Abilities, Key Responsibilities: Key Responsibilities Contribute to the development of scalable and performant data pipelines on Databricks, leveraging Delta Lake, Delta Live Tables (DLT), and other core Databricks components. Develop data lakes/warehouses designed for optimized storage, querying, and real-time updates using Delta Lake. Implement effective data ingestion strategies from various sources (streaming, batch, API-based), ensuring seamless integration with Databricks. Ensure the integrity, security, quality, and governance of data across our Databricks-centric platforms. Collaborate with stakeholders (data scientists, analysts, product teams) to translate business requirements into Databricks-native data solutions. Build and maintain ETL/ELT processes, heavily utilizing Databricks, Spark (Scala or Python), SQL, and Delta Lake for transformations. Page Experience with CI/CD and DevOps practices specifically tailored for the Databricks environment. Monitor and optimize the cost-efficiency of data operations on Databricks, ensuring optimal resource utilization. Utilize a range of Databricks tools, including the Databricks CLI and REST API, alongside Apache Spark™, to develop, manage, and optimize data engineering solutions. Key Relationships and Department Overview: Key Relationships Internal – Data Engineering Manager Developers across various departments, Managers of Departments in other regional hubs of Puma Energy External – Platform providers Show more Show less
Puma Energy
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Mumbai Metropolitan Region
Salary: Not disclosed
Mumbai Metropolitan Region
Experience: Not specified
Salary: Not disclosed
Mumbai Metropolitan Region
Salary: Not disclosed
Mumbai Metropolitan Region
Experience: Not specified
Salary: Not disclosed