Jobs
Interviews

196 Delta Lake Jobs - Page 7

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

6.0 - 10.0 years

22 - 25 Lacs

Bengaluru

Hybrid

Mandatory Skills & Experience: 6 to 8 years of experience in data engineering, with strong experience in Oracle DWH/ODS environments. Minimum 3+ years hands-on experience in Databricks (including PySpark, SQL, Delta Lake, Workflows). Strong understanding of Lakehouse architecture, cloud data platforms, and big data processing. Proven experience in migrating data warehouse and ETL workloads from Oracle to cloud platforms. Experience with PL/SQL, query tuning, and reverse engineering legacy systems. Exposure to Pentaho and/or TIBCO Data Virtualization/Integration tools. Experience with CI/CD pipelines, version control (e.g., Git), and automated testing. Familiarity with data governance, security policies, and compliance in cloud environments. Strong communication and documentation skills. Preferred Skills (Advantage): Experience in cloud migration projects (AWS/Azure). Knowledge of Delta Lake, Unity Catalog, and Databricks workflows. Exposure to Kafka for real-time data streaming. Experience with ETL tools like Pentaho or Tibco will be an added advantage. AWS/Azure/Databricks certifications Tools & Technologies: Databricks, Oracle, Hadoop (HDFS, Hive, Sqoop), AWS (S3, EMR, Glue, Lamda, RDS) PySpark, SQL, Python, Kafka CI/CD (Jenkins, GitHub Actions), Orchestration (Airflow, Control-M) JIRA, Confluence, Git (GitHub/Bitbucket) Cloud Certifications (Preferred): Databricks Certified Data Engineer AWS Certified Solutions Architect/Developer

Posted 1 month ago

Apply

8.0 - 13.0 years

25 - 40 Lacs

Chennai

Work from Office

Architect & Build Scalable Systems: Design and implement a petabyte-scale lakehouse Architectures to unify data lakes and warehouses. Real-Time Data Engineering: Develop and optimize streaming pipelines using Kafka, Pulsar, and Flink. Required Candidate profile Data engineering experience with large-scale systems• Expert proficiency in Java for data-intensive applications. Handson experience with lakehouse architectures, stream processing, & event streaming

Posted 1 month ago

Apply

7.0 - 9.0 years

15 - 18 Lacs

Pune

Work from Office

We are looking for a highly skilled Senior Databricks Developer to join our data engineering team. You will be responsible for building scalable and efficient data pipelines using Databricks, Apache Spark, Delta Lake, and cloud-native services (Azure/AWS/GCP). You will work closely with data architects, data scientists, and business stakeholders to deliver high-performance, production-grade solutions. Key Responsibilities : - Design, build, and maintain scalable and efficient data pipelines on Databricks using PySpark, Spark SQL, and optionally Scala. - Work with Databricks components including Workspace, Jobs, DLT (Delta Live Tables), Repos, and Unity Catalog. - Implement and optimize Delta Lake solutions aligned with Lakehouse and Medallion architecture best practices. - Collaborate with data architects, engineers, and business teams to understand requirements and deliver production-grade solutions. - Integrate CI/CD pipelines using tools such as Azure DevOps, GitHub Actions, or similar for Databricks deployments. - Ensure data quality, consistency, governance, and security by using tools like Unity Catalog or Azure Purview. - Use orchestration tools such as Apache Airflow, Azure Data Factory, or Databricks Workflows to schedule and monitor pipelines. - Apply strong SQL skills and data warehousing concepts in data modeling and transformation logic. - Communicate effectively with technical and non-technical stakeholders to translate business requirements into technical solutions. Required Skills and Qualifications : - Hands-on experience in data engineering, with specifically in Databricks. - Deep expertise in Databricks Workspace, Jobs, DLT, Repos, and Unity Catalog. - Strong programming skills in PySpark, Spark SQL; Scala experience is a plus. - Proficient in working with one or more cloud platforms : Azure, AWS, or GCP. - Experience with Delta Lake, Lakehouse architecture, and medallion architecture patterns. - Proficient in building CI/CD pipelines for Databricks using DevOps tools. - Familiarity with orchestration and ETL/ELT tools such as Airflow, ADF, or Databricks Workflows. - Strong understanding of data governance, metadata management, and lineage tracking. - Excellent analytical, communication, and stakeholder management skills.

Posted 1 month ago

Apply

4.0 - 8.0 years

7 - 11 Lacs

Hyderabad, Bengaluru

Hybrid

Job Summary We are seeking a skilled Azure Data Engineer with 4 years of overall experience , including at least 2 years of hands-on experience with Azure Databricks (Must) . The ideal candidate will have strong expertise in building and maintaining scalable data pipelines and working across cloud-based data platforms. Key Responsibilities Design, develop, and optimize large-scale data pipelines using Azure Data Factory, Azure Databricks, and Azure Synapse. Implement data lake solutions and work with structured and unstructured datasets in Azure Data Lake Storage (ADLS). Collaborate with data scientists, analysts, and engineering teams to design and deliver end-to-end data solutions. Develop ETL/ELT processes and integrate data from multiple sources. Monitor, debug, and optimize workflows for performance and cost-efficiency. Ensure data governance, quality, and security best practices are maintained. Must-Have Skills 4+ years of total experience in data engineering. 2+ years of experience with Azure Databricks (PySpark, Notebooks, Delta Lake) . Strong experience with Azure Data Factory , Azure SQL , and ADLS . Proficient in writing SQL queries and Python/Scala scripting. Understanding of CI/CD pipelines and version control systems (e.g., Git). Solid grasp of data modeling and warehousing concepts.

Posted 1 month ago

Apply

11.0 - 17.0 years

20 - 35 Lacs

Indore, Hyderabad

Work from Office

Greetings of the Day !! We have job opening for Microsoft Fabric + ADF with one of our clients. If you are interested in this position, please share update resume in this email id : shaswati.m@bct-consulting.com . * Primary Skill Microsoft Fabric Secondary Skill 1 Azure Data Factory (ADF) 12+ years of experience in Microsoft Azure Data Engineering for analytical projects. Proven expertise in designing, developing, and deploying high-volume, end-to-end ETL pipelines for complex models, including batch, and real-time data integration frameworks using Azure, Microsoft Fabric and Databricks. Extensive hands-on experience with Azure Data Factory, Databricks (with Unity Catalog), Azure Functions, Synapse Analytics, Data Lake, Delta Lake, and Azure SQL Database for managing and processing large-scale data integrations. Experience in Databricks cluster optimization and workflow management to ensure cost-effective and high-performance processing. Sound knowledge of data modelling, data governance, data quality management, and data modernization processes. Develop architecture blueprints and technical design documentation for Azure-based data solutions. Provide technical leadership and guidance on cloud architecture best practices, ensuring scalable and secure solutions. Keep abreast of emerging Azure technologies and recommend enhancements to existing systems. Lead proof of concepts (PoCs) and adopt agile delivery methodologies for solution development and delivery.

Posted 1 month ago

Apply

13.0 - 20.0 years

40 - 45 Lacs

Bengaluru

Work from Office

Principal Architect - Platform & Application Architect Experience 15+ years in software/data platform architecture 5+ years in architectural leadership roles Architecture & Data Platform Expertise Education Bachelors/Master’s in CS, Engineering, or related field Title: Principal Architect Location: Onsite Bangalore Experience: 15+ years in software & data platform architecture and technology strategy Role Overview We are seeking a Platform & Application Architect to lead the design and implementation of a next-generation, multi-domain data platform and its ecosystem of applications. In this strategic and hands-on role, you will define the overall architecture, select and evolve the technology stack, and establish best practices for governance, scalability, and performance. Your responsibilities will span across the full data lifecycle—ingestion, processing, storage, and analytics—while ensuring the platform is adaptable to diverse and evolving customer needs. This role requires close collaboration with product and business teams to translate strategy into actionable, high-impact platform & products. Key Responsibilities 1. Architecture & Strategy Design the end-to-end architecture for a On-prem / hybrid data platform (data lake/lakehouse, data warehouse, streaming, and analytics components). Define and document data blueprints, data domain models, and architectural standards. Lead build vs. buy evaluations for platform components and recommend best-fit tools and technologies. 2. Data Ingestion & Processing Architect batch and real-time ingestion pipelines using tools like Kafka, Apache NiFi, Flink, or Airbyte. Oversee scalable ETL/ELT processes and orchestrators (Airflow, dbt, Dagster). Support diverse data sources: IoT, operational databases, APIs, flat files, unstructured data. 3. Storage & Modeling Define strategies for data storage and partitioning (data lakes, warehouses, Delta Lake, Iceberg, or Hudi). Develop efficient data strategies for both OLAP and OLTP workloads. Guide schema evolution, data versioning, and performance tuning. 4. Governance, Security, and Compliance Establish data governance , cataloging , and lineage tracking frameworks. Implement access controls , encryption , and audit trails to ensure compliance with DPDPA, GDPR, HIPAA, etc. Promote standardization and best practices across business units. 5. Platform Engineering & DevOps Collaborate with infrastructure and DevOps teams to define CI/CD , monitoring , and DataOps pipelines. Ensure observability, reliability, and cost efficiency of the platform. Define SLAs, capacity planning, and disaster recovery plans. 6. Collaboration & Mentorship Work closely with data engineers, scientists, analysts, and product owners to align platform capabilities with business goals. Mentor teams on architecture principles, technology choices, and operational excellence. Skills & Qualifications Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field. 12+ years of experience in software engineering, including 5+ years in architectural leadership roles. Proven expertise in designing and scaling distributed systems, microservices, APIs, and event-driven architectures using Java, Python, or Node.js. Strong hands-on experience with building scalable data platforms on premise/Hybrid/cloud environments. Deep knowledge of modern data lake and warehouse technologies (e.g., Snowflake, BigQuery, Redshift) and table formats like Delta Lake or Iceberg. Familiarity with data mesh, data fabric, and lakehouse paradigms. Strong understanding of system reliability, observability, DevSecOps practices, and platform engineering principles. Demonstrated success in leading large-scale architectural initiatives across enterprise-grade or consumer-facing platforms. Excellent communication, documentation, and presentation skills, with the ability to simplify complex concepts and influence at executive levels. Certifications such as TOGAF or AWS Solutions Architect (Professional) and experience in regulated domains (e.g., finance, healthcare, aviation) are desirable.

Posted 1 month ago

Apply

3.0 - 7.0 years

5 - 9 Lacs

Hyderabad

Work from Office

What you will do Role Description: We are seeking a Senior Data Engineer with expertise in Graph Data technologies to join our data engineering team and contribute to the development of scalable, high-performance data pipelines and advanced data models that power next-generation applications and analytics. This role combines core data engineering skills with specialized knowledge in graph data structures, graph databases, and relationship-centric data modeling, enabling the organization to leverage connected data for deep insights, pattern detection, and advanced analytics use cases. The ideal candidate will have a strong background in data architecture, big data processing, and Graph technologies and will work closely with data scientists, analysts, architects, and business stakeholders to design and deliver graph-based data engineering solutions. Roles & Responsibilities: Design, build, and maintain robust data pipelines using Databricks (Spark, Delta Lake, PySpark) for complex graph data processing workflows. Own the implementation of graph-based data models, capturing complex relationships and hierarchies across domains. Build and optimize Graph Databases such as Stardog, Neo4j, Marklogic or similar to support query performance, scalability, and reliability. Implement graph query logic using SPARQL, Cypher, Gremlin, or GSQL, depending on platform requirements. Collaborate with data architects to integrate graph data with existing data lakes, warehouses, and lakehouse architectures. Work closely with data scientists and analysts to enable graph analytics, link analysis, recommendation systems, and fraud detection use cases. Develop metadata-driven pipelines and lineage tracking for graph and relational data processing. Ensure data quality, governance, and security standards are met across all graph data initiatives. Mentor junior engineers and contribute to data engineering best practices, especially around graph-centric patterns and technologies. Stay up to date with the latest developments in graph technology, graph ML, and network analytics. What we expect of you Must-Have Skills: Hands-on experience in Databricks, including PySpark, Delta Lake, and notebook-based development. Hands-on experience with graph database platforms such as Stardog, Neo4j, Marklogic etc. Strong understanding of graph theory, graph modeling, and traversal algorithms Proficiency in workflow orchestration, performance tuning on big data processing Strong understanding of AWS services Ability to quickly learn, adapt and apply new technologies with strong problem-solving and analytical skills Excellent collaboration and communication skills, with experience working with Scaled Agile Framework (SAFe), Agile delivery practices, and DevOps practices. Good-to-Have Skills: Good to have deep expertise in Biotech & Pharma industries Experience in writing APIs to make the data available to the consumers Experienced with SQL/NOSQL database, vector database for large language models Experienced with data modeling and performance tuning for both OLAP and OLTP databases Experienced with software engineering best-practices, including but not limited to version control (Git, Subversion, etc.), CI/CD (Jenkins, Maven etc.), automated unit testing, and Dev Ops Education and Professional Certifications Masters degree and 3 to 4 + years of Computer Science, IT or related field experience Bachelors degree and 5 to 8 + years of Computer Science, IT or related field experience AWS Certified Data Engineer preferred Databricks Certificate preferred Scaled Agile SAFe certification preferred Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals. Ability to learn quickly, be organized and detail oriented. Strong presentation and public speaking skills.

Posted 1 month ago

Apply

0.0 - 2.0 years

2 - 4 Lacs

Hyderabad

Work from Office

Role Description: We are looking for an Associate Data Engineer with deep expertise in writing data pipelines to build scalable, high-performance data solutions. The ideal candidate will be responsible for developing, optimizing and maintaining complex data pipelines, integration frameworks, and metadata-driven architectures that enable seamless access and analytics. This role prefers deep understanding of the big data processing, distributed computing, data modeling, and governance frameworks to support self-service analytics, AI-driven insights, and enterprise-wide data management. Roles & Responsibilities: Data Engineer who owns development of complex ETL/ELT data pipelines to process large-scale datasets Contribute to the design, development, and implementation of data pipelines, ETL/ELT processes, and data integration solutions Ensuring data integrity, accuracy, and consistency through rigorous quality checks and monitoring Exploring and implementing new tools and technologies to enhance ETL platform and performance of the pipelines Proactively identify and implement opportunities to automate tasks and develop reusable frameworks Eager to understand the biotech/pharma domains & build highly efficient data pipelines to migrate and deploy complex data across systems Work in an Agile and Scaled Agile (SAFe) environment, collaborating with cross-functional teams, product owners, and Scrum Masters to deliver incremental value Use JIRA, Confluence, and Agile DevOps tools to manage sprints, backlogs, and user stories. Support continuous improvement, test automation, and DevOps practices in the data engineering lifecycle Collaborate and communicate effectively with the product teams, with cross-functional teams to understand business requirements and translate them into technical solutions Must-Have Skills: Experience in Data Engineering with a focus on Databricks, AWS, Python, SQL, and Scaled Agile methodologies Proficiency & Strong understanding of data processing and transformation of big data frameworks (Databricks, Apache Spark, Delta Lake, and distributed computing concepts) Strong understanding of AWS services and can demonstrate the same Ability to quickly learn, adapt and apply new technologies Strong problem-solving and analytical skills Excellent communication and teamwork skills Experience with Scaled Agile Framework (SAFe), Agile delivery, and DevOps practices Good-to-Have Skills: Data Engineering experience in Biotechnology or pharma industry Exposure to APIs, full stack development Experienced with SQL/NOSQL database, vector database for large language models Experienced with data modeling and performance tuning for both OLAP and OLTP databases Experienced with software engineering best-practices, including but not limited to version control (Git, Subversion, etc.), CI/CD (Jenkins, Maven etc.), automated unit testing, and Dev Ops Education and Professional Certifications Bachelors degree and 2 to 5 + years of Computer Science, IT or related field experience OR Masters degree and 1 to 4 + years of Computer Science, IT or related field experience AWS Certified Data Engineer preferred Databricks Certificate preferred Scaled Agile SAFe certification preferred Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals. Ability to learn quickly, be organized and detail oriented. Strong presentation and public speaking skills.

Posted 1 month ago

Apply

9.0 - 14.0 years

11 - 16 Lacs

Hyderabad

Work from Office

Role Description: We are seeking a seasoned Solution Architect to drive the architecture, development and implementation of data solutions to Amgen functional groups. The ideal candidate able to work in large scale Data Analytic initiatives, engage and work along with Business, Program Management, Data Engineering and Analytic Engineering teams. Be champions of enterprise data analytic strategy, data architecture blueprints and architectural guidelines. As a Solution Architect, you will play a crucial role in designing, building, and optimizing data solutions to Amgen functional groups such as R&D, Operations and GCO. Roles & Responsibilities: Implement and manage large scale data analytic solutions to Amgen functional groups that align with the Amgen Data strategy Collaborate with Business, Program Management, Data Engineering and Analytic Engineering teams to deliver data solutions Responsible for design, develop, optimize, delivery and support of Data solutions on AWS and Databricks architecture Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions. Provide expert guidance and mentorship to the team members, fostering a culture of innovation and best practices. Be passionate and hands-on to quickly experiment with new data related technologies Define guidelines, standards, strategies, security policies and change management policies to support the Enterprise Data platform. Collaborate and align with EARB, Cloud Infrastructure, Security and other technology leaders on Enterprise Data Architecture changes Work with different project and application groups to drive growth of the Enterprise Data Platform using effective written/verbal communication skills, and lead demos at different roadmap sessions Overall management of the Enterprise Data Platform on AWS environment to ensure that the service delivery is cost effective and business SLAs around uptime, performance and capacity are met Ensure scalability, reliability, and performance of data platforms by implementing best practices for architecture, cloud resource optimization, and system tuning. Collaboration with RunOps engineers to continuously increase our ability to push changes into production with as little manual overhead and as much speed as possible. Maintain knowledge of market trends and developments in data integration, data management and analytics software/tools Work as part of team in a SAFe Agile/Scrum model Basic Qualifications and Experience: Masters degree with 6 - 8 years of experience in Computer Science, IT or related field OR Bachelors degree with 9 - 12 years of experience in Computer Science, IT or related field OR Functional Skills: Must-Have Skills: 7+ years of hands-on experience in Data integrations, Data Management and BI technology stack. Strong experience with one or more Data Management tools such as AWS data lake, Snowflake or Azure Data Fabric Expert-level proficiency with Databricks and experience in optimizing data pipelines and workflows in Databricks environments. Strong experience with Python, PySpark, and SQL for building scalable data workflows and pipelines. Experience with Apache Spark, Delta Lake, and other relevant technologies for large-scale data processing. Familiarity with BI tools including Tableau and PowerBI Demonstrated ability to enhance cost-efficiency, scalability, and performance for data solutions Strong analytical and problem-solving skills to address complex data solutions Good-to-Have Skills: Preferred to have experience in life science or tech or consultative solution architecture roles Experience working with agile development methodologies such as Scaled Agile. Professional Certifications AWS Certified Data Engineer preferred Databricks Certificate preferred Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals Strong presentation and public speaking skills.

Posted 1 month ago

Apply

0.0 years

0 Lacs

Hyderabad / Secunderabad, Telangana, Telangana, India

On-site

Genpact (NYSE: G) is a global professional services and solutions firm delivering outcomes that shape the future. Our 125,000+ people across 30+ countries are driven by our innate curiosity, entrepreneurial agility, and desire to create lasting value for clients. Powered by our purpose - the relentless pursuit of a world that works better for people - we serve and transform leading enterprises, including the Fortune Global 500, with our deep business and industry knowledge, digital operations services, and expertise in data, technology, and AI. Inviting applications for the role of Lead Consultant- Databricks Developer ! In this role, the Databricks Developer is responsible for solving the real world cutting edge problem to meet both functional and non-functional requirements. Responsibilities Maintains close awareness of new and emerging technologies and their potential application for service offerings and products. Work with architect and lead engineers for solutions to meet functional and non-functional requirements. Demonstrated knowledge of relevant industry trends and standards. Demonstrate strong analytical and technical problem-solving skills. Must have experience in Data Engineering domain . Qualifications we seek in you! Minimum qualifications Bachelor&rsquos Degree or equivalency (CS, CE, CIS, IS, MIS, or engineering discipline) or equivalent work experience. Maintains close awareness of new and emerging technologies and their potential application for service offerings and products. Work with architect and lead engineers for solutions to meet functional and non-functional requirements. Demonstrated knowledge of relevant industry trends and standards. Demonstrate strong analytical and technical problem-solving skills. Must have excellent coding skills either Python or Scala, preferably Python. Must have experience in Data Engineering domain . Must have implemented at least 2 project end-to-end in Databricks. Must have at least experience on databricks which consists of various components as below Delta lake dbConnect db API 2.0 Databricks workflows orchestration Must be well versed with Databricks Lakehouse concept and its implementation in enterprise environments. Must have good understanding to create complex data pipeline Must have good knowledge of Data structure & algorithms. Must be strong in SQL and sprak-sql . Must have strong performance optimization skills to improve efficiency and reduce cost . Must have worked on both Batch and streaming data pipeline . Must have extensive knowledge of Spark and Hive data processing framework. Must have worked on any cloud (Azure, AWS, GCP) and most common services like ADLS/S3, ADF/Lambda, CosmosDB /DynamoDB, ASB/SQS, Cloud databases. Must be strong in writing unit test case and integration test Must have strong communication skills and have worked on the team of size 5 plus Must have great attitude towards learning new skills and upskilling the existing skills. Preferred Qualifications Good to have Unity catalog and basic governance knowledge. Good to have Databricks SQL Endpoint understanding. Good To have CI/CD experience to build the pipeline for Databricks jobs. Good to have if worked on migration project to build Unified data platform. Good to have knowledge of DBT. Good to have knowledge of docker and Kubernetes. Genpact is an Equal Opportunity Employer and considers applicants for all positions without regard to race, color , religion or belief, sex, age, national origin, citizenship status, marital status, military/veteran status, genetic information, sexual orientation, gender identity, physical or mental disability or any other characteristic protected by applicable laws. Genpact is committed to creating a dynamic work environment that values respect and integrity, customer focus, and innovation. For more information, visit . Follow us on Twitter, Facebook, LinkedIn, and YouTube. Furthermore, please do note that Genpact does not charge fees to process job applications and applicants are not required to pay to participate in our hiring process in any other way. Examples of such scams include purchasing a %27starter kit,%27 paying to apply, or purchasing equipment or training .

Posted 1 month ago

Apply

1.0 - 3.0 years

3 - 5 Lacs

New Delhi, Chennai, Bengaluru

Hybrid

Your day at NTT DATA We are seeking an experienced Data Engineer to join our team in delivering cutting-edge Generative AI (GenAI) solutions to clients. The successful candidate will be responsible for designing, developing, and deploying data pipelines and architectures that support the training, fine-tuning, and deployment of LLMs for various industries. This role requires strong technical expertise in data engineering, problem-solving skills, and the ability to work effectively with clients and internal teams. What youll be doing Key Responsibilities: Design, develop, and manage data pipelines and architectures to support GenAI model training, fine-tuning, and deployment Data Ingestion and Integration: Develop data ingestion frameworks to collect data from various sources, transform, and integrate it into a unified data platform for GenAI model training and deployment. GenAI Model Integration: Collaborate with data scientists to integrate GenAI models into production-ready applications, ensuring seamless model deployment, monitoring, and maintenance. Cloud Infrastructure Management: Design, implement, and manage cloud-based data infrastructure (e.g., AWS, GCP, Azure) to support large-scale GenAI workloads, ensuring cost-effectiveness, security, and compliance. Write scalable, readable, and maintainable code using object-oriented programming concepts in languages like Python, and utilize libraries like Hugging Face Transformers, PyTorch, or TensorFlow Performance Optimization: Optimize data pipelines, GenAI model performance, and infrastructure for scalability, efficiency, and cost-effectiveness. Data Security and Compliance: Ensure data security, privacy, and compliance with regulatory requirements (e.g., GDPR, HIPAA) across data pipelines and GenAI applications. Client Collaboration: Collaborate with clients to understand their GenAI needs, design solutions, and deliver high-quality data engineering services. Innovation and R&D: Stay up to date with the latest GenAI trends, technologies, and innovations, applying research and development skills to improve data engineering services. Knowledge Sharing: Share knowledge, best practices, and expertise with team members, contributing to the growth and development of the team. Bachelors degree in computer science, Engineering, or related fields (Masters recommended) Experience with vector databases (e.g., Pinecone, Weaviate, Faiss, Annoy) for efficient similarity search and storage of dense vectors in GenAI applications 5+ years of experience in data engineering, with a strong emphasis on cloud environments (AWS, GCP, Azure, or Cloud Native platforms) Proficiency in programming languages like SQL, Python, and PySpark Strong data architecture, data modeling, and data governance skills Experience with Big Data Platforms (Hadoop, Databricks, Hive, Kafka, Apache Iceberg), Data Warehouses (Teradata, Snowflake, BigQuery), and lakehouses (Delta Lake, Apache Hudi) Knowledge of DevOps practices, including Git workflows and CI/CD pipelines (Azure DevOps, Jenkins, GitHub Actions) Experience with GenAI frameworks and tools (e.g., TensorFlow, PyTorch, Keras) Nice to have: Experience with containerization and orchestration tools like Docker and Kubernetes Integrate vector databases and implement similarity search techniques, with a focus on GraphRAG is a plus Familiarity with API gateway and service mesh architectures Experience with low latency/streaming, batch, and micro-batch processing Familiarity with Linux-based operating systems and REST APIs

Posted 1 month ago

Apply

0.0 years

0 Lacs

Hyderabad / Secunderabad, Telangana, Telangana, India

On-site

Ready to shape the future of work At Genpact, we don&rsquot just adapt to change&mdashwe drive it. AI and digital innovation are redefining industries, and we&rsquore leading the charge. Genpact&rsquos , our industry-first accelerator, is an example of how we&rsquore scaling advanced technology solutions to help global enterprises work smarter, grow faster, and transform at scale. From large-scale models to , our breakthrough solutions tackle companies most complex challenges. If you thrive in a fast-moving, tech-driven environment, love solving real-world problems, and want to be part of a team that&rsquos shaping the future, this is your moment. Genpact (NYSE: G) is an advanced technology services and solutions company that delivers lasting value for leading enterprises globally. Through our deep business knowledge, operational excellence, and cutting-edge solutions - we help companies across industries get ahead and stay ahead. Powered by curiosity, courage, and innovation , our teams implement data, technology, and AI to create tomorrow, today. Get to know us at and on , , , and . Inviting applications for the role of Consultant- Databricks Developer with experience in Unity Catalog + Python , Spark , Kafka for ETL ! In this role, the Databricks Developer is responsible for solving the real world cutting edge problem to meet both functional and non-functional requirements. Responsibilities . Develop and maintain scalable ETL pipelines using Databricks with a focus on Unity Catalog for data asset management. Implement data processing frameworks using Apache Spark for large-scale data transformation and aggregation. Integrate real-time data streams using Apache Kafka and Databricks to enable near real-time data processing. Develop data workflows and orchestrate data pipelines using Databricks Workflows or other orchestration tools. Design and enforce data governance policies, access controls, and security protocols within Unity Catalog . Monitor data pipeline performance, troubleshoot issues, and implement optimizations for scalability and efficiency. Write efficient Python scripts for data extraction, transformation, and loading. Collaborate with data scientists and analysts to deliver data solutions that meet business requirements. Maintain data documentation, including data dictionaries, data lineage, and data governance frameworks. Qualifications we seek in you! Minimum qualifications . Bachelor&rsquos degree in Computer Science , Data Engineering, or a related field. . experience in data engineering with a focus on Databricks development. . Proven expertise in Databricks, Unity Catalog , and data lake management. . Strong programming skills in Python for data processing and automation. . Experience with Apache Spark for distributed data processing and optimization. . Hands -on experience with Apache Kafka for data streaming and event processing. . Proficiency in SQL for data querying and transformation. . Strong understanding of data governance, data security, and data quality frameworks. . Excellent communication skills and the ability to work in a cross-functional environ . Must have experience in Data Engineering domain . . Must have implemented at least 2 project end-to-end in Databricks. . Must have at least experience on databricks which consists of various components as below o Delta lake o dbConnect o db API 2.0 o Databricks workflows orchestration . Must be well versed with Databricks Lakehouse concept and its implementation in enterprise environments. . Must have good understanding to create complex data pipeline . Must have good knowledge of Data structure & algorithms. . Must be strong in SQL and sprak-sql . . Must have strong performance optimization skills to improve efficiency and reduce cost. . Must have worked on both Batch and streaming data pipeline. . Must have extensive knowledge of Spark and Hive data processing framework. . Must have worked on any cloud (Azure, AWS, GCP) and most common services like ADLS/S3, ADF/Lambda, CosmosDB /DynamoDB, ASB/SQS, Cloud databases. . Must be strong in writing unit test case and integration test . Must have strong communication skills and have worked on the team of size 5 plus . Must have great attitude towards learning new skills and upskilling the existing skills. Preferred Qualifications . Good to have Unity catalog and basic governance knowledge. . Good to have Databricks SQL Endpoint understanding. . Good To have CI/CD experience to build the pipeline for Databricks jobs. . Good to have if worked on migration project to build Unified data platform. . Good to have knowledge of DBT. . Good to have knowledge of docker and Kubernetes. Why join Genpact Be a transformation leader - Work at the cutting edge of AI, automation, and digital innovation Make an impact - Drive change for global enterprises and solve business challenges that matter Accelerate your career - Get hands-on experience, mentorship, and continuous learning opportunities Work with the best - Join 140,000+ bold thinkers and problem-solvers who push boundaries every day Thrive in a values-driven culture - Our courage, curiosity, and incisiveness - built on a foundation of integrity and inclusion - allow your ideas to fuel progress Come join the tech shapers and growth makers at Genpact and take your career in the only direction that matters: Up. Let&rsquos build tomorrow together. Genpact is an Equal Opportunity Employer and considers applicants for all positions without regard to race, color , religion or belief, sex, age, national origin, citizenship status, marital status, military/veteran status, genetic information, sexual orientation, gender identity, physical or mental disability or any other characteristic protected by applicable laws. Genpact is committed to creating a dynamic work environment that values respect and integrity, customer focus, and innovation. Furthermore, please do note that Genpact does not charge fees to process job applications and applicants are not required to pay to participate in our hiring process in any other way. Examples of such scams include purchasing a %27starter kit,%27 paying to apply, or purchasing equipment or training.

Posted 1 month ago

Apply

4.0 - 9.0 years

12 - 22 Lacs

Gurugram

Work from Office

To Apply - Submit Details via Google Form - https://forms.gle/8SUxUV2cikzjvKzD9 As a Senior Consultant in our Consulting team, youll build and nurture positive working relationships with teams and clients with the intention to exceed client expectations Seeking experienced AWS Data Engineers to design, implement, and maintain robust data pipelines and analytics solutions using AWS services. The ideal candidate will have a strong background in AWS data services, big data technologies, and programming languages. Role & responsibilities 1. Design and implement scalable, high-performance data pipelines using AWS services 2. Develop and optimize ETL processes using AWS Glue, EMR, and Lambda 3. Build and maintain data lakes using S3 and Delta Lake 4. Create and manage analytics solutions using Amazon Athena and Redshift 5. Design and implement database solutions using Aurora, RDS, and DynamoDB 6. Develop serverless workflows using AWS Step Functions 7. Write efficient and maintainable code using Python/PySpark, and SQL/PostgrSQL 8. Ensure data quality, security, and compliance with industry standards 9. Collaborate with data scientists and analysts to support their data needs 10. Optimize data architecture for performance and cost-efficiency 11. Troubleshoot and resolve data pipeline and infrastructure issues Preferred candidate profile 1. Bachelors degree in computer science, Information Technology, or related field 2. Relevant years of experience as a Data Engineer, with at least 60% of experience focusing on AWS 3. Strong proficiency in AWS data services: Glue, EMR, Lambda, Athena, Redshift, S3 4. Experience with data lake technologies, particularly Delta Lake 5. Expertise in database systems: Aurora, RDS, DynamoDB, PostgreSQL 6. Proficiency in Python and PySpark programming 7. Strong SQL skills and experience with PostgreSQL 8. Experience with AWS Step Functions for workflow orchestration Technical Skills: - AWS Services: Glue, EMR, Lambda, Athena, Redshift, S3, Aurora, RDS, DynamoDB , Step Functions - Big Data: Hadoop, Spark, Delta Lake - Programming: Python, PySpark - Databases: SQL, PostgreSQL, NoSQL - Data Warehousing and Analytics - ETL/ELT processes - Data Lake architectures - Version control: Git - Agile methodologies

Posted 1 month ago

Apply

4.0 - 9.0 years

12 - 22 Lacs

Gurugram, Bengaluru

Work from Office

To Apply - Submit Details via Google Form - https://forms.gle/8SUxUV2cikzjvKzD9 As a Senior Consultant in our Consulting team, youll build and nurture positive working relationships with teams and clients with the intention to exceed client expectations Seeking experienced AWS Data Engineers to design, implement, and maintain robust data pipelines and analytics solutions using AWS services. The ideal candidate will have a strong background in AWS data services, big data technologies, and programming languages. Role & responsibilities 1. Design and implement scalable, high-performance data pipelines using AWS services 2. Develop and optimize ETL processes using AWS Glue, EMR, and Lambda 3. Build and maintain data lakes using S3 and Delta Lake 4. Create and manage analytics solutions using Amazon Athena and Redshift 5. Design and implement database solutions using Aurora, RDS, and DynamoDB 6. Develop serverless workflows using AWS Step Functions 7. Write efficient and maintainable code using Python/PySpark, and SQL/PostgrSQL 8. Ensure data quality, security, and compliance with industry standards 9. Collaborate with data scientists and analysts to support their data needs 10. Optimize data architecture for performance and cost-efficiency 11. Troubleshoot and resolve data pipeline and infrastructure issues Preferred candidate profile 1. Bachelors degree in computer science, Information Technology, or related field 2. Relevant years of experience as a Data Engineer, with at least 60% of experience focusing on AWS 3. Strong proficiency in AWS data services: Glue, EMR, Lambda, Athena, Redshift, S3 4. Experience with data lake technologies, particularly Delta Lake 5. Expertise in database systems: Aurora, RDS, DynamoDB, PostgreSQL 6. Proficiency in Python and PySpark programming 7. Strong SQL skills and experience with PostgreSQL 8. Experience with AWS Step Functions for workflow orchestration Technical Skills: - AWS Services: Glue, EMR, Lambda, Athena, Redshift, S3, Aurora, RDS, DynamoDB , Step Functions - Big Data: Hadoop, Spark, Delta Lake - Programming: Python, PySpark - Databases: SQL, PostgreSQL, NoSQL - Data Warehousing and Analytics - ETL/ELT processes - Data Lake architectures - Version control: Git - Agile methodologies

Posted 1 month ago

Apply

7.0 - 12.0 years

15 - 22 Lacs

Bengaluru

Hybrid

Job Summary: We are seeking a talented Data Engineer with strong expertise in Databricks, specifically in Unity Catalog, PySpark, and SQL, to join our data team. Youll play a key role in building secure, scalable data pipelines and implementing robust data governance strategies using Unity Catalog. Key Responsibilities: Design and implement ETL/ELT pipelines using Databricks and PySpark. Work with Unity Catalog to manage data governance, access controls, lineage, and auditing across data assets. Develop high-performance SQL queries and optimize Spark jobs. Collaborate with data scientists, analysts, and business stakeholders to understand data needs. Ensure data quality and compliance across all stages of the data lifecycle. Implement best practices for data security and lineage within the Databricks ecosystem. Participate in CI/CD, version control, and testing practices for data pipelines. Required Skills: Proven experience with Databricks and Unity Catalog (data permissions, lineage, audits). Strong hands-on skills with PySpark and Spark SQL. Solid experience writing and optimizing complex SQL queries. Familiarity with Delta Lake, data lakehouse architecture, and data partitioning. Experience with cloud platforms like Azure or AWS. Understanding of data governance, RBAC, and data security standards. Preferred Qualifications: Databricks Certified Data Engineer Associate or Professional. Experience with tools like Airflow, Git, Azure Data Factory, or dbt. Exposure to streaming data and real-time processing. Knowledge of DevOps practices for data engineering.

Posted 1 month ago

Apply

0.0 years

0 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

Ready to shape the future of work At Genpact, we don&rsquot just adapt to change&mdashwe drive it. AI and digital innovation are redefining industries, and we&rsquore leading the charge. Genpact&rsquos AI Gigafactory, our industry-first accelerator, is an example of how we&rsquore scaling advanced technology solutions to help global enterprises work smarter, grow faster, and transform at scale. From large-scale models to agentic AI, our breakthrough solutions tackle companies most complex challenges. If you thrive in a fast-moving, tech-driven environment, love solving real-world problems, and want to be part of a team that&rsquos shaping the future, this is your moment. Genpact (NYSE: G) is an advanced technology services and solutions company that delivers lasting value for leading enterprises globally. Through our deep business knowledge, operational excellence, and cutting-edge solutions - we help companies across industries get ahead and stay ahead. Powered by curiosity, courage, and innovation, our teams implement data, technology, and AI to create tomorrow, today. Get to know us at genpact.com and on LinkedIn, X, YouTube, and Facebook. Inviting applications for the role of Senior Principal Consultant- Senior Data Engineer - Databricks, Azure & Mosaic AI Role Summary: We are seeking a Senior Data Engineer with extensive expertise in Data & Analytics platform modernization using Databricks, Azure, and Mosaic AI. This role will focus on designing and optimizing cloud-based data architectures, leveraging AI-driven automation to enhance data pipelines, governance, and processing at scale. Key Responsibilities: . Architect & modernize Data & Analytics platforms using Databricks on Azure. . Design and optimize Lakehouse architectures integrating Azure Data Lake, Databricks Delta Lake, and Synapse Analytics. . Implement Mosaic AI for AI-driven automation, predictive analytics, and intelligent data engineering solutions. . Lead the migration of legacy data platforms to a modern cloud-native Data & AI ecosystem. . Develop high-performance ETL pipelines, integrating Databricks with Azure services such as Data Factory, Synapse, and Purview. . Utilize MLflow & Mosaic AI for AI-enhanced data processing and decision-making. . Establish data governance, security, lineage tracking, and metadata management across modern data platforms. . Work collaboratively with business leaders, data scientists, and engineers to drive innovation. . Stay at the forefront of emerging trends in AI-powered data engineering and modernization strategies. Qualifications we seek in you! Minimum Qualifications . experience in Data Engineering, Cloud Platforms, and AI-driven automation. . Expertise in Databricks (Apache Spark, Delta Lake, MLflow) and Azure (Data Lake, Synapse, ADF, Purview). . Strong experience with Mosaic AI for AI-powered data engineering and automation. . Advanced proficiency in SQL, Python, and Scala for big data processing. . Experience in modernizing Data & Analytics platforms, migrating from on-prem to cloud. . Knowledge of Data Lineage, Observability, and AI-driven Data Governance frameworks. . Familiarity with Vector Databases & Retrieval-Augmented Generation (RAG) architectures for AI-powered data analytics. . Strong leadership, problem-solving, and stakeholder management skills. Preferred Skills: . Experience with Knowledge Graphs (Neo4J, TigerGraph) for data structuring. . Exposure to Kubernetes, Terraform, and CI/CD for scalable cloud deployments. . Background in streaming technologies (Kafka, Spark Streaming, Kinesis). Why join Genpact . Be a transformation leader - Work at the cutting edge of AI, automation, and digital innovation . Make an impact - Drive change for global enterprises and solve business challenges that matter . Accelerate your career - Get hands-on experience, mentorship, and continuous learning opportunities . Work with the best - Join 140,000+ bold thinkers and problem-solvers who push boundaries every day . Thrive in a values-driven culture - Our courage, curiosity, and incisiveness - built on a foundation of integrity and inclusion - allow your ideas to fuel progress Come join the tech shapers and growth makers at Genpact and take your career in the only direction that matters: Up. Let&rsquos build tomorrow together. Genpact is an Equal Opportunity Employer and considers applicants for all positions without regard to race, color, religion or belief, sex, age, national origin, citizenship status, marital status, military/veteran status, genetic information, sexual orientation, gender identity, physical or mental disability or any other characteristic protected by applicable laws. Genpact is committed to creating a dynamic work environment that values respect and integrity, customer focus, and innovation. Furthermore, please do note that Genpact does not charge fees to process job applications and applicants are not required to pay to participate in our hiring process in any other way. Examples of such scams include purchasing a %27starter kit,%27 paying to apply, or purchasing equipment or training.

Posted 1 month ago

Apply

5.0 - 10.0 years

10 - 12 Lacs

Chennai

Work from Office

Databricks developer with deep SQL expertise to support the development of scalable data pipelines and analytics workflows, will work closely with data engineers BIanalysts to prepare clean, query-optimized datasets for reporting and modeling.

Posted 1 month ago

Apply

3.0 - 6.0 years

7 - 12 Lacs

Pune

Work from Office

We are seeking an experienced Databricks Developer with expertise in Delta Live Tables to join our team. The ideal candidate will possess a strong background in designing, developing, and maintaining data pipelines using Databricks and Delta Lake technologies. Key Responsibilities: Develop, implement, and optimize data pipelines and workflows using Databricks platform. Design and manage Delta Live Tables for real-time, streaming and batch data processing solutions. Monitor and troubleshoot data pipelines to ensure high performance, reliability, and data quality. Qualifications: 3 to 6 years of hands-on experience in Databricks platform development. Proven expertise in Delta Lake and Delta Live Tables. Strong SQL and Python/Scala programming skills. Experience with cloud platforms such as Azure, AWS, or GCP (preferably Azure). Familiarity with data modeling and data warehousing concepts.

Posted 1 month ago

Apply

3.0 - 5.0 years

5 - 8 Lacs

Hyderabad

Work from Office

Understanding of Spark core concepts like RDD s, DataFrames, DataSets, SparkSQL and Spark Streaming. Experience with Spark optimization techniques. Deep knowledge of Delta Lake features like time travel, schema evolution, data partitioning. Ability to design and implement data pipelines using Spark and Delta Lake as the data storage layer. Proficiency in Python / Scala/Java for Spark development and integrate with ETL process. Knowledge of data ingestion techniques from various sources (flat files, CSV, API, database) Understanding of data quality best practices and data validation techniques. Other Skills: Understanding of data warehouse concepts, data modelling techniques. Expertise in Git for code management. Familiarity with CI/CD pipelines and containerization technologies. Nice to have experience using data integration tools like DataStage/Prophecy/ Informatica/Ab Initio.

Posted 1 month ago

Apply

4.0 - 8.0 years

6 - 10 Lacs

Pune

Work from Office

Roles and Responsibility : The Senior Tech Lead - Databricks leads the design, development, and implementation of advanced data solutions. Has To have extensive experience in Databricks, cloud platforms, and data engineering, with a proven ability to lead teams and deliver complex projects. Responsibilities: Lead the design and implementation of Databricks-based data solutions. Architect and optimize data pipelines for batch and streaming data. Provide technical leadership and mentorship to a team of data engineers. Collaborate with stakeholders to define project requirements and deliverables. Ensure best practices in data security, governance, and compliance. Troubleshoot and resolve complex technical issues in Databricks environments. Stay updated on the latest Databricks features and industry trends. Key Technical Skills & Responsibilities Experience in data engineering using Databricks or Apache Spark-based platforms. Proven track record of building and optimizing ETL/ELT pipelines for batch and streaming data ingestion. Hands-on experience with Azure services such as Azure Data Factory, Azure Data Lake Storage, Azure Databricks, Azure Synapse Analytics, or Azure SQL Data Warehouse. Proficiency in programming languages such as Python, Scala, SQL for data processing and transformation. Expertise in Spark (PySpark, Spark SQL, or Scala) and Databricks notebooks for large-scale data processing. Familiarity with Delta Lake, Delta Live Tables, and medallion architecture for data lakehouse implementations. Experience with orchestration tools like Azure Data Factory or Databricks Jobs for scheduling and automation. Design and implement the Azure key vault and scoped credentials. Knowledge of Git for source control and CI/CD integration for Databricks workflows, cost optimization, performance tuning. Familiarity with Unity Catalog, RBAC, or enterprise-level Databricks setups. Ability to create reusable components, templates, and documentation to standardize data engineering workflows is a plus. Ability to define best practices, support multiple projects, and sometimes mentor junior engineers is a plus. Must have experience of working with streaming data sources and Kafka (preferred) Eligibility Criteria: Bachelors degree in Computer Science, Data Engineering, or a related field Extensive experience with Databricks, Delta Lake, PySpark, and SQL Databricks certification (e.g., Certified Data Engineer Professional) Experience with machine learning and AI integration in Databricks Strong understanding of cloud platforms (AWS, Azure, or GCP) Proven leadership experience in managing technical teams Excellent problem-solving and communication skills

Posted 1 month ago

Apply

5.0 - 8.0 years

5 - 8 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

Senior data management/integration engineer End to end data management and engineering experience. 10+ years Very strong python programming pyspark dev experience in a large scale data ecosystem Solid foundation on SQL data management Familiarity with datalake/delta lake architectures Requirements analysis familiarity and experience interacting and engaging with business teams Retail industry experience (Fortune 1000) around supply chain and operating with large data sets is a must have.

Posted 2 months ago

Apply

10.0 - 14.0 years

10 - 14 Lacs

Hyderabad / Secunderabad, Telangana, Telangana, India

On-site

What you will do In this vital role you will for designing, developing, and maintaining software solutions for Research scientists. Additionally, it involves automating operations, monitoring system health, and responding to incidents to minimize downtime. You will join a multi-functional team of scientists and software professionals that enables technology and data capabilities to evaluate drug candidates and assess their abilities to affect the biology of drug targets. This team implements scientific software platforms that enable the capture, analysis, storage, and report of in vitro assays and in vivo / pre-clinical studies as well as those that manage compound inventories / biological sample banks. The ideal candidate possesses experience in the pharmaceutical or biotech industry, strong technical skills, and full stack software engineering experience (spanning SQL, back-end, front-end web technologies, automated testing). Roles & Responsibilities: Design, develop, and implement applications and modules, including custom reports, interfaces, and enhancements Analyze and understand the functional and technical requirements of applications, solutions and systems and translate them into software architecture and design specifications Develop and implement unit tests, integration tests, and other testing strategies to ensure the quality of the software Identify and resolve software bugs and performance issues Work closely with multi-functional teams, including product management, design, and QA, to deliver high-quality software on time Maintain detailed documentation of software designs, code, and development processes Customize modules to meet specific business requirements Work on integrating with other systems and platforms to ensure seamless data flow and functionality Provide ongoing support and maintenance for applications, ensuring that they operate smoothly and efficiently Possesses strong rapid prototyping skills and can quickly translate concepts into working code Contribute to both front-end and back-end development using cloud technology Develop innovative solution using generative AI technologies Create and maintain documentation on software architecture, design, deployment, disaster recovery, and operations Identify and resolve technical challenges effectively Stay updated with the latest trends and advancements Work closely with product team, business team including scientists, and other collaborators Roles & Responsibilities: Project & Portfolio Delivery Lead the execution of initiatives across the data platforms portfolio, ensuring projects are delivered on time, within scope, and to expected quality standards. Coordinate cross-functional teams (Business, engineering, architecture, operations, governance) to deliver tools, technologies and platforms. Lead the initiatives for evaluating latest market technologies in the area of data Engineering & Management & Governance Financial Management Own and manage project and portfolio budgets, including tracking actuals vs forecasts, accruals, and reporting on financial performance to stakeholders. Partner with Finance, Procurement, and Vendor Management teams to support contract reviews, Platform costs. Proactively monitor financial risks and ensure alignment of project spend with approved business cases and funding models. Prepare financial summaries and variance reports for leadership and program steering committees. Planning & Governance Maintain integrated plans and roadmaps across projects within the data platforms portfolio. Run governance forums, manage stakeholder expectations, and ensure project artifacts, status reports, and RAID logs are consistently maintained. Stakeholder & Communication Management Serve as the central point of contact between technical teams, business stakeholders, and vendors. Lead project steering committee meetings and provide clear and concise updates to senior leadership. Agile & Hybrid Delivery Apply agile, SAFe or hybrid delivery methods based on project needs, support backlog grooming, sprint planning, and release planning. Promote continuous improvement in delivery through retrospectives and feedback loops. Must Have skills: Demonstrated experience managing project financials (budgeting, forecasting, variance analysis, cost optimization) Experience working in large, complex enterprise environments with cross-functional stakeholders Familiarity with modern data platforms such as Azure Data Lake, Databricks, Snowflake, Synapse, Kafka, Delta Lake, etc. Strong understanding of data management lifecycle, data architecture, and platform components (ingestion, processing, governance, access) Excellent interpersonal, presentation, and negotiation skills PMP, PMI-ACP, SAFe, or equivalent certifications are a plus Basic Qualifications and Experience: Masters degree with 8-10+ years of experience in Business, Engineering, IT or related field OR Bachelors degree with 10-14+ years of experience in Business, Engineering, IT or related field OR Diploma with 14+ years of experience in Business, Engineering, IT or related field Good-to-Have Skills: Strong understanding of Cloud Infrastructure, Data & Analytics tools like Databricks, Informatica, PowerBI, Tableau and Data Governance technologies Experience with cloud (e.g. AWS) and on-premises compute infrastructure Experience with Databricks platform. Professional Certifications : Project Managerment Certifications Agile Certified Practitioner (preferred) AWS certification Soft Skills: Excellent interpersonal, presentation, and negotiation skills Strong analytical abilities to assess and improve data processes and solutions. Excellent verbal and written communication skills, with the ability to convey complex data concepts clearly to technical and non-technical stakeholders. Effective problem-solving skills to address data-related issues and implement scalable solutions. Ability to work effectively with global, virtual teams

Posted 2 months ago

Apply

7.0 - 12.0 years

20 - 35 Lacs

Mumbai

Work from Office

Job Summary: We are looking for a highly skilled Azure Data Engineer with a strong background in real-time and batch data ingestion and big data processing, particularly using Kafka and Databricks . The ideal candidate will have a deep understanding of streaming architectures , Medallion data models , and performance optimization techniques in cloud environments. This role requires hands-on technical expertise , including live coding during the interview process. Key Responsibilities Design and implement streaming data pipelines integrating Kafka with Databricks using Structured Streaming . Architect and maintain Medallion Architecture with well-defined Bronze, Silver, and Gold layers . Implement efficient ingestion using Databricks Autoloader for high-throughput data loads. Work with large volumes of structured and unstructured data , ensuring high availability and performance. Apply performance tuning techniques such as partitioning, caching , and cluster resource optimization . Collaborate with cross-functional teams (data scientists, analysts, business users) to build robust data solutions. Establish best practices for code versioning , deployment automation , and data governance . Required Technical Skills: Strong expertise in Azure Databricks and Spark Structured Streaming Processing modes (append, update, complete) Output modes (append, complete, update) Checkpointing and state management Experience with Kafka integration for real-time data pipelines Deep understanding of Medallion Architecture Proficiency with Databricks Autoloader and schema evolution Deep understanding of Unity Catalog and Foreign catalog Strong knowledge of Spark SQL, Delta Lake, and DataFrames Expertise in performance tuning (query optimization, cluster configuration, caching strategies) Must have Data management strategies Excellent with Governance and Access management Strong with Data modelling, Data warehousing concepts, Databricks as a platform Solid understanding of Window functions Proven experience in: Merge/Upsert logic Implementing SCD Type 1 and Type 2 Handling CDC (Change Data Capture) scenarios Retail/Telcom/Energy any one industry expertise Real time use case execution Data modelling Location: Mumbai

Posted 2 months ago

Apply

10.0 - 12.0 years

0 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

Req ID: 323226 NTT DATA strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptable, and forward-thinking organization, apply now. We are currently seeking a Digital Solution Architect Sr. Advisor to join our team in Bengaluru, India, Karn?taka (IN-KA), India (IN). Key Responsibilities: Design data platform architectures (data lakes, lakehouses, DWH) using modern cloud-native tools (e.g., Databricks, Snowflake, BigQuery, Synapse, Redshift). Architect data ingestion, transformation, and consumption pipelines using batch and streaming methods. Enable real-time analytics and machine learning through scalable and modular data frameworks. Define data governance models, metadata management, lineage tracking, and access controls. Collaborate with AI/ML, application, and business teams to identify high-impact use cases and optimize data usage. Lead modernization initiatives from legacy data warehouses to cloud-native and distributed architectures. Enforce data quality and observability practices for mission-critical workloads. Required Skills: 10+ years in data architecture, with strong grounding in modern data platforms and pipelines. Deep knowledge of SQL/NoSQL, Spark, Delta Lake, Kafka, ETL/ELT frameworks. Hands-on experience with cloud data platforms (AWS, Azure, GCP). Understanding of data privacy, security, lineage, and compliance (GDPR, HIPAA, etc.). Experience implementing data mesh/data fabric concepts is a plus. Expertise in technical solutions writing and presenting using tools such as Word, PowerPoint, Excel, Visio etc. High level of executive presence to be able to articulate the solutions to CXO level executives. Preferred Qualifications: Certifications in Snowflake, Databricks, or cloud-native data platforms. Exposure to AI/ML data pipelines, MLOps, and real-time data applications. Familiarity with data visualization and BI tools (Power BI, Tableau, Looker, etc.). About NTT DATA NTT DATA is a $30 billion trusted global innovator of business and technology services. We serve 75% of the Fortune Global 100 and are committed to helping clients innovate, optimize and transform for long term success. As a Global Top Employer, we have diverse experts in more than 50 countries and a robust partner ecosystem of established and start-up companies. Our services include business and technology consulting, data and artificial intelligence, industry solutions, as well as the development, implementation and management of applications, infrastructure and connectivity. We are one of the leading providers of digital and AI infrastructure in the world. NTT DATA is a part of NTT Group, which invests over $3.6 billion each year in R&D to help organizations and society move confidently and sustainably into the digital future. Visit us at NTT DATA endeavors to make accessible to any and all users. If you would like to contact us regarding the accessibility of our website or need assistance completing the application process, please contact us at . This contact information is for accommodation requests only and cannot be used to inquire about the status of applications. NTT DATA is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status. For our EEO Policy Statement, please click . If you'd like more information on your EEO rights under the law, please click . For Pay Transparency information, please click.

Posted 2 months ago

Apply

3.0 - 6.0 years

1 - 6 Lacs

Gurugram

Work from Office

Role & responsibilities Design, develop, and maintain scalable Python applications for data processing and analytics. Build and manage ETL pipelines using Databricks on Azure/AWS cloud platforms. Collaborate with analysts and other developers to understand business requirements and implement data-driven solutions. Optimize and monitor existing data workflows to improve performance and scalability. Write clean, maintainable, and testable code following industry best practices. Participate in code reviews and provide constructive feedback. Maintain documentation and contribute to project planning and reporting. Skills & Experience Bachelor's degree in Computer Science, Engineering, or related field Prior experience as a Python Developer or similar role, with a strong portfolio showcasing your past projects. 2-5 years of Python experience Strong proficiency in Python programming. Hands-on experience with Databricks platform (Notebooks, Delta Lake, Spark jobs, cluster configuration, etc.). Good knowledge of Apache Spark and its Python API (PySpark). Experience with cloud platforms (preferably Azure or AWS) and working with Databricks on cloud. Familiarity with data pipeline orchestration tools (e.g., Airflow, Azure Data Factory, etc.).

Posted 2 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies