Jobs
Interviews

190 Delta Lake Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

2.0 - 6.0 years

0 Lacs

maharashtra

On-site

As a dedicated professional at Leighton Asia, you will be responsible for spearheading the development of end-to-end Azure integration solutions in alignment with CIMIC's OneIT Target Reference Architecture. Your key responsibilities will include ensuring compliance with CIMIC's Security Standards, constructing data and integration solutions that adhere to CIMIC's Target Reference Architecture, as well as hands-on development and unit testing of the solutions. You will play a crucial role in following CIMIC's path to production environments, providing technical support during test phases, and conducting handover sessions to transition solutions to BAU Support. Additionally, you will be involved in deploying solutions in PROD through CIMIC's CAB process and offering post go-live hypercare support. Your expertise will be instrumental in preparing technical specification documents, obtaining code review approvals, and ensuring compliance with standards. Your work will focus on the development and maintenance of CIMIC's data and integration solutions while aligning with CIMIC's Target Reference Architecture, strategic roadmap, and emerging technologies. To excel in this role, you should hold a degree in ICT, Engineering, or Science in a relevant discipline and possess at least 2-3 years of experience in a similar capacity. Your qualifications should include data and integration solution design skills, familiarity with Azure projects related to data ingestion, storage, transformation/enrichment, data lake solutions, point-to-point integrations, real-time and batch integrations, and an understanding of security compliance requirements. A solid grasp of Azure standards applicable to data and integration solutions, experience collaborating with on and offshore teams, and a commitment to architectural design and solutions are essential. Moreover, your technical proficiency should encompass two out of the three pillars of technical skills outlined below: Tech Pillar #1 Ingestion: - API Manager, REST APIs, Web API - Azure Logic Apps, Function Apps (C# - no, Python- 5 years) - Azure Data Factory and Integration Runtimes - Azure Storage, Delta Lake - Creation of Azure GitHub Repos, Build and Release Pipelines - Azure Key Vault - Service Principles, Security and Azure AD Groups/Entral ID Tech Pillar #2 Process/Enrichment: - Azure Databricks - Azure SQL and SQL MI (with SQL skills) - Azure Data Factory (including orchestration of Databricks notebooks) - Azure Storage, Delta Lake - Creation of Azure GitHub Repos, Build and Release Pipelines - Azure Key Vault - Service Principles, Security and Azure AD Groups/Entral ID By joining our team, you will play a vital role in delivering innovative solutions that drive the success of our projects while contributing to the growth and development of CIMIC Group.,

Posted 13 hours ago

Apply

10.0 - 14.0 years

0 Lacs

dehradun, uttarakhand

On-site

As a Data Modeler, your primary responsibility will be to design and develop conceptual, logical, and physical data models supporting enterprise data initiatives. You will work with modern storage formats like Parquet and ORC, and build and optimize data models within Databricks Unity Catalog. Collaborating with data engineers, architects, analysts, and stakeholders, you will ensure alignment with ingestion pipelines and business goals. Translating business and reporting requirements into robust data architecture, you will follow best practices in data warehousing and Lakehouse design. Your role will involve maintaining metadata artifacts, enforcing data governance, quality, and security protocols, and continuously improving modeling processes. You should have over 10 years of hands-on experience in data modeling within Big Data environments. Your expertise should include OLTP, OLAP, dimensional modeling, and enterprise data warehouse practices. Proficiency in modeling methodologies like Kimball, Inmon, and Data Vault is essential. Hands-on experience with modeling tools such as ER/Studio, ERwin, PowerDesigner, SQLDBM, dbt, or Lucidchart is preferred. Experience in Databricks with Unity Catalog and Delta Lake is required, along with a strong command of SQL and Apache Spark for querying and transformation. Familiarity with the Azure Data Platform, including Azure Data Factory, Azure Data Lake Storage, Azure Synapse Analytics, and Azure SQL Database, is beneficial. Exposure to Azure Purview or similar data cataloging tools is a plus. Strong communication and documentation skills are necessary for this role, as well as the ability to work in cross-functional agile environments. A Bachelor's or Master's degree in Computer Science, Information Systems, Data Engineering, or a related field is required. Certifications such as Microsoft DP-203: Data Engineering on Microsoft Azure are a plus. Experience working in agile/scrum environments and exposure to enterprise data security and regulatory compliance frameworks like GDPR and HIPAA are advantageous.,

Posted 13 hours ago

Apply

4.0 - 8.0 years

0 Lacs

pune, maharashtra

On-site

As a Big Data Architect specializing in Databricks at Codvo, a global empathy-led technology services company, your role is critical in designing sophisticated data solutions that drive business value for enterprise clients and power internal AI products. Your expertise will be instrumental in architecting scalable, high-performance data lakehouse platforms and end-to-end data pipelines, making you the go-to expert for modern data architecture in a cloud-first world. Your key responsibilities will include designing and documenting robust, end-to-end big data solutions on cloud platforms (AWS, Azure, GCP) with a focus on the Databricks Lakehouse Platform. You will provide technical guidance and oversight to data engineering teams on best practices for data ingestion, transformation, and processing using Spark. Additionally, you will design and implement effective data models and establish data governance policies for data quality, security, and compliance within the lakehouse. Evaluating and recommending appropriate data technologies, tools, and frameworks to meet project requirements and collaborating closely with various stakeholders to translate complex business requirements into tangible technical architecture will also be part of your role. Leading and building Proof of Concepts (PoCs) to validate architectural approaches and new technologies in the big data and AI space will be crucial. To excel in this role, you should have 10+ years of experience in data engineering, data warehousing, or software engineering, with at least 4+ years in a dedicated Data Architect role. Deep, hands-on expertise with Apache Spark and the Databricks platform is mandatory, including Delta Lake, Unity Catalog, and Structured Streaming. Proven experience architecting and deploying data solutions on major cloud providers, proficiency in Python or Scala, expert-level SQL skills, strong understanding of modern AI concepts, and in-depth knowledge of data warehousing concepts and modern Lakehouse patterns are essential. This position is remote and based in India with working hours from 2:30 PM to 11:30 PM. Join us at Codvo and be a part of a team that values Product innovation, mature software engineering, and core values like Respect, Fairness, Growth, Agility, and Inclusiveness each day to offer expertise, outside-the-box thinking, and measurable results.,

Posted 14 hours ago

Apply

7.0 - 9.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

???? Were Hiring: Senior Data Engineer 7+ Years Experience ???? Location: Gurugram, Haryana, India ???? Duration: 6 Months C2H (Contract to Hire) ???? Apply Now: [HIDDEN TEXT] ???? What Were Looking For: ? 7+ years of experience in data engineering ? Strong expertise in building scalable, robust batch and real-time data pipelines ? Proficiency in AWS Data Services (S3, Glue, Athena, EMR, Kinesis, etc.) ? Advanced SQL skills and deep knowledge of file formats: Parquet, Delta Lake, Iceberg, Hudi ? Hands-on experience with CDC patterns ? Experience with stream processing (Apache Flink , Kafka Streams) and distributed frameworks like PySpark ? Expertise in Apache Airflow for workflow orchestration ? Solid foundation in data warehousing concepts and experience with both relational and NoSQL databases ? Strong communication and problem-solving skills ? Passion for staying up to date with the latest in the data tech landscape Show more Show less

Posted 15 hours ago

Apply

8.0 - 12.0 years

0 Lacs

hyderabad, telangana

On-site

You should possess 8-10 years of Data and Analytics experience, including at least 5 years in Azure Data Factory and a minimum of 2 years in Databricks. Your communication and presentation skills should be excellent. It is required to have mastery in Data Engineering and Data Analysis, along with being Data Bricks Certified. You should have expertise in ETL/ELT processes, specifically in Azure Data Factory, Synapse Pipelines, and data transformation tasks like deduplication, filtering, sorting, etc. Proficiency in SQL and programming languages like Python is expected. As part of your skill set, you should be proficient in Azure Data Engineering Tools such as ADLS, SQL DB, Data Factory, Databricks, and Key Vault. Experience in job scheduling, automation, and orchestration using Azure Logic Apps, Functions, or any other ETL scheduler is essential. Designing and developing production data pipelines within a big data architecture, using Python, Scala, and having a good understanding of Delta Lake is crucial. Experience in building and delivering data analytics solutions using Azure is required. You should be familiar with project delivery methodologies like Waterfall, Agile, Scrum, and have experience in managing both internal and external stakeholders. Additionally, working experience with reporting tools like Power BI, Tableau, etc., will be an advantage.,

Posted 1 day ago

Apply

6.0 - 10.0 years

0 Lacs

pune, maharashtra

On-site

As a Senior Data Engineer at our Pune location, you will play a critical role in designing, developing, and maintaining scalable data pipelines and architectures using Data bricks on Azure/AWS cloud platforms. With 6 to 9 years of experience in the field, you will collaborate with stakeholders to integrate large datasets, optimize performance, implement ETL/ELT processes, ensure data governance, and work closely with cross-functional teams to deliver accurate solutions. Your responsibilities will include building, maintaining, and optimizing data workflows, integrating datasets from various sources, tuning pipelines for performance and scalability, implementing ETL/ELT processes using Spark and Data bricks, ensuring data governance, collaborating with different teams, documenting data pipelines, and developing automated processes for continuous integration and deployment of data solutions. To excel in this role, you should have 6 to 9 years of hands-on experience as a Data Engineer, expertise in Apache Spark, Delta Lake, Azure/AWS Data bricks, proficiency in Python, Scala, or Java, advanced SQL skills, experience with cloud data platforms, data warehousing solutions, data modeling, ETL tools, version control systems, and automation tools. Additionally, soft skills such as problem-solving, attention to detail, and ability to work in a fast-paced environment are essential. Nice to have skills include experience with Data bricks SQL and Data bricks Delta, knowledge of machine learning concepts, and experience in CI/CD pipelines for data engineering solutions. Joining our team offers challenging work with international clients, growth opportunities, a collaborative culture, and global project involvement. We provide competitive salaries, flexible work schedules, health insurance, performance-based bonuses, and other standard benefits. If you are passionate about data engineering, possess the required skills and qualifications, and thrive in a dynamic and innovative environment, we welcome you to apply for this exciting opportunity.,

Posted 1 day ago

Apply

10.0 - 14.0 years

0 Lacs

dehradun, uttarakhand

On-site

You should have familiarity with modern storage formats like Parquet and ORC. Your responsibilities will include designing and developing conceptual, logical, and physical data models to support enterprise data initiatives. You will build, maintain, and optimize data models within Databricks Unity Catalog, developing efficient data structures using Delta Lake to optimize performance, scalability, and reusability. Collaboration with data engineers, architects, analysts, and stakeholders is essential to ensure data model alignment with ingestion pipelines and business goals. You will translate business and reporting requirements into a robust data architecture using best practices in data warehousing and Lakehouse design. Additionally, maintaining comprehensive metadata artifacts such as data dictionaries, data lineage, and modeling documentation is crucial. Enforcing and supporting data governance, data quality, and security protocols across data ecosystems will be part of your role. You will continuously evaluate and improve modeling processes. The ideal candidate will have 10+ years of hands-on experience in data modeling in Big Data environments. Expertise in OLTP, OLAP, dimensional modeling, and enterprise data warehouse practices is required. Proficiency in modeling methodologies including Kimball, Inmon, and Data Vault is expected. Hands-on experience with modeling tools like ER/Studio, ERwin, PowerDesigner, SQLDBM, dbt, or Lucidchart is preferred. Proven experience in Databricks with Unity Catalog and Delta Lake is necessary, along with a strong command of SQL and Apache Spark for querying and transformation. Experience with the Azure Data Platform, including Azure Data Factory, Azure Data Lake Storage, Azure Synapse Analytics, and Azure SQL Database is beneficial. Exposure to Azure Purview or similar data cataloging tools is a plus. Strong communication and documentation skills are required, with the ability to work in cross-functional agile environments. Qualifications for this role include a Bachelor's or Master's degree in Computer Science, Information Systems, Data Engineering, or a related field. Certifications such as Microsoft DP-203: Data Engineering on Microsoft Azure are desirable. Experience working in agile/scrum environments and exposure to enterprise data security and regulatory compliance frameworks (e.g., GDPR, HIPAA) are also advantageous.,

Posted 1 day ago

Apply

5.0 - 9.0 years

0 Lacs

karnataka

On-site

We are seeking experienced and talented engineers to join our team. Your main responsibilities will include designing, building, and maintaining the software that drives the global logistics industry. WiseTech Global is a leading provider of software for the logistics sector, facilitating connectivity for major companies like DHL and FedEx within their supply chains. Our organization is product and engineer-focused, with a strong commitment to enhancing the functionality and quality of our software through continuous innovation. Our primary Research and Development center in Bangalore plays a pivotal role in our growth strategies and product development roadmap. As a Lead Software Engineer, you will serve as a mentor, a leader, and an expert in your field. You should be adept at effective communication with senior management while also being hands-on with the code to deliver effective solutions. The technical environment you will work in includes technologies such as C#, Java, C++, Python, Scala, Spring, Spring Boot, Apache Spark, Hadoop, Hive, Delta Lake, Kafka, Debezium, GKE (Kubernetes Engine), Composer (Airflow), DataProc, DataStreams, DataFlow, MySQL RDBMS, MongoDB NoSQL (Atlas), UIPath, Helm, Flyway, Sterling, EDI, Redis, Elastic Search, Grafana Dashboard, and Docker. Before applying, please note that WiseTech Global may engage external service providers to assess applications. By submitting your application and personal information, you agree to WiseTech Global sharing this data with external service providers who will handle it confidentially in compliance with privacy and data protection laws.,

Posted 2 days ago

Apply

6.0 - 10.0 years

0 Lacs

karnataka

On-site

As a Lead Azure Data Engineer at CGI, you will have the opportunity to be part of a dynamic team of builders who are dedicated to helping clients succeed. With our global resources, expertise, and stability, we aim to achieve results for our clients and members. If you are looking for a challenging role that offers professional growth and development, this is the perfect opportunity for you. In this role, you will be responsible for supporting the development and maintenance of our trading and risk data platform. Your main focus will be on designing and building data foundations and end-to-end solutions to maximize the value from data. You will collaborate with other data professionals to integrate and enrich trade data from various ETRM systems and create scalable solutions to enhance the usage of TRM data across different platforms and teams. Key Responsibilities: - Implement and manage lake House using Databricks and Azure Tech stack (ADLS Gen2, ADF, Azure SQL). - Utilize SQL, Python, Apache Spark, and Delta Lake for data engineering tasks. - Implement data integration techniques, ETL processes, and data pipeline architectures. - Develop CI/CD pipelines for code management using GIT. - Create and maintain technical documentation for the platform. - Ensure the platform is developed with software engineering, data analytics, and data security best practices. - Optimize data processing and storage systems for high performance, reliability, and security. - Work in Agile Methodology and utilize ADO Boards for Sprint deliveries. - Demonstrate excellent communication skills to convey technical and business concepts effectively. - Collaborate with team members at all levels to share ideas and knowledge effectively. Required Qualifications: - Bachelor's degree in computer science or related field. - 6 to 10 years of experience in software development/engineering. - Proficiency in Azure technologies including Databricks, ADLS Gen2, ADF, and Azure SQL. - Strong hands-on experience with SQL, Python, Apache Spark, and Delta Lake. - Knowledge of data integration techniques, ETL processes, and data pipeline architectures. - Experience in building CI/CD pipelines and using GIT for code management. - Familiarity with Agile Methodology and ADO Boards for Sprint deliveries. At CGI, we believe in ownership, teamwork, respect, and belonging. As a CGI Partner, you will have the opportunity to turn meaningful insights into action, develop innovative solutions, and collaborate with a diverse team to shape your career and contribute to our collective success. Join us on this exciting journey of growth and innovation at one of the largest IT and business consulting services firms in the world.,

Posted 2 days ago

Apply

2.0 - 6.0 years

0 Lacs

jaipur, rajasthan

On-site

As a Databricks Engineer specializing in the Azure Data Platform, you will be responsible for designing, developing, and optimizing scalable data pipelines within the Azure ecosystem. You should have hands-on experience with Python-based ETL development, Lakehouse architecture, and building Databricks workflows utilizing the bronze-silver-gold data modeling approach. Your key responsibilities will include developing and maintaining ETL pipelines using Python and Apache Spark in Azure Databricks, implementing and managing bronze-silver-gold data lake layers using Delta Lake, and working with various Azure services such as Azure Data Lake Storage (ADLS), Azure Data Factory (ADF), and Azure Synapse for end-to-end pipeline orchestration. It will be crucial to ensure data quality, integrity, and lineage across all layers of the data pipeline, optimize Spark performance, manage cluster configurations, and schedule jobs effectively in Databricks. Collaboration with data analysts, architects, and business stakeholders to deliver data-driven solutions will also be part of your role. To be successful in this role, you should have at least 3+ years of experience with Python in a data engineering environment, 2+ years of hands-on experience with Azure Databricks and Apache Spark, and a strong background in building scalable data lake pipelines following the bronze-silver-gold architecture. Additionally, in-depth knowledge of Delta Lake, Parquet, and data versioning, along with familiarity with Azure Data Factory, ADLS Gen2, and SQL is required. Experience with CI/CD pipelines and job orchestration tools such as Azure DevOps or Airflow would be advantageous. Excellent communication skills, both verbal and written, are essential. Nice to have qualifications include experience with data governance, security, and monitoring in Azure, exposure to real-time streaming or event-driven pipelines (Kafka, Event Hub), and an understanding of MLflow, Unity Catalog, or other data cataloging tools. By joining our team, you will have the opportunity to be part of high-impact, cloud-native data initiatives, work in a collaborative and growth-oriented team focused on innovation, and contribute to modern data architecture standards using the latest Azure technologies. If you are ready to advance your career as a Databricks Engineer in the Azure Data Platform, please send your updated resume to hr@vidhema.com. We look forward to hearing from you and potentially welcoming you to our team.,

Posted 2 days ago

Apply

6.0 - 10.0 years

0 Lacs

karnataka

On-site

At EY, you'll have the chance to build a career as unique as you are, with the global scale, support, inclusive culture and technology to become the best version of you. And we're counting on your unique voice and perspective to help EY become even better, too. Join us and build an exceptional experience for yourself, and a better working world for all. Our Technology team builds innovative digital solutions rapidly and at scale to deliver the next generation of Financial and Non-Financial services across the globe. The Position is a senior technical, hands-on delivery role, requiring knowledge of data engineering, cloud infrastructure and platform engineering, platform operations, and production support using ground-breaking cloud and big data technologies. The ideal candidate with 6-8 years of experience will possess strong technical skills, an eagerness to learn, a keen interest on 3 key pillars that our team supports i.e. Financial Crime, Financial Risk, and Compliance technology transformation, the ability to work collaboratively in a fast-paced environment, and an aptitude for picking up new tools and techniques on the job, building on existing skill sets as a foundation. In this role, you will: - Ingest and provision raw datasets, enriched tables, and/or curated, re-usable data assets to enable a variety of use cases. - Drive improvements in the reliability and frequency of data ingestion, including increasing real-time coverage. - Support and enhance data ingestion infrastructure and pipelines. - Design and implement data pipelines that collect data from disparate sources across the enterprise and external sources and deliver it to our data platform. - Extract Transform and Load (ETL) workflows, using both advanced data manipulation tools and programmatically manipulate data throughout our data flows, ensuring data is available at each stage in the data flow and in the form needed for each system, service, and customer along said data flow. - Identify and onboard data sources using existing schemas and, where required, conduct exploratory data analysis to investigate and provide solutions. - Evaluate modern technologies, frameworks, and tools in the data engineering space to drive innovation and improve data processing capabilities. Core/Must-Have Skills: - 3-8 years of expertise in designing and implementing data warehouses, data lakes using Oracle Tech Stack (ETL: ODI, SSIS, DB: PLSQL, and AWS Redshift). - At least 4+ years of experience in managing data extraction, transformation, and loading various sources using Oracle Data Integrator with exposure to other tools like SSIS. - At least 4+ years of experience in Database Design and Dimension modeling using Oracle PLSQL, Microsoft SQL Server. - Experience in developing ETL processes - ETL control tables, error logging, auditing, data quality, etc. Should implement reusability, parameterization workflow design, etc. - Advanced working SQL Knowledge and experience working with relational and NoSQL databases as well as working familiarity with a variety of databases (Oracle, SQL Server, Neo4J). - Strong analytical and critical thinking skills, with the ability to identify and resolve issues in data pipelines and systems. - Expertise in data modeling and DB Design with skills in performance tuning. - Experience with OLAP, OLTP databases, and data structuring/modeling with an understanding of key data points. - Experience building and optimizing data pipelines on Azure Databricks or AWS Glue or Oracle Cloud. - Create and Support ETL Pipelines and table schemas to facilitate the accommodation of new and existing data sources for the Lakehouse. - Experience with data visualization (Power BI/Tableau) and SSRS. Good to Have: - Experience working in Financial Crime, Financial Risk, and Compliance technology transformation domains. - Certification on any cloud tech stack preferred Microsoft Azure. - In-depth knowledge and hands-on experience with data engineering, Data Warehousing, and Delta Lake on-prem (Oracle RDBMS, Microsoft SQL Server) and cloud (Azure or AWS or Oracle Cloud). - Ability to script (Bash, Azure CLI), Code (Python, C#), query (SQL, PLSQL, T-SQL) coupled with software versioning control systems (e.g., GitHub) AND ci/cd systems. - Design and development of systems for the maintenance of the Azure/AWS Lakehouse, ETL process, business Intelligence, and data ingestion pipelines for AI/ML use cases. EY | Building a better working world EY exists to build a better working world, helping to create long-term value for clients, people, and society and build trust in the capital markets. Enabled by data and technology, diverse EY teams in over 150 countries provide trust through assurance and help clients grow, transform, and operate. Working across assurance, consulting, law, strategy, tax, and transactions, EY teams ask better questions to find new answers for the complex issues facing our world today.,

Posted 2 days ago

Apply

0.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Work Location : Hyderabad What Gramener offers you Gramener will offer you an inviting workplace, talented colleagues from diverse backgrounds, career paths, and steady growth prospects with great scope to innovate. We aim to create an ecosystem of easily configurable data applications focused on storytelling for public and private use. Data Architect We are seeking an experienced Data Architect to design and govern scalable, secure, and efficient data platforms in a data mesh environment. You will lead data architecture initiatives across multiple domains, enabling self-serve data products built on Databricks and AWS, and support both operational and analytical use cases. Key Responsibilities Design and implement enterprise-grade data architectures leveraging the medallion architecture (Bronze, Silver, Gold). Develop and enforce data modelling standards, including flattened data models optimized for analytics. Define and implement MDM strategies (Reltio), data governance frameworks (Collibra), and data classification policies. Lead the development of data landscapes, capturing sources, flows, transformations, and consumption layers. Collaborate with domain teams to ensure consistency across decentralized data products in a data mesh architecture. Guide best practices for ingesting and transforming data using Fivetran, PySpark, SQL, and Delta Live Tables (DLT). Define metadata and data quality standards across domains. Provide architectural oversight for data platform development on Databricks (Lakehouse) and AWS ecosystem. Key Skills & Qualifications Must-Have Technical Skills: (Reltio, Colibra, Ataccama, Immuta) Experience in the Pharma domain. Data Modeling (dimensional, flattened, common data model, canonical, and domain-specific, entity-level data understanding from a business process point of view). Master Data Management (MDM) principles and tools (Reltio) (1). Data Governance and Data Classification frameworks (1). Strong experience with Fivetran**, PySpark, SQL, Python. Deep understanding of Databricks (Delta Lake, Unity Catalog, Workflows, DLT) . Experience with AWS services related to data (e.g., S3, Glue, Redshift, IAM, ). Experience on Snowflake. Architecture & Design Proven expertise in Data Mesh or Domain-Oriented Data Architecture. Experience with medallion/lakehouse architecture. Ability to create data blueprints and landscape maps across complex enterprise systems. Soft Skills Strong stakeholder management across business and technology teams. Ability to translate business requirements into scalable data designs. Excellent communication and documentation skills. Preferred Qualifications Familiarity with regulatory and compliance frameworks (e.g., GxP, HIPAA, GDPR). Background in data product building. About Us We consult and deliver solutions to organizations where data is the core of decision-making. We undertake strategic data consulting for organizations, laying out the roadmap for data-driven decision-making. This helps organizations convert data into a strategic differentiator. Through a host of our products, solutions, and Service Offerings, we analyze and visualize large amounts of data. To know more about us visit Gramener Website and Gramener Blog. Apply for this role Apply for this Role Show more Show less

Posted 2 days ago

Apply

7.0 - 9.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Bangalore/Gurugram/Hyderabad YOE - 7+ years We are seeking a talented Data Engineer with strong expertise in Databricks, specifically in Unity Catalog, PySpark, and SQL, to join our data team. Youll play a key role in building secure, scalable data pipelines and implementing robust data governance strategies using Unity Catalog. Key Responsibilities: Design and implement ETL/ELT pipelines using Databricks and PySpark. Work with Unity Catalog to manage data governance, access controls, lineage, and auditing across data assets. Develop high-performance SQL queries and optimize Spark jobs. Collaborate with data scientists, analysts, and business stakeholders to understand data needs. Ensure data quality and compliance across all stages of the data lifecycle. Implement best practices for data security and lineage within the Databricks ecosystem. Participate in CI/CD, version control, and testing practices for data pipelines Required Skills: Proven experience with Databricks and Unity Catalog (data permissions, lineage, audits). Strong hands-on skills with PySpark and Spark SQL. Solid experience writing and optimizing complex SQL queries. Familiarity with Delta Lake, data lakehouse architecture, and data partitioning. Experience with cloud platforms like Azure or AWS. Understanding of data governance, RBAC, and data security standards. Preferred Qualifications: Databricks Certified Data Engineer Associate or Professional. Experience with tools like Airflow, Git, Azure Data Factory, or dbt. Exposure to streaming data and real-time processing. Knowledge of DevOps practices for data engineering. Show more Show less

Posted 2 days ago

Apply

5.0 - 10.0 years

15 - 25 Lacs

Hyderabad/Secunderabad, Bangalore/Bengaluru, Delhi / NCR

Hybrid

Ready to shape the future of work? At Genpact, we dont just adapt to changewe drive it. AI and digital innovation are redefining industries, and were leading the charge. Genpacts AI Gigafactory , our industry-first accelerator, is an example of how were scaling advanced technology solutions to help global enterprises work smarter, grow faster, and transform at scale. From large-scale models to agentic AI , our breakthrough solutions tackle companies most complex challenges. If you thrive in a fast-moving, tech-driven environment, love solving real-world problems, and want to be part of a team thats shaping the future, this is your moment. Genpact (NYSE: G) is an advanced technology services and solutions company that delivers lasting value for leading enterprises globally. Through our deep business knowledge, operational excellence, and cutting-edge solutions we help companies across industries get ahead and stay ahead. Powered by curiosity, courage, and innovation , our teams implement data, technology, and AI to create tomorrow, today. Get to know us at genpact.com and on LinkedIn , X , YouTube , and Facebook . Inviting applications for the role of Senior Principal Consultant, AWS DataLake! Responsibilities Having knowledge on DataLake on AWS services with exposure to creating External Tables and spark programming. The person shall be able to work on python programming. Writing effective and scalable Python codes for automations, data wrangling and ETL. ¢ Designing and implementing robust applications and work on Automations using python codes. ¢ Debugging applications to ensure low-latency and high-availability. ¢ Writing optimized custom SQL queries ¢ Experienced in team and client handling ¢ Having prowess in documentation related to systems, design, and delivery. ¢ Integrate user-facing elements into applications ¢ Having the knowledge of External Tables, Data Lake concepts. ¢ Able to do task allocation, collaborate on status exchanges and getting things to successful closure. ¢ Implement security and data protection solutions ¢ Must be capable of writing SQL queries for validating dashboard outputs ¢ Must be able to translate visual requirements into detailed technical specifications ¢ Well versed in handling Excel, CSV, text, json other unstructured file formats using python. ¢ Expertise in at least one popular Python framework (like Django, Flask or Pyramid) ¢ Good understanding and exposure on any Git, Bamboo, Confluence and Jira. ¢ Good in Dataframes and SQL ANSI using pandas. ¢ Team player, collaborative approach and excellent communication skills Qualifications we seek in you! Minimum Qualifications ¢BE/B Tech/ MCA ¢Excellent written and verbal communication skills ¢Good knowledge of Python, Pyspark Preferred Qualifications/ Skills ¢ Strong ETL knowledge on any ETL tool good to have. ¢ Good to have knowledge on AWS cloud and Snowflake. ¢ Having knowledge of PySpark is a plus. Why join Genpact? Be a transformation leader Work at the cutting edge of AI, automation, and digital innovation Make an impact Drive change for global enterprises and solve business challenges that matter Accelerate your career Get hands-on experience, mentorship, and continuous learning opportunities Work with the best Join 140,000+ bold thinkers and problem-solvers who push boundaries every day Thrive in a values-driven culture Our courage, curiosity, and incisiveness - built on a foundation of integrity and inclusion - allow your ideas to fuel progress Come join the tech shapers and growth makers at Genpact and take your career in the only direction that matters: Up. Lets build tomorrow together. Genpact is an Equal Opportunity Employer and considers applicants for all positions without regard to race, color, religion or belief, sex, age, national origin, citizenship status, marital status, military/veteran status, genetic information, sexual orientation, gender identity, physical or mental disability or any other characteristic protected by applicable laws. Genpact is committed to creating a dynamic work environment that values respect and integrity, customer focus, and innovation. Furthermore, please do note that Genpact does not charge fees to process job applications and applicants are not required to pay to participate in our hiring process in any other way. Examples of such scams include purchasing a 'starter kit,' paying to apply, or purchasing equipment or training.

Posted 3 days ago

Apply

9.0 - 14.0 years

20 - 35 Lacs

Noida

Remote

Job Title: Data bricks SME Engineer Work Timing: US EST Hours Experience: 8+ years Location: Remote Job Responsibilities: Databricks Hands on SME Engineer Architect, configure, and optimize Databricks Pipelines for large-scale data processing within an Azure Data Lakehouse environment. Set up and manage Azure infrastructure components including Databricks Workspaces, Azure Containers (AKS/ACI), Storage Accounts, and Networking. Design and implement a monitoring and observability framework using tools like Azure Monitor, Log Analytics, and Prometheus/Grafana. Collaborate with platform and data engineering teams to enable micro services-based architecture for scalable and modular data solutions. Drive automation and CI/CD practices using Terraform, ARM templates, and GitHub Actions/Azure DevOps. Required Skills & Experience: Strong hands-on experience with Azure Data bricks, Delta Lake, and Apache Spark. Deep understanding of Azure services: Resource Manager, AKS, ACR, Key Vault, and Networking. Proven experience in micro services architecture and container orchestration. Expertise in infrastructure-as-code, scripting (Python, Bash), and DevOps tooling. Familiarity with data governance, security, and cost optimization in cloud environments. Bonus: Experience with event-driven architectures (Kafka/Event Grid). Knowledge of data mesh principles and distributed data ownership. Interested candidates can apply: dsingh15@fcsltd.com

Posted 3 days ago

Apply

10.0 - 14.0 years

0 Lacs

pune, maharashtra

On-site

About the Team As a part of the DoorDash organization, you will be joining a data-driven team that values timely, accurate, and reliable data to make informed business and product decisions. Data serves as the foundation of DoorDash's success, and the Data Engineering team is responsible for building database solutions tailored to various use cases such as reporting, product analytics, marketing optimization, and financial reporting. By implementing robust data structures and data warehouse architecture, this team plays a crucial role in facilitating decision-making processes at DoorDash. Additionally, the team focuses on enhancing the developer experience by developing tools that support the organization's high-velocity demands. About the Role DoorDash is seeking a dedicated Data Engineering Manager to lead the development of enterprise-scale data solutions. In this role, you will serve as a technical expert on all aspects of data architecture, empowering data engineers, data scientists, and DoorDash partners. Your responsibilities will include fostering a culture of engineering excellence, enabling engineers to deliver reliable and flexible solutions at scale. Furthermore, you will be instrumental in building and nurturing a high-performing team, driving innovation and success in a dynamic and fast-paced environment. In this role, you will: - Lead and manage a team of data engineers, focusing on hiring, building, growing, and nurturing impactful business-focused data teams. - Drive the technical and strategic vision for embedded pods and foundational enablers to meet current and future scalability and interoperability needs. - Strive for continuous improvement of data architecture and development processes. - Balance quick wins with long-term strategy and engineering excellence, breaking down large systems into user-friendly data assets and reusable components. - Collaborate cross-functionally with stakeholders, external partners, and peer data leaders. - Utilize effective planning and execution tools to ensure short-term and long-term team and stakeholder success. - Prioritize reliability and quality as essential components of data solutions. Qualifications: - Bachelor's, Master's, or Ph.D. in Computer Science or equivalent field. - Over 10 years of experience in data engineering, data platform, or related domains. - Minimum of 2 years of hands-on management experience. - Strong communication and leadership skills, with a track record of hiring and growing teams in a fast-paced environment. - Proficiency in programming languages such as Python, Kotlin, and SQL. - Prior experience with technologies like Snowflake, Databricks, Spark, Trino, and Pinot. - Familiarity with the AWS ecosystem and large-scale batch/real-time ETL orchestration using tools like Airflow, Kafka, and Spark Streaming. - Knowledge of data lake file formats including Delta Lake, Apache Iceberg, Glue Catalog, and S3. - Proficiency in system design and experience with AI solutions in the data space. At DoorDash, we are dedicated to fostering a diverse and inclusive community within our company and beyond. We believe that innovation thrives in an environment where individuals from diverse backgrounds, experiences, and perspectives come together. We are committed to providing equal opportunities for all and creating an inclusive workplace where everyone can excel and contribute to our collective success.,

Posted 3 days ago

Apply

4.0 - 8.0 years

0 Lacs

chennai, tamil nadu

On-site

As a Data Engineer MS Fabric at our Chennai-Excelencia Office location, you will leverage your 4+ years of experience to design, build, and optimize data pipelines using Microsoft Fabric, Azure Data Factory, and Synapse Analytics. Your primary responsibilities will include developing and maintaining Lakehouses, Notebooks, and data flows within the Microsoft Fabric ecosystem, ensuring efficient data integration, quality, and governance across OneLake and other Fabric components, and implementing real-time analytics pipelines for high-throughput data processing. To excel in this role, you must have proficiency in Microsoft Fabric, Azure Data Factory (ADF), Azure Synapse Analytics, Delta Lake, OneLake, Lakehouses, Python, PySpark, Spark SQL, T-SQL, and ETL/ELT Development. Your work will involve collaborating with cross-functional teams to define and deliver end-to-end data engineering solutions, participating in Agile ceremonies, and utilizing tools like JIRA for project tracking and delivery. Additionally, you will be tasked with performing complex data transformations using various data formats and handling large-scale data warehousing and analytics workloads. Preferred skills for this position include a strong understanding of distributed computing and cloud-native data architecture, experience with DataOps practices and data quality frameworks, familiarity with CI/CD for data pipelines, and proficiency in monitoring tools and job scheduling frameworks to ensure the reliability and performance of data systems. Strong problem-solving and analytical thinking, excellent communication and collaboration skills, as well as a self-motivated and proactive approach with a continuous learning mindset are essential soft skills required for success in this role.,

Posted 3 days ago

Apply

3.0 - 7.0 years

0 Lacs

chennai, tamil nadu

On-site

As a Data Engineer, you will be responsible for developing and maintaining a metadata-driven generic ETL framework to automate ETL code. Your primary tasks will include designing, building, and optimizing ETL/ELT pipelines using Databricks (PySpark/SQL) on AWS. You will be required to ingest data from a variety of structured and unstructured sources such as APIs, RDBMS, flat files, and streaming services. In this role, you will also develop and maintain robust data pipelines for both batch and streaming data utilizing Delta Lake and Spark Structured Streaming. Implementing data quality checks, validations, and logging mechanisms will be essential to ensure data accuracy and reliability. You will work on optimizing pipeline performance, cost, and reliability and collaborate closely with data analysts, BI teams, and business stakeholders to deliver high-quality datasets. Additionally, you will support data modeling efforts, including star and snowflake schemas, de-normalization tables approach, and assist in data warehousing initiatives. Your responsibilities will also involve working with orchestration tools like Databricks Workflows to schedule and monitor pipelines effectively. To excel in this role, you should have hands-on experience in ETL/Data Engineering roles and possess strong expertise in Databricks (PySpark, SQL, Delta Lake). Experience with Spark optimization, partitioning, caching, and handling large-scale datasets is crucial. Proficiency in SQL and scripting in Python or Scala is required, along with a solid understanding of data lakehouse/medallion architectures and modern data platforms. Knowledge of cloud storage systems like AWS S3, familiarity with DevOps practices (Git, CI/CD, Terraform, etc.), and strong debugging, troubleshooting, and performance-tuning skills are also essential for this position. Following best practices for version control, CI/CD, and collaborative development will be a key part of your responsibilities. If you are passionate about data engineering, enjoy working with cutting-edge technologies, and thrive in a collaborative environment, this role offers an exciting opportunity to contribute to the success of data-driven initiatives within the organization.,

Posted 3 days ago

Apply

5.0 - 10.0 years

0 Lacs

hyderabad, telangana

On-site

You are an experienced Azure Databricks Engineer who will be responsible for designing, developing, and maintaining scalable data pipelines and supporting data infrastructure in an Azure cloud environment. Your key responsibilities will include designing ETL pipelines using Azure Databricks, building robust data architectures on Azure, collaborating with stakeholders to define data requirements, optimizing data pipelines for performance and reliability, implementing data transformations and cleansing processes, managing Databricks clusters, and leveraging Azure services for data orchestration and storage. You must possess 5-10 years of experience in data engineering or a related field with extensive hands-on experience in Azure Databricks and Apache Spark. Strong knowledge of Azure cloud services such as Azure Data Lake, Data Factory, Azure SQL, and Azure Synapse Analytics is required. Experience with Python, Scala, or SQL for data manipulation, ETL frameworks, Delta Lake, Parquet formats, Azure DevOps, CI/CD pipelines, big data architecture, and distributed systems is essential. Knowledge of data modeling, performance tuning, and optimization of big data solutions is expected, along with problem-solving skills and the ability to work in a collaborative environment. Preferred qualifications include experience with real-time data streaming tools, Azure certifications, machine learning frameworks, integration with Databricks, and data visualization tools like Power BI. A bachelor's degree in Computer Science, Data Engineering, Information Technology, or a related field is required for this role.,

Posted 3 days ago

Apply

8.0 - 10.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

This job is with Kyndryl, an inclusive employer and a member of myGwork the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly. Who We Are At Kyndryl, we design, build, manage and modernize the mission-critical technology systems that the world depends on every day. So why work at Kyndryl We are always moving forward - always pushing ourselves to go further in our efforts to build a more equitable, inclusive world for our employees, our customers and our communities. The Role At Kyndryl, we design, build, manage and modernize the mission-critical technology systems that the world depends on every day. So why work at Kyndryl We are always moving forward - always pushing ourselves to go further in our efforts to build a more equitable, inclusive world for our employees, our customers and our communities. As a Data Engineer , you will leverage your expertise in Databricks , big data platforms , and modern data engineering practices to develop scalable data solutions for our clients. Candidates with healthcare experience, particularly with EPIC systems , are strongly encouraged to apply. This includes creating data pipelines, integrating data from various sources, and implementing data security and privacy measures. The Data Engineer will also be responsible for monitoring and troubleshooting data flows and optimizing data storage and processing for performance and cost efficiency. Responsibilities Develop data ingestion, data processing and analytical pipelines for big data, relational databases and data warehouse solutions Design and implement data pipelines and ETL/ELT processes using Databricks, Apache Spark, and related tools. Collaborate with business stakeholders, analysts, and data scientists to deliver accessible, high-quality data solutions. Provide guidance on cloud migration strategies and data architecture patterns such as Lakehouse and Data Mesh Provide pros/cons, and migration considerations for private and public cloud architectures Provide technical expertise in troubleshooting, debugging, and resolving complex data and system issues. Create and maintain technical documentation, including system diagrams, deployment procedures, and troubleshooting guides Experience working with Data Governance, Data security and Data Privacy (Unity Catalogue or Purview) Your Future at Kyndryl Every position at Kyndryl offers a way forward to grow your career. We have opportunities that you won&apost find anywhere else, including hands-on experience, learning opportunities, and the chance to certify in all four major platforms. Whether you want to broaden your knowledge base or narrow your scope and specialize in a specific sector, you can find your opportunity here. Who You Are You&aposre good at what you do and possess the required experience to prove it. However, equally as important - you have a growth mindset; keen to drive your own personal and professional development. You are customer-focused - someone who prioritizes customer success in their work. And finally, you&aposre open and borderless - naturally inclusive in how you work with others. Required Technical And Professional Experience 3+ years of consulting or client service delivery experience on Azure Graduate/Postgraduate in computer science, computer engineering, or equivalent with minimum of 8 years of experience in the IT industry. 3+ years of experience in developing data ingestion, data processing and analytical pipelines for big data, relational databases such as SQL server and data warehouse solutions such as Azure Synapse Extensive hands-on experience implementing data ingestion, ETL and data processing. Hands-on experience in and Big Data technologies such as Java, Python, SQL, ADLS/Blob, PySpark and Spark SQL, Databricks, HD Insight and live streaming technologies such as EventHub. Experience with cloud-based database technologies (Azure PAAS DB, AWS RDS and NoSQL). Cloud migration methodologies and processes including tools like Azure Data Factory, Data Migration Service, etc. Experience with monitoring and diagnostic tools (SQL Profiler, Extended Events, etc). Expertise in data mining, data storage and Extract-Transform-Load (ETL) processes. Experience with relational databases and expertise in writing and optimizing T-SQL queries and stored procedures. Experience in using Big Data File Formats and compression techniques. Experience in Developer tools such as Azure DevOps, Visual Studio Team Server, Git, Jenkins, etc. Experience with private and public cloud architectures, pros/cons, and migration considerations. Excellent problem-solving, analytical, and critical thinking skills. Ability to manage multiple projects simultaneously, while maintaining a high level of attention to detail. Communication Skills: Must be able to communicate with both technical and nontechnical. Able to derive technical requirements with the stakeholders. Preferred Technical And Professional Experience Cloud platform certification, e.g., Microsoft Certified: (DP-700) Azure Data Engineer Associate, AWS Certified Data Analytics - Specialty, Elastic Certified Engineer, Google Cloud Professional Data Engineer Professional certification, e.g., Open Certified Technical Specialist with Data Engineering Specialization. Experience working with EPIC healthcare systems (e.g., Clarity, Caboodle). Databricks certifications (e.g., Databricks Certified Data Engineer Associate or Professional). Knowledge of GenAI tools, Microsoft Fabric, or Microsoft Copilot. Familiarity with healthcare data standards and compliance (e.g., HIPAA, GDPR). Experience with DevSecOps and CI/CD deployments Experience in NoSQL databases design Knowledge on , Gen AI fundamentals and industry supporting use cases. Hands-on experience with Delta Lake and Delta Tables within the Databricks environment for building scalable and reliable data pipelines. Being You Diversity is a whole lot more than what we look like or where we come from, it&aposs how we think and who we are. We welcome people of all cultures, backgrounds, and experiences. But we&aposre not doing it single-handily: Our Kyndryl Inclusion Networks are only one of many ways we create a workplace where all Kyndryls can find and provide support and advice. This dedication to welcoming everyone into our company means that Kyndryl gives you - and everyone next to you - the ability to bring your whole self to work, individually and collectively, and support the activation of our equitable culture. That&aposs the Kyndryl Way. What You Can Expect With state-of-the-art resources and Fortune 100 clients, every day is an opportunity to innovate, build new capabilities, new relationships, new processes, and new value. Kyndryl cares about your well-being and prides itself on offering benefits that give you choice, reflect the diversity of our employees and support you and your family through the moments that matter - wherever you are in your life journey. Our employee learning programs give you access to the best learning in the industry to receive certifications, including Microsoft, Google, Amazon, Skillsoft, and many more. Through our company-wide volunteering and giving platform, you can donate, start fundraisers, volunteer, and search over 2 million non-profit organizations. At Kyndryl, we invest heavily in you, we want you to succeed so that together, we will all succeed. Get Referred! If you know someone that works at Kyndryl, when asked &aposHow Did You Hear About Us' during the application process, select &aposEmployee Referral' and enter your contact&aposs Kyndryl email address. Show more Show less

Posted 3 days ago

Apply

7.0 - 9.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Teamwork makes the stream work. Roku is changing how the world watches TV Roku is the #1 TV streaming platform in the U.S., Canada, and Mexico, and we&aposve set our sights on powering every television in the world. Roku pioneered streaming to the TV. Our mission is to be the TV streaming platform that connects the entire TV ecosystem. We connect consumers to the content they love, enable content publishers to build and monetize large audiences, and provide advertisers unique capabilities to engage consumers. From your first day at Roku, you&aposll make a valuable - and valued - contribution. We&aposre a fast-growing public company where no one is a bystander. We offer you the opportunity to delight millions of TV streamers around the world while gaining meaningful experience across a variety of disciplines. About the team Roku runs one of the largest data lakes in the world. We store over 70 PB of data, run 10+M queries per month, scan over 100 PB of data per month. Big Data team is the one responsible for building, running, and supporting the platform that makes this possible. We provide all the tools needed to acquire, generate, process, monitor, validate and access the data in the lake for both streaming data and batch. We are also responsible for generating the foundational data. The systems we provide include Scribe, Kafka, Hive, Presto, Spark, Flink, Pinot, and others. The team is actively involved in the Open Source, and we are planning to increase our engagement over time. About the Role Roku is in the process of modernizing its Big Data Platform. We are working on defining the new architecture to improve user experience, minimize the cost and increase efficiency. Are you interested in helping us build this state-of-the-art big data platform Are you an expert with Big Data Technologies Have you looked under the hood of these systems Are you interested in Open Source If you answered Yes to these questions, this role is for you! What you will be doing You will be responsible for streamlining and tuning existing Big Data systems and pipelines and building new ones. Making sure the systems run efficiently and with minimal cost is a top priority You will be making changes to the underlying systems and if an opportunity arises, you can contribute your work back into the open source You will also be responsible for supporting internal customers and on-call services for the systems we host. Making sure we provided stable environment and great user experience is another top priority for the team We are excited if you have 7+ years of production experience building big data platforms based upon Spark, Trino or equivalent Strong programming expertise in Java, Scala, Kotlin or another JVM language. A robust grasp of distributed systems concepts, algorithms, and data structures Strong familiarity with the Apache Hadoop ecosystem: Spark, Kafka, Hive/Iceberg/Delta Lake, Presto/Trino, Pinot, etc. Experience working with at least 3 of the technologies/tools mentioned here: Big Data / Hadoop, Kafka, Spark, Trino, Flink, Airflow, Druid, Hive, Iceberg, Delta Lake, Pinot, Storm etc Extensive hands-on experience with public cloud AWS or GCP BS/MS degree in CS or equivalent AI Literacy / AI growth mindset Benefits Roku is committed to offering a diverse range of benefits as part of our compensation package to support our employees and their families. Our comprehensive benefits include global access to mental health and financial wellness support and resources. Local benefits include statutory and voluntary benefits which may include healthcare (medical, dental, and vision), life, accident, disability, commuter, and retirement options (401(k)/pension). Our employees can take time off work for vacation and other personal reasons to balance their evolving work and life needs. It&aposs important to note that not every benefit is available in all locations or for every role. For details specific to your location, please consult with your recruiter. The Roku Culture Roku is a great place for people who want to work in a fast-paced environment where everyone is focused on the company&aposs success rather than their own. We try to surround ourselves with people who are great at their jobs, who are easy to work with, and who keep their egos in check. We appreciate a sense of humor. We believe a fewer number of very talented folks can do more for less cost than a larger number of less talented teams. We&aposre independent thinkers with big ideas who act boldly, move fast and accomplish extraordinary things through collaboration and trust. In short, at Roku you&aposll be part of a company that&aposs changing how the world watches TV.? We have a unique culture that we are proud of. We think of ourselves primarily as problem-solvers, which itself is a two-part idea. We come up with the solution, but the solution isn&apost real until it is built and delivered to the customer. That penchant for action gives us a pragmatic approach to innovation, one that has served us well since 2002.? To learn more about Roku, our global footprint, and how we&aposve grown, visit https://www.weareroku.com/factsheet. By providing your information, you acknowledge that you have read our Applicant Privacy Notice and authorize Roku to process your data subject to those terms. Show more Show less

Posted 3 days ago

Apply

5.0 - 9.0 years

0 Lacs

karnataka

On-site

As a Data Engineer with 5+ years of experience, you will be responsible for designing and developing scalable, reusable, and efficient data pipelines using modern Data Engineering platforms such as Microsoft Fabric, PySpark, and Data Lakehouse architectures. Your role will involve integrating data from diverse sources, transforming it into actionable insights, and ensuring high standards of data governance and quality. You will play a key role in establishing and enforcing data governance policies, monitoring pipeline performance, and optimizing for efficiency. Key Responsibilities Design and build robust data pipelines using Microsoft Fabric components including Pipelines, Notebooks (PySpark), Dataflows, and Lakehouse architecture. Ingest and transform data from cloud platforms (Azure, AWS), on-prem databases, SaaS platforms (e.g., Salesforce, Workday), and REST/OpenAPI-based APIs. Develop and maintain semantic models and define standardized KPIs for reporting and analytics in Power BI or equivalent BI tools. Implement and manage Delta Tables across bronze/silver/gold layers using Lakehouse medallion architecture within OneLake or equivalent environments. Apply metadata-driven design principles to ensure pipeline parameterization, reusability, and scalability. Monitor, debug, and optimize pipeline performance; implement logging, alerting, and observability mechanisms. Establish and enforce data governance policies including schema versioning, data lineage tracking, role-based access control (RBAC), and audit trail mechanisms. Perform data quality checks including null detection, duplicate handling, schema drift management, outlier identification, and Slowly Changing Dimensions (SCD) type management. Required Skills & Qualifications 5+ years of hands-on experience in Data Engineering or related fields. Solid understanding of data lake/lakehouse architectures, preferably with Microsoft Fabric or equivalent tools (e.g., Databricks, Snowflake, Azure Synapse). Strong experience with PySpark, SQL, and working with dataflows and notebooks. Exposure to BI tools like Power BI, Tableau, or equivalent for data consumption layers. Experience with Delta Lake or similar transactional storage layers. Familiarity with data ingestion from SaaS applications, APIs, and enterprise databases. Understanding of data governance, lineage, and RBAC principles. Strong analytical, problem-solving, and communication skills. Nice to Have Prior experience with Microsoft Fabric and OneLake platform. Knowledge of CI/CD practices in data engineering. Experience implementing monitoring/alerting tools for data pipelines. Join us for the opportunity to work on cutting-edge data engineering solutions in a fast-paced, collaborative environment focused on innovation and learning. Gain exposure to end-to-end data product development and deployment cycles.,

Posted 5 days ago

Apply

5.0 - 9.0 years

0 Lacs

ahmedabad, gujarat

On-site

As a Data Engineer, you will be responsible for designing and building efficient data pipelines using Azure Databricks (PySpark). You will implement business logic for data transformation and enrichment at scale, as well as manage and optimize Delta Lake storage solutions. Additionally, you will develop REST APIs using FastAPI to expose processed data and deploy them on Azure Functions for scalable and serverless data access. Your role will also involve developing and managing Airflow DAGs to orchestrate ETL processes, ingesting and processing data from various internal and external sources on a scheduled basis. You will handle data storage and access using PostgreSQL and MongoDB, writing optimized SQL queries to support downstream applications and analytics. Collaboration is key in this role, as you will work cross-functionally with teams to deliver reliable, high-performance data solutions. It is essential to follow best practices in code quality, version control, and documentation to ensure the success of projects. To excel in this position, you should have at least 5 years of hands-on experience as a Data Engineer and strong expertise in Azure Cloud services. Proficiency in Azure Databricks, PySpark, Delta Lake, Python, and FastAPI for API development is required. Experience with Azure Functions for serverless API deployments, managing ETL pipelines using Apache Airflow, and hands-on experience with PostgreSQL and MongoDB are also essential. Strong SQL skills and experience in handling large datasets will be beneficial for this role.,

Posted 5 days ago

Apply

15.0 - 24.0 years

35 - 45 Lacs

Mumbai, Bengaluru, Mumbai (All Areas)

Work from Office

Greetings!!! This is in regards to a Job opportunity for Data Architect with Datamatics Global Services Ltd. Position: Data Architect Website: https://www.datamatics.com/ Job Location: Mumbai(Andheri - Seepz)/Bangalore(Kalyani Neptune Bannerghatta Road) Job Description: Job Overview: We are seeking a Data Architect to lead end-to-end solutioning for enterprise data platforms while driving strategy, architecture, and innovation within our Data Center of Excellence (COE). This role requires deep expertise in Azure, Databricks, SQL, and Python, alongside strong pre-sales and advisory capabilities. The architect will serve as a trusted advisor, mentoring and guiding delivery teams, and defining scalable data strategies that align with business objectives. Key Responsibilities: Core Engineering Data Architecture & Solutioning - Design and implement enterprise-wide data architectures, ensuring scalability, security, and performance. - Lead end-to-end data solutioning, covering ingestion, transformation, governance, analytics, and visualization. - Architect high-performance data pipelines leveraging Azure Data Factory, Databricks, SQL, and Python. - Establish data governance frameworks, integrating Delta Lake, Azure Purview, and metadata management best practices. - Optimize data models, indexing strategies, and high-volume query processing. - Oversee data security, access controls, and compliance policies within cloud environments. - Mentor engineering teams, guiding best practices in data architecture, pipeline development, and optimization. Data COE & Thought Leadership - Define data architecture strategies, frameworks, and reusable assets for the Data COE. - Drive best practices, standards, and innovation across data engineering and analytics teams. - Act as a subject matter expert, shaping data strategy, scalability models, and governance frameworks. - Lead data modernization efforts, advising on cloud migration, system optimization, and future-proofing architectures. - Deliver technical mentorship, ensuring teams adopt cutting-edge data engineering techniques. - Represent the Data COE in industry discussions, internal training, and thought leadership sessions. Pre-Sales & Solution Advisory - Engage in pre-sales consulting, defining enterprise data strategies for prospects and existing customers. - Craft solution designs, architecture blueprints, and contribute to proof-of-concept (PoC) implementations. - Partner with sales and consulting teams to translate client needs into scalable data solutions. - Provide strategic guidance on Azure, Databricks, and cloud adoption roadmaps. - Present technical proposals and recommendations to executive stakeholders and customers. - Stay ahead of emerging cloud data trends to enhance solution offerings. Required Skills & Qualifications: - 15+ years of experience in data architecture, engineering, and cloud data solutions. - Proven expertise in Azure, Databricks, SQL, and Python as primary technologies. - Proficiency in other relevant cloud and data engineering tools based on business needs. - Deep knowledge of data governance, metadata management, and security policies. - Strong pre-sales, consulting, and solution advisory experience in enterprise data platforms. - Advanced skills in SQL optimization, data pipeline architecture, and high-scale analytics. - Leadership experience in mentoring teams, defining best practices, and driving thought leadership. - Expertise in Delta Lake, Azure Purview, and scalable data architectures. - Strong stakeholder management skills across technical and business domains. Preferred but Not Mandatory: - Familiarity with Microsoft Fabric and Power BI data accessibility techniques. - Hands-on experience with CI/CD for data pipelines, DevOps, and version control practices. Additional Notes: - The technologies listed above are primary but indicative. - The candidate should have the flexibility to work with additional tools and platforms based on business needs.

Posted 5 days ago

Apply

5.0 - 10.0 years

20 - 25 Lacs

Bengaluru

Hybrid

Job title: Senior Software Engineer Experience: 5- 8 years Primary skills: Python, Spark or Pyspark, DWH ETL. Database: SparkSQL or PostgreSQL Secondary skills: Databricks ( Delta Lake, Delta tables, Unity Catalog) Work Model: Hybrid (Weekly Twice) Cab Facility: Yes Work Timings: 10am to 7pm Interview Process: 3 rounds (3rd round F2F Mandatory) Work Location: Karle Town Tech Park Nagawara, Hebbal Bengaluru 560045 About Business Unit: The Architecture Team plays a pivotal role in the end-to-end design, governance, and strategic direction of product development within Epsilon People Cloud (EPC). As a centre of technical excellence, the team ensures that every product feature is engineered to meet the highest standards of scalability, security, performance, and maintainability. Their responsibilities span across architectural ownership of critical product features, driving techno-product leadership, enforcing architectural governance, and ensuring systems are built with scalability, security, and compliance in mind. They design multi cloud and hybrid cloud solutions that support seamless integration across diverse environments and contribute significantly to interoperability between EPC products and the broader enterprise ecosystem. The team fosters innovation and technical leadership while actively collaborating with key partners to align technology decisions with business goals. Through this, the Architecture Team ensures the delivery of future-ready, enterprise-grade, efficient and performant, secure and resilient platforms that form the backbone of Epsilon People Cloud. Why we are looking for you: You have experience working as a Data Engineer with strong database fundamentals and ETL background. You have experience working in a Data warehouse environment and dealing with data volume in terabytes and above. You have experience working in relation data systems, preferably PostgreSQL and SparkSQL. You have excellent designing and coding skills and can mentor a junior engineer in the team. You have excellent written and verbal communication skills. You are experienced and comfortable working with global clients You work well with teams and are able to work with multiple collaborators including clients, vendors and delivery teams. You are proficient with bug tracking and test management toolsets to support development processes such as CI/CD. What you will enjoy in this role: As part of the Epsilon Technology practice, the pace of the work matches the fast-evolving demands in the industry. You will get to work on the latest tools and technology and deal with data of petabyte-scale. Work on homegrown frameworks on Spark and Airflow etc. Exposure to Digital Marketing Domain where Epsilon is a marker leader. Understand and work closely with consumer data across different segments that will eventually provide insights into consumer behaviour's and patterns to design digital Ad strategies. As part of the dynamic team, you will have opportunities to innovate and put your recommendations forward. Using existing standard methodologies and defining as per evolving industry standards. Opportunity to work with Business, System and Delivery to build a solid foundation on Digital Marketing Domain. The open and transparent environment that values innovation and efficiency Click here to view how Epsilon transforms marketing with 1 View, 1 Vision and 1 Voice. What will you do? Develop a deep understanding of the business context under which your team operates and present feature recommendations in an agile working environment. Lead, design and code solutions on and off database for ensuring application access to enable data-driven decision making for the company's multi-faceted ad serving operations. Working closely with Engineering resources across the globe to ensure enterprise data warehouse solutions and assets are actionable, accessible and evolving in lockstep with the needs of the ever-changing business model. This role requires deep expertise in spark and strong proficiency in ETL, SQL, and modern data engineering practices. Design, develop, and manage ETL/ELT pipelines in Databricks using PySpark/SparkSQL, integrating various data sources to support business operations Lead in the areas of solution design, code development, quality assurance, data modelling, business intelligence. Mentor Junior engineers in the team. Stay abreast of developments in the data world in terms of governance, quality and performance optimization. Able to have effective client meetings, understand deliverables, and drive successful outcomes. Qualifications: Bachelor's Degree in Computer Science or equivalent degree is required. 5 - 8 years of data engineering experience with expertise using Apache Spark and Databases (preferably Databricks) in marketing technologies and data management, and technical understanding in these areas. Monitor and tune Databricks workloads to ensure high performance and scalability, adapting to business needs as required. Solid experience in Basic and Advanced SQL writing and tuning. Experience with Python Solid understanding of CI/CD practices with experience in Git for version control and integration for spark data projects. Good understanding of Disaster Recovery and Business Continuity solutions Experience with scheduling applications with complex interdependencies, preferably Airflow Good experience in working with geographically and culturally diverse teams. Understanding of data management concepts in both traditional relational databases and big data lakehouse solutions such as Apache Hive, AWS Glue or Databricks. Excellent written and verbal communication skills. Ability to handle complex products. Good communication and problem-solving skills, with the ability to manage multiple priorities. Ability to diagnose and solve problems quickly. Diligent, able to multi-task, prioritize and able to quickly change priorities. Good time management. Good to have knowledge of cloud platforms (cloud security) and familiarity with Terraform or other infrastructure-as-code tools. About Epsilon: Epsilon is a global data, technology and services company that powers the marketing and advertising ecosystem. For decades, we have provided marketers from the world's leading brands the data, technology and services they need to engage consumers with 1 View, 1 Vision and 1 Voice. 1 View of their universe of potential buyers. 1 Vision for engaging each individual. And 1 Voice to harmonize engagement across paid, owned and earned channels. Epsilon's comprehensive portfolio of capabilities across our suite of digital media, messaging and loyalty solutions bridge the divide between marketing and advertising technology. We process 400+ billion consumer actions each day using advanced AI and hold many patents of proprietary technology, including real-time modeling languages and consumer privacy advancements. Thanks to the work of every employee, Epsilon has been consistently recognized as industry-leading by Forrester, Adweek and the MRC. Epsilon is a global company with more than 9,000 employees around the world.

Posted 6 days ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies