Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
8.0 - 13.0 years
15 - 25 Lacs
Bengaluru
Hybrid
We are looking for a seasoned Data Architect (Senior Engineer) to lead the design and implementation of scalable, secure, and privacy-compliant data architectures that support feedback-driven AI systems.
Posted 2 months ago
8.0 - 13.0 years
15 - 25 Lacs
Chennai
Work from Office
Required Skills and Qualifications Bachelors/Master’s degree in Computer Science, Information Technology, or a related field. Proven experience as a Solution Architect or a similar role. Expertise in programming languages and frameworks: Java, Angular, Python, C++ Proficiency in AI/ML frameworks and libraries such as TensorFlow, PyTorch, Scikit-learn, or Keras. Experience in deploying AI models in production, including optimizing for performance and scalability. Understanding of deep learning, NLP, computer vision, or generative AI techniques. Hands-on experience with model fine-tuning, transfer learning, and hyperparameter optimization. Strong knowledge of enterprise architecture frameworks (TOGAF, Zachman, etc.). Expertise in distributed systems, microservices, and cloud-native architectures. Experience in API design, data pipelines, and integration of AI services within existing systems. Strong knowledge of databases: MongoDB, SQL, NoSQL. Proficiency in working with large-scale datasets, data wrangling, and ETL pipelines. Hands-on experience with CI/CD pipelines for AI development. Version control systems like Git and experience with ML lifecycle tools such as MLflow or DVC. Proven track record of leading AI-driven projects from ideation to deployment. Hands-on experience with cloud platforms (AWS, Azure, GCP) for deploying AI solutions. Familiarity with Agile methodologies, especially POD-based execution models. Strong problem-solving skills and ability to design scalable solutions. Excellent communication skills to articulate technical solutions to stakeholders. Preferred Qualifications Experience in e-commerce, Adtech or OOH (Out-of-Home) advertising technology. Knowledge of tools like Jira, Confluence, and Agile frameworks like Scrum or Kanban. Certification in cloud technologies (e.g., AWS Solutions Architect).
Posted 2 months ago
6.0 - 10.0 years
15 - 30 Lacs
Indore, Jaipur, Bengaluru
Work from Office
Exp in dashboard story development, dashboard creation, and data engineering pipelines. Manage and organize large volumes of application log data using Google Big Query Exp with log analytics, user engagement metrics, and product performance metrics Required Candidate profile Exp with tool like Tableau Power BI, or ThoughtSpot AI . Understand log data generated by Python-based applications. Ensure data integrity, consistency, and accessibility for analytical purposes.
Posted 2 months ago
2.0 - 4.0 years
3 - 7 Lacs
Bengaluru
Work from Office
There is a need for a resource (proficient) for a Data Engineer with experience monitoring and fixing jobs for data pipelines written in Azure data Factory and Python Design and implement data models for Snowflake to support analytical solutions. Develop ETL processes to integrate data from various sources into Snowflake. Optimize data storage and query performance in Snowflake. Collaborate with cross-functional teams to gather requirements and deliver scalable data solutions. Monitor and maintain Snowflake environments, ensuring optimal performance and data security. Create documentation for data architecture, processes, and best practices. Provide support and training for teams utilizing Snowflake services. Roles and Responsibilities Strong experience with Snowflake architecture and data warehousing concepts. Proficiency in SQL for data querying and manipulation. Familiarity with ETL tools such as Talend, Informatica, or Apache NiFi. Experience with data modeling techniques and tools. Knowledge of cloud platforms, specifically AWS, Azure, or Google Cloud. Understanding of data governance and compliance requirements. Excellent analytical and problem-solving skills. Strong communication and collaboration skills to work effectively within a team. Experience with Python or Java for data pipeline development is a plus.
Posted 2 months ago
10.0 - 15.0 years
12 - 16 Lacs
Pune, Bengaluru
Work from Office
We are seeking a talented and experienced Kafka Architect with migration experience to Google Cloud Platform (GCP) to join our team. As a Kafka Architect, you will be responsible for designing, implementing, and managing our Kafka infrastructure to support our data processing and messaging needs, while also leading the migration of our Kafka ecosystem to GCP. You will work closely with our engineering and data teams to ensure seamless integration and optimal performance of Kafka on GCP. Responsibilities: Discovery, analysis, planning, design, and implementation of Kafka deployments on GKE, with a specific focus on migrating Kafka from AWS to GCP. Design, architect and implement scalable, high-performance Kafka architectures and clusters to meet our data processing and messaging requirements. Lead the migration of our Kafka infrastructure from on-premises or other cloud platforms to Google Cloud Platform (GCP). Conduct thorough discovery and analysis of existing Kafka deployments on AWS. Develop and implement best practices for Kafka deployment, configuration, and monitoring on GCP. Develop a comprehensive migration strategy for moving Kafka from AWS to GCP. Collaborate with engineering and data teams to integrate Kafka into our existing systems and applications on GCP. Optimize Kafka performance and scalability on GCP to handle large volumes of data and high throughput. Plan and execute the migration, ensuring minimal downtime and data integrity. Test and validate the migrated Kafka environment to ensure it meets performance and reliability standards. Ensure Kafka security on GCP by implementing authentication, authorization, and encryption mechanisms. Troubleshoot and resolve issues related to Kafka infrastructure and applications on GCP. Ensure seamless data flow between Kafka and other data sources/sinks. Implement monitoring and alerting mechanisms to ensure the health and performance of Kafka clusters. Stay up to date with Kafka developments and GCP services to recommend and implement new features and improvements. Requirements: Bachelors degree in computer science, Engineering, or related field (Masters degree preferred). Proven experience as a Kafka Architect or similar role, with a minimum of [5] years of experience. Deep knowledge of Kafka internals and ecosystem, including Kafka Connect, Kafka Streams, and KSQL. In-depth knowledge of Apache Kafka architecture, internals, and ecosystem components. Proficiency in scripting and automation for Kafka management and migration. Hands-on experience with Kafka administration, including cluster setup, configuration, and tuning. Proficiency in Kafka APIs, including Producer, Consumer, Streams, and Connect. Strong programming skills in Java, Scala, or Python. Experience with Kafka monitoring and management tools such as Confluent Control Center, Kafka Manager, or similar. Solid understanding of distributed systems, data pipelines, and stream processing. Experience leading migration projects to Google Cloud Platform (GCP), including migrating Kafka workloads. Familiarity with GCP services such as Google Kubernetes Engine (GKE), Google Cloud Storage, Google Cloud Pub/Sub, and Big Query. Excellent communication and collaboration skills. Ability to work independently and manage multiple tasks in a fast-paced environment.
Posted 2 months ago
4.0 - 9.0 years
4 - 8 Lacs
Hyderabad
Work from Office
Data Transformation: Utilize Data Build Tool (dbt) to transform raw data into curated data models according to business requirements. Implement data transformations and aggregations to support analytical and reporting needs. Orchestration and Automation: Design and implement automated workflows using Google Cloud Composer to orchestrate data pipelines and ensure timely data delivery. Monitor and troubleshoot data pipelines, identifying and resolving issues proactively. Develop and maintain documentation for data pipelines and workflows. GCP Expertise: Leverage GCP services, including BigQuery, Cloud Storage, and Pub/Sub, to build a robust and scalable data platform. Optimize BigQuery performance and cost through efficient query design and data partitioning. Implement data security and access controls in accordance with banking industry standards. Collaboration and Communication: Collaborate with Solution Architect and Data Modeler to understand data requirements and translate them into technical solutions. Communicate effectively with team members and stakeholders, providing regular updates on project progress. Participate in code reviews and contribute to the development of best practices. Data Pipeline Development: Design, develop, and maintain scalable and efficient data pipelines using Google Cloud Dataflow to ingest data from various sources, including relational databases (RDBMS), data streams, and files. Implement data quality checks and validation processes to ensure data accuracy and consistency. Optimize data pipelines for performance and cost-effectiveness. Banking Domain Knowledge (Preferred): Understanding of banking data domains, such as customer data, transactions, and financial products. Familiarity with regulatory requirements and data governance standards in the banking industry. Required Experience: Bachelor's degree in computer science, Engineering, or a related field. ETL Knowledge. 4-9 years of experience in data engineering, with a focus on building data pipelines and data transformations. Strong proficiency in SQL and experience working with relational databases. Hands-on experience with Google Cloud Platform (GCP) services, including Dataflow, BigQuery, Cloud Composer, and Cloud Storage. Experience with data transformation tools, preferably Data Build Tool (dbt). Proficiency in Python or other scripting languages is a plus. Experience with data orchestration and automation. Strong problem-solving and analytical skills. Excellent communication and collaboration skills. Experience with data streams like Pub/Sub or similar. Experience in working with files such as CSV, JSON and Parquet. Primary Skills: GCP, Dataflow, BigQuery, Cloud Composer, Cloud Storage, Data Pipeline, Composer, SQL, DBT, DWH Concepts. Secondary Skills: Python, Banking Domain knowledge, pub/sub, Cloud certifications (e.g. Data engineer), Git or any other version control system.
Posted 2 months ago
3.0 - 7.0 years
10 - 20 Lacs
Noida, Gurugram, Delhi / NCR
Hybrid
Salary: 8 to 24 LPA Exp: 3 to 7 years Location: Gurgaon (Hybrid) Notice: Immediate to 30 days..!! Job Title: Senior Data Engineer Job Summary: We are looking for an experienced Senior Data Engineer with 5+ years of hands-on experience in cloud data engineering platforms, specifically AWS, Databricks, and Azure. The ideal candidate will play a critical role in designing, building, and maintaining scalable data pipelines and infrastructure to support our analytics and business intelligence initiatives. Key Responsibilities: Design, develop, and optimize scalable data pipelines using AWS services (e.g., S3, Glue, Redshift, Lambda). Build and maintain ETL/ELT workflows leveraging Databricks and Apache Spark for processing large datasets. Work extensively with Azure data services such as Azure Data Lake, Azure Synapse, Azure Data Factory, and Azure Databricks. Collaborate with data scientists, analysts, and stakeholders to understand data requirements and deliver high-quality data solutions. Ensure data quality, reliability, and security across multiple cloud platforms. Monitor and troubleshoot data pipelines, implement performance tuning, and optimize resource usage. Implement best practices for data governance, metadata management, and documentation. Stay current with emerging cloud data technologies and industry trends to recommend improvements. Required Qualifications: 5+ years of experience in data engineering with strong expertise in AWS , Databricks , and Azure cloud platforms. Hands-on experience with big data processing frameworks, particularly Apache Spark. Proficient in building complex ETL/ELT pipelines and managing data workflows. Strong programming skills in Python, Scala, or Java. Experience working with structured and unstructured data in cloud storage solutions. Knowledge of SQL and experience with relational and NoSQL databases. Familiarity with CI/CD pipelines and DevOps practices in cloud environments. Strong analytical and problem-solving skills with an ability to work independently and in teams. Preferred Skills: Experience with containerization and orchestration tools (Docker, Kubernetes). Familiarity with machine learning pipelines and tools. Knowledge of data modeling, data warehousing, and analytics architecture.
Posted 2 months ago
3.0 - 7.0 years
10 - 20 Lacs
Noida, Gurugram, Delhi / NCR
Hybrid
Salary: 8 to 24 LPA Exp: 3 to 7 years Location: Gurgaon (Hybrid) Notice: Immediate to 30 days..!! Job Profile: Experienced Data Engineer with a strong foundation in designing, building, and maintaining scalable data pipelines and architectures. Skilled in transforming raw data into clean, structured formats for analytics and business intelligence. Proficient in modern data tools and technologies such as SQL, T-SQL, Python, Databricks, and cloud platforms (Azure). Adept at data wrangling, modeling, ETL/ELT development, and ensuring data quality, integrity, and security. Collaborative team player with a track record of enabling data-driven decision-making across business units. As a Data engineer, Candidate will work on the assignments for one of our Utilities clients. Collaborating with cross-functional teams and stakeholders involves gathering data requirements, aligning business goals, and translating them into scalable data solutions. The role includes working closely with data analysts, scientists, and business users to understand needs, designing robust data pipelines, and ensuring data is accessible, reliable, and well-documented. Regular communication, iterative feedback, and joint problem-solving are key to delivering high-impact, data-driven outcomes that support organizational objectives. This position requires a proven track record of transforming processes, driving customer value, cost savings with experience in running end-to-end analytics for large-scale organizations. Design, build, and maintain scalable data pipelines to support analytics, reporting, and advanced modeling needs. Collaborate with consultants, analysts, and clients to understand data requirements and translate them into effective data solutions. Ensure data accuracy, quality, and integrity through validation, cleansing, and transformation processes. Develop and optimize data models, ETL workflows, and database architectures across cloud and on-premises environments. Support data-driven decision-making by delivering reliable, well-structured datasets and enabling self-service analytics. Provides seamless integration with cloud platforms (Azure), making it easy to build and deploy end-to-end data pipelines in the cloud Scalable clusters for handling large datasets and complex computations in Databricks, optimizing performance and cost management. Must to have Client Engagement Experience and collaboration with cross-functional teams Data Engineering background in Databricks Capable of working effectively as an individual contributor or in collaborative team environments Effective communication and thought leadership with proven record. Candidate Profile: Bachelors/masters degree in economics, mathematics, computer science/engineering, operations research or related analytics areas 3+ years’ experience must be in Data engineering. Hands on experience on SQL, Python, Databricks, cloud Platform like Azure etc. Prior experience in managing and delivering end to end projects Outstanding written and verbal communication skills Able to work in fast pace continuously evolving environment and ready to take up uphill challenges Is able to understand cross cultural differences and can work with clients across the globe.
Posted 2 months ago
8.0 - 13.0 years
10 - 15 Lacs
Bengaluru
Work from Office
In this role, you will play a key role in designing, building, and optimizing scalable data products within the Telecom Analytics domain. You will collaborate with cross-functional teams to implement AI-driven analytics, autonomous operations, and programmable data solutions. This position offers the opportunity to work with cutting-edge Big Data and Cloud technologies, enhance your data engineering expertise, and contribute to advancing Nokias data-driven telecom strategies. If you are passionate about creating innovative data solutions, mastering cloud and big data platforms, and working in a fast-paced, collaborative environment, this role is for you! You have: Bachelors or masters degree in computer science, Data Engineering, or related field with 8+ years of experience in data engineering with a focus on Big Data, Cloud, and Telecom Analytics. Hands-on expertise in Ab Initio for data cataloguing, metadata management, and lineage. Skills in data warehousing, OLAP, and modelling using BigQuery, Clickhouse, and SQL. Experience with data persistence technologies like S3, HDFS, and Iceberg. Hold on, Python and scripting languages. It would be nice if you also had: Experience with data exploration and visualization using Superset or BI tools. Knowledge in ETL processes and streaming tools such as Kafka. Background in building data products for the telecom domain and understanding of AI and machine learning pipeline integration. Data Governance: Manage source data within the Metadata Hub and Data Catalog. ETL Development: Develop and execute data processing graphs using Express It and the Co-Operating System. ETL Optimization: Debug and optimize data processing graphs using the Graphical Development Environment (GDE). API Integration: Leverage Ab Initio APIs for metadata and graph artifact management. CI/CD Implementation: Implement and maintain CI/CD pipelines for metadata and graph deployments. Team Leadership & Mentorship: Mentor team members and foster best practices in Ab Initio development and deployment.
Posted 2 months ago
3.0 - 8.0 years
5 - 9 Lacs
Bengaluru
Work from Office
Employment Type : Full Time, Permanent Working mode : Regular Job Description : Utilizes software engineering principles to deploy and maintain fully automated data transformation pipelines that combine a large variety of storage and computation technologies to handle a distribution of data types and volumes in support of data architecture design. Key Responsibilities : A Data Engineer designs data products and data pipelines that are resilient to change, modular, flexible, scalable, reusable, and cost effective. - Design, develop, and maintain data pipelines and ETL processes using Microsoft Azure services (e.g., Azure Data Factory, Azure Synapse, Azure Databricks, Azure Fabric). - Utilize Azure data storage accounts for organizing and maintaining data pipeline outputs. (e.g., Azure Data Lake Storage Gen 2 & Azure Blob storage). - Collaborate with data scientists, data analysts, data architects and other stakeholders to understand data requirements and deliver high-quality data solutions. - Optimize data pipelines in the Azure environment for performance, scalability, and reliability. - Ensure data quality and integrity through data validation techniques and frameworks. - Develop and maintain documentation for data processes, configurations, and best practices. - Monitor and troubleshoot data pipeline issues to ensure timely resolution. - Stay current with industry trends and emerging technologies to ensure our data solutions remain cutting-edge. - Manage the CI/CD process for deploying and maintaining data solutions.
Posted 2 months ago
7.0 - 12.0 years
5 - 15 Lacs
Bengaluru
Remote
Role & responsibilities Responsibilities: Design, develop, and maintain Collibra workflows tailored to our project's specific needs. Collaborate with cross-functional teams to ensure seamless integration of Collibra with other systems. Educate team members on Collibra's features and best practices. (or) Educate oneself on Collibra's features and best practices. Engage with customers to gather requirements and provide solutions that meet their needs. Stay updated with the latest developments in Collibra and data engineering technologies. Must-Haves: Excellent communication skills in English (reading, writing, and speaking). Background in Data Engineering or related disciplines. Eagerness to learn and become proficient in Collibra and its features. Ability to understand and apply Collibra's use cases within the project scope. Nice-to-Haves: Previous experience with Collibra or similar data cataloguing software. Familiarity with workflow design and optimization. Experience in requirement engineering, particularly in customer-facing roles. Knowledge of other cataloguing software and their integration with Collibra. Preferred candidate profile Nice-to-Haves: Previous experience with Collibra or similar data cataloguing software. Familiarity with workflow design and optimization. Experience in requirement engineering, particularly in customer-facing roles. Knowledge of other cataloguing software and their integration with Collibra.
Posted 2 months ago
9.0 - 14.0 years
8 - 13 Lacs
Bengaluru
Work from Office
Utilizes software engineering principles to deploy and maintain fully automated data transformation pipelines that combine a large variety of storage and computation technologies to handle a distribution of data types and volumes in support of data architecture design. A Senior Data Engineer designs and oversees the entire data infrastructure, data products and data pipelines that are resilient to change, modular, flexible, scalable, reusable, and cost effective. Key Responsibilities : Oversee the entire data infrastructure to ensure scalability, operation efficiency and resiliency. - Mentor junior data engineers within the organization. - Design, develop, and maintain data pipelines and ETL processes using Microsoft Azure services (e.g., Azure Data Factory, Azure Synapse, Azure Databricks, Azure Fabric). - Utilize Azure data storage accounts for organizing and maintaining data pipeline outputs. (e.g., Azure Data Lake Storage Gen 2 & Azure Blob storage). - Collaborate with data scientists, data analysts, data architects and other stakeholders to understand data requirements and deliver high-quality data solutions. - Optimize data pipelines in the Azure environment for performance, scalability, and reliability. - Ensure data quality and integrity through data validation techniques and frameworks. - Develop and maintain documentation for data processes, configurations, and best practices. - Monitor and troubleshoot data pipeline issues to ensure timely resolution. - Stay current with industry trends and emerging technologies to ensure our data solutions remain cutting-edge. - Manage the CI/CD process for deploying and maintaining data solutions. Keywords: ETL,Data Pipeline,Data Quality,Data Analytics,Data Modeling,Azure Databricks,Synapse Analytics,Azure Data Factory,Data Validation,Data Engineering*
Posted 2 months ago
7.0 - 12.0 years
15 - 22 Lacs
Gurugram
Work from Office
Data Scientist: 7+ years experience in AI/ML & Big Data Proficient in Python, SQL, TensorFlow, PyTorch, Scikit-learn, Spark MLlib Cloud proficiency (GCP, AWS/Azure) Strong analytical & comms skills Location: Gurgaon Salary: 22 LPA Immediate joiners
Posted 2 months ago
5.0 - 7.0 years
13 - 15 Lacs
Pune
Work from Office
About us: We are building a modern, scalable, fully automated on-premise data platform , designed to handle complex data workflows, including data ingestion, ETL processes, physics-based calculations and machine learning predictions. Orchestrated using Dagster , our platform integrates with multiple data sources, edge devices, and storage systems. A core principle of our architecture is self-service : granting data scientists, analysts, and engineers granular control over the entire journey of their data assets as well empowering teams to modify and extend their data pipelines with minimal friction. We're looking for a hands-on Data Engineer to help develop, maintain, and optimize this platform. Role & responsibilities: - Design, develop, and maintain robust data pipelines using Dagster for orchestration - Build and manage ETL pipelines with python and SQL - Optimize performance and reliability of the platform within on-premise infrastructure constraints - Develop solutions for processing and aggregating data on edge devices , including data filtering, compression, and secure transmission - Maintain metadata, data lineage, ensure data quality, consistency, and compliance with governance and security policies - Implement CI/CD workflows of the platform on a local Kubernetes cluster - Architect the platform with a self-service mindset , including clear abstractions, reusable components, and documentation - Develop in collaboration with data scientists, analysts, and frontend developers to understand evolving data needs - Define and maintain clear contracts/interfaces with source systems , ensuring resilience to upstream changes Preferred candidate profile: -5-7 years of experience in database-driven projects or related fields. -1-2 years of experience with data platforms, orchestration, and big data management. -Proven experience as a Data Engineer or similar role, with focus on backend data processing and infrastructure -Hands-on experience with Dagster or similar data orchestration tools (e.g., Airflow, Prefect, Luigi, Databricks) - Proficiency with SQL and Python - Strong understanding of data modeling , ETL/ELT best practices, and batch/stream processing - Familiarity with on-premises deployments and challenges (e.g., network latency, storage constraints, resource management) - Experience with version control (Git) and CI/CD practices for data workflows - Understanding of data governance , access control , and data cataloging
Posted 2 months ago
5.0 - 10.0 years
8 - 14 Lacs
Bengaluru
Work from Office
Key Responsibilities : - Design and develop scalable PySpark pipelines to ingest, parse, and process XML datasets with extreme hierarchical complexity. - Implement efficient XPath expressions, recursive parsing techniques, and custom schema definitions to extract data from nested XML structures. - Optimize Spark jobs through partitioning, caching, and parallel processing to handle terabytes of XML data efficiently. - Transform raw hierarchical XML data into structured DataFrames for analytics, machine learning, and reporting use cases. - Collaborate with data architects and analysts to define data models for nested XML schemas. - Troubleshoot performance bottlenecks and ensure reliability in distributed environments (e.g., AWS, Databricks, Hadoop). - Document parsing logic, data lineage, and optimization strategies for maintainability. Qualifications : - 5+ years of hands-on experience with PySpark and Spark XML libraries (e.g., `spark-xml`) in production environments. - Proven track record of parsing XML data with 20+ levels of nesting using recursive methods and schema inference. - Expertise in XPath, XQuery, and DataFrame transformations (e.g., `explode`, `struct`, `selectExpr`) for hierarchical data. - Strong understanding of Spark optimization techniques: partitioning strategies, broadcast variables, and memory management. - Experience with distributed computing frameworks (e.g., Hadoop, YARN) and cloud platforms (AWS, Azure, GCP). - Familiarity with big data file formats (Parquet, Avro) and orchestration tools (Airflow, Luigi). - Bachelor's degree in Computer Science, Data Engineering, or a related field. Preferred Skills : - Experience with schema evolution and versioning for nested XML/JSON datasets. - Knowledge of Scala or Java for extending Spark XML libraries. - Exposure to Databricks, Delta Lake, or similar platforms. - Certifications in AWS/Azure big data technologies.
Posted 2 months ago
5.0 - 10.0 years
8 - 14 Lacs
Gurugram
Work from Office
Key Responsibilities : - Design and develop scalable PySpark pipelines to ingest, parse, and process XML datasets with extreme hierarchical complexity. - Implement efficient XPath expressions, recursive parsing techniques, and custom schema definitions to extract data from nested XML structures. - Optimize Spark jobs through partitioning, caching, and parallel processing to handle terabytes of XML data efficiently. - Transform raw hierarchical XML data into structured DataFrames for analytics, machine learning, and reporting use cases. - Collaborate with data architects and analysts to define data models for nested XML schemas. - Troubleshoot performance bottlenecks and ensure reliability in distributed environments (e.g., AWS, Databricks, Hadoop). - Document parsing logic, data lineage, and optimization strategies for maintainability. Qualifications : - 5+ years of hands-on experience with PySpark and Spark XML libraries (e.g., `spark-xml`) in production environments. - Proven track record of parsing XML data with 20+ levels of nesting using recursive methods and schema inference. - Expertise in XPath, XQuery, and DataFrame transformations (e.g., `explode`, `struct`, `selectExpr`) for hierarchical data. - Strong understanding of Spark optimization techniques: partitioning strategies, broadcast variables, and memory management. - Experience with distributed computing frameworks (e.g., Hadoop, YARN) and cloud platforms (AWS, Azure, GCP). - Familiarity with big data file formats (Parquet, Avro) and orchestration tools (Airflow, Luigi). - Bachelor's degree in Computer Science, Data Engineering, or a related field. Preferred Skills : - Experience with schema evolution and versioning for nested XML/JSON datasets. - Knowledge of Scala or Java for extending Spark XML libraries. - Exposure to Databricks, Delta Lake, or similar platforms. - Certifications in AWS/Azure big data technologies.
Posted 2 months ago
7.0 - 11.0 years
20 - 30 Lacs
Hyderabad, Bengaluru
Hybrid
Responsibilities includes working on MDM platforms like ETL, data modelling, data warehousing and manage database related complex analysis, design, implement and support moderate to large sized databases. The role will help in providing production support and enhance existing data assets, design and develop ETL processes. Job Description: He/She will be responsible for design and development of ETL processes for large data warehouse. Required Qualifications: Experience in Master Data Management Platform like ETL or EAI, Data warehousing concepts, code management, automated testing Experience in developing ETL design guidelines, standards and procedures to ensure a manageable ETL infrastructure across the enterprise. Strong command on MS SQL/Oracle SQL, Mongo DB, PL/SQL and Complex Data Analysis using SQL queries. Development experience in Big Data eco system with the ability to design, develop, document & architect Hadoop applications would be plus. Experience in HDFS/Hive /Spark/NoSQL Hbase Strong knowledge of data architecture concepts. Strong knowledge of reporting and analytics concepts. Knowledge of Software Engineering best practices with experience on implementing CI/CD using Jenkins Knowledge of the Agile methodology for delivering software solutions Good to have skills in SQL Server DB and Windows server file handling and Power Shell scripting. What will you do in this role? Manage and develop ETL processes for large data warehouse. Provide analysis and design reviews with other development teams to avoid duplication of efforts and inefficiency in solving the same application problem with different solutions. Work closely with business partners, business analysts and software architects to create and operationalize common data products and consumption layers. Acts as a developer in providing application design guidance and consultation, utilizing a thorough understanding of applicable technology, tools and existing designs. Develops simple or highly complex code. Verifies program logic by overseeing the preparation of test data, testing and debugging of programs.
Posted 2 months ago
5.0 - 10.0 years
8 - 14 Lacs
Hyderabad
Work from Office
Key Responsibilities : - Design and develop scalable PySpark pipelines to ingest, parse, and process XML datasets with extreme hierarchical complexity. - Implement efficient XPath expressions, recursive parsing techniques, and custom schema definitions to extract data from nested XML structures. - Optimize Spark jobs through partitioning, caching, and parallel processing to handle terabytes of XML data efficiently. - Transform raw hierarchical XML data into structured DataFrames for analytics, machine learning, and reporting use cases. - Collaborate with data architects and analysts to define data models for nested XML schemas. - Troubleshoot performance bottlenecks and ensure reliability in distributed environments (e.g., AWS, Databricks, Hadoop). - Document parsing logic, data lineage, and optimization strategies for maintainability. Qualifications : - 5+ years of hands-on experience with PySpark and Spark XML libraries (e.g., `spark-xml`) in production environments. - Proven track record of parsing XML data with 20+ levels of nesting using recursive methods and schema inference. - Expertise in XPath, XQuery, and DataFrame transformations (e.g., `explode`, `struct`, `selectExpr`) for hierarchical data. - Strong understanding of Spark optimization techniques: partitioning strategies, broadcast variables, and memory management. - Experience with distributed computing frameworks (e.g., Hadoop, YARN) and cloud platforms (AWS, Azure, GCP). - Familiarity with big data file formats (Parquet, Avro) and orchestration tools (Airflow, Luigi). - Bachelor's degree in Computer Science, Data Engineering, or a related field. Preferred Skills : - Experience with schema evolution and versioning for nested XML/JSON datasets. - Knowledge of Scala or Java for extending Spark XML libraries. - Exposure to Databricks, Delta Lake, or similar platforms. - Certifications in AWS/Azure big data technologies.
Posted 2 months ago
5.0 - 10.0 years
8 - 14 Lacs
Mumbai
Remote
Key Responsibilities : - Design and develop scalable PySpark pipelines to ingest, parse, and process XML datasets with extreme hierarchical complexity. - Implement efficient XPath expressions, recursive parsing techniques, and custom schema definitions to extract data from nested XML structures. - Optimize Spark jobs through partitioning, caching, and parallel processing to handle terabytes of XML data efficiently. - Transform raw hierarchical XML data into structured DataFrames for analytics, machine learning, and reporting use cases. - Collaborate with data architects and analysts to define data models for nested XML schemas. - Troubleshoot performance bottlenecks and ensure reliability in distributed environments (e.g., AWS, Databricks, Hadoop). - Document parsing logic, data lineage, and optimization strategies for maintainability. Qualifications : - 5+ years of hands-on experience with PySpark and Spark XML libraries (e.g., `spark-xml`) in production environments. - Proven track record of parsing XML data with 20+ levels of nesting using recursive methods and schema inference. - Expertise in XPath, XQuery, and DataFrame transformations (e.g., `explode`, `struct`, `selectExpr`) for hierarchical data. - Strong understanding of Spark optimization techniques: partitioning strategies, broadcast variables, and memory management. - Experience with distributed computing frameworks (e.g., Hadoop, YARN) and cloud platforms (AWS, Azure, GCP). - Familiarity with big data file formats (Parquet, Avro) and orchestration tools (Airflow, Luigi). - Bachelor's degree in Computer Science, Data Engineering, or a related field. Preferred Skills : - Experience with schema evolution and versioning for nested XML/JSON datasets. - Knowledge of Scala or Java for extending Spark XML libraries. - Exposure to Databricks, Delta Lake, or similar platforms. - Certifications in AWS/Azure big data technologies.
Posted 2 months ago
5.0 - 10.0 years
8 - 14 Lacs
Jaipur
Remote
Key Responsibilities : - Design and develop scalable PySpark pipelines to ingest, parse, and process XML datasets with extreme hierarchical complexity. - Implement efficient XPath expressions, recursive parsing techniques, and custom schema definitions to extract data from nested XML structures. - Optimize Spark jobs through partitioning, caching, and parallel processing to handle terabytes of XML data efficiently. - Transform raw hierarchical XML data into structured DataFrames for analytics, machine learning, and reporting use cases. - Collaborate with data architects and analysts to define data models for nested XML schemas. - Troubleshoot performance bottlenecks and ensure reliability in distributed environments (e.g., AWS, Databricks, Hadoop). - Document parsing logic, data lineage, and optimization strategies for maintainability. Qualifications : - 5+ years of hands-on experience with PySpark and Spark XML libraries (e.g., `spark-xml`) in production environments. - Proven track record of parsing XML data with 20+ levels of nesting using recursive methods and schema inference. - Expertise in XPath, XQuery, and DataFrame transformations (e.g., `explode`, `struct`, `selectExpr`) for hierarchical data. - Strong understanding of Spark optimization techniques: partitioning strategies, broadcast variables, and memory management. - Experience with distributed computing frameworks (e.g., Hadoop, YARN) and cloud platforms (AWS, Azure, GCP). - Familiarity with big data file formats (Parquet, Avro) and orchestration tools (Airflow, Luigi). - Bachelor's degree in Computer Science, Data Engineering, or a related field. Preferred Skills : - Experience with schema evolution and versioning for nested XML/JSON datasets. - Knowledge of Scala or Java for extending Spark XML libraries. - Exposure to Databricks, Delta Lake, or similar platforms. - Certifications in AWS/Azure big data technologies.
Posted 2 months ago
5.0 - 10.0 years
8 - 14 Lacs
Chennai
Work from Office
Key Responsibilities : - Design and develop scalable PySpark pipelines to ingest, parse, and process XML datasets with extreme hierarchical complexity. - Implement efficient XPath expressions, recursive parsing techniques, and custom schema definitions to extract data from nested XML structures. - Optimize Spark jobs through partitioning, caching, and parallel processing to handle terabytes of XML data efficiently. - Transform raw hierarchical XML data into structured DataFrames for analytics, machine learning, and reporting use cases. - Collaborate with data architects and analysts to define data models for nested XML schemas. - Troubleshoot performance bottlenecks and ensure reliability in distributed environments (e.g., AWS, Databricks, Hadoop). - Document parsing logic, data lineage, and optimization strategies for maintainability. Qualifications : - 5+ years of hands-on experience with PySpark and Spark XML libraries (e.g., `spark-xml`) in production environments. - Proven track record of parsing XML data with 20+ levels of nesting using recursive methods and schema inference. - Expertise in XPath, XQuery, and DataFrame transformations (e.g., `explode`, `struct`, `selectExpr`) for hierarchical data. - Strong understanding of Spark optimization techniques: partitioning strategies, broadcast variables, and memory management. - Experience with distributed computing frameworks (e.g., Hadoop, YARN) and cloud platforms (AWS, Azure, GCP). - Familiarity with big data file formats (Parquet, Avro) and orchestration tools (Airflow, Luigi). - Bachelor's degree in Computer Science, Data Engineering, or a related field. Preferred Skills : - Experience with schema evolution and versioning for nested XML/JSON datasets. - Knowledge of Scala or Java for extending Spark XML libraries. - Exposure to Databricks, Delta Lake, or similar platforms. - Certifications in AWS/Azure big data technologies.
Posted 2 months ago
14.0 - 24.0 years
35 - 55 Lacs
Hyderabad, Bengaluru, Delhi / NCR
Hybrid
About the role We are seeking a Sr. Practice Manager with Insight , you will be involved in different phases related to Software Development Lifecycle including Analysis, Design, Development and Deployment. We will count on you to be proficient in Software Design and Development, data modelling, data processing and data visualization. Along the way, you will get to: Help customers leverage existing data resources, implement new technologies and tooling to enable data science and data analytics Track the performance of our resources and related capabilities Experience mentoring and managing other data engineers and ensuring data engineering best practices are being followed. Constantly evolve and scale our capabilities along with the growth of the business and needs of our customers Be Ambitious : This opportunity is not just about what you do today but also about where you can go tomorrow. As a Practice Manager, you are positioned for swift advancement within our organization through a structured career path. When you bring your hunger, heart, and harmony to Insight, your potential will be met with continuous opportunities to upskill, earn promotions, and elevate your career. What were looking for Sr. Practice Manager with: Total of 14+ yrs of relevant experience, atleast 5-6 years in people management, managing 20+ team. Minimum 12 years of experience in Data technology. Experience in Data Warehouse and excellent command in SQL, data modeling and ETL development. Hands-on experience in SQL Server, Microsoft Azure (Data Factory, Data Lake, Data Bricks) Experience in MSBI (SSRS, SSIS, SSAS), writing queries and stored procedures. (Good to have) Experienced using Power BI, MDX, DAX, MDS, DQS. (Good to have) Experience developing design related to Predictive Analytics model Ability to handle performance improvement tasks & data archiving. Proficient in relevant provisioning of Azure resources, forecasting hardware usage, and managing to a budget.
Posted 2 months ago
4.0 - 8.0 years
1 - 4 Lacs
New Delhi, Bengaluru
Work from Office
Role Overview Were hiring a top-tier AI Solution Architect who thrives on building scalable, open- source-first frameworks in Python for industrial AI applications. You will lead the architecture, design, and deployment of Neuralixs proprietary DLT (Data Lifecycle Templatization) engineturning complex industrial problems into elegant, reusable code artifacts. This is a career-defining opportunity to build the foundational platform for one of the most audacious AI startups in the world. What You'll Do Architect and build modular, reusable Python frameworks that streamline industrial data pipelines, signal processing, and AI model deployment Collaborate directly with product teams, data scientists, and field engineers to convert real-world chaos into structured, scalable solutions Translate client-specific AI use cases into templatized microservices , with clear interfaces and observability hooks.Drive the evolution of our DLT engine from a powerful internal tool to an industry-wide platform Mentor engineers, review critical code paths, and shape the engineering culture of the company Contribute to open source components and help publish technical content or white papers under the Neuralix brand. What We're Looking For 4 - 7 years of hands-on experience in startups or fast-paced engineering teams Proven track record in Python framework development (not just app dev, but scalable libraries, plugins, SDKs, etc.) Deep understanding of data pipelines , sensor data , streaming frameworks , or time-series AI Bonus: experience with physics-informed ML, MLflow, Apache Airflow, or similar orchestration tools Strong GitHub portfolio, open source contributions, or authorship of reusable internal tools Exceptional clarity in system design, documentation, and API design. Why Neuralix? You'll build something foundational. Not just a model, not just a productbut a platform used across defense, energy, and mission-critical industrial systems. You'll work with first-principles thinkers. We dont hire for buzzwordswe solve hard problems with clarity and conviction. Youll have an outsized impact. This is not a corporate innovation lab—every line of code you write will move industries forward. Apply Now If you're obsessed with clean abstractions, efficient design, and building real-world AI systems from scratch—we want to hear from you. Send your resume, GitHub, or anything else you’re proud of to careers@neuralix.ai .
Posted 2 months ago
9.0 - 13.0 years
20 - 25 Lacs
Bengaluru
Work from Office
Senior Software Engineer (12-15 yrs) Supply chain retail technology Who We Are :Wayfair runs the largest custom e-commerce large parcel network in the United States, approximately 1 6 million square meters of logistics space The nature of the network is inherently a highly variable ecosystem that requires flexible, reliable, and resilient systems to operate efficiently We are looking for a passionate Backend Software Engineer to join the Fulfilment Optimisation team What Youll D Partner with your business stakeholders to provide them with transparency, data, and resources to make informed decisions Be a technical leader within and across the teams you work with Drive high impact architectural decisions and hands-on development, including inception, design, execution, and delivery following good design and coding practices Obsessively focus on production readiness for the team including testing, monitoring, deployment, documentation and proactive troubleshootin Identify risks and gaps in technical approaches and propose solutions to meet team and project goals Create proposals and action plans to garner support across the organizatio Influence and contribute to the teams strategy and roadmap Tenacity for learning curious, and constantly pushing the boundary of what is possible We Are a Match Because You Ha A Bachelors Degree in Computer Science or a related engineering At least 12 years of experience in a senior engineer or technical lead role Should have mentored 10-12 people Experience developing and designing scalable distributed systems with deep understanding of architectural and design patterns, object oriented design, modern program languages Excellent communication skills and ability to work effectively with engineers, product managers, data scientists, analysts and business stakeholders, Passion for mentoring and leading peer engineers Experience designing APIs and micro services Experience working on cloud technologies specifically GCP is a plus Deep understanding of data processing and data pipelines Common open source platforms, tools and framework, eg: Kafka, Kubernetes, Containerization, Java microservices, GraphQL APIs, Aerospike etc Designing and developing recommendation systems and productionalizing ML models for real time decisions, large-scale data processing and event-driven systems and technologies is a plus About Wayfair Inc Wayfair is one of the worlds largest online destinations for the home Whether you work in our global headquarters in Boston or Berlin, or in our warehouses or offices throughout the world, were reinventing the way people shop for their homes Through our commitment to industry-leading technology and creative problem-solving, we are confident that Wayfair will be home to the most rewarding work of your career If youre looking for rapid growth, constant learning, and dynamic challenges, then youll find that amazing career opportunities are knock ing No matter who you are, Wayfair is a place you can call home Were a community of innovators, risk-takers, and trailblazers who celebrate our differences, and know that our unique perspectives make us stronger, smarter, and well-positioned for success We value and rely on the collective voices of our employees, customers, community, and suppliers to help guide us as we build a better Wayfair and world for all Every voice, every perspective matters Thats why were proud to be an equal opportunity employer We do not discriminate on the basis of race, color, ethnicity, ancestry, religion, sex, national origin, sexual orientation, age, citizenship status, marital status, disability, gender identity, gender expression, veteran status, genetic information, or any other legally protected characteristic Your personal data is processed in accordance with our Candidate Privacy Notice (https:// wayfair,com/careers/privacy) If you have any questions or wish to exercise your rights under applicable privacy and data protection laws, please contact us at dataprotectionofficer@wayfair , com
Posted 2 months ago
5.0 - 10.0 years
20 - 35 Lacs
Hyderabad, Chennai, Bengaluru
Hybrid
Location: Bangalore, Hyderabad, Chennai Notice Period: Immediate to 20 days Experience: 5+ years Relevant Experience: 5+ years Skills: Data Engineer, Azure, Python, Panda, SQL, Pyspark, SQL, Databricks, Data pipeline, Synapse
Posted 2 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
39581 Jobs | Dublin
Wipro
19070 Jobs | Bengaluru
Accenture in India
14409 Jobs | Dublin 2
EY
14248 Jobs | London
Uplers
10536 Jobs | Ahmedabad
Amazon
10262 Jobs | Seattle,WA
IBM
9120 Jobs | Armonk
Oracle
8925 Jobs | Redwood City
Capgemini
7500 Jobs | Paris,France
Virtusa
7132 Jobs | Southborough