Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
10.0 - 14.0 years
0 Lacs
pune, maharashtra
On-site
About the Team As a part of the DoorDash organization, you will be joining a data-driven team that values timely, accurate, and reliable data to make informed business and product decisions. Data serves as the foundation of DoorDash's success, and the Data Engineering team is responsible for building database solutions tailored to various use cases such as reporting, product analytics, marketing optimization, and financial reporting. By implementing robust data structures and data warehouse architecture, this team plays a crucial role in facilitating decision-making processes at DoorDash. Additionally, the team focuses on enhancing the developer experience by developing tools that support the organization's high-velocity demands. About the Role DoorDash is seeking a dedicated Data Engineering Manager to lead the development of enterprise-scale data solutions. In this role, you will serve as a technical expert on all aspects of data architecture, empowering data engineers, data scientists, and DoorDash partners. Your responsibilities will include fostering a culture of engineering excellence, enabling engineers to deliver reliable and flexible solutions at scale. Furthermore, you will be instrumental in building and nurturing a high-performing team, driving innovation and success in a dynamic and fast-paced environment. In this role, you will: - Lead and manage a team of data engineers, focusing on hiring, building, growing, and nurturing impactful business-focused data teams. - Drive the technical and strategic vision for embedded pods and foundational enablers to meet current and future scalability and interoperability needs. - Strive for continuous improvement of data architecture and development processes. - Balance quick wins with long-term strategy and engineering excellence, breaking down large systems into user-friendly data assets and reusable components. - Collaborate cross-functionally with stakeholders, external partners, and peer data leaders. - Utilize effective planning and execution tools to ensure short-term and long-term team and stakeholder success. - Prioritize reliability and quality as essential components of data solutions. Qualifications: - Bachelor's, Master's, or Ph.D. in Computer Science or equivalent field. - Over 10 years of experience in data engineering, data platform, or related domains. - Minimum of 2 years of hands-on management experience. - Strong communication and leadership skills, with a track record of hiring and growing teams in a fast-paced environment. - Proficiency in programming languages such as Python, Kotlin, and SQL. - Prior experience with technologies like Snowflake, Databricks, Spark, Trino, and Pinot. - Familiarity with the AWS ecosystem and large-scale batch/real-time ETL orchestration using tools like Airflow, Kafka, and Spark Streaming. - Knowledge of data lake file formats including Delta Lake, Apache Iceberg, Glue Catalog, and S3. - Proficiency in system design and experience with AI solutions in the data space. At DoorDash, we are dedicated to fostering a diverse and inclusive community within our company and beyond. We believe that innovation thrives in an environment where individuals from diverse backgrounds, experiences, and perspectives come together. We are committed to providing equal opportunities for all and creating an inclusive workplace where everyone can excel and contribute to our collective success.,
Posted 4 days ago
7.0 - 11.0 years
0 Lacs
karnataka
On-site
As a Lead database engine developer, you will play a crucial role in enhancing our database engine to operate at an exabyte scale. Our analytical database engine processes trillions of data points daily, enabling rapid queries with an impressive 60 ms response time at P50. Your technical expertise and leadership will be pivotal in ensuring that our system seamlessly manages exabytes of data on a daily basis. Your responsibilities will include developing and executing innovative technical strategies for our database engine that align with Newrelic's business objectives. You will focus on optimizing scalability and performance to handle exabyte-scale data while maintaining exceptional query performance. Enhancing data ingestion pipelines to support trillions of data points, collaborating with cross-functional teams to fine-tune query execution and response times, and ensuring high reliability, fault tolerance, and disaster recovery capabilities for mission-critical cloud services will also be part of your role. To excel in this position, you should possess at least 7 years of experience in database engine development. You must have exposure to core areas of Database Products, including Query Optimization and Execution, Distributed database & Parallel Query Execution, and Expression optimization & evaluation. Proficiency in C, C++, Unix, Linux, Windows, Data Structures & Algorithms, Database Internals, PostgreSQL, CitusDB, and MySQL is required. Experience with major cloud providers like AWS, Azure, or GCP, as well as extensive experience in a SaaS environment building and operating large scale distributed systems is essential. Your ability to collaborate effectively, influence decisions at an interpersonal level, and communicate clearly both in writing and verbally will be crucial. Domain knowledge in observability, experience with operating containerized services like Kubernetes or Mesos/Marathon, and a solid understanding of databases such as RDS, MySQL, and PostgreSQL are also important. Additionally, familiarity with configuration management tools like Ansible, Terraform, or Puppet, as well as technologies like ElasticSearch/OpenSearch, Apache Iceberg, Apache Spark, Spark SQL, and Cassandra will be beneficial. Experience with data platforms, data lakes, scalability, integration with multiple data sources, benchmarking, large-scale distributed database deployments, data ingestion, query performance optimization, integrations, and migrations is highly desirable. Ideally, you should hold a BS/MS/PhD in CS or an equivalent field to thrive in this challenging and rewarding role.,
Posted 4 days ago
8.0 - 12.0 years
8 - 12 Lacs
Bengaluru, Karnataka, India
On-site
Job description About this Team: IT Data Platform (ITDP) is the powerhouse data platform driving Target'stech efficiencies, seamlessly integrating operational and analytical needs. Itfuels every facet of Target Tech, from boosting developer productivity andenhancing system intelligence to ensuring top-notch security andcompliance. Target Tech builds the technology that makes Target the easiest, safest andmost joyful place to shop and work. From digital to supply chain tocybersecurity, develop innovations that power the future of retail whilerelying on best-in-class data science algorithms that drive value. TargetTech is at the forefront of the industry, revolutionizing technology efficiencywith cutting-edge data and AI. ITDP meticulously tracks tech data points across stores, multi-cloudenvironments, data centers, and distribution centers. IT Data Platform leverages advanced AI algorithms to analyze vast datasets, providingactionable insights that drive strategic decision-making. By integratingGenerative AI, it enhances predictive analytics, enabling proactive solutionsand optimizing operational efficiencies. Basic Qualifications: 4 years degree or equivalent experience 8+ years of industry experience in software design, development, andalgorithm related solutions. 8+ years of experience in programming languages such as Java, Python,Scala. Hands on experience developing distributed systems, large scale systems,database and/or backend APIs. Demonstrates expertise in analysis and optimization of systems capacity,performance, and operational health Stays current with new and evolving technologies via formal training andself-directed education Preferred Qualifications: Experience Big Data tools and Hadoop Ecosystems. Like Apache Spark,Apache Iceberg, Kafka, ORC, MapReduce, Yarn, Hive, HDFS etc. Experience in architecting, building and running a large-scale system. Experience with industry, open-source projects and/or databases and/orlarge-data distributed systems. Key Responsibilities: Data Platform Management: Lead the design, implementation, andoptimization of the Data Platform ensuring scalability and datacorrectness. Development: Oversee the development and maintenance of all corecomponents of the platform. Unified APIs: Manage and create highly scalable APIs with GraphQLat enterprise scale. Platform Monitoring and Observability: Ensure monitoring solutionsand security tools to ensure the integrity and trust in Data and APIs. Leadership and Mentorship: Provide technical leadership andmentorship to engineering teams, fostering a culture of collaborationand continuous improvement. Technology Design and Architecture: Articulate technology designsand architectural decisions to team members, ensuring alignmentwith business goals and technical standards
Posted 5 days ago
10.0 - 14.0 years
0 Lacs
karnataka
On-site
As a Senior Staff Software Engineer in Data Lake House Engineering, you will play a crucial role in designing and implementing the Data Lake house platform, supporting both Data Engineering and Data Lake house applications. Your responsibilities will include overseeing Data Engineering pipeline productionalization, end-to-end data pipelines, model development, deployment, monitoring, refresh, etc. Additionally, you will be involved in driving technology development and architecture to ensure the platforms, systems, tools, models, and services meet the technical standards for security, quality, reliability, usability, scalability, performance, efficiency, and operability to meet the evolving needs of Wex and its customers. It is essential to balance both near-term and long-term requirements in collaboration with other teams across the organization. Your technical ownership will extend to Wex's Data Lake House Data architecture and service technology implementations, emphasizing architecture, technical direction, engineering best practices, and quality/compliance. Collaboration with Platform engineering and Data Lake House Engineering teams will be a key aspect of your role. The vision behind Wex's Data Lake House revolves around creating a unified, scalable, and intelligent data infrastructure that enables the organization to leverage its data effectively. This includes goals such as data democratization, agility and scalability, and advanced insights and innovation through Data & AI technology. We are seeking a highly motivated and experienced Software Engineer to join our organization and contribute to building out the Data Lake House Platform for Wex. Reporting to the Sr. Manager of Data Lake House Engineering in Bangalore, the ideal candidate will possess deep technical expertise in building and scaling data lake house environments, coupled with strong leadership and communication skills to align efforts across the organization. Your impact will be significant as you lead and drive the development of technology and platform for the company's Data Lake house requirements, ensuring functional richness, reliability, performance, and flexibility of the Data Lake house Platform. You will be instrumental in designing the architecture, leading the implementation of the Data Lake house System and services, and challenging the status quo to drive technical solutions that effectively serve the broad risk area of Wex. Collaboration with various engineering teams, information security teams, and external partners will be essential to ensure the security, privacy, and integration of the Data Lake Platform. Moreover, you will be responsible for creating, prioritizing, managing, and executing roadmaps and project plans, as well as reporting on the status of development, quality, operations, and system performance. Your role will involve driving the technical vision and strategy of Data Lake to meet business needs, setting high standards for your team, providing technical guidance and mentorship, and fostering an environment of continuous learning and innovation. Upholding strong engineering principles and ensuring a culture of transparency and inclusion will be integral to your leadership. To be successful in this role, you should bring at least 10 years of software design and development experience at a large scale and have strong software development skills in your chosen programming language. Experience with Data Lakehouse formats, Spark programming, cloud architecture tools and services, CI/CD automation, and agile development practices will be advantageous. Additionally, you should possess excellent analytical skills, mentorship capabilities, and strong written and verbal communication skills. In terms of personal characteristics, you should demonstrate a collaborative, mission-driven style, high standards of integrity and corporate stewardship, and the ability to operate in a fast-paced entrepreneurial environment. Leading with empathy, fostering a culture of trust and transparency, and communicating effectively in various settings will be key to your success. You should also exhibit talent development and scouting abilities, intellectual curiosity, learning agility, and the capacity to drive change through influence and stakeholder management across a complex business environment.,
Posted 1 week ago
5.0 - 10.0 years
0 Lacs
hyderabad, telangana
On-site
We are looking for an experienced and dedicated Senior Manager of Business Intelligence & Data Engineering to lead a team of engineers. In this role, you will oversee various aspects of the Business Intelligence (BI) ecosystem, including designing and maintaining data pipelines, enabling advanced analytics, and providing actionable insights through BI tools and data visualization. Your responsibilities will include leading the design and development of scalable data architectures on AWS, managing Data Lakes, implementing data modeling and productization, collaborating with business stakeholders to create actionable insights, ensuring thorough documentation of data pipelines and systems, promoting knowledge-sharing within the team, and staying updated on industry trends in data engineering and BI. You should have at least 10 years of experience in Data Engineering or a related field, with a strong track record in designing and implementing large-scale distributed data systems. Additionally, you should possess expertise in BI, data visualization, people management, CI/CD tools, cloud-based data warehousing, AWS services, Data Lake architectures, Apache Spark, SQL, enterprise BI platforms, and microservices-based architectures. Strong communication skills, a collaborative mindset, and the ability to deliver insights to technical and executive audiences are essential for this role. Bonus points will be awarded if you have knowledge of data science and machine learning concepts, experience with Infrastructure as Code practices, familiarity with data governance and security in cloud environments, and domain understanding of Apparel, Retail, Manufacturing, Supply Chain, or Logistics. If you are passionate about leading a high-performing team, driving innovation in data engineering and BI, and contributing to the success of a global sports platform like Fanatics Commerce, we welcome you to apply for this exciting opportunity.,
Posted 1 week ago
12.0 - 14.0 years
12 - 14 Lacs
Bengaluru, Karnataka, India
On-site
The candidate should have proven expertise in building scalable platforms that are customer facing and have expertise in evangelizing the platform with customers and with internal stakeholders Expert level knowledge of Cloud Computing including aspects of VPC Network Design, Shared Responsibility Matrix, Cloud databases, No SQL Databases, Data Pipelines on the cloud, VM and VM orchestration, Serverless frameworks. This should be across all 3 major cloud providers (AWS, Azure, GCP), preferably at least in 2 of the 3 Public Clouds Expert level knowledge in Data Ingestion paradigms & use of different types of databases like OLTP, OLAP for specific purposes Hands-on experience with Apache Spark, Apache Flink, Kafka, Kinesis, Pub/Sub, Databricks, Apache Airflow, Apache Iceberg, and Presto. Expertise in designing ML Pipelines for experiment management, model management, feature management, model retraining, A/B testing of models and design of APIs for model inferencing at scale. Proven expertise with Kube Flow, SageMaker/Vertex AI/Azure AI. SME in LLM Serving paradigms, deep knowledge of GPU architectures, distributed training and serving of large language models. Expertise in Model and Data parallel training, expertise with frameworks like DeepSpeed and service frameworks like vLLM etc. Proven expertise in Model finetuning and model optimization techniques to achieve better latencies, better accuracies in results. Be an expert in reducing training and resource requirements of finetuning of LLM and LVM models. Have a wide knowledge of different LLM models and have an opinion on aspects of applicability of each model based the usecases. Proven expertise of having worked on specific customer usecases and having seen delivery of a solution end to end from engineering to production. Proven expertise in DevOps and LLMOps, knowledge of Kubernetes, Docker and container orchestration, and deep knowledge of LLM Orchestration frameworks like Flowise, Langflow, Langgraph. Skill Matrix LLM: Hugging Face OSS LLMs, GPT, Gemini, Claude, Mixtral, Llama LLM Ops: ML Flow, Langchain, Langraph, LangFlow, Flowise, LLamaIndex, SageMaker, AWS Bedrock, Vertex AI, Azure AI Dev Ops: Kubernetes, Docker, FluentD, Kibana, Grafana, Prometheus Databases/Datawarehouse: DynamoDB, Cosmos, MongoDB, RDS, MySQL, PostGreSQL, Aurora, Spanner, Google BigQuery. Cloud Expertise: AWS/Azure/GCP Cloud Certifications: AWS Professional Solution Architect, AWS Machine Learning Specialty, Azure Solutions Architect Expert Proficient in Python, SQL, Javascript
Posted 1 week ago
4.0 - 6.0 years
4 - 8 Lacs
Bengaluru
Hybrid
Hiring an AWS Data Engineer for a 6-month hybrid contractual role based in Bellandur, Bengaluru. The ideal candidate will have 46 years of experience in data engineering, with strong expertise in AWS services (S3, EC2, RDS, Lambda, EKS), PostgreSQL, Redis, Apache Iceberg, and Graph/Vector Databases. Proficiency in Python or Golang is essential. Responsibilities include designing and optimizing data pipelines on AWS, managing structured and in-memory data, implementing advanced analytics with vector/graph databases, and collaborating with cross-functional teams. Prior experience with CI/CD and containerization (Docker/Kubernetes) is a plus.
Posted 2 weeks ago
7.0 - 12.0 years
10 - 15 Lacs
Bengaluru
Hybrid
Hiring an AWS Data Engineer for a 6-month hybrid contractual role based in Bellandur, Bengaluru. The ideal candidate will have 7+ years of experience in data engineering, with strong expertise in AWS services (S3, EC2, RDS, Lambda, EKS), PostgreSQL, Redis, Apache Iceberg, and Graph/Vector Databases. Proficiency in Python or Golang is essential. Responsibilities include designing and optimizing data pipelines on AWS, managing structured and in-memory data, implementing advanced analytics with vector/graph databases, and collaborating with cross-functional teams. Prior experience with CI/CD and containerization (Docker/Kubernetes) is a plus.
Posted 3 weeks ago
4.0 - 6.0 years
6 - 8 Lacs
Bengaluru, Bellandur
Hybrid
Hiring an AWS Data Engineer for a 6-month hybrid contractual role based in Bellandur, Bengaluru. The ideal candidate will have 4-6 years of experience in data engineering, with strong expertise in AWS services (S3, EC2, RDS, Lambda, EKS), PostgreSQL, Redis, Apache Iceberg, and Graph/Vector Databases. Proficiency in Python or Golang is essential. Responsibilities include designing and optimizing data pipelines on AWS, managing structured and in-memory data, implementing advanced analytics with vector/graph databases, and collaborating with cross-functional teams. Prior experience with CI/CD and containerization (Docker/Kubernetes) is a plus.
Posted 1 month ago
4.0 - 9.0 years
4 - 9 Lacs
Hyderabad, Telangana, India
On-site
Good experience in Apache Iceberg, Apache Spark, Trino Proficiency in SQL and data modeling Experience with open Data Lakehouse using Apache Iceberg Experience with Data Lakehouse architecture with Apache Iceberg and Trino Design and implement scalable Data Lakehouse solutions using Apache Iceberg and Trino to optimize data storage and query performance.
Posted 1 month ago
5.0 - 10.0 years
20 - 25 Lacs
Bengaluru
Work from Office
The Platform Data Engineer will be responsible for designing and implementing robust data platform architectures, integrating diverse data technologies, and ensuring scalability, reliability, performance, and security across the platform. The role involves setting up and managing infrastructure for data pipelines, storage, and processing, developing internal tools to enhance platform usability, implementing monitoring and observability, collaborating with software engineering teams for seamless integration, and driving capacity planning and cost optimization initiatives.
Posted 1 month ago
5.0 - 10.0 years
3 - 14 Lacs
Bengaluru / Bangalore, Karnataka, India
On-site
Key Responsibilities : Design & Implement Data Architecture : Design, implement, and maintain the overall data platform architecture ensuring the scalability, security, and performance of the platform. Data Technologies Integration : Select, integrate, and configure data technologies (cloud platforms like AWS , Azure , GCP , data lakes , data warehouses , streaming platforms like Kafka , containerization technologies ). Infrastructure Management : Setup and manage the infrastructure for data pipelines , data storage , and data processing across platforms like Kubernetes and Airflow . Develop Frameworks & Tools : Develop internal frameworks to improve the efficiency and usability of the platform for other teams like Data Engineers and Data Scientists . Data Platform Monitoring & Observability : Implement and manage monitoring and observability for the data platform, ensuring high availability and fault tolerance. Collaboration : Work closely with software engineering teams to integrate the data platform with other business systems and applications. Capacity & Cost Optimization : Involved in capacity planning and cost optimization for data infrastructure, ensuring efficient utilization of resources. Tech Stack Requirements : Apache Iceberg (version 0.13.2): Experience in managing table formats for scalable data storage. Apache Spark (version 3.4 and above): Expertise in building and maintaining batch processing and streaming data processing capabilities. Apache Kafka (version 3.9 and above): Proficiency in managing messaging platforms for real-time data streaming. Role-Based Access Control (RBAC) : Experience with Apache Ranger (version 2.6.0) for implementing and administering security and access controls. RDBMS : Experience working with near real-time data storage solutions , specifically Oracle (version 19c). Great Expectations (version 1.3.4): Familiarity with implementing Data Quality (DQ) frameworks to ensure data integrity and consistency. Data Lineage & Cataloging : Experience with Open Lineage and DataHub (version 0.15.0) for managing data lineage and catalog solutions. Trino (version 4.7.0): Proficiency with query engines for batch processing. Container Platforms : Hands-on experience in managing container platforms such as SKE (version 1.29 on AKS ). Airflow (version 2.10.4): Experience using workflow and scheduling tools for orchestrating and managing data pipelines. DBT (Data Build Tool): Proficiency in using ETL/ELT frameworks like DBT for data transformation and automation. Data Tokenization : Experience with data tokenization technologies like Protegrity (version 9.2) for ensuring data security. Desired Skills : Domain Expertise : Familiarity with the Banking domain is a plus, including working with financial data and regulatory requirements.
Posted 1 month ago
8.0 - 13.0 years
25 - 40 Lacs
Chennai
Work from Office
Architect & Build Scalable Systems: Design and implement a petabyte-scale lakehouse Architectures to unify data lakes and warehouses. Real-Time Data Engineering: Develop and optimize streaming pipelines using Kafka, Pulsar, and Flink. Required Candidate profile Data engineering experience with large-scale systems• Expert proficiency in Java for data-intensive applications. Handson experience with lakehouse architectures, stream processing, & event streaming
Posted 1 month ago
8.0 - 10.0 years
0 Lacs
Bengaluru / Bangalore, Karnataka, India
On-site
NTT DATA strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptable, and forward-thinking organization, apply now. We are currently seeking a Cloud Solution Delivery Lead Consultant to join our team in bangalore, Karn?taka (IN-KA), India (IN). Data Engineer Lead Robust hands-on experience with industry standard tooling and techniques, including SQL, Git and CI/CD pipelines mandiroty Management, administration, and maintenance with data streaming tools such as Kafka/Confluent Kafka, Flink Experienced with software support for applications written in Python & SQL Administration, configuration and maintenance of Snowflake & DBT Experience with data product environments that use tools such as Kafka Connect, Synk, Confluent Schema Registry, Atlan, IBM MQ, Sonarcube, Apache Airflow, Apache Iceberg, Dynamo DB, Terraform and GitHub Debugging issues, root cause analysis, and applying fixes Management and maintenance of ETL processes (bug fixing and batch job monitoring) Training & Certification . Apache Kafka Administration Snowflake Fundamentals/Advanced Training . Experience 8 years of experience in a technical role working with AWS At least 2 years in a leadership or management role About NTT DATA NTT DATA is a $30 billion trusted global innovator of business and technology services. We serve 75% of the Fortune Global 100 and are committed to helping clients innovate, optimize and transform for long term success. As a Global Top Employer, we have diverse experts in more than 50 countries and a robust partner ecosystem of established and start-up companies. Our services include business and technology consulting, data and artificial intelligence, industry solutions, as well as the development, implementation and management of applications, infrastructure and connectivity. We are one of the leading providers of digital and AI infrastructure in the world. NTT DATA is a part of NTT Group, which invests over $3.6 billion each year in R&D to help organizations and society move confidently and sustainably into the digital future. Visit us at NTT DATA endeavors to make accessible to any and all users. If you would like to contact us regarding the accessibility of our website or need assistance completing the application process, please contact us at . This contact information is for accommodation requests only and cannot be used to inquire about the status of applications. NTT DATA is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status. For our EEO Policy Statement, please click . If you'd like more information on your EEO rights under the law, please click . For Pay Transparency information, please click.
Posted 1 month ago
1.0 - 3.0 years
3 - 5 Lacs
New Delhi, Chennai, Bengaluru
Hybrid
Your day at NTT DATA We are seeking an experienced Data Engineer to join our team in delivering cutting-edge Generative AI (GenAI) solutions to clients. The successful candidate will be responsible for designing, developing, and deploying data pipelines and architectures that support the training, fine-tuning, and deployment of LLMs for various industries. This role requires strong technical expertise in data engineering, problem-solving skills, and the ability to work effectively with clients and internal teams. What youll be doing Key Responsibilities: Design, develop, and manage data pipelines and architectures to support GenAI model training, fine-tuning, and deployment Data Ingestion and Integration: Develop data ingestion frameworks to collect data from various sources, transform, and integrate it into a unified data platform for GenAI model training and deployment. GenAI Model Integration: Collaborate with data scientists to integrate GenAI models into production-ready applications, ensuring seamless model deployment, monitoring, and maintenance. Cloud Infrastructure Management: Design, implement, and manage cloud-based data infrastructure (e.g., AWS, GCP, Azure) to support large-scale GenAI workloads, ensuring cost-effectiveness, security, and compliance. Write scalable, readable, and maintainable code using object-oriented programming concepts in languages like Python, and utilize libraries like Hugging Face Transformers, PyTorch, or TensorFlow Performance Optimization: Optimize data pipelines, GenAI model performance, and infrastructure for scalability, efficiency, and cost-effectiveness. Data Security and Compliance: Ensure data security, privacy, and compliance with regulatory requirements (e.g., GDPR, HIPAA) across data pipelines and GenAI applications. Client Collaboration: Collaborate with clients to understand their GenAI needs, design solutions, and deliver high-quality data engineering services. Innovation and R&D: Stay up to date with the latest GenAI trends, technologies, and innovations, applying research and development skills to improve data engineering services. Knowledge Sharing: Share knowledge, best practices, and expertise with team members, contributing to the growth and development of the team. Bachelors degree in computer science, Engineering, or related fields (Masters recommended) Experience with vector databases (e.g., Pinecone, Weaviate, Faiss, Annoy) for efficient similarity search and storage of dense vectors in GenAI applications 5+ years of experience in data engineering, with a strong emphasis on cloud environments (AWS, GCP, Azure, or Cloud Native platforms) Proficiency in programming languages like SQL, Python, and PySpark Strong data architecture, data modeling, and data governance skills Experience with Big Data Platforms (Hadoop, Databricks, Hive, Kafka, Apache Iceberg), Data Warehouses (Teradata, Snowflake, BigQuery), and lakehouses (Delta Lake, Apache Hudi) Knowledge of DevOps practices, including Git workflows and CI/CD pipelines (Azure DevOps, Jenkins, GitHub Actions) Experience with GenAI frameworks and tools (e.g., TensorFlow, PyTorch, Keras) Nice to have: Experience with containerization and orchestration tools like Docker and Kubernetes Integrate vector databases and implement similarity search techniques, with a focus on GraphRAG is a plus Familiarity with API gateway and service mesh architectures Experience with low latency/streaming, batch, and micro-batch processing Familiarity with Linux-based operating systems and REST APIs
Posted 1 month ago
8.0 - 12.0 years
0 Lacs
Mumbai, Maharashtra, India
On-site
Introduction A career in IBM Consulting is rooted by long-term relationships and close collaboration with clients across the globe. Youll work with visionaries across multiple industries to improve the hybrid cloud and AI journey for the most innovative and valuable companies in the world. Your ability to accelerate impact and make meaningful change for your clients is enabled by our strategic partner ecosystem and our robust technology platforms across the IBM portfolio including Software and Red Hat. Curiosity and a constant quest for knowledge serve as the foundation to success in IBM Consulting. In your role, youll be encouraged to challenge the norm, investigate ideas outside of your role, and come up with creative solutions resulting in ground breaking impact for a wide network of clients. Our culture of evolution and empathy centers on long-term career growth and development opportunities in an environment that embraces your unique skills and experience Your role and responsibilities Role Overview : We are looking for an experienced Denodo SME to design, implement, and optimize data virtualization solutions using Denodo as the enterprise semantic and access layer over a Cloudera-based data lakehouse. The ideal candidate will lead the integration of structured and semi-structured data across systems, enabling unified access for analytics, BI, and operational use cases. Key Responsibilities: Design and deploy the Denodo Platform for data virtualization over Cloudera, RDBMS, APIs, and external data sources. Define logical data models , derived views, and metadata mappings across layers (integration, business, presentation). Connect to Cloudera Hive, Impala, Apache Iceberg , Oracle, and other on-prem/cloud sources. Publish REST/SOAP APIs, JDBC/ODBC endpoints for downstream analytics and applications. Tune virtual views, caching strategies, and federation techniques to meet performance SLAs for high-volume data access. Implement Denodo smart query acceleration , usage monitoring, and access governance. Configure role-based access control (RBAC) , row/column-level security, and integrate with enterprise identity providers (LDAP, Kerberos, SSO). Work with data governance teams to align Denodo with enterprise metadata catalogs (e.g., Apache Atlas, Talend). Required education Bachelors Degree Preferred education Masters Degree Required technical and professional expertise Skills Required : 8-12 years in data engineering, with 4+ years of hands-on experience in Denodo Platform . Strong experience integrating RDBMS (Oracle, SQL Server), Cloudera CDP (Hive, Iceberg), and REST/SOAP APIs. Denodo Admin Tool, VQL, Scheduler, Data Catalog SQL, Shell scripting, basic Python (preferred). Deep understanding of query optimization , caching, memory management, and federation principles. Experience implementing data security, masking, and user access control in Denodo.
Posted 2 months ago
4.0 - 9.0 years
10 - 20 Lacs
Hyderabad, Chennai, Bengaluru
Work from Office
JD: • Good experience in Apache Iceberg, Apache Spark, Trino • Proficiency in SQL and data modeling • Experience with open Data Lakehouse using Apache Iceberg • Experience with Data Lakehouse architecture with Apache Iceberg and Trino
Posted 2 months ago
7.0 - 12.0 years
10 - 20 Lacs
Hyderabad
Remote
Job Title: Senior Data Engineer Location: Remote Job Type: Fulltime Experience Level: 7+ years About the Role: We are seeking a highly skilled Senior Data Engineer to join our team in building a modern data platform on AWS. You will play a key role in transitioning from legacy systems to a scalable, cloud-native architecture using technologies like Apache Iceberg, AWS Glue, Redshift, and Atlan for governance. This role requires hands-on experience across both legacy (e.g., Siebel, Talend, Informatica) and modern data stacks. Responsibilities: Design, develop, and optimize data pipelines and ETL/ELT workflows on AWS. Migrate legacy data solutions (Siebel, Talend, Informatica) to modern AWS-native services. Implement and manage a data lake architecture using Apache Iceberg and AWS Glue. Work with Redshift for data warehousing solutions including performance tuning and modelling. Apply data quality and observability practices using Soda or similar tools. Ensure data governance and metadata management using Atlan (or other tools like Collibra, Alation). Collaborate with data architects, analysts, and business stakeholders to deliver robust data solutions. Build scalable, secure, and high-performing data platforms supporting both batch and real-time use cases. Participate in defining and enforcing data engineering best practices. Required Qualifications: 7+ years of experience in data engineering and data pipeline development. Strong expertise with AWS services, especially Redshift, Glue, S3, and Athena. Proven experience with Apache Iceberg or similar open table formats (like Delta Lake or Hudi). Experience with legacy tools like Siebel, Talend, and Informatica. Knowledge of data governance tools like Atlan, Collibra, or Alation. Experience implementing data quality checks using Soda or equivalent. Strong SQL and Python skills; familiarity with Spark is a plus. Solid understanding of data modeling, data warehousing, and big data architectures. Strong problem-solving skills and the ability to work in an Agile environment.
Posted 2 months ago
4.0 - 7.0 years
10 - 14 Lacs
Noida
Work from Office
Location: Noida (In-office/Hybrid; client site if required) Type: Full-Time | Immediate Joiners Preferred Must-Have Skills: GCP (BigQuery, Dataflow, Dataproc, Cloud Storage) PySpark / Spark Distributed computing expertise Apache Iceberg (preferred), Hudi, or Delta Lake Role Overview: Be part of a high-impact Data Engineering team focused on building scalable, cloud-native data pipelines. You'll support and enhance EMR platforms using DevOps principles, helping deliver real-time health alerts and diagnostics for platform performance. Key Responsibilities: Provide data engineering support to EMR platforms Design and implement cloud-native, automated data solutions Collaborate with internal teams to deliver scalable systems Continuously improve infrastructure reliability and observability Technical Environment: Databases: Oracle, MySQL, MSSQL, MongoDB Distributed Engines: Spark/PySpark, Presto, Flink/Beam Cloud Infra: GCP (preferred), AWS (nice-to-have), Terraform Big Data Formats: Iceberg, Hudi, Delta Tools: SQL, Data Modeling, Palantir Foundry, Jenkins, Confluence Bonus: Stats/math tools (NumPy, PyMC3), Linux scripting Ideal for engineers with cloud-native, real-time data platform experience especially those who have worked with EMR and modern lakehouse stacks.
Posted 2 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
39581 Jobs | Dublin
Wipro
19070 Jobs | Bengaluru
Accenture in India
14409 Jobs | Dublin 2
EY
14248 Jobs | London
Uplers
10536 Jobs | Ahmedabad
Amazon
10262 Jobs | Seattle,WA
IBM
9120 Jobs | Armonk
Oracle
8925 Jobs | Redwood City
Capgemini
7500 Jobs | Paris,France
Virtusa
7132 Jobs | Southborough