Jobs
Interviews

491 Data Pipeline Jobs - Page 3

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

2.0 - 6.0 years

3 - 7 Lacs

Hyderabad

Work from Office

Profile • Design, develop, and optimize database solutions with a focus on SQL based development and data transformation • Develop code based on reading and understanding business and functional requirements following the Agile process • Produce high-quality code to meet all project deadlines and ensuring the functionality matches the requirements •Analyze and resolve issues found during the testing or pre-production phases of the software delivery lifecycle; coordinating changes with project team leaders and cross-work team members • Provide technical support to project team members and responding to inquiries regarding errors or questions about programs •Interact with architects, other tech leads, team members and project manager as required to address technical and schedule issues. •Suggest and implement process improvements for estimating, development and testing processes. •BS Degree in Computer Science or applicable programming area of study •A minimum of 2 years prior work experience working with an application or database development; must demonstrate experience delivering systems and projects from inception through implementation •Strong experience with SQL development on SQL Server and/or Oracle •Proficiency in SQL and PL/SQL, including writing queries, stored procedures, and performance tuning •Familiarity with data modelling and database design principles •Experience working in Agile/Scrum environments is preferred •Understand Asynchronous and Synchronous transactions and processing. Experience with JMS, MDBs, MQ is a plus •Experience with Snowflake, Python, data warehousing technologies, data pipelines, or cloud-based data platforms is a plus •Excellent communication skills •Strong system/ technical analysis skills •Self-motivation with an ability to prioritize multiple tasks •Ability to develop a strong internal network across the platform •Excellent collaboration, communications, negotiation, and conflict resolution skills •Ability to think creatively and seek optimum solutions •Ability to grasp loosely defined concepts and transform them into tangible results and key deliverables •Very strong problem-solving skills •Diagnostic skills with the ability to analyze technical, business and financial issues and options •Ability to infer from previous examples, willingness to understand how an application is put together •Action-oriented, with the ability to quickly deal with c

Posted 1 week ago

Apply

7.0 - 11.0 years

0 Lacs

haryana

On-site

Genpact is a global professional services and solutions firm dedicated to delivering outcomes that shape the future. With a workforce of over 125,000 professionals spanning across more than 30 countries, we are fueled by our innate curiosity, entrepreneurial agility, and commitment to creating lasting value for our clients. Our purpose, the relentless pursuit of a world that works better for people, drives us to serve and transform leading enterprises, including the Fortune Global 500, leveraging our deep business and industry knowledge, digital operations services, and expertise in data, technology, and AI. We are currently seeking applications for the position of Principal Consultant- Databricks Lead Developer. As a Databricks Developer in this role, you will be tasked with solving cutting-edge real-world problems to meet both functional and non-functional requirements. Responsibilities: - Keep abreast of new and emerging technologies and assess their potential application for service offerings and products. - Collaborate with architects and lead engineers to devise solutions that meet functional and non-functional requirements. - Demonstrate proficiency in understanding relevant industry trends and standards. - Showcase strong analytical and technical problem-solving skills. - Possess experience in the Data Engineering domain. Qualifications we are looking for: Minimum qualifications: - Bachelor's Degree or equivalency in CS, CE, CIS, IS, MIS, or an engineering discipline, or equivalent work experience. - <<>> years of experience in IT. - Familiarity with new and emerging technologies and their possible applications for service offerings and products. - Collaboration with architects and lead engineers to develop solutions meeting functional and non-functional requirements. - Understanding of industry trends and standards. - Strong analytical and technical problem-solving abilities. - Proficiency in either Python or Scala, preferably Python. - Experience in the Data Engineering domain. Preferred qualifications: - Knowledge of Unity catalog and basic governance. - Understanding of Databricks SQL Endpoint. - Experience with CI/CD for building Databricks job pipelines. - Exposure to migration projects for building Unified data platforms. - Familiarity with DBT, Docker, and Kubernetes. If you are a proactive individual with a passion for innovation and a strong commitment to continuous learning and upskilling, we invite you to apply for this exciting opportunity to join our team at Genpact.,

Posted 1 week ago

Apply

5.0 - 10.0 years

6 - 8 Lacs

Hyderabad

Remote

Job Title : Senior-Level Data Engineer Healthcare Domain Location: Remote Option Experience: 5+ Years Employment Type: Full-Time About the Role We are looking for a Senior Data Engineer with extensive experience in healthcare data ecosystems and Databricks-based pipelines . The ideal candidate brings deep technical expertise in building large-scale data platforms, optimizing performance, and ensuring compliance with healthcare data standards (e.g., HIPAA, EDI, HCC). This role requires the ability to lead data initiatives, mentor junior engineers, and work cross-functionally with product, analytics, and compliance teams. Key Responsibilities Architect, develop, and manage large-scale, secure, and high-performance data pipelines on Databricks using Spark , Delta Lake , and cloud-native tools . Design and implement healthcare-specific data models to support analytics, AI/ML, and operational reporting. Ingest and transform complex data types such as 837/835 claims , EHR/EMR records , provider/member files , lab results , and clinical notes . Lead data governance, quality, and security initiatives ensuring compliance with HIPAA , HITECH , and organizational policies. Collaborate with cross-functional stakeholders to understand data needs and provide robust, scalable solutions. Mentor junior and mid-level engineers, performing code reviews and technical guidance. Identify performance bottlenecks and implement optimizations in Spark jobs and SQL transformations. Own and evolve best practices for CI/CD , version control, and deployment automation. Stay up to date with industry standards (e.g., FHIR , HL7 , OMOP ) and evaluate new tools/technologies. Required Qualifications 5+ years of experience in data engineering, with 3+ years in healthcare or life sciences domain. Deep expertise with Databricks , Scala , Apache Spark (preferably PySpark) , and Delta Lake . Proficiency in SQL , Python , and data modeling (dimensional/star schema, normalized models). Strong command over 837/835 EDI formats , CPT/ICD-10/DRG/HCC coding , and data regulatory frameworks. Experience with cloud platforms such as Azure , AWS , or GCP and cloud-native data services (e.g., S3, ADLS, Glue, Data Factory). Familiarity with orchestration tools like Airflow , dbt , or Azure Data Factory . Proven ability to work in agile environments , manage stakeholder expectations, and deliver end-to-end data products. Experience implementing monitoring, observability , and alerting for data pipelines. Strong written and verbal communication skills for both technical and non-technical audiences. Education Bachelors degree in Business Administration, Healthcare Informatics, Information Systems, or a related field.

Posted 1 week ago

Apply

7.0 - 12.0 years

27 - 42 Lacs

Hyderabad

Work from Office

Job Description: 1. Be a hands on problem solver with consultative approach, who can apply Machine Learning & Deep Learning algorithms to solve business challenges a. Use the knowledge of wide variety of AI/ML techniques and algorithms to find what combinations of these techniques can best solve the problem b. Improve Model accuracy to deliver greater business impact c. Estimate business impact due to deployment of model 2. Work with the domain/customer teams to understand business context , data dictionaries and apply relevant Deep Learning solution for the given business challenge 3. Working with tools and scripts for sufficiently pre-processing the data & feature engineering for model development – Python / R / SQL / Cloud data pipelines 4. Design , develop & deploy Deep learning models using Tensorflow / Pytorch 5. Experience in using Deep learning models with text, speech, image and video data a. Design & Develop NLP models for Text Classification, Custom Entity Recognition, Relationship extraction, Text Summarization, Topic Modeling, Reasoning over Knowledge Graphs, Semantic Search using NLP tools like Spacy and opensource Tensorflow, Pytorch, etc b. Design and develop Image recognition & video analysis models using Deep learning algorithms and open source tools like OpenCV c. Knowledge of State of the art Deep learning algorithms 6. Optimize and tune Deep Learnings model for best possible accuracy 7. Use visualization tools/modules to be able to explore and analyze outcomes & for Model validation eg: using Power BI / Tableau 8. Work with application teams, in deploying models on cloud as a service or on-prem a. Deployment of models in Test / Control framework for tracking b. Build CI/CD pipelines for ML model deployment 9. Integrating AI&ML models with other applications using REST APIs and other connector technologies 10. Constantly upskill and update with the latest techniques and best practices. Write white papers and create demonstrable assets to summarize the AIML work and its impact.

Posted 1 week ago

Apply

4.0 - 8.0 years

8 - 12 Lacs

Pune

Work from Office

Piller Soft Technology is looking for Lead Data Engineer to join our dynamic team and embark on a rewarding career journey Designing and developing data pipelines: Lead data engineers are responsible for designing and developing data pipelines that move data from various sources to storage and processing systems. Building and maintaining data infrastructure: Lead data engineers are responsible for building and maintaining data infrastructure, such as data warehouses, data lakes, and data marts. Ensuring data quality and integrity: Lead data engineers are responsible for ensuring data quality and integrity, by setting up data validation processes and implementing data quality checks. Managing data storage and retrieval: Lead data engineers are responsible for managing data storage and retrieval, by designing and implementing data storage systems, such as NoSQL databases or Hadoop clusters. Developing and maintaining data models: Lead data engineers are responsible for developing and maintaining data models, such as data dictionaries and entity-relationship diagrams, to ensure consistency in data architecture. Managing data security and privacy: Lead data engineers are responsible for managing data security and privacy, by implementing security measures, such as access controls and encryption, to protect sensitive data. Leading and managing a team: Lead data engineers may be responsible for leading and managing a team of data engineers, providing guidance and support for their work.

Posted 1 week ago

Apply

3.0 - 8.0 years

13 - 18 Lacs

Bengaluru

Work from Office

Develop code and test case scenarios by applying relevant software craftsmanship principles and meet the acceptance criteria. Complete the assigned learning path and contribute to daily meetings. Deliver on all aspects of Software Development Lifecyle (SDLC) in-line with Agile and IT craftsmanship principles. Take part in team ceremonies be it agile practices or chapter meetings. Deliver high-quality clean code and design that can be re-used. Actively, work with other Development teams to define and implement API's and rules for data access. Perform bug-free release validations and produce test and defect reports. Contribute to developing scripts, configuring quality and automating framework usage. Run and maintain test suites with the guidance of seniors. Support existing data models, data dictionary, data pipeline standards, storage of source, process and consumer metadata. Profile required 3+ year of expertise and hands on experience in Core java, Python/Spark, and good conceptual understanding of OOPs and Data Engineering Hands-on experience wirh at least 2 years on PGres or SQL Dbs Hands-on experience with at least 2 years in Springboot Hands-on experience with at least 2 years in web GUI development using ReactJS\AngularJS, Hands-on experience on API development Prior experience working with CI/CD tools (Maven, Git, jenkins) Good to have working knowledge of Cloud platroms Professional attitude: Self-motivated, fast learner, team player, independent, ability to handle multiple tasks and functional topic simultaneous

Posted 1 week ago

Apply

5.0 - 10.0 years

10 - 15 Lacs

Chennai, Bengaluru

Work from Office

Job Description: Job Title: ETL Testing Experience: 5-8 Years location: Chennai, Bangalore Employment Type: Full Time. J ob Type: Work from Office (Monday - Friday) Shift Timing: 12:30 PM to 9:30 PM Required Skills: Analytics skills to understand requirements to develop test cases, understand and manage data, strong SQL skills. Hands on testing of data pipelines built using Glue, S3, Redshift and Lambda, collaborate with developers to build automated testing where appropriate, understanding of data concepts like data lineage, data integrity and quality, experience testing financial data is a plus

Posted 1 week ago

Apply

5.0 - 8.0 years

10 - 16 Lacs

Bengaluru

Work from Office

Perform gap analysis & assess the impact of AI implementations on business processes Develop prototypes & proof-of-concept AI solutions using tools like Python, TensorFlow, or R Support UAT Required Candidate profile Experience in AI/ML or data analytics projects, AI/ML concepts, data pipelines, statistical modeling Proficiency in Python, R, or SQL preferred, AI tools - Azure AI, AWS SageMaker, Google AI, OpenAI

Posted 1 week ago

Apply

8.0 - 13.0 years

25 - 40 Lacs

Hyderabad

Work from Office

Key Responsibilities Design conformed star & snowflake schemas , implement SCD2 dimensions and fact tables. Lead Spark (PySpark/Scala) or AWSGlue ELT pipelines from RDSZeroETL/S3 into Redshift. Tune RA3 clusterssort/dist keys, WLM queues, Spectrum partitionsfor subsecond BI queries. Establish dataquality, lineage, and costgovernance dashboards using CloudWatch & Terraform/CDK. Collaborate with Product & Analytics to translate HR KPIs into selfservice data marts. Mentor junior engineers; drive documentation and coding standards. MustHave Skills AmazonRedshift (sort & dist keys, RA3, Spectrum) Spark on EMR/Glue (PySpark or Scala) Dimensional modelling (Kimball), star schema, SCD2 Advanced SQL + Python/Scala scripting AWS IAM, KMS, CloudWatch, Terraform/CDK, CI/CD (GitHub Actions or CodePipeline) NicetoHave dbt, Airflow, Kinesis/Kafka, LakeFormation rowlevel ACLs GDPR / SOC2 compliance exposure AWSDataAnalytics or SolutionsArchitect certification Education B.E./B.Tech in Computer Science, IT, or related field (Master’s preferred but not mandatory). Compensation & Benefits Competitive CTC 25–40 LPA Health insurance for self & dependents Why Join Us? Own a greenfield HR analytics platform with executive sponsorship. Modern AWS stack (RedshiftRA3, LakeFormation, EMRonEKS). Culture of autonomy, fast decisionmaking, and continuous learning. Application Process 30min technical screen 4hour takehome Spark/SQL challenge 90min architecture deep dive Panel interview (leadership & stakeholder communication)

Posted 1 week ago

Apply

10.0 - 16.0 years

50 - 100 Lacs

Bengaluru

Hybrid

Title: Principal Data Engineer Keywords: Java | AWS |SPARK |KAFKA | MySQL | ElasticSearch Office location : Bangalore, EGL - Domlur Experience: 10 to 16y Responsibilities: As a Principal Data Engineer, you will be responsible for: Leading the design and implementation of high-scale, cloud-native data pipelines for real-time and batch workloads. Collaborating with product managers, architects, and backend teams to translate business needs into secure and scalable data solutions. Integrating big data frameworks (like Spark, Kafka, Flink) with cloud-native services (AWS/GCP/Azure) to support security analytics use cases. Driving CI/CD best practices, infrastructure automation, and performance tuning across distributed environments. Evaluating and piloting the use of AI/LLM technologies in data pipelines (e.g., anomaly detection, metadata enrichment, automation). Evaluate and integrate LLM-based automation and AI-enhanced observability into engineering workflows. Ensure data security and privacy compliance. Mentoring engineers, ensuring high engineering standards, and promoting technical excellence across teams. What Were Looking For (Minimum Qualifications) 10-16 years of experience in big data architecture and engineering, including deep proficiency with the AWS cloud platform. Expertise in distributed systems and frameworks such as Apache Spark, Scala, Kafka, Flink, and Elasticsearch, with experience building production-grade data pipelines. Strong programming skills in Java for building scalable data applications. Hands-on experience with ETL tools and orchestration systems. Solid understanding of data modeling across both relational (PostgreSQL, MySQL) and NoSQL (HBase) databases and performance tuning. What Will Make You Stand Out! Experience integrating AI/ML or LLM frameworks (e.g., LangChain, LlamaIndex) into data workflows. Experience implementing CI/CD pipelines with Kubernetes, Docker, and Terraform. Knowledge of modern data warehousing (e.g., BigQuery, Snowflake) and data governance principles (GDPR, HIPAA). Strong ability to translate business goals into technical architecture and mentor teams through delivery. Familiarity with visualization tools (Tableau, Power BI) to communicate data insights, even if not a primary responsibility.

Posted 1 week ago

Apply

8.0 - 13.0 years

0 Lacs

Pune

Work from Office

Responsibilities: * Design, develop & maintain data pipelines using SQL, AWS & Snowflake. * Collaborate with cross-functional teams on data warehousing projects.

Posted 1 week ago

Apply

3.0 - 8.0 years

15 - 27 Lacs

Pune, Bengaluru

Work from Office

Velotio Technologies is a product engineering company working with innovative startups and enterprises. We have provided full-stack product development for 110+ startups across the globe, building products in the cloud-native, data engineering, B2B SaaS, IoT & Machine Learning space. Our team of 400+ elite software engineers solves hard technical problems while transforming customer ideas into successful products. Requirements Implement a cloud-native analytics platform with high performance and scalability Build an API-first infrastructure for data in and data out Build data ingestion capabilities for internal data, as well as external spend data. Leverage data classification AI algorithms to cleanse and harmonize data Own data modelling, microservice orchestration, monitoring & alerting Build solid expertise in the entire application suite and leverage this knowledge to better design application and data frameworks. Adhere to iterative development processes to deliver concrete value each release while driving longer-term technical vision. Engage with cross-organizational teams such as Product Management, Integrations, Services, Support, and Operations, to ensure the success of overall software development, implementation, and deployment. What you will bring: Bachelors degree in computer science, information systems, computer engineering, systems analysis or a related discipline, or equivalent work experience. More than 4 - 8 years of experience building enterprise, SaaS web applications using one or more of the following modern frameworks technologies: Java/ .Net/C etc. Exposure to Python & Familiarity with AI/ML-based data cleansing, deduplication and entity resolution techniques Familiarity with a MVC framework such as Django or Rails Full stack web development experience with hands-on experience building responsive UI, Single Page Applications, reusable components, with a keen eye for UI design and usability. Understanding of micro services and event driven architecture Strong knowledge of APIs, and integration with the backend Experience with relational SQL and NoSQL databases such MySQL / PostgreSQL / AWS Aurora / Cassandra Proven expertise in Performance Optimization and Monitoring Tools. Strong knowledge of Cloud Platforms (e.g., AWS, Azure, or GCP) Experience with CI/CD Tooling and software delivery and bundling mechanisms. Note- We need to fill this position soon. Please apply if your notice is less than 15 days.

Posted 1 week ago

Apply

4.0 - 7.0 years

10 - 20 Lacs

Hyderabad, Gurugram, Bengaluru

Work from Office

Proven experience as a Data Engineer. Strong experience on SQL and ETL, Hadoop and its ecosystem. Expertise in full and incremental data loading techniques. Contact / Whatsapp: 8712691790 / navya.k@liveconnections.in *JOB IN BANGALORE* Required Candidate profile "• Design, develop, and maintain scalable data pipelines and systems for data processing. • Utilize Hadoop and related technologies to manage large-scale data processing.

Posted 2 weeks ago

Apply

9.0 - 14.0 years

8 - 13 Lacs

Bengaluru

Work from Office

Key Responsibilities : Oversee the entire data infrastructure to ensure scalability, operation efficiency and resiliency. - Mentor junior data engineers within the organization. - Design, develop, and maintain data pipelines and ETL processes using Microsoft Azure services (e.g., Azure Data Factory, Azure Synapse, Azure Databricks, Azure Fabric). - Utilize Azure data storage accounts for organizing and maintaining data pipeline outputs. (e.g., Azure Data Lake Storage Gen 2 & Azure Blob storage). - Collaborate with data scientists, data analysts, data architects and other stakeholders to understand data requirements and deliver high-quality data solutions. - Optimize data pipelines in the Azure environment for performance, scalability, and reliability. - Ensure data quality and integrity through data validation techniques and frameworks. - Develop and maintain documentation for data processes, configurations, and best practices. - Monitor and troubleshoot data pipeline issues to ensure timely resolution. - Stay current with industry trends and emerging technologies to ensure our data solutions remain cutting-edge. - Manage the CI/CD process for deploying and maintaining data solutions.

Posted 2 weeks ago

Apply

7.0 - 12.0 years

8 - 18 Lacs

Bengaluru

Hybrid

Role - Cyber Data Pipeline Engineer Exp 7-14 Years Location – Bengaluru Description Overview We are seeking a skilled and motivated Data Pipeline Engineer to join our team. In this role, you will manage and maintain critical data pipeline platforms that collect, transform, and transmit cyber events data to downstream platforms, such as ElasticSearch and Splunk. You will be responsible for ensuring the reliability, scalability, and performance of the pipeline infrastructure while building complex integrations with cloud and on-premises cyber systems. Our key stakeholders are cyber teams including security response, investigations and insider threat. Role Profile A successful applicant will contribute to several important initiatives including: Collaborate with Cyber teams to identify, onboard, and integrate new data sources into the platform. Design and implement data mapping, transformation, and routing processes to meet analytics and monitoring requirements. Developing automation tools that integrate with in-house developed configuration management frameworks and APIs Monitor the health and performance of the data pipeline infrastructure. Working as a top-level escalation point to perform complex troubleshoots, working with other infrastructure teams to resolve issues Create and maintain detailed documentation for pipeline architecture, processes, and integrations. Required Skills Hands-on experience deploying and managing large-scale dataflow products like Cribl, Logstash or Apache NiFi Hands-on experience integrating data pipelines with cloud platforms (e.g., AWS, Azure, Google Cloud) and on-premises systems. Hands-on experience in developing and validating field extraction using regular expressions. A solid understanding of Operating Systems and Networking concepts: Linux/Unix system administration, HTTP and encryption. Good understanding of software version control, deployment & build tools using DevOps SDLC practices (Git, Jenkins, Jira) Strong analytical and troubleshooting skills Excellent verbal & written communication skills Appreciation of Agile methodologies, specifically Kanban Desired Skills Enterprise experience with a distributed event streaming platform like Apache Kafka, AWS Kinesis, Google Pub/Sub, MQ Infrastructure automation and integration experience, ideally using Python and Ansible Familiarity with cybersecurity concepts, event types, and monitoring requirements. Experience in Parsing and Normalizing data in Elasticsearch using Elastic Common Schema (ECS)

Posted 2 weeks ago

Apply

2.0 - 4.0 years

4 - 5 Lacs

Hyderabad

Work from Office

Job description As a key member of the data team, you will be responsible for developing and managing business intelligence solutions, creating data visualizations, and providing actionable insights to support decision-making processes. You will work with the stakeholders closely to understand the business requirements, translate them into technical tasks and develop robust data analytics and BI solutions while ensuring high-quality deliverables and client satisfaction. Desired Skills and Experience Essential skills 3-4 years of experience in development and deployment of high-performance, complex Tableau dashboards Experienced in writing complicated tableau calculations and using Alteryx solutions transformation to arrive at the desired solution. SQL skills and solid expertise in database design principles with data warehousing concepts In-depth knowledge of data platforms like SQL server and similar platforms. Strong understanding of data models in business intelligence (BI) tools such as Tableau, and other analytics tools. Problem-solving skills with a track record of resolving complex technical issues. Exposure to Tableau server and ETL concepts Tableu and Alteryx certifications in data-related fields are preferred. Strong communication skills, both written and oral, with a business and financial aptitude. Design, develop, and maintain Alteryx workflows for data processing and reporting. Extract data from SAP systems using DVW Alteryx Connectors and other integration tools. Perform ETL operations using Alteryx and manage datasets within MS SQL Server. Collaborate with stakeholders to understand data requirements and deliver actionable insights. Create and maintain documentation for workflow, processes, and data pipelines. Independently manage assigned tasks and ensure timely delivery. Education: Bachelors or masters in science or engineering disciplines (Computer Science, Engineering, Maths, Physics, etc.) Key Responsibilities Develop impactful and self-serving Tableau dashboards and visualizations to support business functions Conduct in-depth data analysis, extracting insights and trends from the data warehouse using SQL. Fulfill ad-hoc data requests and create insightful reports for business stakeholders. Contribute to data documentation and data quality efforts from a reporting perspective. Translate business questions into data-driven answers and actionable recommendations. Optimize BI solutions for performance and scalability. Incorporate feedback from clients and continuously improve models, dashboards, and processes. Create and maintain comprehensive documentation covering Data Fabric architecture, processes, and procedures Strategize and ideate the solution design, develop visual mockups, storyboards, flow diagrams, wireframes and interactive prototypes Comfortable working with large, complex datasets and conducting data quality checks, validations, and reconciliations. Key Skills Tableau Desktop, Tableau Server, Alteryx Designer, Alteryx Server , Tableau Reporting, Dashboard Visualization, data Warehousing, data pipeline, MS SQL server , ETL, business Analysis, Agile Methodolgy, data management, troubleshooting , data modelling, digital transformation, data integration Key Metrics Strong hands-on experience in solutioning and deploying analytics visualization solutions using Tableau Strong hands-on knowledge of MS SQL Server , data warehousing and data pipelines Exposure to Tableau Server and Alteryx server. Tableau Desktop certified or Alteryx Designer certified is a puls.

Posted 2 weeks ago

Apply

8.0 - 12.0 years

35 - 45 Lacs

Hyderabad, Bengaluru

Work from Office

We are seeking a hands-on and forward-thinking Principal or Lead Engineer with deep expertise in Java-based backend development and a strong grasp of Generative AI technologies . You will lead the design, development, and deployment of Gen AI-based solutions, working across data, ML engineering, and software engineering teams to integrate AI capabilities into our core platforms and client-facing products. Responsibilities Lead end-to-end architecture and implementation of Generative AI solutions integrated with Java-based applications Evaluate, fine-tune, and deploy foundational and LLM models (e.g., GPT, LLaMA, Claude, Gemini) for use cases such as code generation, summarization, intelligent assistants, etc. Collaborate with engineers and practice leads to identify and scope high-impact Gen AI use cases Build solutions based on AI models with engineers. Mentor and guide junior engineers and set technical direction across AI initiatives Build scalable APIs and microservices in Java/Spring Boot that interact with AI models Optimize performance and cost of AI solutions in production using prompt engineering, retrieval-augmented generation (RAG), caching, and model selection Contribute to AI model experimentation and evaluation pipelines (OpenAI, Hugging Face, LangChain, etc.) Drive adoption of Gen AI best practices (e.g., guardrails, ethical AI, observability, and feedback loops) Requirements Bachelor’s or Master’s degree in Computer Science, Engineering, or related field 10+ years of experience in Java/J2EE/Spring Boot and backend architecture 3+ years of experience working on ML/AI/Gen AI systems, including hands-on work with LLM APIs or open-source models Strong knowledge of modern Gen AI frameworks likeLangChain, LlamaIndex, andVector DBs (e.g., FAISS, Pinecone, Chroma) Experience integrating with LLMs via APIs (OpenAI, Azure OpenAI, Hugging Face) or self-hosted models Working knowledge of Python for AI model orchestration and prototyping Solid understanding of data pipelines, REST APIs, containerization (Docker, Kubernetes), and CI/CD workflows Experience with AWS, GCP, or Azure AI/ML services Nice to have Familiarity with prompt engineering and fine-tuning LLMs using techniques like LoRA or PEFT Experience building RAG-based chatbots, copilots, or AI-powered developer tools Contributions to AI communities, research, or open-source Gen AI projects Strong communication and stakeholder management skills

Posted 2 weeks ago

Apply

8.0 - 10.0 years

3 - 7 Lacs

Bengaluru

Work from Office

Must have : - Strong on programming languages like Python, Java - One cloud hands-on experience (GCP preferred) - Experience working with Dockers - Environments managing (e.g venv, pip, poetry, etc.) - Experience with orchestrators like Vertex AI pipelines, Airflow, etc. - Understanding of full ML Cycle end-to-end - Data engineering, Feature Engineering techniques - Experience with ML modelling and evaluation metrics - Experience with Tensorflow, Pytorch or another framework - Experience with Models monitoring - Advance SQL knowledge - Aware of Streaming concepts like Windowing , Late arrival , Triggers etc. Good to have : - Hyperparameter tuning experience. - Proficient in either Apache Spark or Apache Beam or Apache Flink. - Should have hands-on experience on Distributed computing. - Should have working experience on Data Architecture design. - Should be aware of storage and compute options and when to choose what. - Should have good understanding on Cluster Optimisation/ Pipeline Optimisation strategies. - Should have exposure on GCP tools to develop end to end data pipeline for various scenarios (including ingesting data from traditional data bases as well as integration of API based data sources). - Should have Business mindset to understand data and how it will be used for BI and Analytics purposes. - Should have working experience on CI/CD pipelines, Deployment methodologies, Infrastructure as a code (eg. Terraform). - Hands-on experience on Kubernetes. - Vector based Database like Qdrant. - LLM experience (embeddings generation, embeddings indexing, RAG, Agents, etc.). Key Responsibilities : - Design, develop, and implement AI models and algorithms using Python and Large Language Models (LLMs). - Collaborate with data scientists, engineers, and business stakeholders to define project requirements and deliver impactful AI-driven solutions. - Optimize and manage data pipelines, ensuring efficient data storage and retrieval with PostgreSQL. - Continuously research emerging AI trends and best practices to enhance model performance and capabilities. - Deploy, monitor, and maintain AI applications in production environments, adhering to best industry standards. - Document technical designs, workflows, and processes to facilitate clear knowledge transfer and project continuity. - Communicate technical concepts effectively to both technical and non-technical team members. Required Skills and Qualifications : - Proven expertise in Python programming for AI/ML applicati

Posted 2 weeks ago

Apply

2.0 - 6.0 years

3 - 7 Lacs

Gurugram

Work from Office

We are looking for a Pyspark Developer that loves solving complex problems across a full spectrum of technologies. You will help ensure our technological infrastructure operates seamlessly in support of our business objectives. Responsibilities Develop and maintain data pipelines implementing ETL processes. Take responsibility for Hadoop development and implementation. Work closely with a data science team implementing data analytic pipelines. Help define data governance policies and support data versioning processes. Maintain security and data privacy working closely with Data Protection Officer internally. Analyse a vast number of data stores and uncover insights. Skillset Required Ability to design, build and unit test the applications in Pyspark. Experience with Python development and Python data transformations. Experience with SQL scripting on one or more platforms Hive, Oracle, PostgreSQL, MySQL etc. In-depth knowledge of Hadoop, Spark, and similar frameworks. Strong knowledge of Data Management principles. Experience with normalizing/de-normalizing data structures, and developing tabular, dimensional and other data models. Have knowledge about YARN, cluster, executor, cluster configuration. Hands on working in different file formats like Json, parquet, csv etc. Experience with CLI on Linux-based platforms. Experience analysing current ETL/ELT processes, define and design new processes. Experience analysing business requirements in BI/Analytics context and designing data models to transform raw data into meaningful insights. Good to have knowledge on Data Visualization. Experience in processing large amounts of structured and unstructured data, including integrating data from multiple sources.

Posted 2 weeks ago

Apply

4.0 - 9.0 years

4 - 9 Lacs

Gurugram

Work from Office

As a Mid Databricks Engineer, you will play a pivotal role in designing, implementing, and optimizing data processing pipelines and analytics solutions on the Databricks platform. You will collaborate closely with cross-functional teams to understand business requirements, architect scalable solutions, and ensure the reliability and performance of our data infrastructure. This role requires deep expertise in Databricks, strong programming skills, and a passion for solving complex engineering challenges. What you'll do : - Design and develop data processing pipelines and analytics solutions using Databricks. - Architect scalable and efficient data models and storage solutions on the Databricks platform. - Collaborate with architects and other teams to migrate current solution to use Databricks. - Optimize performance and reliability of Databricks clusters and jobs to meet SLAs and business requirements. - Use best practices for data governance, security, and compliance on the Databricks platform. - Mentor junior engineers and provide technical guidance. - Stay current with emerging technologies and trends in data engineering and analytics to drive continuous improvement. You'll be expected to have : - Bachelor's or master's degree in computer science, Engineering, or a related field. - 5 to 8 years of overall experience and 2+ years of experience designing and implementing data solutions on the Databricks platform. - Proficiency in programming languages such as Python, Scala, or SQL. - Strong understanding of distributed computing principles and experience with big data technologies such as Apache Spark. - Experience with cloud platforms such as AWS, Azure, or GCP, and their associated data services. - Proven track record of delivering scalable and reliable data solutions in a fast-paced environment. - Excellent problem-solving skills and attention to detail. - Strong communication and collaboration skills with the ability to work effectively in cross-functional teams. - Good to have experience with containerization technologies such as Docker and Kubernetes. - Knowledge of DevOps practices for automated deployment and monitoring of data pipelines.

Posted 2 weeks ago

Apply

6.0 - 9.0 years

9 - 13 Lacs

Gurugram

Work from Office

Experience : 6+ years as Azure Data Engineer including at least 1 E2E Implementation in Microsoft Fabric. Responsibilities : - Lead the design and implementation of Microsoft Fabric-centric data platforms and data warehouses. - Develop and optimize ETL/ELT processes within the Microsoft Azure ecosystem, effectively utilizing relevant Fabric solutions. - Ensure data integrity, quality, and governance throughout Microsoft Fabric environment. - Collaborate with stakeholders to translate business needs into actionable data solutions. - Troubleshoot and optimize existing Fabric implementations for enhanced performance. Skills : - Solid foundational knowledge in data warehousing, ETL/ELT processes, and data modeling (dimensional, normalized). - Design and implement scalable and efficient data pipelines using Data Factory (Data Pipeline, Data Flow Gen 2 etc) in Fabric, Pyspark notebooks, Spark SQL, and Python. This includes data ingestion, data transformation, and data loading processes. - Experience ingesting data from SAP systems like SAP ECC/S4HANA/SAP BW etc will be a plus. - Nice to have ability to develop dashboards or reports using tools like Power BI. Coding Fluency : - Proficiency in SQL, Python, or other languages for data scripting, transformation, and automation.

Posted 2 weeks ago

Apply

3.0 - 7.0 years

12 - 15 Lacs

Hyderabad, Bengaluru, Delhi / NCR

Work from Office

We are looking for an experienced Data Engineer/BI Developer with strong hands-on expertise in Microsoft Fabric technologies, including OneLake, Lakehouse, Data Lake, Warehouse, and Real-Time Analytics, along with proven skills in Power BI, Azure Synapse Analytics, and Azure Data Factory (ADF). The ideal candidate should also possess working knowledge of DevOps practices for data engineering and deployment automation. Key Responsibilities: Design and implement scalable data solutions using Microsoft Fabric components: OneLake, Data Lake, Lakehouse, Warehouse, and Real-Time Analytics Build and manage end-to-end data pipelines integrating structured and unstructured data from multiple sources. Integrate Microsoft Fabric with Power BI, Synapse Analytics, and Azure Data Factory to enable modern data analytics solutions. Develop and maintain Power BI datasets, dashboards, and reports using data from Fabric Lakehouses or Warehouses. Implement data governance, security, and compliance policies within the Microsoft Fabric ecosystem. Collaborate with stakeholders for requirements gathering, data modeling, and performance tuning. Leverage Azure DevOps / Git for version control, CI/CD pipelines, and deployment automation of data artifacts. Monitor, troubleshoot, and optimize data flows and transformations for performance and reliability. Required Skills: 38 years of experience in data engineering, BI development, or similar roles. Strong hands-on experience with Microsoft Fabric ecosystem:OneLake, Data Lake, Lakehouse, Warehouse, Real-Time Analytics Proficient in Power BI for interactive reporting and visualization. Experience with Azure Synapse Analytics, ADF (Azure Data Factory), and related Azure services. Good understanding of data modeling, SQL, T-SQL, and Spark/Delta Lake concepts. Working knowledge of DevOps tools and CI/CD processes for data deployment (Azure DevOps preferred). Familiarity with DataOps and version control practices for data solutions. Preferred Qualifications: Microsoft certifications (e.g., DP-203, PL-300, or Microsoft Fabric certifications) are a plus. Experience with Python, Notebooks, or KQL for Real-Time Analytics is advantageous. Knowledge of data governance tools (e.g., Microsoft Purview) is a plus. Location: Remote- Bengaluru,Hyderabad,Delhi / NCR,Chennai,Pune,Kolkata,Ahmedabad,Mumbai

Posted 2 weeks ago

Apply

3.0 - 6.0 years

5 - 8 Lacs

Gurugram

Work from Office

About the job : - As a Mid Databricks Engineer, you will play a pivotal role in designing, implementing, and optimizing data processing pipelines and analytics solutions on the Databricks platform. - You will collaborate closely with cross-functional teams to understand business requirements, architect scalable solutions, and ensure the reliability and performance of our data infrastructure. - This role requires deep expertise in Databricks, strong programming skills, and a passion for solving complex engineering challenges. What You'll Do : - Design and develop data processing pipelines and analytics solutions using Databricks. - Architect scalable and efficient data models and storage solutions on the Databricks platform. - Collaborate with architects and other teams to migrate current solution to use Databricks. - Optimize performance and reliability of Databricks clusters and jobs to meet SLAs and business requirements. - Use best practices for data governance, security, and compliance on the Databricks platform. - Mentor junior engineers and provide technical guidance. - Stay current with emerging technologies and trends in data engineering and analytics to drive continuous improvement. You'll Be Expected To Have : - Bachelor's or Master's degree in Computer Science, Engineering, or a related field. - 3 to 6 years of overall experience and 2+ years of experience designing and implementing data solutions on the Databricks platform. - Proficiency in programming languages such as Python, Scala, or SQL. - Strong understanding of distributed computing principles and experience with big data technologies such as Apache Spark. - Experience with cloud platforms such as AWS, Azure, or GCP, and their associated data services. - Proven track record of delivering scalable and reliable data solutions in a fast-paced environment. - Excellent problem-solving skills and attention to detail. - Strong communication and collaboration skills with the ability to work effectively in cross-functional teams. - Good to have experience with containerization technologies such as Docker and Kubernetes. - Knowledge of DevOps practices for automated deployment and monitoring of data pipelines.

Posted 2 weeks ago

Apply

3.0 - 6.0 years

12 - 16 Lacs

Thiruvananthapuram

Work from Office

AWS Cloud Services (Glue, Lambda, Athena, Lakehouse) AWS CDK for Infrastructure-as-Code (IaC) with typescript Data pipeline development & orchestration using AWS Glue Strong programming skills in Python, Pyspark, Spark SQL, Typescript Required Candidate profile 3 to 5 Years Client-facing and team leadership experience Candidates have to work with UK Clients, Work timings will be aligned with the client's requirements and may follow UK time zones

Posted 2 weeks ago

Apply

7.0 - 10.0 years

9 - 12 Lacs

Hyderabad

Hybrid

Responsibilities of the Candidate : - Be responsible for the design and development of big data solutions. Partner with domain experts, product managers, analysts, and data scientists to develop Big Data pipelines in Hadoop - Be responsible for moving all legacy workloads to a cloud platform - Work with data scientists to build Client pipelines using heterogeneous sources and provide engineering services for data PySpark science applications - Ensure automation through CI/CD across platforms both in cloud and on-premises - Define needs around maintainability, testability, performance, security, quality, and usability for the data platform - Drive implementation, consistent patterns, reusable components, and coding standards for data engineering processes - Convert SAS-based pipelines into languages like PySpark, and Scala to execute on Hadoop and non-Hadoop ecosystems - Tune Big data applications on Hadoop and non-Hadoop platforms for optimal performance - Apply an in-depth understanding of how data analytics collectively integrate within the sub-function as well as coordinate and contribute to the objectives of the entire function. - Produce a detailed analysis of issues where the best course of action is not evident from the information available, but actions must be recommended/taken. - Assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients, and assets, by driving compliance with applicable laws, rules, and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct, and business practices, and escalating, managing and reporting control issues with transparency Requirements : - 6+ years of total IT experience - 3+ years of experience with Hadoop (Cloudera)/big data technologies - Knowledge of the Hadoop ecosystem and Big Data technologies Hands-on experience with the Hadoop eco-system (HDFS, MapReduce, Hive, Pig, Impala, Spark, Kafka, Kudu, Solr) - Experience in designing and developing Data Pipelines for Data Ingestion or Transformation using Java Scala or Python. - Experience with Spark programming (Pyspark, Scala, or Java) - Hands-on experience with Python/Pyspark/Scala and basic libraries for machine learning is required. - Proficient in programming in Java or Python with prior Apache Beam/Spark experience a plus. - Hand on experience in CI/CD, Scheduling and Scripting - Ensure automation through CI/CD across platforms both in cloud and on-premises - System level understanding - Data structures, algorithms, distributed storage & compute - Can-do attitude on solving complex business problems, good interpersonal and teamwork skills.

Posted 2 weeks ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies