Jobs
Interviews

252 Etl Pipelines Jobs - Page 7

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 - 7.0 years

9 - 11 Lacs

Hyderabad

Work from Office

Role: PySpark DeveloperLocations:MultipleWork Mode: Hybrid Interview Mode: Virtual (2 Rounds) Type: Contract-to-Hire (C2H) Job Summary We are looking for a skilled PySpark Developer with hands-on experience in building scalable data pipelines and processing large datasets. The ideal candidate will have deep expertise in Apache Spark, Python, and working with modern data engineering tools in cloud environments such as AWS. Key Skills & Responsibilities Strong expertise in PySpark and Apache Spark for batch and real-time data processing. Experience in designing and implementing ETL pipelines, including data ingestion, transformation, and validation. Proficiency in Python for scripting, automation, and building reusable components. Hands-on experience with scheduling tools like Airflow or Control-M to orchestrate workflows. Familiarity with AWS ecosystem, especially S3 and related file system operations. Strong understanding of Unix/Linux environments and Shell scripting. Experience with Hadoop, Hive, and platforms like Cloudera or Hortonworks. Ability to handle CDC (Change Data Capture) operations on large datasets. Experience in performance tuning, optimizing Spark jobs, and troubleshooting. Strong knowledge of data modeling, data validation, and writing unit test cases. Exposure to real-time and batch integration with downstream/upstream systems. Working knowledge of Jupyter Notebook, Zeppelin, or PyCharm for development and debugging. Understanding of Agile methodologies, with experience in CI/CD tools (e.g., Jenkins, Git). Preferred Skills Experience in building or integrating APIs for data provisioning. Exposure to ETL or reporting tools such as Informatica, Tableau, Jasper, or QlikView. Familiarity with AI/ML model development using PySpark in cloud environments

Posted 1 month ago

Apply

4.0 - 9.0 years

4 - 9 Lacs

Gurgaon / Gurugram, Haryana, India

On-site

Responsibilities Data Pipeline Engineers are expected to be involved from inception of projects, understand requirements, architect, develop, deploy, and maintain data pipelines (ETL / ELT). Typically, they work in a multi-disciplinary squad (we follow Agile!) which involves partnering with program and product managers to expand product offering based on business demands. Design is an iterative process, whether for UX, services or infrastructure. Our goal is to drive up user engagement and adoption of the platform while constantly working towards modernizing and improving platform performance and scalability. Deployment and maintenance require close interaction with various teams. This requires maintaining a positive and collaborative working relationship with teams within DOE as well as with wider Aladdin developer community. Production support for applications is usually required for issues that cannot be resolved by operations team. Creative and inventive problem-solving skills for reduced turnaround times are highly valued. Preparing user documentation to maintain both development and operations continuity is integral to the role. And Ideal candidate would have At least 4+ years experience as a data engineer Experience in SQL, Sybase, Linux is a must Experience coding in two of these languages for server side/data processing is required Java, Python, C++ 2+ years experience using modern data stack (spark, snowflake, Big Query etc.) on cloud platforms (Azure, GCP, AWS) Experience building ETL/ELT pipelines for complex data engineering projects (using Airflow, dbt, Great Expectations would be a plus) Experience with Database Modeling, Normalization techniques Experience with object-oriented design patterns Experience with dev ops tools like Git, Maven, Jenkins, Gitlab CI, Azure DevOps Experience with Agile development concepts and related tools Ability to trouble shoot and fix performance issues across the codebase and database queries Excellent written and verbal communication skills Ability to operate in a fast-paced environment Strong interpersonal skills with a can-do attitude under challenging circumstances BA/BS or equivalent practical experience Skills that would be a plus Perl, ETL tools (Informatica, Talend, dbt etc.) Experience with Snowflake or other Cloud Data warehousing products Exposure with Workflow management tools such as Airflow Exposure to messaging platforms such as Kafka Exposure to NoSQL platforms such as Cassandra, MongoDB Building and Delivering REST APIs

Posted 1 month ago

Apply

11.0 - 17.0 years

20 - 35 Lacs

Indore, Hyderabad

Work from Office

Greetings of the Day !! We have job opening for Microsoft Fabric + ADF with one of our clients. If you are interested in this position, please share update resume in this email id : shaswati.m@bct-consulting.com . * Primary Skill Microsoft Fabric Secondary Skill 1 Azure Data Factory (ADF) 12+ years of experience in Microsoft Azure Data Engineering for analytical projects. Proven expertise in designing, developing, and deploying high-volume, end-to-end ETL pipelines for complex models, including batch, and real-time data integration frameworks using Azure, Microsoft Fabric and Databricks. Extensive hands-on experience with Azure Data Factory, Databricks (with Unity Catalog), Azure Functions, Synapse Analytics, Data Lake, Delta Lake, and Azure SQL Database for managing and processing large-scale data integrations. Experience in Databricks cluster optimization and workflow management to ensure cost-effective and high-performance processing. Sound knowledge of data modelling, data governance, data quality management, and data modernization processes. Develop architecture blueprints and technical design documentation for Azure-based data solutions. Provide technical leadership and guidance on cloud architecture best practices, ensuring scalable and secure solutions. Keep abreast of emerging Azure technologies and recommend enhancements to existing systems. Lead proof of concepts (PoCs) and adopt agile delivery methodologies for solution development and delivery.

Posted 1 month ago

Apply

4.0 - 8.0 years

12 - 18 Lacs

Hyderabad, Chennai, Coimbatore

Hybrid

We are seeking a skilled and motivated Data Engineer to join our dynamic team. The ideal candidate will have experience in designing, developing, and maintaining scalable data pipelines and architectures using Hadoop, PySpark, ETL processes , and Cloud technologies . Responsibilities: Design, develop, and maintain data pipelines for processing large-scale datasets. Build efficient ETL workflows to transform and integrate data from multiple sources. Develop and optimize Hadoop and PySpark applications for data processing. Ensure data quality, governance, and security standards are met across systems. Implement and manage Cloud-based data solutions (AWS, Azure, or GCP). Collaborate with data scientists and analysts to support business intelligence initiatives. Troubleshoot performance issues and optimize query executions in big data environments. Stay updated with industry trends and advancements in big data and cloud technologies . Required Skills: Strong programming skills in Python, Scala, or Java . Hands-on experience with Hadoop ecosystem (HDFS, Hive, Spark, etc.). Expertise in PySpark for distributed data processing. Proficiency in ETL tools and workflows (SSIS, Apache Nifi, or custom pipelines). Experience with Cloud platforms (AWS, Azure, GCP) and their data-related services. Knowledge of SQL and NoSQL databases. Familiarity with data warehousing concepts and data modeling techniques. Strong analytical and problem-solving skills. Interested can reach us at +91 7305206696/ saranyadevib@talentien.com

Posted 1 month ago

Apply

5.0 - 10.0 years

7 - 17 Lacs

Chandigarh

Remote

We are looking for a skilled and motivated Data Engineer with deep expertise in GCP, BigQuery, Apache Airflow to join our data platform team. The ideal candidate should have hands-on experience building scalable data pipelines, automating workflows, migrating large-scale datasets, and optimizing distributed systems. The candidate should have experience with building Web APIs using Python. This role will play a key part in designing and maintaining robust data engineering solutions across cloud and on-prem environments. Key Responsibilities BigQuery & Cloud Data Pipelines: Design and implement scalable ETL pipelines for ingesting large-scale datasets. Build solutions for efficient querying of tables in BigQuery. Automated scheduled data ingestion using Google Cloud services and scheduled Apache Airflow DAGs Airflow DAG Development & Automation: Build dynamic and configurable DAGs using JSON-based input to be reused across multiple data processes Create DAGs for data migration to/from BigQuery and external systems (SFTP, SharePoint, Email etc.) Develop custom Airflow operators to meet business needs Data Security & Encryption : Build secure data pipelines with end-to-end encryption for external data exports and imports Data Migration & Integration: Experience in data migration and replication across various systems, including Salesforce, MySQL, SQL Server, and BigQuery Required Skills & Qualifications: Strong hands-on experience with Google BigQuery, Apache Airflow, and Cloud Storage (GCS/S3) Deep understanding of ETL/ELT concepts, data partitioning, and pipeline scheduling Proven ability to automate complex workflows and build reusable pipeline frameworks Programming knowledge in Python, SQL, and scripting for automation Hands-on experience with building web APIs/applications using Python Familiarity with cloud platforms (GCP/AWS) and distributed computing frameworks Strong problem-solving, analytical thinking, and debugging skills Basic understanding of Object Oriented Fundamentals Working knowledge of version control tools like Gitlab, Bitbucket Industry Knowledge & Experience Experience with deep expertise in Google BigQuery, Apache Airflow, Python, and SQL Experience with BI tools such as DOMO, Looker, and Tableau Working knowledge of Salesforce and data extraction methods Prior experience working with data encryption, SFTP automation, or ad-hoc data requests

Posted 1 month ago

Apply

4.0 - 7.0 years

15 - 30 Lacs

Hyderabad

Hybrid

What are the ongoing responsibilities of Data Engineer responsible for? We are building a growing Data and AI team. You will play a critical role in the efforts to centralize structured and unstructured data for the firm. We seek a candidate with skills in data modeling, data management and data governance, and can contribute first-hand towards firms data strategy. The ideal candidate is a self-starter with a strong technical foundation, a collaborative mindset, and the ability to navigate complex data challenges #ASSOCIATE What ideal qualifications, skills & experience would help someone to be successful? Bachelors degree in computer science or computer applications; or equivalent experience in lieu of degree with 3 years of industry experience. Strong expertise in data modeling and data management concepts. Experience in implementing master data management is preferred. Sound knowledge on Snowflake and data warehousing techniques. Experience in building, optimizing, and maintaining data pipelines and data management frameworks to support business needs. Proficiency in at least one programming language, preferably python. Collaborate with cross-functional teams to translate business needs into scalable data and AI-driven solutions. Take ownership of projects from ideation to production, operating in a startup-like culture within an enterprise environment. Excellent communication, collaboration, and ownership mindset. Foundational Knowledge of API development and integration. Knowledge of Tableau, Alteryx is good-to-have. Work Shift Timings - 2:00 PM - 11:00 PM IST

Posted 1 month ago

Apply

5.0 - 8.0 years

22 - 30 Lacs

Noida, Hyderabad, Bengaluru

Hybrid

Role: Data Engineer Exp: 5 to 8 Years Location: Bangalore, Noida, and Hyderabad (Hybrid, weekly 2 Days office must) NP: Immediate to 15 Days (Try to find only immediate joiners) Note: Candidate must have experience in Python, Kafka Streams, Pyspark, and Azure Databricks. Not looking for candidates who have only Exp in Pyspark and not in Python. Job Title: SSE Kafka, Python, and Azure Databricks (Healthcare Data Project) Experience: 5 to 8 years Role Overview: We are looking for a highly skilled with expertise in Kafka, Python, and Azure Databricks (preferred) to drive our healthcare data engineering projects. The ideal candidate will have deep experience in real-time data streaming, cloud-based data platforms, and large-scale data processing . This role requires strong technical leadership, problem-solving abilities, and the ability to collaborate with cross-functional teams. Key Responsibilities: Lead the design, development, and implementation of real-time data pipelines using Kafka, Python, and Azure Databricks . Architect scalable data streaming and processing solutions to support healthcare data workflows. Develop, optimize, and maintain ETL/ELT pipelines for structured and unstructured healthcare data. Ensure data integrity, security, and compliance with healthcare regulations (HIPAA, HITRUST, etc.). Collaborate with data engineers, analysts, and business stakeholders to understand requirements and translate them into technical solutions. Troubleshoot and optimize Kafka streaming applications, Python scripts, and Databricks workflows . Mentor junior engineers, conduct code reviews, and ensure best practices in data engineering . Stay updated with the latest cloud technologies, big data frameworks, and industry trends . Required Skills & Qualifications: 4+ years of experience in data engineering, with strong proficiency in Kafka and Python . Expertise in Kafka Streams, Kafka Connect, and Schema Registry for real-time data processing. Experience with Azure Databricks (or willingness to learn and adopt it quickly). Hands-on experience with cloud platforms (Azure preferred, AWS or GCP is a plus) . Proficiency in SQL, NoSQL databases, and data modeling for big data processing. Knowledge of containerization (Docker, Kubernetes) and CI/CD pipelines for data applications. Experience working with healthcare data (EHR, claims, HL7, FHIR, etc.) is a plus. Strong analytical skills, problem-solving mindset, and ability to lead complex data projects. Excellent communication and stakeholder management skills. Email: Sam@hiresquad.in

Posted 1 month ago

Apply

5.0 - 10.0 years

15 - 30 Lacs

Vadodara

Remote

We are seeking an experienced Senior Data Engineer to join our team. The ideal candidate will have a strong background in data engineering and AWS infrastructure, with hands-on experience in building and maintaining data pipelines and the necessary infrastructure components. The role will involve using a mix of data engineering tools and AWS services to design, build, and optimize data architecture. Key Responsibilities: Design, develop, and maintain data pipelines using Airflow and AWS services. Implement and manage data warehousing solutions with Databricks and PostgreSQL. Automate tasks using GIT / Jenkins. Develop and optimize ETL processes, leveraging AWS services like S3, Lambda, AppFlow, and DMS. Create and maintain visual dashboards and reports using Looker. Collaborate with cross-functional teams to ensure smooth integration of infrastructure components. Ensure the scalability, reliability, and performance of data platforms. Work with Jenkins for infrastructure automation. Technical and functional areas of expertise: Working as a senior individual contributor on a data intensive project Strong experience in building high performance, resilient & secure data processing pipelines preferably using Python based stack. Extensive experience in building data intensive applications with a deep understanding of querying and modeling with relational databases preferably on time-series data. Intermediate proficiency in AWS services (S3, Airflow) Proficiency in Python and PySpark Proficiency with ThoughtSpot or Databricks. Intermediate proficiency in database scripting (SQL) Basic experience with Jenkins for task automation Nice to Have : Intermediate proficiency in data analytics tools (Power BI / Tableau / Looker / ThoughSpot) Experience working with AWS Lambda, Glue, AppFlow, and other AWS transfer services. Exposure to PySpark and data automation tools like Jenkins or CircleCI. Familiarity with Terraform for infrastructure-as-code. Experience in data quality testing to ensure the accuracy and reliability of data pipelines. Proven experience working directly with U.S. client stakeholders. Ability to work independently and take the lead on tasks. Education and experience: Bachelors or masters in computer science or related fields. 5+ years of experience Stack/Skills needed: Databricks PostgreSQL Python & Pyspark AWS Stack Power BI / Tableau / Looker / ThoughSpot Familiarity with GIT and/or CI/CD tools

Posted 1 month ago

Apply

4.0 - 9.0 years

9 - 18 Lacs

Pune, Gurugram

Work from Office

The first Data Engineer specializes in traditional ETL with SAS DI and Big Data (Hadoop, Hive). The second is more versatile, skilled in modern data engineering with Python, MongoDB, and real-time processing.

Posted 1 month ago

Apply

6.0 - 11.0 years

20 - 35 Lacs

Hyderabad, Pune, Bengaluru

Hybrid

SAP Analytics Specialists (Datasphere & SAC) analytics solutions, build ETL pipelines, ensure data quality, create dashboards, visualizations with low/no-code tools, expertise in Datasphere, ETL, and data modeling. SAP Finance/Operations knowledge.

Posted 1 month ago

Apply

5.0 - 9.0 years

7 - 17 Lacs

Pune

Work from Office

Job Overview: Diacto is seeking an experienced and highly skilled Data Architect to lead the design and development of scalable and efficient data solutions. The ideal candidate will have strong expertise in Azure Databricks, Snowflake (with DBT, GitHub, Airflow), and Google BigQuery. This is a full-time, on-site role based out of our Baner, Pune office. Qualifications: B.E./B.Tech in Computer Science, IT, or related discipline MCS/MCA or equivalent preferred Key Responsibilities: Design, build, and optimize robust data architecture frameworks for large-scale enterprise solutions Architect and manage cloud-based data platforms using Azure Databricks, Snowflake, and BigQuery Define and implement best practices for data modeling, integration, governance, and security Collaborate with engineering and analytics teams to ensure data solutions meet business needs Lead development using tools such as DBT, Airflow, and GitHub for orchestration and version control Troubleshoot data issues and ensure system performance, reliability, and scalability Guide and mentor junior data engineers and developers Experience and Skills Required: 5 to12 years of experience in data architecture, engineering, or analytics roles Hands-on expertise in Databricks , especially Azure Databricks Proficient in Snowflake , with working knowledge of DBT, Airflow, and GitHub Experience with Google BigQuery and cloud-native data processing workflows Strong knowledge of modern data architecture, data lakes, warehousing, and ETL pipelines Excellent problem-solving, communication, and analytical skills Nice to Have: Certifications in Azure, Snowflake, or GCP Experience with containerization (Docker/Kubernetes) Exposure to real-time data streaming and event-driven architecture Why Join Diacto Technologies? Collaborate with experienced data professionals and work on high-impact projects Exposure to a variety of industries and enterprise data ecosystems Competitive compensation, learning opportunities, and an innovation-driven culture Work from our collaborative office space in Baner, Pune How to Apply: Option 1 (Preferred) Copy and paste the following link on your browser and submit your application for the automated interview process : - https://app.candidhr.ai/app/candidate/gAAAAABoRrTQoMsfqaoNwTxsE_qwWYcpcRyYJk7NzSUmO3LKb6rM-8FcU58CUPYQKc65n66feHor-TGdCEfyouj0NmKdgYcNbA==/ Option 2 1. Please visit our website's career section at https://www.diacto.com/careers/ 2. Scroll down to the " Who are we looking for ?" section 3. Find the listing for " Data Architect (Data Bricks) " and 4. Proceed with the virtual interview by clicking on " Apply Now ."

Posted 1 month ago

Apply

6.0 - 11.0 years

15 - 30 Lacs

Noida, Pune, Bengaluru

Hybrid

We are looking for a Snowflake Developer with deep expertise in Snowflake and DBT or SQL to help us build and scale our modern data platform. Key Responsibilities: Design and build scalable ELT pipelines in Snowflake using DBT/SQL . Develop efficient, well-tested DBT models (staging, intermediate, and marts layers). Implement data quality, testing, and monitoring frameworks to ensure data reliability and accuracy. Optimize Snowflake queries, storage, and compute resources for performance and cost-efficiency. Collaborate with cross-functional teams to gather data requirements and deliver data solutions. Required Qualifications: 5+ years of experience as a Data Engineer, with at least 4 years working with Snowflake . Proficient with DBT (Data Build Tool) including Jinja templating, macros, and model dependency management. Strong understanding of ELT patterns and modern data stack principles. Advanced SQL skills and experience with performance tuning in Snowflake. Interested candidates share your CV at himani.girnar@alikethoughts.com with below details Candidate's name- Email and Alternate Email ID- Contact and Alternate Contact no- Total exp- Relevant experience- Current Org- Notice period- CCTC- ECTC- Current Location- Preferred Location- Pancard No-

Posted 1 month ago

Apply

5.0 - 8.0 years

20 - 30 Lacs

Noida, Hyderabad, Bengaluru

Hybrid

Looking for Data Engineers, immediate joiners only, for Hyderabad, Bengaluru and Noida Location. * Must have experience in Python, Kafka Stream, Pyspark, and Azure Databricks.* Role and responsibilities: Lead the design, development, and implementation of real-time data pipelines using Kafka, Python, and Azure Databricks . Architect scalable data streaming and processing solutions to support healthcare data workflows. Develop, optimize, and maintain ETL/ELT pipelines for structured and unstructured healthcare data. Ensure data integrity, security, and compliance with healthcare regulations (HIPAA, HITRUST, etc.). Collaborate with data engineers, analysts, and business stakeholders to understand requirements and translate them into technical solutions. Troubleshoot and optimize Kafka streaming applications, Python scripts, and Databricks workflows . Mentor junior engineers, conduct code reviews, and ensure best practices in data engineering . Stay updated with the latest cloud technologies, big data frameworks, and industry trends . Preferred candidate profile : 5+ years of experience in data engineering, with strong proficiency in Kafka and Python . Expertise in Kafka Streams, Kafka Connect, and Schema Registry for real-time data processing. Experience with Azure Databricks (or willingness to learn and adopt it quickly). Hands-on experience with cloud platforms (Azure preferred, AWS or GCP is a plus) . Proficiency in SQL, NoSQL databases, and data modeling for big data processing. Knowledge of containerization (Docker, Kubernetes) and CI/CD pipelines for data applications. Experience working with healthcare data (EHR, claims, HL7, FHIR, etc.) is a plus. Strong analytical skills, problem-solving mindset, and ability to lead complex data projects. Excellent communication and stakeholder management skills. Interested, call: Rose (9873538143 / WA : 8595800635) rose2hiresquad@gmail.com

Posted 1 month ago

Apply

4.0 - 6.0 years

20 - 25 Lacs

Noida

Work from Office

Technical Requirements SQL (Advanced level): Strong command of complex SQL logic, including window functions, CTEs, pivot/unpivot, and be proficient in stored procedure/SQL script development. Experience writing maintainable SQL for transformations. Python for ETL : Ability to write modular and reusable ETL logic using Python. Familiarity with JSON manipulation and API consumption. ETL Pipeline Development : Experienced in developing ETL/ELT pipelines, data profiling, validation, quality/health check, error handling, logging and notifications, etc. Nice-to-Have Skills Knowledge of CI/CD practices for data workflows. Key Responsibilities Collaborate with analysts and data architects to develop and test ETL pipelines using SQL and Python in Data Brick and Yellowbrick. Perform related data quality checks and implement validation frameworks. Optimize queries for performance and cost-efficiency Technical Requirements SQL (Advanced level): Strong command of complex SQL logic, including window functions, CTEs, pivot/unpivot, and be proficient in stored procedure/SQL script development. Experience writing maintainable SQL for transformations. Python for ETL : Ability to write modular and reusable ETL logic using Python. Familiarity with JSON manipulation and API consumption. ETL Pipeline Development : Experienced in developing ETL/ELT pipelines, data profiling, validation, quality/health check, error handling, logging and notifications, etc. Nice-to-Have Skills: Experiences with AWS Redshift, Databrick and Yellow brick, Knowledge of CI/CD practices for data workflows. Roles and Responsibilities Leverage expertise in AWS Redshift, PostgreSQL, Databricks, and Yellowbrick to design and implement scalable data solutions. Partner with data analysts and architects to build and test robust ETL pipelines using SQL and Python. Develop and maintain data validation frameworks to ensure high data quality and reliability. Optimize database queries to enhance performance and ensure cost-effective data processing.

Posted 1 month ago

Apply

5.0 - 8.0 years

22 - 30 Lacs

Noida, Hyderabad, Bengaluru

Hybrid

Role: Data Engineer Exp: 5 to 8 Years Location: Bangalore, Noida, and Hyderabad (Hybrid, weekly 2 Days office must) NP: Immediate to 15 Days (Try to find only immediate joiners) Note: Candidate must have experience in Python, Kafka Streams, Pyspark, and Azure Databricks. Not looking for candidates who have only Exp in Pyspark and not in Python. Job Title: SSE Kafka, Python, and Azure Databricks (Healthcare Data Project) Experience: 5 to 8 years Role Overview: We are looking for a highly skilled with expertise in Kafka, Python, and Azure Databricks (preferred) to drive our healthcare data engineering projects. The ideal candidate will have deep experience in real-time data streaming, cloud-based data platforms, and large-scale data processing . This role requires strong technical leadership, problem-solving abilities, and the ability to collaborate with cross-functional teams. Key Responsibilities: Lead the design, development, and implementation of real-time data pipelines using Kafka, Python, and Azure Databricks . Architect scalable data streaming and processing solutions to support healthcare data workflows. Develop, optimize, and maintain ETL/ELT pipelines for structured and unstructured healthcare data. Ensure data integrity, security, and compliance with healthcare regulations (HIPAA, HITRUST, etc.). Collaborate with data engineers, analysts, and business stakeholders to understand requirements and translate them into technical solutions. Troubleshoot and optimize Kafka streaming applications, Python scripts, and Databricks workflows . Mentor junior engineers, conduct code reviews, and ensure best practices in data engineering . Stay updated with the latest cloud technologies, big data frameworks, and industry trends . Required Skills & Qualifications: 4+ years of experience in data engineering, with strong proficiency in Kafka and Python . Expertise in Kafka Streams, Kafka Connect, and Schema Registry for real-time data processing. Experience with Azure Databricks (or willingness to learn and adopt it quickly). Hands-on experience with cloud platforms (Azure preferred, AWS or GCP is a plus) . Proficiency in SQL, NoSQL databases, and data modeling for big data processing. Knowledge of containerization (Docker, Kubernetes) and CI/CD pipelines for data applications. Experience working with healthcare data (EHR, claims, HL7, FHIR, etc.) is a plus. Strong analytical skills, problem-solving mindset, and ability to lead complex data projects. Excellent communication and stakeholder management skills. Email: Sam@hiresquad.in

Posted 1 month ago

Apply

0.0 - 5.0 years

0 - 5 Lacs

Pune, Maharashtra, India

On-site

As a Data Engineer at IBM , you will harness the power of data to unveil captivating stories and intricate patterns. You'll contribute to data gathering, storage, and both batch and real-time processing. Collaborating closely with diverse teams, you'll play an important role in deciding the most suitable data management systems and identifying the crucial data required for insightful analysis. As a Data Engineer, you'll tackle obstacles related to database integration and untangle complex, unstructured data sets. Your Role and Responsibilities In this role, your responsibilities may include: Implementing and validating predictive models as well as creating and maintaining statistical models with a focus on big data, incorporating a variety of statistical and machine learning techniques. Designing and implementing various enterprise search applications such as Elasticsearch and Splunk for client requirements. Work in an Agile, collaborative environment , partnering with other scientists, engineers, consultants, and database administrators of all backgrounds and disciplines to bring analytical rigor and statistical methods to the challenges of predicting behaviors. Build teams or writing programs to cleanse and integrate data in an efficient and reusable manner, developing predictive or prescriptive models, and evaluating modeling results. Required Education Bachelor's Degree Preferred Education Master's Degree Required Technical and Professional Expertise Expertise in designing and implementing scalable data warehouse solutions on Snowflake , including schema design, performance tuning, and query optimization. Strong experience in building data ingestion and transformation pipelines using Talend to process structured and unstructured data from various sources. Proficiency in integrating data from cloud platforms into Snowflake using Talend and native Snowflake capabilities. Hands-on experience with dimensional and relational data modeling techniques to support analytics and reporting requirements. Preferred Technical and Professional Experience Understanding of optimizing Snowflake workloads, including clustering keys, caching strategies, and query profiling. Ability to implement robust data validation, cleansing, and governance frameworks within ETL processes. Proficiency in SQL and/or Shell scripting for custom transformations and automation tasks.

Posted 1 month ago

Apply

2.0 - 5.0 years

14 - 17 Lacs

Mumbai

Work from Office

As a Data Engineer at IBM, you'll play a vital role in the development, design of application, provide regular support/guidance to project teams on complex coding, issue resolution and execution. Your primary responsibilities include: Lead the design and construction of new solutions using the latest technologies, always looking to add business value and meet user requirements. Strive for continuous improvements by testing the build solution and working under an agile framework. Discover and implement the latest technologies trends to maximize and build creative solutions Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Experience with Apache Spark (PySpark)In-depth knowledge of Spark’s architecture, core APIs, and PySpark for distributed data processing. Big Data TechnologiesFamiliarity with Hadoop, HDFS, Kafka, and other big data tools. Data Engineering Skills: Strong understanding of ETL pipelines, data modeling, and data warehousing concepts. Strong proficiency in PythonExpertise in Python programming with a focus on data processing and manipulation. Data Processing FrameworksKnowledge of data processing libraries such as Pandas, NumPy. SQL ProficiencyExperience writing optimized SQL queries for large-scale data analysis and transformation. Cloud PlatformsExperience working with cloud platforms like AWS, Azure, or GCP, including using cloud storage systems Preferred technical and professional experience Define, drive, and implement an architecture strategy and standards for end-to-end monitoring. Partner with the rest of the technology teams including application development, enterprise architecture, testing services, network engineering, Good to have detection and prevention tools for Company products and Platform and customer-facing

Posted 1 month ago

Apply

2.0 - 5.0 years

14 - 17 Lacs

Kochi

Work from Office

As a Data Engineer at IBM, you'll play a vital role in the development, design of application, provide regular support/guidance to project teams on complex coding, issue resolution and execution Your primary responsibilities include: Lead the design and construction of new solutions using the latest technologies, always looking to add business value and meet user requirements. Strive for continuous improvements by testing the build solution and working under an agile framework. Discover and implement the latest technologies trends to maximize and build creative solutions Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Experience with Apache Spark (PySpark)In-depth knowledge of Spark’s architecture, core APIs, and PySpark for distributed data processing. Big Data TechnologiesFamiliarity with Hadoop, HDFS, Kafka, and other big data tools. Data Engineering Skills: Strong understanding of ETL pipelines, data modeling, and data warehousing concepts. * Strong proficiency in PythonExpertise in Python programming with a focus on data processing and manipulation. Data Processing FrameworksKnowledge of data processing libraries such as Pandas, NumPy. SQL ProficiencyExperience writing optimized SQL queries for large-scale data analysis and transformation. Cloud PlatformsExperience working with cloud platforms like AWS, Azure, or GCP, including using cloud storage systems Preferred technical and professional experience Define, drive, and implement an architecture strategy and standards for end-to-end monitoring. Partner with the rest of the technology teams including application development, enterprise architecture, testing services, network engineering, Good to have detection and prevention tools for Company products and Platform and customer-facing

Posted 1 month ago

Apply

3.0 - 8.0 years

9 - 13 Lacs

Mumbai

Work from Office

Role Overview : As a Big Data Engineer, you'll design and build robust data pipelines on Cloudera using Spark (Scala/PySpark) for ingestion, transformation, and processing of high-volume data from banking systems. Key Responsibilities : Build scalable batch and real-time ETL pipelines using Spark and Hive Integrate structured and unstructured data sources Perform performance tuning and code optimization Support orchestration and job scheduling (NiFi, Airflow) Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Experience 3-15 Years Proficiency in PySpark/Scala with Hive/Impala Experience with data partitioning, bucketing, and optimization Familiarity with Kafka, Iceberg, NiFi is a must Knowledge of banking or financial datasets is a plus

Posted 1 month ago

Apply

15.0 - 20.0 years

5 - 9 Lacs

Mumbai

Work from Office

Location Mumbai Role Overview : As a Big Data Engineer, you'll design and build robust data pipelines on Cloudera using Spark (Scala/PySpark) for ingestion, transformation, and processing of high-volume data from banking systems. Key Responsibilities : Build scalable batch and real-time ETL pipelines using Spark and Hive Integrate structured and unstructured data sources Perform performance tuning and code optimization Support orchestration and job scheduling (NiFi, Airflow) Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Experience3–15 years Proficiency in PySpark/Scala with Hive/Impala Experience with data partitioning, bucketing, and optimization Familiarity with Kafka, Iceberg, NiFi is a must Knowledge of banking or financial datasets is a plus

Posted 1 month ago

Apply

2.0 - 6.0 years

4 - 8 Lacs

Coimbatore

Work from Office

Key Responsibilities: Design, develop, and deploy scalable analytics and reporting solutions. Build and maintain ETL pipelines and data integration processes. Collaborate with data scientists, analysts, and business stakeholders to understand requirements and deliver insights. Develop custom dashboards and visualizations using tools such as Power BI, Tableau, or similar. Optimize database queries and ensure high performance of analytics systems. Automate data workflows and contribute to data quality initiatives. Implement security best practices in analytics platforms and applications. Stay updated with the latest trends in analytics technologies and tools.

Posted 1 month ago

Apply

4.0 - 9.0 years

12 - 22 Lacs

Gurugram

Work from Office

To Apply - Submit Details via Google Form - https://forms.gle/8SUxUV2cikzjvKzD9 As a Senior Consultant in our Consulting team, youll build and nurture positive working relationships with teams and clients with the intention to exceed client expectations Seeking experienced AWS Data Engineers to design, implement, and maintain robust data pipelines and analytics solutions using AWS services. The ideal candidate will have a strong background in AWS data services, big data technologies, and programming languages. Role & responsibilities 1. Design and implement scalable, high-performance data pipelines using AWS services 2. Develop and optimize ETL processes using AWS Glue, EMR, and Lambda 3. Build and maintain data lakes using S3 and Delta Lake 4. Create and manage analytics solutions using Amazon Athena and Redshift 5. Design and implement database solutions using Aurora, RDS, and DynamoDB 6. Develop serverless workflows using AWS Step Functions 7. Write efficient and maintainable code using Python/PySpark, and SQL/PostgrSQL 8. Ensure data quality, security, and compliance with industry standards 9. Collaborate with data scientists and analysts to support their data needs 10. Optimize data architecture for performance and cost-efficiency 11. Troubleshoot and resolve data pipeline and infrastructure issues Preferred candidate profile 1. Bachelors degree in computer science, Information Technology, or related field 2. Relevant years of experience as a Data Engineer, with at least 60% of experience focusing on AWS 3. Strong proficiency in AWS data services: Glue, EMR, Lambda, Athena, Redshift, S3 4. Experience with data lake technologies, particularly Delta Lake 5. Expertise in database systems: Aurora, RDS, DynamoDB, PostgreSQL 6. Proficiency in Python and PySpark programming 7. Strong SQL skills and experience with PostgreSQL 8. Experience with AWS Step Functions for workflow orchestration Technical Skills: - AWS Services: Glue, EMR, Lambda, Athena, Redshift, S3, Aurora, RDS, DynamoDB , Step Functions - Big Data: Hadoop, Spark, Delta Lake - Programming: Python, PySpark - Databases: SQL, PostgreSQL, NoSQL - Data Warehousing and Analytics - ETL/ELT processes - Data Lake architectures - Version control: Git - Agile methodologies

Posted 1 month ago

Apply

4.0 - 9.0 years

12 - 22 Lacs

Gurugram, Bengaluru

Work from Office

To Apply - Submit Details via Google Form - https://forms.gle/8SUxUV2cikzjvKzD9 As a Senior Consultant in our Consulting team, youll build and nurture positive working relationships with teams and clients with the intention to exceed client expectations Seeking experienced AWS Data Engineers to design, implement, and maintain robust data pipelines and analytics solutions using AWS services. The ideal candidate will have a strong background in AWS data services, big data technologies, and programming languages. Role & responsibilities 1. Design and implement scalable, high-performance data pipelines using AWS services 2. Develop and optimize ETL processes using AWS Glue, EMR, and Lambda 3. Build and maintain data lakes using S3 and Delta Lake 4. Create and manage analytics solutions using Amazon Athena and Redshift 5. Design and implement database solutions using Aurora, RDS, and DynamoDB 6. Develop serverless workflows using AWS Step Functions 7. Write efficient and maintainable code using Python/PySpark, and SQL/PostgrSQL 8. Ensure data quality, security, and compliance with industry standards 9. Collaborate with data scientists and analysts to support their data needs 10. Optimize data architecture for performance and cost-efficiency 11. Troubleshoot and resolve data pipeline and infrastructure issues Preferred candidate profile 1. Bachelors degree in computer science, Information Technology, or related field 2. Relevant years of experience as a Data Engineer, with at least 60% of experience focusing on AWS 3. Strong proficiency in AWS data services: Glue, EMR, Lambda, Athena, Redshift, S3 4. Experience with data lake technologies, particularly Delta Lake 5. Expertise in database systems: Aurora, RDS, DynamoDB, PostgreSQL 6. Proficiency in Python and PySpark programming 7. Strong SQL skills and experience with PostgreSQL 8. Experience with AWS Step Functions for workflow orchestration Technical Skills: - AWS Services: Glue, EMR, Lambda, Athena, Redshift, S3, Aurora, RDS, DynamoDB , Step Functions - Big Data: Hadoop, Spark, Delta Lake - Programming: Python, PySpark - Databases: SQL, PostgreSQL, NoSQL - Data Warehousing and Analytics - ETL/ELT processes - Data Lake architectures - Version control: Git - Agile methodologies

Posted 1 month ago

Apply

9.0 - 10.0 years

12 - 14 Lacs

Hyderabad

Work from Office

Responsibilities: * Design, develop & maintain data pipelines using Airflow/Data Flow/Data Lake * Optimize performance & scalability of ETL processes with SQL & Python

Posted 1 month ago

Apply

4.0 - 9.0 years

4 - 9 Lacs

Hyderabad, Pune, Bengaluru

Work from Office

Role & responsibilities ETL Testing: Design, develop, and execute comprehensive test cases and plans for ETL pipelines and workflows to validate data transformation, integration, and migration. DBT Data Validation: Use DBT to test, document, and monitor data models, including writing DBT tests ( generic, singular, custom). Snowflake Testing: Validate data loads, transformations, and warehousing in Snowflake / SQL environments. Automation & CI/CD: Design and implement automated testing scripts for ETL processes, using tools and frameworks to integrate testing in CI/CD pipelines. Test Documentation: Create and maintain test documentation, including requirements traceability, test cases, results, and defect logs. Collaboration: Work with data engineers, BI developers, and other stakeholders to clarify requirements, resolve defects, and ensure timely delivery. Defect Management: Identify, document, and track defects through resolution, providing insights for process improvements. Data Quality Assurance: Develop and execute data quality checks to ensure accuracy, consistency, and completeness.

Posted 1 month ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies