Home
Jobs

61 Aws Emr Jobs - Page 3

Filter
Filter Interviews
Min: 0 years
Max: 25 years
Min: ₹0
Max: ₹10000000
Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

3 - 5 years

12 - 14 Lacs

Delhi NCR, Mumbai, Bengaluru

Work from Office

Naukri logo

We are looking for a highly skilled and motivated Data Engineer to join our dynamic team. In this role, you will collaborate with cross-functional teams to design, build, and maintain scalable data platforms on the AWS Cloud. Youll play a key role in developing next-generation data solutions and optimizing current implementations. Key Responsibilities: Build and maintain high-performance data pipelines using AWS Glue, EMR, Databricks, and Spark. Design and implement robust ETL processes to integrate and analyze large datasets. Develop and optimize data models for reporting, analytics, and machine learning workflows. Use Python, PySpark, and SQL for data transformation and optimization. Ensure data governance, security, and performance on AWS Cloud platforms. Collaborate with stakeholders to translate business needs into technical solutions. Required Skills & Experience: 3-5 years of hands-on experience in data engineering. Proficiency in Python, SQL, and PySpark. Strong knowledge of Big Data ecosystems (Hadoop, Hive, Sqoop, HDFS). Expertise in Spark (Spark Core, Spark Streaming, Spark SQL) and Databricks. Experience with AWS services like EMR, Glue, S3, EC2/EKS, and Lambda. Solid understanding of data modeling, warehousing, and ETL processes. Familiarity with data governance, quality, and security principles. Location - Anywhere in india,hyderabad,ahmedabad,pune,chennai,kolkata.

Posted 3 months ago

Apply

3 - 5 years

3 - 8 Lacs

Noida

Work from Office

Naukri logo

We are seeking a highly skilled and motivated Data Engineer to join our dynamic team. As a Data Engineer, you will collaborate closely with our Data Scientists to develop and deploy machine learning models. Proficiency in below listed skills will be crucial in building and maintaining pipelines for training and inference datasets. Responsibilities: • Work in tandem with Data Scientists to design, develop, and implement machine learning pipelines. • Utilize PySpark for data processing, transformation, and preparation for model training. • Leverage AWS EMR and S3 for scalable and efficient data storage and processing. • Implement and manage ETL workflows using Streamsets for data ingestion and transformation. • Design and construct pipelines to deliver high-quality training and inference datasets. • Collaborate with cross-functional teams to ensure smooth deployment and real-time/near real-time inferencing capabilities. • Optimize and fine-tune pipelines for performance, scalability, and reliability. • Ensure IAM policies and permissions are appropriately configured for secure data access and management. • Implement Spark architecture and optimize Spark jobs for scalable data processing. Total Experience Expected: 04-06 years

Posted 3 months ago

Apply

5 - 8 years

15 - 25 Lacs

Pune, Bengaluru, Kolkata

Hybrid

Naukri logo

We're Hiring! Data Engineer (Python, PySpark, AWS) Immediate Joiners Only! Locations: Pune, Bangalore, Kolkata, Indore Job Type: Full-time Experience: 5 to 8 years Are you an experienced Data Engineer looking for an exciting opportunity? We seek a highly skilled professional with expertise in Python, PySpark, FastAPI, SQL, and AWS to join our team immediately ! Key Responsibilities: Develop and maintain scalable data pipelines using PySpark & SQL. Implement APIs using FastAPI for seamless data processing. Optimize and manage AWS services (Lambda, Glue, EMR, Athena, Iceberg). Ensure data quality, integrity, and security across various systems. Collaborate with cross-functional teams to enhance data architecture . Troubleshoot and monitor data workflows for performance efficiency. Required Skills: Strong programming skills in Python & PySpark Hands-on experience with FastAPI for API development Expertise in SQL query optimization Solid experience with AWS services (Lambda, Glue, EMR, Athena, Iceberg) Understanding of ETL processes, big data technologies, and distributed computing Ability to work in a fast-paced, collaborative environment Preferred Qualifications: Experience with Data Lakehouse architecture Knowledge of CI/CD pipelines & DevOps Familiarity with Docker & Kubernetes is a plus Strong problem-solving & analytical skills Who should apply? Professionals with 5-8 years of relevant experience Candidates who can join immediately Learn more about us: www.calsoftinc.com Interested? Please share your updated resume at ritu.singh@calsoftinc.com Apply now or refer a connection who fits the role! Industry

Posted 3 months ago

Apply

8 - 13 years

10 - 15 Lacs

Bengaluru

Work from Office

Naukri logo

Role:Lead position with Primary skillsets on AWS services with experience on EC3, S3, Redshift, RDS, AWS Glue/EMR, Python , PySpark, SQL, Airflow, Visualization tools & Databricks. Responsibilities: Design and implement the data modeling, data ingestion and data processing for various datasets Design, develop and maintain ETL Framework for various new data source Ability to migrate the existing Talend ETL workflow into new ETL framework using AWS Glue/ EMR, PySpark and/or data pipeline using python. Build orchestration workflow using Airflow Develop and execute adhoc data ingestion to support business analytics. Proactively interact with vendors for any questions and report the status accordingly Explore and evaluate the tools/service to support business requirement Ability to learn to create a data-driven culture and impactful data strategies. Aptitude towards learning new technologies and solving complex problem. Connect with Customer to get the requirement and ensure the timely delivery. Qualifications: Minimum of bachelors degree. Preferably in Computer Science, Information system, Information technology. Minimum 8+ years of experience on cloud platforms such as AWS, Azure, GCP. Minimum 8+ year of experience in Amazon Web Services like VPC, S3, EC3, Redshift, RDS, EMR, Athena, IAM, Glue, DMS, Data pipeline & API, Lambda, etc. Minimum of 8+ years of experience in ETL and data engineering using Python, AWS Glue, AWS EMR /PySpark, Talend and Airflow for orchestration. Minimum 8+ years of experience in SQL, Python, and source control such as Bitbucket, CICD for code deployment. Experience in PostgreSQL, SQL Server, MySQL & Oracle databases. Experience in MPP such as AWS Redshift and EMR. Experience in distributed programming with Python, Unix Scripting, MPP, RDBMS databases for data integration Experience building distributed high-performance systems using Spark/PySpark, AWS Glue and developing applications for loading/streaming data into databases, Redshift. Experience in Agile methodology Proven skills to write technical specifications for data extraction and good quality code. Experience with big data processing techniques using Sqoop, Spark, hive is additional plus Experience in analytic visualization tools. Design of data solutions on Databricks including delta lake, data warehouse, data marts and other data solutions to support the analytics needs of the organization. Should be an individual contributor with experience in above mentioned technologies Should be able to lead the offshore team and can ensure on time delivery, code review and work management among the team members. Should have experience in customer communication.

Posted 3 months ago

Apply

4 - 8 years

20 - 27 Lacs

Bengaluru, Bangalore Rural, Hyderabad

Hybrid

Naukri logo

Mode: Hybrid Please find the details below: Required Skills : Python, spark/pyspark, Hive, Databricks, SQl, Airflow/ Github, AWS Data Engineer We are seeking an experienced Data Engineer to join our team. As a Data Engineer, you will be responsible for designing, implementing, and maintaining data pipelines, ETL processes, and data infrastructure. The ideal candidate should have a strong background in Python, Spark, SQL, and Databricks. Responsibilities: Design, develop, and maintain data pipelines to extract, transform, and load data from various sources into our data lake or data warehouse. Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and ensure data quality. Optimize and tune data processing workflows for performance and scalability. Implement data governance and security best practices. Troubleshoot and resolve data-related issues. Monitor and maintain data infrastructure, ensuring high availability and reliability. Stay up-to-date with industry trends and emerging technologies in data engineering. Qualifications: Bachelors degree in Computer Science, Engineering, or a related field. 4+ years of experience as a Data Engineer. Strong proficiency in Python for data processing and ETL. Experience with Apache Spark for distributed data processing. Proficiency in SQL for querying and manipulating data. Hands-on experience with Databricks or similar cloud-based data platforms. Familiarity with data modeling, data warehousing, and data lake concepts. Excellent problem-solving skills and attention to detail. Strong communication and collaboration skills. If you are passionate about data engineering and enjoy working in a dynamic environment, we encourage you to apply!

Posted 3 months ago

Apply

4 - 6 years

6 - 8 Lacs

Pune

Work from Office

Naukri logo

Capgemini Invent Capgemini Invent is the digital innovation, consulting and transformation brand of the Capgemini Group, a global business line that combines market leading expertise in strategy, technology, data science and creative design, to help CxOs envision and build whats next for their businesses. Your Role Design, develop, and maintain scalable ETL/ELT pipelines to process structured, semi-structured, and unstructured data on the cloud. Build and optimize cloud storage, data lakes, and data warehousing solutions using platforms like Snowflake, BigQuery, AWS Redshift, ADLS, and S3. Develop cloud utility functions using services like AWS Lambda, AWS Step Functions, Cloud Run, Cloud Functions, and Azure Functions. Utilize cloud-native data integration tools, such as Azure Databricks, Azure Data Factory, AWS Glue, AWS EMR, Dataflow, and Dataproc, to transform and analyze data. Your Profile Has 4-5 years of IT experience with minimum 3 years of experience in creating data pipelines, ETL/ELT on cloud. Has experience with any of these cloud providers - AWS, Azure, GCP. Experience with cloud storage, cloud database, cloud data-warehousing and Data lake solutions like Snowflake, Big query, AWS Redshift, ADLS, S3. Experience in writing cloud utility functions such as AWS lambda, AWS step functions, Cloud Run, Cloud functions, Azure functions. Experience in using cloud data integration services for structured, semi structured and unstructured data such as Azure Databricks, Azure Data Factory, Azure Synapse Analytics, AWS Glue, AWS EMR, Dataflow, Dataproc. Exposure to cloud Dev ops practices such as infrastructure as code, CI/CD components, and automated deployments on cloud. What you will love about working here We recognize the significance of flexible work arrangements to provide support. Be it remote work, or flexible work hours, you will get an environment to maintain healthy work life balance. At the heart of our mission is your career growth. Our array of career growth programs and diverse professions are crafted to support you in exploring a world of opportunities. Equip yourself with valuable certifications in the latest technologies such as Generative AI. About Capgemini Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, cloud and data, combined with its deep industry expertise and partner ecosystem. The Group reported 2023 global revenues of 22.5 billion.

Posted 3 months ago

Apply

2 - 4 years

4 - 6 Lacs

Kochi

Work from Office

Naukri logo

As a Big Data Engineer, you will develop, maintain, evaluate, and test big data solutions. You will be involved in data engineering activities like creating pipelines/workflows for Source to Target and implementing solutions that tackle the clients needs.Your primary responsibilities include: Design, build, optimize and support new and existing data models and ETL processes based on our clients business requirements. Build, deploy and manage data infrastructure that can adequately handle the needs of a rapidly growing data driven organization. Coordinate data access and security to enable data scientists and analysts to easily access to data whenever they need too Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Developed the Pysprk code for AWS Glue jobs and for EMR.. Worked on scalable distributed data system using Hadoop ecosystem in AWS EMR, MapR distribution.. Developed Python and pyspark programs for data analysis.. Good working experience with python to develop Custom Framework for generating of rules (just like rules engine). Developed Hadoop streaming Jobs using python for integrating python API supported applications.. Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark. Apache Spark DataFrames/RDD's were used to apply business transformations and utilized Hive Context objects to perform read/write operations.. Re - write some Hive queries to Spark SQL to reduce the overall batch time Preferred technical and professional experience Understanding of Devops. Experience in building scalable end-to-end data ingestion and processing solutions Experience with object-oriented and/or functional programming languages, such as Python, Java and Scala

Posted 3 months ago

Apply

4 - 9 years

6 - 11 Lacs

Kochi

Work from Office

Naukri logo

Responsibilities As Data Engineer, you will develop, maintain, evaluate and test big data solutions. You will be involved in the development of data solutions using Spark Framework with Python or Scala on Hadoop and AWS Cloud Data Platform Responsibilities: Experienced in building data pipelines to Ingest, process, and transform data from files, streams and databases. Process the data with Spark, Python, PySpark, Scala, and Hive, Hbase or other NoSQL databases on Cloud Data Platforms (AWS) or HDFS Experienced in develop efficient software code for multiple use cases leveraging Spark Framework / using Python or Scala and Big Data technologies for various use cases built on the platform Experience in developing streaming pipelines Experience to work with Hadoop / AWS eco system components to implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Apache Spark, Kafka, any Cloud computing etc Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Minimum 4+ years of experience in Big Data technologies with extensive data engineering experience in Spark / Python or Scala ; Minimum 3 years of experience on Cloud Data Platforms on AWS; Experience in AWS EMR / AWS Glue / DataBricks, AWS RedShift, DynamoDB Good to excellent SQL skills Exposure to streaming solutions and message brokers like Kafka technologies Preferred technical and professional experience Certification in AWS and Data Bricks or Cloudera Spark Certified developers

Posted 3 months ago

Apply

4 - 9 years

6 - 11 Lacs

Kochi

Work from Office

Naukri logo

Responsibilities As Data Engineer, you will develop, maintain, evaluate and test big data solutions. You will be involved in the development of data solutions using Spark Framework with Python or Scala on Hadoop and AWS Cloud Data Platform Responsibilities: Experienced in building data pipelines to Ingest, process, and transform data from files, streams and databases. Process the data with Spark, Python, PySpark, Scala, and Hive, Hbase or other NoSQL databases on Cloud Data Platforms (AWS) or HDFS Experienced in develop efficient software code for multiple use cases leveraging Spark Framework / using Python or Scala and Big Data technologies for various use cases built on the platform Experience in developing streaming pipelines Experience to work with Hadoop / AWS eco system components to implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Apache Spark, Kafka, any Cloud computing etc Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Total 5 - 7+ years of experience in Data Management (DW, DL, Data Platform, Lakehouse) and Data Engineering skills Minimum 4+ years of experience in Big Data technologies with extensive data engineering experience in Spark / Python or Scala. Minimum 3 years of experience on Cloud Data Platforms on AWS; Exposure to streaming solutions and message brokers like Kafka technologies. Experience in AWS EMR / AWS Glue / DataBricks, AWS RedShift, DynamoDB Good to excellent SQL skills Preferred technical and professional experience Certification in AWS and Data Bricks or Cloudera Spark Certified developers

Posted 3 months ago

Apply

4 - 9 years

6 - 11 Lacs

Kochi

Work from Office

Naukri logo

Responsibilities As a senior SAP Consultant, you will serve as a client-facing practitioner working collaboratively with clients to deliver high-quality solutions and be a trusted business advisor with deep understanding of SAP Accelerate delivery methodology or equivalent and associated work products. You will work on projects that assist clients in integrating strategy, process, technology, and information to enhance effectiveness, reduce costs, and improve profit and shareholder value. There are opportunities for you to acquire new skills, work across different disciplines, take on new challenges, and develop a comprehensive understanding of various industries. Your primary responsibilities include:Strategic SAP Solution Focus:Working across technical design, development, and implementation of SAP solutions for simplicity, amplification, and maintainability that meet client needs. Comprehensive Solution Delivery:Involvement in strategy development and solution implementation, leveraging your knowledge of SAP and working with the latest technologies. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Total 5 - 7+ years of experience in Data Management (DW, DL, Data Platform, Lakehouse) and Data Engineering skills Minimum 4+ years of experience in Big Data technologies with extensive data engineering experience in Spark / Python or Scala. Minimum 3 years of experience on Cloud Data Platforms on AWS; Exposure to streaming solutions and message brokers like Kafka technologies. Experience in AWS EMR / AWS Glue / DataBricks, AWS RedShift, DynamoDB Good to excellent SQL skills Preferred technical and professional experience Certification in AWS and Data Bricks or Cloudera Spark Certified developers

Posted 3 months ago

Apply

3 - 5 years

5 - 8 Lacs

Pune

Work from Office

Naukri logo

Data engineers are responsible for building reliable and scalable data infrastructure that enables organizations to derive meaningful insights, make data-driven decisions, and unlock the value of their data assets. Job Description - Grade Specific The involves leading and managing a team of data engineers, overseeing data engineering projects, ensuring technical excellence, and fostering collaboration with stakeholders. They play a critical role in driving the success of data engineering initiatives and ensuring the delivery of reliable and high-quality data solutions to support the organization's data-driven objectives. Skills (competencies) Ab Initio Agile (Software Development Framework) Apache Hadoop AWS Airflow AWS Athena AWS Code Pipeline AWS EFS AWS EMR AWS Redshift AWS S3 Azure ADLS Gen2 Azure Data Factory Azure Data Lake Storage Azure Databricks Azure Event Hub Azure Stream Analytics Azure Sunapse Bitbucket Change Management Client Centricity Collaboration Continuous Integration and Continuous Delivery (CI/CD) Data Architecture Patterns Data Format Analysis Data Governance Data Modeling Data Validation Data Vault Modeling Database Schema Design Decision-Making DevOps Dimensional Modeling GCP Big Table GCP BigQuery GCP Cloud Storage GCP DataFlow GCP DataProc Git Google Big Tabel Google Data Proc Greenplum HQL IBM Data Stage IBM DB2 Industry Standard Data Modeling (FSLDM) Industry Standard Data Modeling (IBM FSDM)) Influencing Informatica IICS Inmon methodology JavaScript Jenkins Kimball Linux - Redhat Negotiation Netezza NewSQL Oracle Exadata Performance Tuning Perl Platform Update Management Project Management PySpark Python R RDD Optimization SantOs SaS Scala Spark Shell Script Snowflake SPARK SPARK Code Optimization SQL Stakeholder Management Sun Solaris Synapse Talend Teradata Time Management Ubuntu Vendor Management

Posted 3 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies