Jobs
Interviews

7 Map Reduce Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 - 12.0 years

0 Lacs

chennai, tamil nadu

On-site

You should have 5-12 years of experience in Big Data & Data related technologies, with expertise in distributed computing principles. Your skills should include an expert level understanding of Apache Spark and hands-on programming with Python. Proficiency in Hadoop v2, Map Reduce, HDFS, and Sqoop is required. Experience in building stream-processing systems using technologies like Apache Storm or Spark-Streaming, as well as working with messaging systems such as Kafka or RabbitMQ, will be beneficial. A good understanding of Big Data querying tools like Hive and Impala, along with integration of data from multiple sources including RDBMS, ERP, and Files, is necessary. You should possess knowledge of SQL queries, joins, stored procedures, and relational schemas. Experience with NoSQL databases like HBase, Cassandra, and MongoDB, along with ETL techniques and frameworks, is expected. Performance tuning of Spark Jobs and familiarity with native Cloud data services like AWS or AZURE Databricks is essential. The role requires the ability to efficiently lead a team, design and implement Big data solutions, and work as a practitioner of AGILE methodology. This position falls under the category of Data Engineer and is suitable for individuals with expertise in ML/AI Engineers, Data Scientists, and Software Engineers.,

Posted 1 day ago

Apply

3.0 - 8.0 years

0 Lacs

pune, maharashtra

On-site

You should have strong experience in PySpark, Python, Unix scripting, SparkSQL, and Hive. You must be proficient in writing SQL queries, creating views, and possess excellent oral and written communication skills. Prior experience in the Insurance domain would be beneficial. A good understanding of the Hadoop Ecosystem including HDFS, Map Reduce, Pig, Hive, Oozie, and Yarn is required. Knowledge of AWS services such as Glue, AWS S3, Lambda function, Step Function, and EC2 is essential. Experience in data migration from platforms like Hive/S3 to Data Bricks is a plus. You should be able to prioritize, plan, organize, and manage multiple tasks efficiently while delivering high-quality work. As a candidate, you should have 6-8 years of technical experience in PySpark, AWS (Glue, EMR, Lambda, Steps functions, S3), with at least 3 years of experience in Big Data/ETL using Python, Spark, and Hive, along with 3+ years of experience in AWS. Your primary key skills should include PySpark, AWS (Glue, EMR, Lambda, Steps functions, S3), and Big Data with Python, Spark, and Hive experience. Exposure to Big Data migration is also important. Secondary key skills that would be beneficial for this role include Informatica BDM/Power center, Data Bricks, and MongoDB.,

Posted 6 days ago

Apply

2.0 - 6.0 years

0 Lacs

maharashtra

On-site

Job Description: We are looking for a skilled PySpark Developer having 4-5 or 2-3 years of experience to join our team. As a PySpark Developer, you will be responsible for developing and maintaining data processing pipelines using PySpark, Apache Spark's Python API. You will work closely with data engineers, data scientists, and other stakeholders to design and implement scalable and efficient data processing solutions. Bachelor's or Master's degree in Computer Science, Data Science, or a related field is required. The ideal candidate should have strong expertise in the Big Data ecosystem including Spark, Hive, Sqoop, HDFS, Map Reduce, Oozie, Yarn, HBase, Nifi. The candidate should be below 35 years of age and have experience in designing, developing, and maintaining PySpark data processing pipelines to process large volumes of structured and unstructured data. Additionally, the candidate should collaborate with data engineers and data scientists to understand data requirements and design efficient data models and transformations. Optimizing and tuning PySpark jobs for performance, scalability, and reliability is a key responsibility. Implementing data quality checks, error handling, and monitoring mechanisms to ensure data accuracy and pipeline robustness is crucial. The candidate should also develop and maintain documentation for PySpark code, data pipelines, and data workflows. Experience in developing production-ready Spark applications using Spark RDD APIs, Data frames, Datasets, Spark SQL, and Spark Streaming is required. Strong experience of HIVE Bucketing and Partitioning, as well as writing complex hive queries using analytical functions, is essential. Knowledge in writing custom UDFs in Hive to support custom business requirements is a plus. If you meet the above qualifications and are interested in this position, please email your resume, mentioning the position applied for in the subject column at: careers@cdslindia.com.,

Posted 1 week ago

Apply

6.0 - 10.0 years

0 Lacs

karnataka

On-site

The Conversational AI team at Walmart is responsible for building and deploying core AI assistant experiences across Walmart, catering to millions of active users globally. As a Staff Data Scientist, you will play a crucial role in leading the evolution of the AI assistant platform by developing highly scalable Generative AI systems and infrastructure. This hands-on leadership position requires expertise in machine learning, ASR, large-scale distributed systems, multi-modal LLMs, and more. Your responsibilities will include partnering with key business stakeholders to drive the development and planning of proof of concepts and production AI solutions within the Conversational AI space. You will be involved in translating business requirements into strategies, initiatives, and projects aligned with business objectives. Designing, testing, and deploying cutting-edge AI solutions at scale to enhance customer experiences will be a key aspect of your role. Collaboration with applied scientists, ML engineers, software engineers, and product managers will be essential in developing the next generation of AI assistant experiences. Staying updated on industry trends in Generative AI, Speech/Video processing, and AI assistant architecture patterns will be crucial. Additionally, providing technical leadership, guidance, and mentorship to a skilled team of data scientists, as well as driving innovation through problem-solving cycles and research publication, are integral parts of this role. To qualify for this position, you should have a Master's degree with 8+ years or a Ph.D. with 6+ years of relevant experience in Computer Science, Statistics, Mathematics, or a related field. A strong track record in a data science tech lead role, extensive experience in designing and deploying AI products, and expertise in machine learning, NLP, speech processing, image processing, and deep learning models are required. Proficiency in industry tools and technologies, a deep interest in generative AI, and exceptional decision-making skills will be assets in this role. Furthermore, you should possess a thorough understanding of distributed technologies, public cloud platforms, and big data systems, along with experience working with geographically distributed teams. Business acumen, research acumen with publications in top-tier AI conferences, and strong programming skills in Python and Java are also essential qualifications for this position. Join the Conversational AI team at Walmart Global Tech, where you will have the opportunity to make a significant impact, innovate at scale, and shape the future of retail while working in a collaborative and inclusive environment.,

Posted 1 week ago

Apply

4.0 - 8.0 years

0 Lacs

maharashtra

On-site

The opportunity available at EY is for a Bigdata Engineer based in Pune, requiring a minimum of 4 years of experience. As a key member of the technical team, you will collaborate with Engineers, Data Scientists, and Data Users in an Agile environment. Your responsibilities will include software design, Scala & Spark development, automated testing, promoting development standards, production support, troubleshooting, and liaising with BAs to ensure correct interpretation and implementation of requirements. You will be involved in implementing tools and processes, handling performance, scale, availability, accuracy, and monitoring. Additionally, you will participate in regular planning and status meetings, provide input in Sprint reviews and retrospectives, and contribute to system architecture and design. Peer code reviews will also be a part of your responsibilities. Key technical skills required for this role include Scala or Java development and design, experience with technologies such as Apache Hadoop, Apache Spark, Spark streaming, YARN, Kafka, Hive, Python, and ETL frameworks. Hands-on experience in building data pipelines using Hadoop components and familiarity with version control tools, automated deployment tools, and requirement management is essential. Knowledge of big data modelling techniques and debugging code issues are also necessary. Desired qualifications include experience with Elastic search, scheduling tools like Airflow and Control-M, understanding of Cloud design patterns, exposure to DevOps & Agile Project methodology, and Hive QL development. The ideal candidate will possess strong communication skills, the ability to collaborate effectively, mentor developers, and lead technical initiatives. A Bachelors or Masters degree in Computer Science, Engineering, or a related field is required. EY is looking for individuals who can work collaboratively across teams, solve complex problems, and deliver practical solutions while adhering to commercial and legal requirements. The organization values agility, curiosity, mindfulness, positive energy, adaptability, and creativity in its employees. EY offers a personalized Career Journey, ample learning opportunities, and resources to help individuals understand their roles and opportunities better. EY is committed to being an inclusive employer that focuses on achieving a balance between delivering excellent client service and supporting the career growth and wellbeing of its employees. As a global leader in assurance, tax, transaction, and advisory services, EY believes in providing training, opportunities, and creative freedom to its employees to help build a better working world. The organization encourages personal and professional growth, offering motivating and fulfilling experiences to help individuals reach their full potential.,

Posted 3 weeks ago

Apply

0.0 years

0 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

Job Description: Job description Skills AWS EMR Key Responsibilities: A day in the life of an Infoscion As part of the Infosys delivery team your primary role would be to interface with the client for quality assurance issue resolution and ensuring high customer satisfaction You will understand requirements create and review designs validate the architecture and ensure high levels of service offerings to clients in the technology domain You will participate in project estimation provide inputs for solution delivery conduct technical risk planning perform code reviews and unit test plan reviews You will lead and guide your teams towards developing optimized high quality code deliverables continual knowledge management and adherence to the organizational guidelines and processes You would be a key contributor to building efficient programs systems and if you think you fit right in to help our clients navigate their next in their digital transformation journey this is the place for you If you think you fit right in to help our clients navigate their next in their digital transformation journey this is the place for you Technical Requirements: Primary skills Technology Big Data Data Processing Map Reduce Preferred Skills: Technology->Big Data - Data Processing->Map Reduce

Posted 2 months ago

Apply

7 - 11 years

50 - 60 Lacs

Mumbai, Delhi / NCR, Bengaluru

Work from Office

Role :- Resident Solution ArchitectLocation: RemoteThe Solution Architect at Koantek builds secure, highly scalable big data solutions to achieve tangible, data-driven outcomes all the while keeping simplicity and operational effectiveness in mind This role collaborates with teammates, product teams, and cross-functional project teams to lead the adoption and integration of the Databricks Lakehouse Platform into the enterprise ecosystem and AWS/Azure/GCP architecture This role is responsible for implementing securely architected big data solutions that are operationally reliable, performant, and deliver on strategic initiatives Specific requirements for the role include: Expert-level knowledge of data frameworks, data lakes and open-source projects such as Apache Spark, MLflow, and Delta Lake Expert-level hands-on coding experience in Python, SQL ,Spark/Scala,Python or Pyspark In depth understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, RDD caching, Spark MLib IoT/event-driven/microservices in the cloud- Experience with private and public cloud architectures, pros/cons, and migration considerations Extensive hands-on experience implementing data migration and data processing using AWS/Azure/GCP services Extensive hands-on experience with the Technology stack available in the industry for data management, data ingestion, capture, processing, and curation: Kafka, StreamSets, Attunity, GoldenGate, Map Reduce, Hadoop, Hive, Hbase, Cassandra, Spark, Flume, Hive, Impala, etc Experience using Azure DevOps and CI/CD as well as Agile tools and processes including Git, Jenkins, Jira, and Confluence Experience in creating tables, partitioning, bucketing, loading and aggregating data using Spark SQL/Scala Able to build ingestion to ADLS and enable BI layer for Analytics with strong understanding of Data Modeling and defining conceptual logical and physical data models Proficient level experience with architecture design, build and optimization of big data collection, ingestion, storage, processing, and visualization Responsibilities : Work closely with team members to lead and drive enterprise solutions, advising on key decision points on trade-offs, best practices, and risk mitigationGuide customers in transforming big data projects,including development and deployment of big data and AI applications Promote, emphasize, and leverage big data solutions to deploy performant systems that appropriately auto-scale, are highly available, fault-tolerant, self-monitoring, and serviceable Use a defense-in-depth approach in designing data solutions and AWS/Azure/GCP infrastructure Assist and advise data engineers in the preparation and delivery of raw data for prescriptive and predictive modeling Aid developers to identify, design, and implement process improvements with automation tools to optimizing data delivery Implement processes and systems to monitor data quality and security, ensuring production data is accurate and available for key stakeholders and the business processes that depend on it Employ change management best practices to ensure that data remains readily accessible to the business Implement reusable design templates and solutions to integrate, automate, and orchestrate cloud operational needs and experience with MDM using data governance solutions Qualifications : Overall experience of 12+ years in the IT field Hands-on experience designing and implementing multi-tenant solutions using Azure Databricks for data governance, data pipelines for near real-time data warehouse, and machine learning solutions Design and development experience with scalable and cost-effective Microsoft Azure/AWS/GCP data architecture and related solutions Experience in a software development, data engineering, or data analytics field using Python, Scala, Spark, Java, or equivalent technologies Bachelors or Masters degree in Big Data, Computer Science, Engineering, Mathematics, or similar area of study or equivalent work experience Good to have- - Advanced technical certifications: Azure Solutions Architect Expert, - AWS Certified Data Analytics, DASCA Big Data Engineering and Analytics - AWS Certified Cloud Practitioner, Solutions Architect - Professional Google Cloud Certified Location : - Mumbai, Delhi / NCR, Bengaluru , Kolkata, Chennai, Hyderabad, Ahmedabad, Pune, Remote

Posted 2 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies