Jobs
Interviews

237 Cloudera Jobs - Page 9

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 - 10.0 years

5 - 15 Lacs

Chennai

Hybrid

Role & responsibilities Bigdata, Hadoop, Hive, SQL, Cloudera, Impala, Python, Pyspark Fundamentals of: Big data Cloudera Platform Unix Python Expertise in: SQL/HIVE Pyspark Nice to have: Django/Flask frameworks

Posted 2 months ago

Apply

2.0 - 7.0 years

3 - 7 Lacs

Kochi

Work from Office

Job Description Skills and Requirements: Resource must have 2+ years of IT Industry experience. Resource must have worked in AEMR, PMTX datasets The Resource must have strong background in data analysis, visualization and be proficient in using SharePoint, Python, Hive tables and Cloudera platform Should be able to Create Tableau Extracts and schedule it based on the requirement. Perform Data analysis to Identify trends and patterns in data extracts that supports business decision making. Should possess Strong SQL skills for data querying and manipulation. Ability to work independently and as part of a team.

Posted 2 months ago

Apply

1.0 - 4.0 years

1 - 5 Lacs

Mumbai

Work from Office

Location Mumbai Role Overview : As a Big Data Engineer, you'll design and build robust data pipelines on Cloudera using Spark (Scala/PySpark) for ingestion, transformation, and processing of high-volume data from banking systems. Key Responsibilities : Build scalable batch and real-time ETL pipelines using Spark and Hive Integrate structured and unstructured data sources Perform performance tuning and code optimization Support orchestration and job scheduling (NiFi, Airflow) Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Skills Required : Proficiency in PySpark/Scala with Hive/Impala Experience with data partitioning, bucketing, and optimization Familiarity with Kafka, Iceberg, NiFi is a must Knowledge of banking or financial datasets is a plus

Posted 2 months ago

Apply

2.0 - 5.0 years

14 - 17 Lacs

Mumbai

Work from Office

As a Big Data Engineer, you will develop, maintain, evaluate, and test big data solutions. You will be involved in data engineering activities like creating pipelines/workflows for Source to Target and implementing solutions that tackle the clients needs. Your primary responsibilities include Design, build, optimize and support new and existing data models and ETL processes based on our clients business requirements. Build, deploy and manage data infrastructure that can adequately handle the needs of a rapidly growing data driven organization. Coordinate data access and security to enable data scientists and analysts to easily access to data whenever they need too, Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Must have 5+ years exp in Big Data -Hadoop Spark -Scala ,Python Hbase, Hive Good to have Aws -S3, athena ,Dynomo DB, Lambda, Jenkins GIT Developed Python and pyspark programs for data analysis. Good working experience with python to develop Custom Framework for generating of rules (just like rules engine). Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark. Apache Spark DataFrames/RDD's were used to apply business transformations and utilized Hive Context objects to perform read/write operations, Preferred technical and professional experience Understanding of Devops. Experience in building scalable end-to-end data ingestion and processing solutions Experience with object-oriented and/or functional programming languages, such as Python, Java and Scala

Posted 2 months ago

Apply

2.0 - 7.0 years

3 - 7 Lacs

Kochi

Work from Office

Skills and Requirements: Resource must have 2+ years of IT Industry experience. Resource must have worked in AEMR, PMTX datasets The Resource must have strong background in data analysis, visualization and be proficient in using SharePoint, Python, Hive tables and Cloudera platform Should be able to Create Tableau Extracts and schedule it based on the requirement. Perform Data analysis to Identify trends and patterns in data extracts that supports business decision making. Should possess Strong SQL skills for data querying and manipulation. Ability to work independently and as part of a team.

Posted 2 months ago

Apply

5.0 - 8.0 years

5 - 9 Lacs

Kolkata

Work from Office

Role Purpose The purpose of this role is to design, develop and troubleshoot solutions/ designs/ models/ simulations on various softwares as per clients/ project requirements Do 1. Design and Develop solutions as per clients specifications Work on different softwares like CAD, CAE to develop appropriate models as per the project plan/ customer requirements Test the protype and designs produced on the softwares and check all the boundary conditions (impact analysis, stress analysis etc) Produce specifications and determine operational feasibility by integrating software components into a fully functional software system Create a prototype as per the engineering drawings & outline CAD model is prepared Perform failure effect mode analysis (FMEA) for any new requirements received from the client Provide optimized solutions to the client by running simulations in virtual environment Ensure software is updated with latest features to make it cost effective for the client Enhance applications/ solutions by identifying opportunities for improvement, making recommendations and designing and implementing systems Follow industry standard operating procedures for various processes and systems as per the client requirement while modeling a solution on the software 2. Provide customer support and problem solving from time to time Perform defect fixing raised by the client or software integration team while solving the tickets raised Develop software verification plans and quality assurance procedures for the customer Troubleshoot, debug and upgrade existing systems on time & with minimum latency and maximum efficiency Deploy programs and evaluate user feedback for adequate resolution with customer satisfaction Comply with project plans and industry standards 3. Ensure reporting & documentation for the client Ensure weekly, monthly status reports for the clients as per requirements Maintain documents and create a repository of all design changes, recommendations etc Maintain time-sheets for the clients Providing written knowledge transfer/ history of the project Deliver No. Performance Parameter Measure 1. Design and develop solutions Adherence to project plan/ schedule, 100% error free on boarding & implementation, throughput % 2. Quality & CSAT On-Time Delivery, minimum corrections, first time right, no major defects post production, 100% compliance of bi-directional traceability matrix, completion of assigned certifications for skill upgradation 3. MIS & Reporting 100% on time MIS & report generation Mandatory Skills: StreamSets. Experience5-8 Years.

Posted 2 months ago

Apply

5.0 - 8.0 years

4 - 8 Lacs

Pune

Work from Office

Role Purpose The purpose of the role is to support process delivery by ensuring daily performance of the Production Specialists, resolve technical escalations and develop technical capability within the Production Specialists. Do Oversee and support process by reviewing daily transactions on performance parameters Review performance dashboard and the scores for the team Support the team in improving performance parameters by providing technical support and process guidance Record, track, and document all queries received, problem-solving steps taken and total successful and unsuccessful resolutions Ensure standard processes and procedures are followed to resolve all client queries Resolve client queries as per the SLA’s defined in the contract Develop understanding of process/ product for the team members to facilitate better client interaction and troubleshooting Document and analyze call logs to spot most occurring trends to prevent future problems Identify red flags and escalate serious client issues to Team leader in cases of untimely resolution Ensure all product information and disclosures are given to clients before and after the call/email requests Avoids legal challenges by monitoring compliance with service agreements Handle technical escalations through effective diagnosis and troubleshooting of client queries Manage and resolve technical roadblocks/ escalations as per SLA and quality requirements If unable to resolve the issues, timely escalate the issues to TA & SES Provide product support and resolution to clients by performing a question diagnosis while guiding users through step-by-step solutions Troubleshoot all client queries in a user-friendly, courteous and professional manner Offer alternative solutions to clients (where appropriate) with the objective of retaining customers’ and clients’ business Organize ideas and effectively communicate oral messages appropriate to listeners and situations Follow up and make scheduled call backs to customers to record feedback and ensure compliance to contract SLA’s Build people capability to ensure operational excellence and maintain superior customer service levels of the existing account/client Mentor and guide Production Specialists on improving technical knowledge Collate trainings to be conducted as triage to bridge the skill gaps identified through interviews with the Production Specialist Develop and conduct trainings (Triages) within products for production specialist as per target Inform client about the triages being conducted Undertake product trainings to stay current with product features, changes and updates Enroll in product specific and any other trainings per client requirements/recommendations Identify and document most common problems and recommend appropriate resolutions to the team Update job knowledge by participating in self learning opportunities and maintaining personal networks Deliver NoPerformance ParameterMeasure1ProcessNo. of cases resolved per day, compliance to process and quality standards, meeting process level SLAs, Pulse score, Customer feedback, NSAT/ ESAT2Team ManagementProductivity, efficiency, absenteeism3Capability developmentTriages completed, Technical Test performance Mandatory Skills: Hadoop. Experience5-8 Years.

Posted 2 months ago

Apply

3.0 - 6.0 years

9 - 14 Lacs

Mumbai

Work from Office

Role Overview : We are looking for aTalend Data Catalog Specialistto drive enterprise data governance initiatives by implementingTalend Data Catalogand integrating it withApache Atlasfor unified metadata management within a Cloudera-based data lakehouse. The role involves establishing metadata lineage, glossary harmonization, and governance policies to enhance trust, discovery, and compliance across the data ecosystem Key Responsibilities: o Set up and configure Talend Data Catalog to ingest and manage metadata from source systems, data lake (HDFS), Iceberg tables, Hive metastore, and external data sources. o Develop and maintain business glossaries , data classifications, and metadata models. o Design and implement bi-directional integration between Talend Data Catalog and Apache Atlas to enable metadata synchronization , lineage capture, and policy alignment across the Cloudera stack. o Map technical metadata from Hive/Impala to business metadata defined in Talend. o Capture end-to-end lineage of data pipelines (e.g., from ingestion in PySpark to consumption in BI tools) using Talend and Atlas. o Provide impact analysis for schema changes, data transformations, and governance rule enforcement. o Support definition and rollout of enterprise data governance policies (e.g., ownership, stewardship, access control). o Enable role-based metadata access , tagging, and data sensitivity classification. o Work with data owners, stewards, and architects to ensure data assets are well-documented, governed, and discoverable. o Provide training to users on leveraging the catalog for search, understanding, and reuse. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise 6–12 years in data governance or metadata management, with at least 2–3 years in Talend Data Catalog. Talend Data Catalog, Apache Atlas, Cloudera CDP, Hive/Impala, Spark, HDFS, SQL. Business glossary, metadata enrichment, lineage tracking, stewardship workflows. Hands-on experience in Talend–Atlas integration , either through REST APIs, Kafka hooks, or metadata bridges. Preferred technical and professional experience .

Posted 2 months ago

Apply

3.0 - 7.0 years

6 - 10 Lacs

Mumbai

Work from Office

Role Overview : Looking for a Kafka SME to design and support real-time data ingestion pipelines using Kafka within a Cloudera-based Lakehouse architecture. Key Responsibilities : Design Kafka topics, partitions, schema registry Implement producer-consumer apps using Spark Structured Streaming Set up Kafka Connect, monitoring, and alerts Ensure secure, scalable message delivery Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Skills Required : Deep understanding of Kafka internals and ecosystem Integration with Cloudera and NiFi Schema evolution and serialization (Avro, Parquet) Performance tuning and fault-tolerance Preferred technical and professional experience Good communication skill. India market experience is preferred.

Posted 2 months ago

Apply

10.0 - 15.0 years

8 - 14 Lacs

Chennai

Work from Office

Years of Experience : 10-15 Yrs Shifts : 24*7 (Rotational Shift) Mode : Onsite Experience : 10+ yrs of experience in IT, with At least 7+ years of experience with cloud and system administration. At least 5 years of experience with and strong understanding of 'big data' technologies in Hadoop ecosystem - Hive, HDFS, Map/Reduce, Flume, Pig, Cloudera, HBase Sqoop, Spark etc. Job Overview : Smartavya Analytica Private Limited is seeking an experienced Hadoop Administrator to manage and support our Hadoop ecosystem. The ideal candidate will have strong expertise in Hadoop cluster administration, excellent troubleshooting skills, and a proven track record of maintaining and optimizing Hadoop environments. Key Responsibilities: Install, configure, and manage Hadoop clusters, including HDFS, YARN, Hive, HBase, and other ecosystem components. Monitor and manage Hadoop cluster performance, capacity, and security. Perform routine maintenance tasks such as upgrades, patching, and backups. Implement and maintain data ingestion processes using tools like Sqoop, Flume, and Kafka. Ensure high availability and disaster recovery of Hadoop clusters. Collaborate with development teams to understand requirements and provide appropriate Hadoop solutions. Troubleshoot and resolve issues related to the Hadoop ecosystem. Maintain documentation of Hadoop environment configurations, processes, and procedures. Requirement : Experience in Installing, configuring and tuning Hadoop distributions. Hands on experience in Cloudera. Understanding of Hadoop design principals and factors that affect distributed system performance, including hardware and network considerations. Provide Infrastructure Recommendations, Capacity Planning, work load management. Develop utilities to monitor cluster better Ganglia, Nagios etc. Manage large clusters with huge volumes of data Perform Cluster maintenance tasks Create and removal of nodes, cluster monitoring and troubleshooting Manage and review Hadoop log files

Posted 2 months ago

Apply

8.0 - 13.0 years

5 - 12 Lacs

Mysuru, Pune

Hybrid

Role & responsibilities 4+ years of experience as a Data Engineer or similar role - 3+ of years experience building data solutions at scale using one of the Enterprise Data platforms – Palantir Foundry, Snowflake, Cloudera/Hive, Amazon Redshift - 3+ years of experience with SQL and No-SQL databases (Snowflake or Hive) - 3+ years of hands-on experience with programming using Python, Spark or C# - Experience with DevOps principals and CI/CD - Strong understanding of ETL principles and data integration patterns - Experience with Agile and iterative development processes is a plus - Experience with cloud services such as AWS, Azure etc. and other big data tools like Spark, Kafka, etc. is a plus (not mandatory) - Knowledge of Typescript & Full stack development experience is a plus (not mandatory)

Posted 2 months ago

Apply

5.0 - 10.0 years

10 - 20 Lacs

Pune, Bengaluru, Delhi / NCR

Hybrid

Job Description for the Data Engineering: Location: PAN INDIA Experience between 7 - 14 years in performing the Data Engineering engagement Have experienced in Cloudera, Hadoop and SnowFlake Worked in the Impala and Kudu systems and can write code in Spark, PySpark Experienced in setting up Ooizee workflows Should be very good in SQL Have experience in Performance Tuning

Posted 2 months ago

Apply

6.0 - 11.0 years

25 - 30 Lacs

Kolkata, Mumbai, New Delhi

Work from Office

PureSoftware is looking for Data Engineer to join our dynamic team and embark on a rewarding career journey. Liaising with coworkers and clients to elucidate the requirements for each task. Conceptualizing and generating infrastructure that allows big data to be accessed and analyzed. Reformulating existing frameworks to optimize their functioning. Testing such structures to ensure that they are fit for use. Preparing raw data for manipulation by data scientists. Detecting and correcting errors in your work. Ensuring that your work remains backed up and readily accessible to relevant coworkers. Remaining up-to-date with industry standards and technological advancements that will improve the quality of your outputs.

Posted 2 months ago

Apply

5 - 8 years

5 - 9 Lacs

Bengaluru

Work from Office

Wipro Limited (NYSEWIT, BSE507685, NSEWIPRO) is a leading technology services and consulting company focused on building innovative solutions that address clients’ most complex digital transformation needs. Leveraging our holistic portfolio of capabilities in consulting, design, engineering, and operations, we help clients realize their boldest ambitions and build future-ready, sustainable businesses. With over 230,000 employees and business partners across 65 countries, we deliver on the promise of helping our customers, colleagues, and communities thrive in an ever-changing world. For additional information, visit us at www.wipro.com. About The Role Role Purpose The purpose of the role is to support process delivery by ensuring daily performance of the Production Specialists, resolve technical escalations and develop technical capability within the Production Specialists. ? Do Oversee and support process by reviewing daily transactions on performance parameters Review performance dashboard and the scores for the team Support the team in improving performance parameters by providing technical support and process guidance Record, track, and document all queries received, problem-solving steps taken and total successful and unsuccessful resolutions Ensure standard processes and procedures are followed to resolve all client queries Resolve client queries as per the SLA’s defined in the contract Develop understanding of process/ product for the team members to facilitate better client interaction and troubleshooting Document and analyze call logs to spot most occurring trends to prevent future problems Identify red flags and escalate serious client issues to Team leader in cases of untimely resolution Ensure all product information and disclosures are given to clients before and after the call/email requests Avoids legal challenges by monitoring compliance with service agreements ? Handle technical escalations through effective diagnosis and troubleshooting of client queries Manage and resolve technical roadblocks/ escalations as per SLA and quality requirements If unable to resolve the issues, timely escalate the issues to TA & SES Provide product support and resolution to clients by performing a question diagnosis while guiding users through step-by-step solutions Troubleshoot all client queries in a user-friendly, courteous and professional manner Offer alternative solutions to clients (where appropriate) with the objective of retaining customers’ and clients’ business Organize ideas and effectively communicate oral messages appropriate to listeners and situations Follow up and make scheduled call backs to customers to record feedback and ensure compliance to contract SLA’s ? Build people capability to ensure operational excellence and maintain superior customer service levels of the existing account/client Mentor and guide Production Specialists on improving technical knowledge Collate trainings to be conducted as triage to bridge the skill gaps identified through interviews with the Production Specialist Develop and conduct trainings (Triages) within products for production specialist as per target Inform client about the triages being conducted Undertake product trainings to stay current with product features, changes and updates Enroll in product specific and any other trainings per client requirements/recommendations Identify and document most common problems and recommend appropriate resolutions to the team Update job knowledge by participating in self learning opportunities and maintaining personal networks ? Deliver NoPerformance ParameterMeasure1ProcessNo. of cases resolved per day, compliance to process and quality standards, meeting process level SLAs, Pulse score, Customer feedback, NSAT/ ESAT2Team ManagementProductivity, efficiency, absenteeism3Capability developmentTriages completed, Technical Test performance Mandatory Skills: Hadoop. Experience5-8 Years. Reinvent your world. We are building a modern Wipro. We are an end-to-end digital transformation partner with the boldest ambitions. To realize them, we need people inspired by reinvention. Of yourself, your career, and your skills. We want to see the constant evolution of our business and our industry. It has always been in our DNA - as the world around us changes, so do we. Join a business powered by purpose and a place that empowers you to design your own reinvention. Come to Wipro. Realize your ambitions. Applications from people with disabilities are explicitly welcome.

Posted 2 months ago

Apply

5 - 8 years

6 - 10 Lacs

Bengaluru

Work from Office

About The Role Role Purpose The purpose of this role is to design, test and maintain software programs for operating systems or applications which needs to be deployed at a client end and ensure its meet 100% quality assurance parameters B?ig Data Developer - Spark,Scala,Pyspark Big Data Developer - Spark, Scala, Pyspark Coding & scripting Years of Experience5 to 12 years LocationBangalore Notice Period0 to 30 days Key Skills: - Proficient in Spark,Scala,Pyspark coding & scripting - Fluent in big data engineering development using the Hadoop/Spark ecosystem - Hands-on experience in Big Data - Good Knowledge of Hadoop Eco System - Knowledge of cloud architecture AWS - Data ingestion and integration into the Data Lake using the Hadoop ecosystem tools such as Sqoop, Spark, Impala, Hive, Oozie, Airflow etc. - Candidates should be fluent in the Python / Scala language - Strong communication skills ? 2. Perform coding and ensure optimal software/ module development Determine operational feasibility by evaluating analysis, problem definition, requirements, software development and proposed software Develop and automate processes for software validation by setting up and designing test cases/scenarios/usage cases, and executing these cases Modifying software to fix errors, adapt it to new hardware, improve its performance, or upgrade interfaces. Analyzing information to recommend and plan the installation of new systems or modifications of an existing system Ensuring that code is error free or has no bugs and test failure Preparing reports on programming project specifications, activities and status Ensure all the codes are raised as per the norm defined for project / program / account with clear description and replication patterns Compile timely, comprehensive and accurate documentation and reports as requested Coordinating with the team on daily project status and progress and documenting it Providing feedback on usability and serviceability, trace the result to quality risk and report it to concerned stakeholders ? 3. Status Reporting and Customer Focus on an ongoing basis with respect to project and its execution Capturing all the requirements and clarifications from the client for better quality work Taking feedback on the regular basis to ensure smooth and on time delivery Participating in continuing education and training to remain current on best practices, learn new programming languages, and better assist other team members. Consulting with engineering staff to evaluate software-hardware interfaces and develop specifications and performance requirements Document and demonstrate solutions by developing documentation, flowcharts, layouts, diagrams, charts, code comments and clear code Documenting very necessary details and reports in a formal way for proper understanding of software from client proposal to implementation Ensure good quality of interaction with customer w.r.t. e-mail content, fault report tracking, voice calls, business etiquette etc Timely Response to customer requests and no instances of complaints either internally or externally ? Deliver No. Performance Parameter Measure 1. Continuous Integration, Deployment & Monitoring of Software 100% error free on boarding & implementation, throughput %, Adherence to the schedule/ release plan 2. Quality & CSAT On-Time Delivery, Manage software, Troubleshoot queries, Customer experience, completion of assigned certifications for skill upgradation 3. MIS & Reporting 100% on time MIS & report generation Mandatory Skills: Python for Insights. Experience5-8 Years. Reinvent your world. We are building a modern Wipro. We are an end-to-end digital transformation partner with the boldest ambitions. To realize them, we need people inspired by reinvention. Of yourself, your career, and your skills. We want to see the constant evolution of our business and our industry. It has always been in our DNA - as the world around us changes, so do we. Join a business powered by purpose and a place that empowers you to design your own reinvention. Come to Wipro. Realize your ambitions. Applications from people with disabilities are explicitly welcome.

Posted 2 months ago

Apply

5 - 8 years

6 - 10 Lacs

Pune

Work from Office

About The Role Role Purpose The purpose of the role is to support process delivery by ensuring daily performance of the Production Specialists, resolve technical escalations and develop technical capability within the Production Specialists. ? Do Oversee and support process by reviewing daily transactions on performance parameters Review performance dashboard and the scores for the team Support the team in improving performance parameters by providing technical support and process guidance Record, track, and document all queries received, problem-solving steps taken and total successful and unsuccessful resolutions Ensure standard processes and procedures are followed to resolve all client queries Resolve client queries as per the SLA’s defined in the contract Develop understanding of process/ product for the team members to facilitate better client interaction and troubleshooting Document and analyze call logs to spot most occurring trends to prevent future problems Identify red flags and escalate serious client issues to Team leader in cases of untimely resolution Ensure all product information and disclosures are given to clients before and after the call/email requests Avoids legal challenges by monitoring compliance with service agreements ? Handle technical escalations through effective diagnosis and troubleshooting of client queries Manage and resolve technical roadblocks/ escalations as per SLA and quality requirements If unable to resolve the issues, timely escalate the issues to TA & SES Provide product support and resolution to clients by performing a question diagnosis while guiding users through step-by-step solutions Troubleshoot all client queries in a user-friendly, courteous and professional manner Offer alternative solutions to clients (where appropriate) with the objective of retaining customers’ and clients’ business Organize ideas and effectively communicate oral messages appropriate to listeners and situations Follow up and make scheduled call backs to customers to record feedback and ensure compliance to contract SLA’s ? Build people capability to ensure operational excellence and maintain superior customer service levels of the existing account/client Mentor and guide Production Specialists on improving technical knowledge Collate trainings to be conducted as triage to bridge the skill gaps identified through interviews with the Production Specialist Develop and conduct trainings (Triages) within products for production specialist as per target Inform client about the triages being conducted Undertake product trainings to stay current with product features, changes and updates Enroll in product specific and any other trainings per client requirements/recommendations Identify and document most common problems and recommend appropriate resolutions to the team Update job knowledge by participating in self learning opportunities and maintaining personal networks ? Deliver NoPerformance ParameterMeasure1ProcessNo. of cases resolved per day, compliance to process and quality standards, meeting process level SLAs, Pulse score, Customer feedback, NSAT/ ESAT2Team ManagementProductivity, efficiency, absenteeism3Capability developmentTriages completed, Technical Test performance Mandatory Skills: Big Data. Experience5-8 Years. Reinvent your world. We are building a modern Wipro. We are an end-to-end digital transformation partner with the boldest ambitions. To realize them, we need people inspired by reinvention. Of yourself, your career, and your skills. We want to see the constant evolution of our business and our industry. It has always been in our DNA - as the world around us changes, so do we. Join a business powered by purpose and a place that empowers you to design your own reinvention. Come to Wipro. Realize your ambitions. Applications from people with disabilities are explicitly welcome.

Posted 2 months ago

Apply

5 - 10 years

20 - 35 Lacs

Bengaluru

Work from Office

Position - Pyspark Developer Experience - 5+ yrs Location - Bangalore Notice Period - Immediate - 30 days Roles & Responsibilities: 5+ years of experience as a Data Engineer, with a strong focus on PySpark and the Cloudera Data Platform. PySpark: Advanced proficiency in PySpark, including working with RDDs, Data Frames, and optimization techniques. Cloudera Data Platform: Strong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase. Data Warehousing: Knowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala). Big Data Technologies: Familiarity with Hadoop, Kafka, and other distributed computing tools. Orchestration and Scheduling: Experience with Apache Oozie, Airflow, or similar orchestration frameworks. Scripting and Automation: Strong scripting skills in Linux. Data Pipeline Development: Design, develop, and maintain highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform, ensuring data integrity and accuracy. Data Ingestion: Implement and manage data ingestion processes from a variety of sources (e.g., relational databases, APIs, file systems) to the data lake or data warehouse on CDP. Data Transformation and Processing: Use PySpark to process, cleanse, and transform large datasets into meaningful formats that support analytical needs and business requirements. Performance Optimization: Conduct performance tuning of PySpark code and Cloudera components, optimizing resource utilization and reducing runtime of ETL processes. Data Quality and Validation: Implement data quality checks, monitoring, and validation routines to ensure data accuracy and reliability throughout the pipeline. Automation and Orchestration: Automate data workflows using tools like Apache Oozie, Airflow, or similar orchestration tools within the Cloudera ecosystem. Monitoring and Maintenance: Monitor pipeline performance, troubleshoot issues, and perform routine maintenance on the Cloudera Data Platform and associated data processes. Collaboration: Work closely with other data engineers, analysts, product managers, and other stakeholders to understand data requirements and support various data-driven initiatives. Documentation: Maintain thorough documentation of data engineering processes, code, and pipeline configurations.

Posted 2 months ago

Apply

2 - 7 years

6 - 10 Lacs

Bengaluru

Work from Office

Hello Talented Techie! We provide support in Project Services and Transformation, Digital Solutions and Delivery Management. We offer joint operations and digitalization services for Global Business Services and work closely alongside the entire Shared Services organization. We make efficient use of the possibilities of new technologies such as Business Process Management (BPM) and Robotics as enablers for efficient and effective implementations. We are looking for Data Engineer ( AWS, Confluent & Snaplogic ) Data Integration Integrate data from various Siemens organizations into our data factory, ensuring seamless data flow and real-time data fetching. Data Processing Implement and manage large-scale data processing solutions using AWS Glue, ensuring efficient and reliable data transformation and loading. Data Storage Store and manage data in a large-scale data lake, utilizing Iceberg tables in Snowflake for optimized data storage and retrieval. Data Transformation Apply various data transformations to prepare data for analysis and reporting, ensuring data quality and consistency. Data Products Create and maintain data products that meet the needs of various stakeholders, providing actionable insights and supporting data-driven decision-making. Workflow Management Use Apache Airflow to orchestrate and automate data workflows, ensuring timely and accurate data processing. Real-time Data Streaming Utilize Confluent Kafka for real-time data streaming, ensuring low-latency data integration and processing. ETL Processes Design and implement ETL processes using SnapLogic , ensuring efficient data extraction, transformation, and loading. Monitoring and Logging Use Splunk for monitoring and logging data processes, ensuring system reliability and performance. You"™d describe yourself as: Experience 3+ relevant years of experience in data engineering, with a focus on AWS Glue, Iceberg tables, Confluent Kafka, SnapLogic, and Airflow. Technical Skills : Proficiency in AWS services, particularly AWS Glue. Experience with Iceberg tables and Snowflake. Knowledge of Confluent Kafka for real-time data streaming. Familiarity with SnapLogic for ETL processes. Experience with Apache Airflow for workflow management. Understanding of Splunk for monitoring and logging. Programming Skills Proficiency in Python, SQL, and other relevant programming languages. Data Modeling Experience with data modeling and database design. Problem-Solving Strong analytical and problem-solving skills, with the ability to troubleshoot and resolve data-related issues. Preferred Qualities: Attention to Detail Meticulous attention to detail, ensuring data accuracy and quality. Communication Skills Excellent communication skills, with the ability to collaborate effectively with cross-functional teams. Adaptability Ability to adapt to changing technologies and work in a fast-paced environment. Team Player Strong team player with a collaborative mindset. Continuous Learning Eagerness to learn and stay updated with the latest trends and technologies in data engineering. Create a better #TomorrowWithUs! This role, based in Bangalore, is an individual contributor position. You may be required to visit other locations within India and internationally. In return, you'll have the opportunity to work with teams shaping the future. At Siemens, we are a collection of over 312,000 minds building the future, one day at a time, worldwide. We value your unique identity and perspective and are fully committed to providing equitable opportunities and building a workplace that reflects the diversity of society. Come bring your authentic self and create a better tomorrow with us. Find out more about Siemens careers at: www.siemens.com/careers

Posted 2 months ago

Apply

5 - 8 years

25 - 30 Lacs

Hyderabad

Work from Office

Ecodel Infotel pvt ltd is looking for Data Engineer to join our dynamic team and embark on a rewarding career journey. Liaising with coworkers and clients to elucidate the requirements for each task. Conceptualizing and generating infrastructure that allows big data to be accessed and analyzed. Reformulating existing frameworks to optimize their functioning. Testing such structures to ensure that they are fit for use. Preparing raw data for manipulation by data scientists. Detecting and correcting errors in your work. Ensuring that your work remains backed up and readily accessible to relevant coworkers. Remaining up-to-date with industry standards and technological advancements that will improve the quality of your outputs.

Posted 2 months ago

Apply

5 - 7 years

0 - 0 Lacs

Thiruvananthapuram

Work from Office

Key Responsibilities: Big Data Architecture: Design, develop, and maintain scalable and distributed data architectures capable of processing large volumes of data. Data Storage Solutions: Implement and optimize data storage solutions using technologies such as Hadoop , Spark , and PySpark . PySpark Development: Develop and implement efficient ETL processes using PySpark to extract, transform, and load large datasets. Performance Optimization: Optimize PySpark applications for better performance, scalability, and resource management. Qualifications: Proven experience as a Big Data Engineer with a strong focus on PySpark . Deep understanding of Big Data processing frameworks and technologies. Strong proficiency in PySpark for developing and optimizing ETL processes and data transformations. Experience with distributed computing and parallel processing. Ability to collaborate in a fast-paced, innovative environment. Required Skills Pyspark,Big Data, Python,

Posted 2 months ago

Apply

2 - 5 years

6 - 10 Lacs

Gurugram

Work from Office

KDataScience (USA & INDIA) is looking for Data Engineer to join our dynamic team and embark on a rewarding career journey Liaising with coworkers and clients to elucidate the requirements for each task. Conceptualizing and generating infrastructure that allows big data to be accessed and analyzed. Reformulating existing frameworks to optimize their functioning. Testing such structures to ensure that they are fit for use. Preparing raw data for manipulation by data scientists. Detecting and correcting errors in your work. Ensuring that your work remains backed up and readily accessible to relevant coworkers. Remaining up-to-date with industry standards and technological advancements that will improve the quality of your outputs.

Posted 2 months ago

Apply

3 - 8 years

20 - 30 Lacs

Bengaluru

Work from Office

Greetings from Clover Infotech!!! Please review the job details and share the required details if you are interested in proceeding further. If you are not interested, request you to help me reach the right candidate Job Title: Data Engineer (PySpark) Location: Bangalore Employment Type: - Full-Time Experience Required: 3+ years About the Role We are looking for a highly skilled Data Engineer (PySpark) to join our dynamic data engineering team. This position is ideal for a technically sound and detail-oriented professional who has deep expertise in PySpark and the Cloudera Data Platform (CDP) . You will be instrumental in designing, developing, and maintaining scalable data pipelines, ensuring high data quality, and enabling data availability across our organization. The ideal candidate brings hands-on experience with big data technologies, data ingestion, transformation, and optimization using Clouderas ecosystem. Youll collaborate with cross-functional teams to create robust, high-performing data solutions that support business insights and strategic decision-making. Key Responsibilities Data Pipeline Development : Design, build, and manage scalable and high-performance ETL pipelines using PySpark on Cloudera. Data Ingestion : Integrate data from multiple sources (relational databases, APIs, file systems) into CDP environments. Data Processing : Cleanse, transform, and process large datasets to meet business and analytics needs. Performance Optimization : Fine-tune PySpark jobs and Cloudera tools to improve ETL performance and resource utilization. Data Quality : Implement and maintain data validation, error handling, and quality control processes. Workflow Automation : Automate jobs using tools like Apache Oozie, Airflow, or similar orchestration frameworks. Monitoring and Maintenance : Monitor job performance and ensure system reliability through proactive issue identification and resolution. Team Collaboration : Work closely with data analysts, data scientists, and business stakeholders to understand requirements and deliver data-driven solutions. Documentation : Create and maintain clear, comprehensive documentation for pipelines, processes, and systems. Qualifications Education & Experience Bachelor’s or Master’s degree in Computer Science, Information Systems, or a related field. Minimum 3 years of hands-on experience in a Data Engineering role with a focus on PySpark and Cloudera Data Platform. Technical Skills Advanced experience with PySpark (RDDs, DataFrames, performance tuning). Strong knowledge of Cloudera Data Platform and its components: Cloudera Manager, Hive, Impala, HDFS, HBase. Solid understanding of ETL processes , data warehousing , and SQL -based tools. Exposure to Hadoop , Kafka , and other big data frameworks. Experience in workflow orchestration using tools like Apache Oozie , Apache Airflow , etc. Proficient in Linux scripting and automation. Soft Skills Strong analytical and troubleshooting abilities. Effective communication skills – verbal and written. Self-starter with the ability to work independently and collaboratively. Detail-oriented with a commitment to delivering high-quality data solutions. What We Offer An opportunity to work with cutting-edge big data technologies. A collaborative team culture focused on innovation and growth. Career advancement and learning opportunities. Competitive compensation and benefits package. Please share the following details to proceed further. Currently Salary: - Expected Salary: - Notice Period: - Reason for looking for change: - Updated Resume: -Please attach. Job Application Disclaimer” We appreciate your interest in this opportunity. Due to the large number of applications, only shortlisted candidates will be contacted for an interview. However, we will keep your resume on file for future opportunities that match your profile. Thanks Vijin.appukuttan@cloverinfotech.com

Posted 2 months ago

Apply

2 - 5 years

14 - 17 Lacs

Mumbai

Work from Office

Who you are A seasoned Data Engineer with a passion for building and managing data pipelines in large-scale environments. Have good experience working with big data technologies, data integration frameworks, and cloud-based data platforms. Have a strong foundation in Apache Spark, PySpark, Kafka, and SQL.What you’ll doAs a Data Engineer – Data Platform Services, your responsibilities include: Data Ingestion & Processing Assisting in building and optimizing data pipelines for structured and unstructured data. Working with Kafka and Apache Spark to manage real-time and batch data ingestion. Supporting data integration using IBM CDC and Universal Data Mover (UDM). Big Data & Data Lakehouse Management Managing and processing large datasets using PySpark and Iceberg tables. Assisting in migrating data workloads from IIAS to Cloudera Data Lake. Supporting data lineage tracking and metadata management for compliance. Optimization & Performance Tuning Helping to optimize PySpark jobs for efficiency and scalability. Supporting data partitioning, indexing, and caching strategies. Monitoring and troubleshooting pipeline issues and performance bottlenecks. Security & Compliance Implementing role-based access controls (RBAC) and encryption policies. Supporting data security and compliance efforts using Thales CipherTrust. Ensuring data governance best practices are followed. Collaboration & Automation Working with Data Scientists, Analysts, and DevOps teams to enable seamless data access. Assisting in automation of data workflows using Apache Airflow. Supporting Denodo-based data virtualization for efficient data access. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise 4-7 years of experience in big data engineering, data integration, and distributed computing. Strong skills in Apache Spark, PySpark, Kafka, SQL, and Cloudera Data Platform (CDP). Proficiency in Python or Scala for data processing. Experience with data pipeline orchestration tools (Apache Airflow, Stonebranch UDM). Understanding of data security, encryption, and compliance frameworks. Preferred technical and professional experience Experience in banking or financial services data platforms. Exposure to Denodo for data virtualization and DGraph for graph-based insights. Familiarity with cloud data platforms (AWS, Azure, GCP). Certifications in Cloudera Data Engineering, IBM Data Engineering, or AWS Data Analytics.

Posted 2 months ago

Apply

2 - 5 years

14 - 17 Lacs

Mumbai

Work from Office

Who you are: A Data Engineer specializing in enterprise data platforms, experienced in building, managing, and optimizing data pipelines for large-scale environments. Having expertise in big data technologies, distributed computing, data ingestion, and transformation frameworks. Proficient in Apache Spark, PySpark, Kafka, and Iceberg tables, and understand how to design and implement scalable, high-performance data processing solutions.What you’ll doAs a Data Engineer – Data Platform Services, responsibilities include: Data Ingestion & Processing Designing and developing data pipelines to migrate workloads from IIAS to Cloudera Data Lake. Implementing streaming and batch data ingestion frameworks using Kafka, Apache Spark (PySpark). Working with IBM CDC and Universal Data Mover to manage data replication and movement. Big Data & Data Lakehouse Management Implementing Apache Iceberg tables for efficient data storage and retrieval. Managing distributed data processing with Cloudera Data Platform (CDP). Ensuring data lineage, cataloging, and governance for compliance with Bank/regulatory policies. Optimization & Performance Tuning Optimizing Spark and PySpark jobs for performance and scalability. Implementing data partitioning, indexing, and caching to enhance query performance. Monitoring and troubleshooting pipeline failures and performance bottlenecks. Security & Compliance Ensuring secure data access, encryption, and masking using Thales CipherTrust. Implementing role-based access controls (RBAC) and data governance policies. Supporting metadata management and data quality initiatives. Collaboration & Automation Working closely with Data Scientists, Analysts, and DevOps teams to integrate data solutions. Automating data workflows using Airflow and implementing CI/CD pipelines with GitLab and Sonatype Nexus. Supporting Denodo-based data virtualization for seamless data access. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise 4-7 years of experience in big data engineering, data integration, and distributed computing. Strong skills in Apache Spark, PySpark, Kafka, SQL, and Cloudera Data Platform (CDP). Proficiency in Python or Scala for data processing. Experience with data pipeline orchestration tools (Apache Airflow, Stonebranch UDM). Understanding of data security, encryption, and compliance frameworks. Preferred technical and professional experience Experience in banking or financial services data platforms. Exposure to Denodo for data virtualization and DGraph for graph-based insights. Familiarity with cloud data platforms (AWS, Azure, GCP). Certifications in Cloudera Data Engineering, IBM Data Engineering, or AWS Data Analytics.

Posted 2 months ago

Apply

2 - 5 years

7 - 11 Lacs

Mumbai

Work from Office

What you’ll do As a Data Engineer – Data Modeling, you will be responsible for: Data Modeling & Schema Design Developing conceptual, logical, and physical data models to support enterprise data requirements. Designing schema structures for Apache Iceberg tables on Cloudera Data Platform. Collaborating with ETL developers and data engineers to optimize data models for efficient ingestion and retrieval. Data Governance & Quality Assurance Ensuring data accuracy, consistency, and integrity across data models. Supporting data lineage and metadata management to enhance data traceability. Implementing naming conventions, data definitions, and standardization in collaboration with governance teams. ETL & Data Pipeline Support Assisting in the migration of data from IIAS to Cloudera Data Lake by designing efficient data structures. Working with Denodo for data virtualization, ensuring optimized data access across multiple sources. Collaborating with teams using Talend Data Quality (DQ) tools to ensure high-quality data in the models. Collaboration & Documentation Working closely with business analysts, architects, and reporting teams to understand data requirements. Maintaining data dictionaries, entity relationships, and technical documentation for data models. Supporting data visualization and analytics teams by designing reporting-friendly data models. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise 4-7 years of experience in data modeling, database design, and data engineering. Hands-on experience with ERwin Data Modeler for creating and managing data models. Strong knowledge of relational databases (PostgreSQL) and big data platforms (Cloudera, Apache Iceberg). Proficiency in SQL and NoSQL database concepts. Understanding of data governance, metadata management, and data security principles. Familiarity with ETL processes and data pipeline optimization. Strong analytical, problem-solving, and documentation skills. Preferred technical and professional experience Experience working on Cloudera migration projects. Exposure to Denodo for data virtualization and Talend DQ for data quality management. Knowledge of Kafka, Airflow, and PySpark for data processing. Familiarity with GitLab, Sonatype Nexus, and CheckMarx for CI/CD and security compliance. Certifications in Data Modeling, Cloudera Data Engineering, or IBM Data Solutions.

Posted 2 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies