Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
4.0 - 8.0 years
15 - 30 Lacs
Noida, Hyderabad, India
Hybrid
Spark Architecture , Spark tuning, Delta tables, Madelaine architecture, data Bricks , Azure cloud services python Oops concept, Pyspark complex transformation , Read data from different file format and sources writing to delta tables Dataware housing concepts How to process large files and handle pipeline failures in current projects Roles and Responsibilities Spark Architecture , Spark tuning, Delta tables, Madelaine architecture, data Bricks , Azure cloud services python Oops concept, Pyspark complex transformation , Read data from different file format and sources writing to delta tables Dataware housing concepts How to process large files and handle pipeline failures in current projects
Posted 1 month ago
8.0 - 13.0 years
35 - 50 Lacs
Mumbai
Work from Office
Hiring Big Data Lead with 8+ years experience for US Shift time: Must Have: - Big Data: Spark, Hadoop, Kafka, Hive, Flink - Backend: Python, Scala - NoSQL: MongoDB, Cassandra - Cloud: AWS/AZURE/GCP, Snowflake, Databricks - Docker, Kubernetes, CI/CD Required Candidate profile - Excellent in Mentoring/ Training in Big Data- HDFS, YARN, Airflow, Hive, Mapreduce, Hbase, Kafka & ETL/ELT, real-time streaming, data modeling - Immediate Joiner is plus - Excellent in Communication
Posted 1 month ago
3.0 - 6.0 years
5 - 9 Lacs
Bengaluru
Work from Office
Your Role Strong Spark programming experience with Java Good knowledge of SQL query writing and shell scripting Experience working in Agile mode Analyze, Design, develop, deploy and operate high-performant and high-quality services that serve users in a cloud environment. Good understanding of client eco system and expectations In charge of code reviews, integration process, test organization, quality of delivery Take part in development. Experienced into writing queries using SQL commands. Experienced with deploying and operating the codes in the cloud environment. Experienced in working without much supervision. Your Profile Primary Skill Java, Spark, SQL Secondary Skill/Good to have Hadoop or any cloud technology, Kafka, or BO. What youll love about working hereShort Description Choosing Capgemini means having the opportunity to make a difference, whether for the worlds leading businesses or for society. It means getting the support you need to shape your career in the way that works for you. It means when the future doesnt look as bright as youd like, you have the opportunity to make changeto rewrite it. When you join Capgemini, you dont just start a new job. You become part of something bigger. A diverse collective of free-thinkers, entrepreneurs and experts, all working together to unleash human energy through technology, for an inclusive and sustainable future. At Capgemini, people are at the heart of everything we do! You can exponentially grow your career by being part of innovative projects and taking advantage of our extensive Learning & Development programs. With us, you will experience an inclusive, safe, healthy, and flexible work environment to bring out the best in you! You also get a chance to make positive social change and build a better world by taking an active role in our Corporate Social Responsibility and Sustainability initiatives. And whilst you make a difference, you will also have a lot of fun. About Capgemini
Posted 1 month ago
12.0 - 15.0 years
13 - 17 Lacs
Mumbai
Work from Office
12+ Years experience in Big data Space across Architecture, Design, Development, testing & Deployment, full understanding in SDLC. 1. Experience of Hadoop and related technology stack experience 2. Experience of the Hadoop Eco-system(HDP+CDP) / Big Data (especially HIVE) Hand on experience with programming languages such as Java/Scala/python Hand-on experience/knowledge on Spark3. Being responsible and focusing on uptime and reliable running of all or ingestion/ETL jobs4. Good SQL and used to work in a Unix/Linux environment is a must.5. Create and maintain optimal data pipeline architecture.6. Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.7. Good to have cloud experience8. Good to have experience for Hadoop integration with data visualization tools like PowerBI. Location Mumbai, Pune, Chennai, Hyderabad, Coimbatore, Kolkata
Posted 1 month ago
5.0 - 10.0 years
14 - 17 Lacs
Pune
Work from Office
As a Big Data Engineer, you will develop, maintain, evaluate, and test big data solutions. You will be involved in data engineering activities like creating pipelines/workflows for Source to Target and implementing solutions that tackle the clients needs. Your primary responsibilities include Design, build, optimize and support new and existing data models and ETL processes based on our clients business requirements. Build, deploy and manage data infrastructure that can adequately handle the needs of a rapidly growing data driven organization. Coordinate data access and security to enable data scientists and analysts to easily access to data whenever they need too Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Must have 5+ years exp in Big Data -Hadoop Spark -Scala ,Python Hbase, Hive Good to have Aws -S3, athena ,Dynomo DB, Lambda, Jenkins GIT Developed Python and pyspark programs for data analysis. Good working experience with python to develop Custom Framework for generating of rules (just like rules engine). Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark. Apache Spark DataFrames/RDD's were used to apply business transformations and utilized Hive Context objects to perform read/write operations Preferred technical and professional experience Understanding of Devops. Experience in building scalable end-to-end data ingestion and processing solutions Experience with object-oriented and/or functional programming languages, such as Python, Java and Scala
Posted 1 month ago
3.0 - 8.0 years
9 - 13 Lacs
Mumbai
Work from Office
Role Overview : As a Big Data Engineer, you'll design and build robust data pipelines on Cloudera using Spark (Scala/PySpark) for ingestion, transformation, and processing of high-volume data from banking systems. Key Responsibilities : Build scalable batch and real-time ETL pipelines using Spark and Hive Integrate structured and unstructured data sources Perform performance tuning and code optimization Support orchestration and job scheduling (NiFi, Airflow) Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Experience 3-15 Years Proficiency in PySpark/Scala with Hive/Impala Experience with data partitioning, bucketing, and optimization Familiarity with Kafka, Iceberg, NiFi is a must Knowledge of banking or financial datasets is a plus
Posted 1 month ago
5.0 - 10.0 years
14 - 17 Lacs
Mumbai
Work from Office
As a Big Data Engineer, you will develop, maintain, evaluate, and test big data solutions. You will be involved in data engineering activities like creating pipelines/workflows for Source to Target and implementing solutions that tackle the clients needs. Your primary responsibilities include: Design, build, optimize and support new and existing data models and ETL processes based on our clients business requirements. Build, deploy and manage data infrastructure that can adequately handle the needs of a rapidly growing data driven organization. Coordinate data access and security to enable data scientists and analysts to easily access to data whenever they need too Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Must have 5+ years exp in Big Data -Hadoop Spark -Scala ,Python Hbase, Hive Good to have Aws -S3, athena ,Dynomo DB, Lambda, Jenkins GIT Developed Python and pyspark programs for data analysis. Good working experience with python to develop Custom Framework for generating of rules (just like rules engine). Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark. Apache Spark DataFrames/RDD's were used to apply business transformations and utilized Hive Context objects to perform read/write operations Preferred technical and professional experience Understanding of Devops. Experience in building scalable end-to-end data ingestion and processing solutions Experience with object-oriented and/or functional programming languages, such as Python, Java and Scala
Posted 1 month ago
15.0 - 20.0 years
5 - 9 Lacs
Mumbai
Work from Office
Location Mumbai Role Overview : As a Big Data Engineer, you'll design and build robust data pipelines on Cloudera using Spark (Scala/PySpark) for ingestion, transformation, and processing of high-volume data from banking systems. Key Responsibilities : Build scalable batch and real-time ETL pipelines using Spark and Hive Integrate structured and unstructured data sources Perform performance tuning and code optimization Support orchestration and job scheduling (NiFi, Airflow) Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Experience3–15 years Proficiency in PySpark/Scala with Hive/Impala Experience with data partitioning, bucketing, and optimization Familiarity with Kafka, Iceberg, NiFi is a must Knowledge of banking or financial datasets is a plus
Posted 1 month ago
5.0 - 10.0 years
10 - 20 Lacs
Chennai
Remote
Role & responsibilities Develop, maintain, and enhance new data sources and tables, contributing to data engineering efforts to ensure comprehensive and efficient data architecture. Serves as the liaison between Data Engineer team and the Airport operation teams, developing new data sources and overseeing enhancements to existing database; being one of the main contact points for data requests, metadata, and statistical analysis Migrates all existing Hive Metastore tables to Unity Catalog, addressing access issues and ensuring smooth transition of jobs and tables. Collaborate with IT teams to validate package (gold level data) table outputs during the production deployment of developed notebooks Develop and implement data quality alerting systems and Tableau alerting mechanisms for dashboards, setting up notifications for various thresholds. Create and maintain standard reports and dashboards to provide insights into airport performance, helping guide stations to optimize operations and improve performance. Preferred candidate profile Master's degree / UG Min 5 -10 years of experience Databricks (Azur op) Good Communication Experience developing solutions on a Big Data platform utilizing tools such as Impala and Spark Advanced knowledge/experience with Azure Databricks, PySpark , ( Teradata )/Databricks SQL Advanced knowledge/experience in Python along with associated development environments (e.g. JupyterHub, PyCharm, etc.) Advanced knowledge/experience in building Tableau Dashboard / Clikview / PowerBi Basic idea on HTML and JavaScript Immediate Joiner Skills, Licenses & Certifications Strong project management skills Proficient with Microsoft Office applications (MS Excel, Access and PowerPoint); advanced knowledge of Microsoft Excel Advanced aptitude in problem-solving, including the ability to logically structure an appropriate analytical framework Proficient in SharePoint, PowerApp and ability to use Graph API
Posted 1 month ago
12.0 - 15.0 years
13 - 17 Lacs
Mumbai
Work from Office
Job Information Job Opening ID ZR_1688_JOB Date Opened 24/12/2022 Industry Technology Job Type Work Experience 12-15 years Job Title Big Data Architect City Mumbai Province Maharashtra Country India Postal Code 400008 Number of Positions 4 LocationMumbai, Pune, Chennai, Hyderabad, Coimbatore, Kolkata 12+ Years experience in Big data Space across Architecture, Design, Development, testing & Deployment, full understanding in SDLC. 1. Experience of Hadoop and related technology stack experience 2. Experience of the Hadoop Eco-system(HDP+CDP) / Big Data (especially HIVE) Hand on experience with programming languages such as Java/Scala/python Hand-on experience/knowledge on Spark3. Being responsible and focusing on uptime and reliable running of all or ingestion/ETL jobs4. Good SQL and used to work in a Unix/Linux environment is a must.5. Create and maintain optimal data pipeline architecture.6. Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.7. Good to have cloud experience8. Good to have experience for Hadoop integration with data visualization tools like PowerBI. check(event) ; career-website-detail-template-2 => apply(record.id,meta)" mousedown="lyte-button => check(event)" final-style="background-color:#2B39C2;border-color:#2B39C2;color:white;" final-class="lyte-button lyteBackgroundColorBtn lyteSuccess" lyte-rendered=""> I'm interested
Posted 1 month ago
6.0 - 11.0 years
8 - 13 Lacs
Hyderabad
Work from Office
10+ years of software development experience building large scale distributed data processing systems/application, Data Engineering or large scale internet systems. Experience of at least 4 years in Developing/ Leading Big Data solution at enterprise scale with at least one end to end implementation Strong experience in programming languages Java/J2EE/Scala. Good experience in Spark/Hadoop/HDFS Architecture, YARN, Confluent Kafka , Hbase, Hive, Impala and NoSQL database. Experience with Batch Processing and AutoSys Job Scheduling and Monitoring Performance analysis, troubleshooting and resolution (this includes familiarity and investigation of Cloudera/Hadoop logs) Work with Cloudera on open issues that would result in cluster configuration changes and then implement as needed Strong experience with databases such as SQL,Hive, Elasticsearch, HBase, etc Knowledge of Hadoop Security, Data Management and Governance Primary Skills: Java/Scala, ETL, Spark, Hadoop, Hive, Impala, Sqoop, HBase, Confluent Kafka, Oracle, Linux, Git, Jenkins CI/CD
Posted 1 month ago
4.0 - 9.0 years
17 - 27 Lacs
Chennai, Bengaluru
Work from Office
Role & responsibilities • Experience with big data technologies (Hadoop, Spark, Hive) • Proven experience as a development data engineer or similar role, with ETL background. • Experience with data integration / ETL best practices and data quality principles. • Play a crucial role in ensuring the quality and reliability of the data by designing, implementing, and executing comprehensive testing. • By going over the User Stories build the comprehensive code base and business rules for testing and validation of the data. • Knowledge of continuous integration and continuous deployment (CI/CD) pipelines. • Familiarity with Agile/Scrum development methodologies. • Excellent analytical and problem-solving skills. • Strong communication and collaboration skills.
Posted 1 month ago
3.0 - 6.0 years
5 - 9 Lacs
Hyderabad
Work from Office
Job Role Strong Spark programming experience with Java Good knowledge of SQL query writing and shell scripting Experience working in Agile mode Analyze, Design, develop, deploy and operate high-performant and high-quality services that serve users in a cloud environment. Good understanding of client eco system and expectations In charge of code reviews, integration process, test organization, quality of delivery Take part in development. Experienced into writing queries using SQL commands. Experienced with deploying and operating the codes in the cloud environment. Experienced in working without much supervision. Your Profile Primary Skill Java, Spark, SQL Secondary Skill/Good to have Hadoop or any cloud technology, Kafka, or BO. What youll love about working hereShort Description Choosing Capgemini means having the opportunity to make a difference, whether for the worlds leading businesses or for society. It means getting the support you need to shape your career in the way that works for you. It means when the future doesnt look as bright as youd like, you have the opportunity to make changeto rewrite it. When you join Capgemini, you dont just start a new job. You become part of something bigger. A diverse collective of free-thinkers, entrepreneurs and experts, all working together to unleash human energy through technology, for an inclusive and sustainable future. At Capgemini, people are at the heart of everything we do! You can exponentially grow your career by being part of innovative projects and taking advantage of our extensive Learning & Development programs. With us, you will experience an inclusive, safe, healthy, and flexible work environment to bring out the best in you! You also get a chance to make positive social change and build a better world by taking an active role in our Corporate Social Responsibility and Sustainability initiatives. And whilst you make a difference, you will also have a lot of fun.
Posted 2 months ago
3.0 - 8.0 years
1 - 5 Lacs
Bengaluru
Work from Office
Project Role : Infra Tech Support Practitioner Project Role Description : Provide ongoing technical support and maintenance of production and development systems and software products (both remote and onsite) and for configured services running on various platforms (operating within a defined operating model and processes). Provide hardware/software support and implement technology at the operating system-level across all server and network areas, and for particular software solutions/vendors/brands. Work includes L1 and L2/ basic and intermediate level troubleshooting. Must have skills : AIX System Administration Good to have skills : Linux Operations, Red Hat OS AdministrationMinimum 3 year(s) of experience is required Educational Qualification : 15 years full time education Summary :As an Infra Tech Support Practitioner, you will engage in the ongoing technical support and maintenance of production and development systems and software products. Your typical day will involve addressing various technical issues, providing both remote and onsite assistance, and ensuring that configured services operate smoothly across multiple platforms. You will work within a defined operating model and processes, focusing on delivering high-quality support to meet the needs of the organization and its clients. Roles & Responsibilities:- Expected to perform independently and become an SME.- Required active participation/contribution in team discussions.- Contribute in providing solutions to work related problems.- Assist in the implementation of technology at the operating system level across all server and network areas.- Engage in basic and intermediate level troubleshooting for hardware and software issues. Professional & Technical Skills: - Must To Have Skills: Proficiency in AIX System Administration.- Good To Have Skills: Experience with Linux Operations, Red Hat OS Administration.- Strong understanding of server and network management.- Experience with system monitoring and performance tuning.- Familiarity with backup and recovery solutions. Additional Information:- The candidate should have minimum 3 years of experience in AIX System Administration.- This position is based at our Bengaluru office.- A 15 years full time education is required. Qualification 15 years full time education
Posted 2 months ago
3.0 - 5.0 years
9 - 13 Lacs
Pune
Work from Office
Job Title Big Data Tester About Us Capco, a Wipro company, is a global technology and management consulting firm. Awarded with Consultancy of the year in the British Bank Award and has been ranked Top 100 Best Companies for Women in India 2022 by Avtar & Seramount . With our presence across 32 cities across globe, we support 100+ clients across banking, financial and Energy sectors. We are recognized for our deep transformation execution and delivery. WHY JOIN CAPCO You will work on engaging projects with the largest international and local banks, insurance companies, payment service providers and other key players in the industry. The projects that will transform the financial services industry. MAKE AN IMPACT Innovative thinking, delivery excellence and thought leadership to help our clients transform their business. Together with our clients and industry partners, we deliver disruptive work that is changing energy and financial services. #BEYOURSELFATWORK Capco has a tolerant, open culture that values diversity, inclusivity, and creativity. CAREER ADVANCEMENT With no forced hierarchy at Capco, everyone has the opportunity to grow as we grow, taking their career into their own hands. DIVERSITY & INCLUSION We believe that diversity of people and perspective gives us a competitive advantage. MAKE AN IMPACT Job TitleBig Data Engineer : Role: Support Development, and maintain automated test frameworks, tools, and test cases for Data Engineering and Data Warehouse applications. Collaborate with cross-functional teams, including software developers, data engineers, and data analysts, to ensure comprehensive testing coverage and adherence to quality standards. Conduct thorough testing of data pipelines, ETL processes, and data transformations using Big Data technologies. Apply your knowledge of Data Warehouse/Data Lake methodologies and best practices to validate the accuracy, completeness, and performance of our data storage and retrieval systems. Identify, document, and track software defects, working closely with the development team to ensure timely resolution. Participate in code reviews, design discussions, and quality assurance meetings to provide valuable insights and contribute to the overall improvement of our software products. Base Skill Requirements: Must Technical Bachelor's or Master's degree in Computer Science, Engineering, or a related field. 3-5 years of experience in software testing and development, with a focus on data-intensive applications. Proven experience in testing data pipelines and ETL processes - Test planning, Test Environment planning, End to End testing, Performance testing. Solid programming skills in Python - proven automation effort to bring efficiency in the test cycles. Solid understanding of Data models and SQL . Must have experience with ETL (Extract, Transform, Load) processes and tools (Scheduling and Orchestration tools, ETL Design understanding) Good understanding of Big Data technologies like Spark, Hive, and Impala. Understanding of Data Warehouse methodologies, applications, and processes. Experience working in an Agile/Scrum environment, with a solid understanding of user stories, acceptance criteria, and sprint cycles. Optional Technical Experience with scripting languages like Bash or Shell. Experience working with large-scale datasets and distributed data processing frameworks (e.g., Hadoop, Spark). Familiarity with data integration tools like Apache NiFi is a plus. Excellent problem-solving and debugging skills, with a keen eye for detail. Strong communication and collaboration skills to work effectively in a team-oriented environment. Eagerness to learn and contribute to a growing team.
Posted 2 months ago
4.0 - 9.0 years
5 - 8 Lacs
Gurugram
Work from Office
RARR Technologies is looking for HADOOP ADMIN to join our dynamic team and embark on a rewarding career journey. Responsible for managing the day-to-day administrative tasks Provides support to employees, customers, and visitors Responsibilities:1 Manage incoming and outgoing mail, packages, and deliveries 2 Maintain office supplies and equipment, and ensure that they are in good working order 3 Coordinate scheduling and meetings, and make arrangements for travel and accommodations as needed 4 Greet and assist visitors, and answer and direct phone calls as needed Requirements:1 Experience in an administrative support role, with a track record of delivering high-quality work 2 Excellent organizational and time-management skills 3 Strong communication and interpersonal skills, with the ability to interact effectively with employees, customers, and visitors 4 Proficiency with Microsoft Office and other common office software, including email and calendar applications
Posted 2 months ago
6.0 - 10.0 years
11 - 15 Lacs
Pune
Work from Office
We at Onix Datametica Solutions Private Limited are looking for Bigdata Lead who have a passion for cloud with knowledge of different on-premise and cloud Data implementation in the field of Big Data and Analytic s including and not limiting to Teradata, Netezza, Exadata, Oracle, Cloudera, Hortonworks and alike. Ideal candidates should have technical experience in migrations and the ability to help customers get value from Datametica's tools and accelerators Job Description 6+ years of overall experience in developing, testing implementing Big data projects using Hadoop, Spark, Hive Hands-on experience playing lead role in Big data projects, responsible for implementing one or more tracks within projects, identifying and assigning tasks within the team and providing technical guidance to team members Experience in setting up Hadoop services, implementing ETL/ELT pipelines, working with Terabytes of data ingestion processing from varied systems Experience working in onshore/offshore model, leading technical discussions with customers, mentoring and guiding teams on technology, preparing HDD LDD documents Required Skills and Abilities: Mandatory Skills Spark, Scala/Pyspark, Hadoop ecosystem including Hive, Sqoop, Impala, Oozie, Hue, Java, Python, SQL, Flume, bash (shell scripting) Experience implementing CICD pipelines and working experience with tools like SCM tools such as GIT, Bit bucket, etc Hands on experience in writing data ingestion pipelines, data processing pipelines using spark and SQL, experience in implementing SCD type 1 2, auditing, exception handling mechanism Data Warehousing projects implementation with either, Scala or Hadoop programming background Proficient with various development methodologies like waterfall, agile/scrum Exceptional communication, organisation, and time management skills Collaborative approach to decision-making Strong analytical skills Good To Have - Certifications in any of GCP, AWS or Azure, Cloud era Work on multiple Projects simultaneously, prioritising appropriately
Posted 2 months ago
5.0 - 10.0 years
25 - 35 Lacs
Chennai, Bengaluru
Hybrid
5–12 years of experience in Big Data Proficient in Apache Spark with hands-on experience Proficient in Kafka and RabbitMQ messaging systems Skilled in Hive and Impala for Big Data querying Integrated data from RDBMS (SQL Server, Oracle), ERP
Posted 2 months ago
2.0 - 4.0 years
4 - 6 Lacs
Bengaluru
Work from Office
The Big Data (Scala, HIVE) role involves working with relevant technologies, ensuring smooth operations, and contributing to business objectives. Responsibilities include analysis, development, implementation, and troubleshooting within the Big Data (Scala, HIVE) domain.
Posted 2 months ago
2.0 - 4.0 years
4 - 6 Lacs
Chennai
Work from Office
The Big Data (Scala, HIVE) role involves working with relevant technologies, ensuring smooth operations, and contributing to business objectives. Responsibilities include analysis, development, implementation, and troubleshooting within the Big Data (Scala, HIVE) domain.
Posted 2 months ago
5.0 - 10.0 years
14 - 17 Lacs
Pune
Work from Office
As a Big Data Engineer, you will develop, maintain, evaluate, and test big data solutions. You will be involved in data engineering activities like creating pipelines/workflows for Source to Target and implementing solutions that tackle the clients needs. Your primary responsibilities include: Design, build, optimize and support new and existing data models and ETL processes based on our clients business requirements. Build, deploy and manage data infrastructure that can adequately handle the needs of a rapidly growing data driven organization. Coordinate data access and security to enable data scientists and analysts to easily access to data whenever they need too Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Must have 5+ years exp in Big Data -Hadoop Spark -Scala ,Python Hbase, Hive Good to have Aws -S3, athena ,Dynomo DB, Lambda, Jenkins GIT Developed Python and pyspark programs for data analysis. Good working experience with python to develop Custom Framework for generating of rules (just like rules engine). Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark. Apache Spark DataFrames/RDD's were used to apply business transformations and utilized Hive Context objects to perform read/write operations Preferred technical and professional experience Understanding of Devops. Experience in building scalable end-to-end data ingestion and processing solutions Experience with object-oriented and/or functional programming languages, such as Python, Java and Scala
Posted 2 months ago
3.0 - 5.0 years
12 - 13 Lacs
Thane, Navi Mumbai, Pune
Work from Office
We at Acxiom Technologies are hiring for Pyspark Developer for Mumbai Location Relevant Experience : 1 to 4 Years Location : Mumbai Mode of Work : Work From Office Notice Period : Upto 20 days. Job Description: Proven experience as a Pyspark Developer . Hands-on expertise with AWS Redshift . Strong proficiency in Pyspark , Spark , Python , and Hive . Solid experience with SQL . Excellent communication skills. Benefits of working at Acxiom: - Statutory Benefits - Paid Leaves - Phenomenal Career Growth - Exposure to Banking Domain About Acxiom Technologies: Acxiom Technologies is a leading software solutions services company that provides consulting services to global firms and has established itself as one of the most sought-after consulting organizations in the field of Data Management and Business Intelligence. Also here is our website address https://www.acxtech.co.in/ to give you a detailed overview of our company. Interested Candidates can share their resumes on 7977418669 Thank you.
Posted 2 months ago
3.0 - 7.0 years
6 - 10 Lacs
Bengaluru
Work from Office
Overall Responsibilities: Data Pipeline Development: Design, develop, and maintain highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform, ensuring data integrity and accuracy. Data Ingestion: Implement and manage data ingestion processes from a variety of sources (e.g., relational databases, APIs, file systems) to the data lake or data warehouse on CDP. Data Transformation and Processing: Use PySpark to process, cleanse, and transform large datasets into meaningful formats that support analytical needs and business requirements. Performance Optimization: Conduct performance tuning of PySpark code and Cloudera components, optimizing resource utilization and reducing runtime of ETL processes. Data Quality and Validation: Implement data quality checks, monitoring, and validation routines to ensure data accuracy and reliability throughout the pipeline. Automation and Orchestration: Automate data workflows using tools like Apache Oozie, Airflow, or similar orchestration tools within the Cloudera ecosystem. Monitoring and Maintenance: Monitor pipeline performance, troubleshoot issues, and perform routine maintenance on the Cloudera Data Platform and associated data processes. Collaboration: Work closely with other data engineers, analysts, product managers, and other stakeholders to understand data requirements and support various data-driven initiatives. Documentation: Maintain thorough documentation of data engineering processes, code, and pipeline configurations. Software Requirements: Advanced proficiency in PySpark, including working with RDDs, DataFrames, and optimization techniques. Strong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase. Knowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala). Familiarity with Hadoop, Kafka, and other distributed computing tools. Experience with Apache Oozie, Airflow, or similar orchestration frameworks. Strong scripting skills in Linux. Category-wise Technical Skills: PySpark: Advanced proficiency in PySpark, including working with RDDs, DataFrames, and optimization techniques. Cloudera Data Platform: Strong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase. Data Warehousing: Knowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala). Big Data Technologies: Familiarity with Hadoop, Kafka, and other distributed computing tools. Orchestration and Scheduling: Experience with Apache Oozie, Airflow, or similar orchestration frameworks. Scripting and Automation: Strong scripting skills in Linux. Experience: 3+ years of experience as a Data Engineer, with a strong focus on PySpark and the Cloudera Data Platform. Proven track record of implementing data engineering best practices. Experience in data ingestion, transformation, and optimization on the Cloudera Data Platform. Day-to-Day Activities: Design, develop, and maintain ETL pipelines using PySpark on CDP. Implement and manage data ingestion processes from various sources. Process, cleanse, and transform large datasets using PySpark. Conduct performance tuning and optimization of ETL processes. Implement data quality checks and validation routines. Automate data workflows using orchestration tools. Monitor pipeline performance and troubleshoot issues. Collaborate with team members to understand data requirements. Maintain documentation of data engineering processes and configurations. Qualifications: Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, or a related field. Relevant certifications in PySpark and Cloudera technologies are a plus. Soft Skills: Strong analytical and problem-solving skills. Excellent verbal and written communication abilities. Ability to work independently and collaboratively in a team environment. Attention to detail and commitment to data quality.
Posted 2 months ago
5.0 - 10.0 years
5 - 15 Lacs
Chennai
Hybrid
Role & responsibilities Bigdata, Hadoop, Hive, SQL, Cloudera, Impala, Python, Pyspark Fundamentals of: Big data Cloudera Platform Unix Python Expertise in: SQL/HIVE Pyspark Nice to have: Django/Flask frameworks
Posted 2 months ago
1.0 - 4.0 years
1 - 5 Lacs
Mumbai
Work from Office
Location Mumbai Role Overview : As a Big Data Engineer, you'll design and build robust data pipelines on Cloudera using Spark (Scala/PySpark) for ingestion, transformation, and processing of high-volume data from banking systems. Key Responsibilities : Build scalable batch and real-time ETL pipelines using Spark and Hive Integrate structured and unstructured data sources Perform performance tuning and code optimization Support orchestration and job scheduling (NiFi, Airflow) Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Skills Required : Proficiency in PySpark/Scala with Hive/Impala Experience with data partitioning, bucketing, and optimization Familiarity with Kafka, Iceberg, NiFi is a must Knowledge of banking or financial datasets is a plus
Posted 2 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
39581 Jobs | Dublin
Wipro
19070 Jobs | Bengaluru
Accenture in India
14409 Jobs | Dublin 2
EY
14248 Jobs | London
Uplers
10536 Jobs | Ahmedabad
Amazon
10262 Jobs | Seattle,WA
IBM
9120 Jobs | Armonk
Oracle
8925 Jobs | Redwood City
Capgemini
7500 Jobs | Paris,France
Virtusa
7132 Jobs | Southborough