Jobs
Interviews

344 Hdfs Jobs - Page 9

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 - 10.0 years

25 - 37 Lacs

Pune

Work from Office

Role Overview: Synechron is looking for an experienced Scala Spark Developer to join our advanced analytics and big data engineering team in Pune. The ideal candidate should have a strong background in data processing using Scala and Spark in distributed environments, with the ability to handle large-scale data pipelines for complex business needs. Key Responsibilities: Design and develop scalable big data processing solutions using Scala and Apache Spark. Optimize data workflows and Spark jobs for performance and cost-efficiency. Work with large datasets across structured and unstructured sources. Collaborate with data engineers, analysts, and stakeholders to build end-to-end solutions. Maintain clean, modular, and well-documented code following best practices. Preferred Qualifications: Experience with big data ecosystems (Hadoop, Hive, HDFS, etc.). Familiarity with cloud platforms (AWS, Azure, or GCP) for data engineering. Understanding of data lakes, streaming data, and real-time analytics is a plus. Strong debugging, performance tuning, and communication skills. Educational Qualification: Bachelors or Masters degree in Computer Science, Information Technology, or related field.

Posted 2 months ago

Apply

5.0 - 10.0 years

25 - 37 Lacs

Pune

Work from Office

Mandatory Skills: PySpark Big Data Technologies Role Overview: Synechron is hiring a skilled PySpark Developer for its advanced data engineering team in Pune. The ideal candidate will have strong experience in building scalable data pipelines and solutions using PySpark, with a solid understanding of Big Data ecosystems. Key Responsibilities: Design, build, and maintain high-performance batch and streaming data pipelines using PySpark. Work with large-scale data processing frameworks and big data tools. Optimize and troubleshoot PySpark jobs for efficient performance. Collaborate with data scientists, analysts, and architects to translate business needs into technical solutions. Ensure best practices in code quality, version control, and documentation. Preferred Qualifications: Hands-on experience with Big Data tools like Hive, HDFS, or HBase. Exposure to cloud-based data services (AWS, Azure, or GCP). Familiarity with workflow orchestration tools like Airflow or Oozie. Strong analytical, problem-solving, and communication skills. Educational Qualification: Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.

Posted 2 months ago

Apply

4.0 - 8.0 years

0 - 1 Lacs

Chennai

Work from Office

Role & responsibilities Key Skills: Bigdata Frameworks like: Hadoop, Sprak, Pyspark, Hive - Mandatory

Posted 2 months ago

Apply

5.0 - 10.0 years

15 - 30 Lacs

Bengaluru

Work from Office

Description: To work on Data Analytical to triage and investigate data quality and data pipeline exceptions and reporting issues. Requirements: This role will support Data Operations and Reporting related projects but also will be helping with other projects as well if needed. In this role, you will leverage your strong analytical skills to triage and investigate data quality and data pipeline exceptions and reporting issues. The ideal candidate should be able to work independently and actively engage other functional teams as needed. This role requires researching transactions and events using large amounts of data. Technical Experience/Qualifications: • At least 5 years of experience in software development • At least 5 years of SQL experience in any RDBMS • Minimum 5 years of experience in Python • Strong analytical and problem-solving skill • Strong communication skill • Strong experience with data modeling • Strong experience in data analysis and reporting. • Experience with version control tools such as GitHub etc. • Experience with shell scripting and Linux • Knowledge of agile and scrum methodologies • Preferred experience in Hive SQL or related technologies such as Big Query etc. • Preferred experience in Big data technologies like Hadoop, AWS/GCP, S3, HIVE, Impala, HDFS, Spark, MapReduce • Preferred experience in reporting tools such as Looker or Tableau etc. • Preferred experience in finance and accounting but not required Job Responsibilities: Responsibilities: • Develop SQL queries as per technical requirements • Investigate and fix day to day data related issues • Develop test plan and execute test script • Data validation and analysis • Develop new reports/dashboard as per technical requirements • Modify existing reports/dashboards for bug fixes and enhancements • Develop new ETL scripts and modify existing in case of bug fixes and enhancements • Monitoring of ETL processes and fix issues in case of failure • Monitor scheduled jobs and fix issues in case of failure • Monitor data quality alerts and act on it What We Offer: Exciting Projects: We focus on industries like High-Tech, communication, media, healthcare, retail and telecom. Our customer list is full of fantastic global brands and leaders who love what we build for them. Collaborative Environment: You Can expand your skills by collaborating with a diverse team of highly talented people in an open, laidback environment — or even abroad in one of our global centers or client facilities! Work-Life Balance: GlobalLogic prioritizes work-life balance, which is why we offer flexible work schedules, opportunities to work from home, and paid time off and holidays. Professional Development: Our dedicated Learning & Development team regularly organizes Communication skills training(GL Vantage, Toast Master),Stress Management program, professional certifications, and technical and soft skill trainings. Excellent Benefits: We provide our employees with competitive salaries, family medical insurance, Group Term Life Insurance, Group Personal Accident Insurance , NPS(National Pension Scheme ), Periodic health awareness program, extended maternity leave, annual performance bonuses, and referral bonuses. Fun Perks: We want you to love where you work, which is why we host sports events, cultural activities, offer food on subsidies rates, Corporate parties. Our vibrant offices also include dedicated GL Zones, rooftop decks and GL Club where you can drink coffee or tea with your colleagues over a game of table and offer discounts for popular stores and restaurants!

Posted 2 months ago

Apply

5.0 - 7.0 years

5 - 5 Lacs

Kochi, Hyderabad, Thiruvananthapuram

Work from Office

Key Responsibilities Develop & Deliver: Build applications/features/components as per design specifications, ensuring high-quality code adhering to coding standards and project timelines. Testing & Debugging: Write, review, and execute unit test cases; debug code; validate results with users; and support defect analysis and mitigation. Technical Decision Making: Select optimal technical solutions including reuse or creation of components to enhance efficiency, cost-effectiveness, and quality. Documentation & Configuration: Create and review design documents, templates, checklists, and configuration management plans; ensure team compliance. Domain Expertise: Understand customer business domain deeply to advise developers and identify opportunities for value addition; obtain relevant certifications. Project & Release Management: Manage delivery of modules/user stories, estimate efforts, coordinate releases, and ensure adherence to engineering processes and timelines. Team Leadership: Set goals (FAST), provide feedback, mentor team members, maintain motivation, and manage people-related issues effectively. Customer Interaction: Clarify requirements, present design options, conduct demos, and build customer confidence through timely and quality deliverables. Technology Stack: Expertise in Big Data technologies (PySpark, Scala), plus preferred skills in AWS services (EMR, S3, Glue, Airflow, RDS, DynamoDB), CICD tools (Jenkins), relational & NoSQL databases, microservices, and containerization (Docker, Kubernetes). Soft Skills & Collaboration: Communicate clearly, work under pressure, handle dependencies and risks, collaborate with cross-functional teams, and proactively seek/offers help. Required Skills Big Data,Pyspark,Scala Additional Comments: Must-Have Skills Big Data (Py Spark + Java/Scala) Preferred Skills: AWS (EMR, S3, Glue, Airflow, RDS, Dynamodb, similar) CICD (Jenkins or another) Relational Databases experience (any) No SQL databases experience (any) Microservices or Domain services or API gateways or similar Containers (Docker, K8s, similar)

Posted 2 months ago

Apply

5.0 - 8.0 years

16 - 25 Lacs

Bengaluru

Work from Office

Job Summary We are seeking a highly motivated Senior Data Engineer with expertise in designing, building, and securing data systems. The ideal candidate will have a strong background in data engineering, security compliance, and distributed systems, with a focus on ensuring adherence to industry standards and regulatory requirements. Key Responsibilities Design, implement, and maintain secure data systems, including wrapper solutions for components with minimal security controls, ensuring compliance with bank standards. Identify security design gaps in existing and proposed architectures and recommend enhancements to strengthen system resilience. Develop and enforce security controls for data transfers, including CRON, ETLs, and JDBC-ODBC scripts. Ensure compliance with data sensitivity standards, such as avoiding storage of card numbers or PII in logs, and maintaining data integrity. Collaborate on distributed systems, focusing on resiliency, monitoring, and troubleshooting in production environments. Work with Agile/DevOps practices, CI/CD pipelines (GitHub, Jenkins), and scripting tools to optimize data workflows. Troubleshoot and resolve issues in large-scale data infrastructures, including SQL/NoSQL databases, HDFS, Hive, and HQL. Requirements -5+ years of total experience, with4+ years in Informatica Big Data Management. Extensive knowledge of Oozie scheduling, HQL, Hive, HDFS, and data partitioning. Proficiency in SQL and NoSQL databases, along with Linux OS configuration and shell scripting. Strong understanding of networking concepts (DNS, Proxy, ACL, Policy) and data transfer security. In-depth knowledge of compliance and regulatory requirements (encryption, anonymization, policy controls). Familiarity with Agile/DevOps, CI/CD, and distributed systems monitoring. Ability to address data sensitivity concerns in logging, events, and in-memory storage.

Posted 2 months ago

Apply

16.0 - 20.0 years

16 - 20 Lacs

Bengaluru, Karnataka, India

On-site

Responsibilities: Lead technical projects end-to-end right from design to release, owning the complete delivery. Handle a large Portfolio and account P&L or experience in handling cross market global rollouts, Provide Delivery Oversight including operational parameters. Guide the team to Design and build cloud solutions on Azure / AWS/ GCP, choosing the best-in-class services, tools, methodologies. Translating business requirements into technology specifications, architecture design, data modeling, build, release management, DevOps and stakeholder management including tech/IT teams Experience in one or more Agile and DevOps practices and tools such as Jenkins, JIRA, Confluence, Selenium, SonarQube, etc Implement /Execute projects/ programs involving one or more of the below: one or more programming language/development platforms such as Java, .NET, Python, JavaScript, NodeJS, Go Data engineering tools like pyspark , python , ETL processing etc Production support execution / Managed Services Build the knowledge base required to deliver increasingly complex technology projects. Develop the software and systems needed for end-to-end execution on large projects. Qualifications : 16 to 20 years of technology experience Strong experience in Data engineering , Webapp build , System Integration, Application Development or Data-Warehouse projects, across technologies used in the enterprise space Proficient in data extraction from heterogenous sources, performing EDA, data integrity and quality checks, and applying data cleansing techniques, transform data for modeling. Expertise in Python, Azure, Hadoop (Hive, HBase, HDFS, Map Reduce, Others),Spark are required Ability to work with a global team of consulting professionals across multiple projects Hands-on DevOps experience developing and delivering products in a collaborative, agile team environment Excellent leadership, communication (written and oral), and interpersonal skills Proven success in contributing to a team-oriented environment Proven ability to work creatively and analytically in a problem-solving environment

Posted 2 months ago

Apply

4.0 - 6.0 years

0 Lacs

Noida, Uttar Pradesh, India

On-site

Major skillset: GCP , Pyspark , SQL , Python , Cloud Architecture , ETL , Automation 4+ years of Experience in Data Engineering, Management with strong focus on Spark for building production ready Data Pipelines. Experienced with analyzing large data sets from multiple data sources and build automated testing and validations. Knowledge of Hadoop eco-system and components like HDFS , Spark , Hive , Sqoop Strong Python experience Hands on SQL , HQL to write optimized queries Strong hands-on experience with GCP Big Query , Data Proc , Airflow DAG , Dataflow , GCS , Pub/sub , Secret Manager , Cloud Functions , Beams . Ability to work in fast passed collaborative environment work with various stakeholders to define strategic optimization initiatives. Deep understanding of distributed computing, memory turning and spark optimization. Familiar with CI/CD workflows, Git . Experience in designing modular , automated , and secure ETL frameworks .

Posted 2 months ago

Apply

5.0 - 9.0 years

10 - 18 Lacs

Pune

Work from Office

experience in big data development and data engineering. - Proficiency in Java and experience with Apache Spark. - Experience in API development and integration. - Strong understanding of data engineering principles and big data concepts. Required Candidate profile - Familiarity with big data tools such as Hadoop, HDFS, Hive, HBase, and Kafka. - Experience with SQL and NoSQL databases. - Strong communication and collaboration skills

Posted 2 months ago

Apply

8.0 - 13.0 years

25 - 40 Lacs

Hyderabad, Pune, Bengaluru

Work from Office

Roles and Responsibilities Role- Bigdata Tech Lead Location-Bangalore,Hyderabad,Pune and Chennai Exp Range- 8 years to 15 years Mandatory skills- Pyspark and Bigdata Design, develop, and maintain large-scale data processing pipelines using Hadoop, PySpark, Hive, and HDfS. Ensure scalability, reliability, and performance of big data systems by monitoring logs, troubleshooting issues, and implementing improvements. Collaborate with cross-functional teams to gather requirements and deliver high-quality solutions on time. Develop expertise in multiple programming languages such as Java or Python to support various projects. Participate in code reviews to ensure adherence to coding standards and best practices.

Posted 2 months ago

Apply

5.0 - 8.0 years

12 - 20 Lacs

Hyderabad

Work from Office

Role & responsibilities Data Engineer with Python, Spark, PySpark & HDFS: 9 Positions Preferred candidate profile

Posted 2 months ago

Apply

7.0 - 10.0 years

6 - 7 Lacs

Navi Mumbai, SBI Belapur

Work from Office

ISA Non captive RTH-Y Note: 1.This position requires the candidate to work from the office starting from day one clinet office. 2.Ensure that you perform basic validation and gauge the interest level of the candidate before uploading their profile to our system. 3.Candidate Band will be count as per their relevant experience. We will not entertain lesser experience profile for higher band. 4. Candidate full BGV is required before onboarding the candidate. 5. If required will regularize the candidate after 6months. Hence 6 months NOC is required from the DOJ. Mode of Interview: Face to Face (Mandatory). **JOB DESCRIPTION** Total Years of Experience : 7-10 Years Relevant Years of Experience : 7-10 Years Mandatory Skills : Cloudera DBA Detailed JD : Key Responsibilities: Provision and manage Cloudera clusters (CDP Private Cloud Base) Monitor cluster health, performance, and resource utilization Implement security (Kerberos, Ranger, TLS), HA, and backupstrategies Handle patching, upgrades, and incident response Collaborate with engineering and data teams to support workloads Skills Required: Strong hands-on with Cloudera Manager, Ambari, HDFS, Hive, Impala, Spark Linux administration and scripting skills (Shell, Python) Experience with Kerberos, Ranger, and audit/compliance setups Exposure to Cloudera Support and ticketing processes

Posted 2 months ago

Apply

10.0 - 12.0 years

9 - 11 Lacs

Navi Mumbai, SBI Belapur

Work from Office

ISA NC, Candidate full BGV is required before on-boarding And need to provide NOC after 90days Post DOJ. Education qualification: B.tech or BE. Candidate should appear for the client interview at Mumbai client office. RTH-Y Note: 1. This position requires the candidate to work from the office starting from day one. 2. Ensure that you perform basic validation and gauge the interest level of the candidate before uploading their profile to our system. 3. Candidate Band will be count as per their relevant experience. We will not entertain lesser experience profile for higher band. Mode of Interview: Face to Face (Mandatory). Mandatory Skills : Strong hands-on with Cloudera Manager, Ambari, HDFS, Hive, Impala, Spark. Linux administration and scripting skills (Shell, Python). Experience with Kerberos, Ranger, and audit/compliance setups. Exposure to Cloudera Support and ticketing processes Detailed JD : (i) Provision and manage Cloudera clusters (CDP Private Cloud Base). (ii) Monitor cluster health, performance, and resource utilization. (iii) Implement security (Kerberos, Ranger, TLS), HA, and backup strategies. (iv) Handle patching, upgrades, and incident response. (v) Collaborate with engineering and data teams to support workloads.

Posted 2 months ago

Apply

6.0 - 10.0 years

30 - 35 Lacs

Bengaluru

Work from Office

We are seeking an experienced PySpark Developer / Data Engineer to design, develop, and optimize big data processing pipelines using Apache Spark and Python (PySpark). The ideal candidate should have expertise in distributed computing, ETL workflows, data lake architectures, and cloud-based big data solutions. Key Responsibilities: Develop and optimize ETL/ELT data pipelines using PySpark on distributed computing platforms (Hadoop, Databricks, EMR, HDInsight). Work with structured and unstructured data to perform data transformation, cleansing, and aggregation. Implement data lake and data warehouse solutions on AWS (S3, Glue, Redshift), Azure (ADLS, Synapse), or GCP (BigQuery, Dataflow). Optimize PySpark jobs for performance tuning, partitioning, and caching strategies. Design and implement real-time and batch data processing solutions. Integrate data pipelines with Kafka, Delta Lake, Iceberg, or Hudi for streaming and incremental updates. Ensure data security, governance, and compliance with industry best practices. Work with data scientists and analysts to prepare and process large-scale datasets for machine learning models. Collaborate with DevOps teams to deploy, monitor, and scale PySpark jobs using CI/CD pipelines, Kubernetes, and containerization. Perform unit testing and validation to ensure data integrity and reliability. Required Skills & Qualifications: 6+ years of experience in big data processing, ETL, and data engineering. Strong hands-on experience with PySpark (Apache Spark with Python). Expertise in SQL, DataFrame API, and RDD transformations. Experience with big data platforms (Hadoop, Hive, HDFS, Spark SQL). Knowledge of cloud data processing services (AWS Glue, EMR, Databricks, Azure Synapse, GCP Dataflow). Proficiency in writing optimized queries, partitioning, and indexing for performance tuning. Experience with workflow orchestration tools like Airflow, Oozie, or Prefect. Familiarity with containerization and deployment using Docker, Kubernetes, and CI/CD pipelines. Strong understanding of data governance, security, and compliance (GDPR, HIPAA, CCPA, etc.). Excellent problem-solving, debugging, and performance optimization skills.

Posted 2 months ago

Apply

7.0 - 10.0 years

10 - 15 Lacs

Bengaluru

Work from Office

Job Title: Senior Engineer | Java and Big Data Company Name: Impetus Technologies Job Description: Impetus Technologies is seeking a skilled Senior Engineer with expertise in Java and Big Data technologies. As a Senior Engineer, you will be responsible for designing, developing, and deploying scalable data processing applications using Java and Big Data frameworks. Your role will involve collaborating with cross-functional teams to gather requirements, developing high-quality code, and optimizing data processing workflows. You will also mentor junior engineers and contribute to architectural decisions to enhance the performance and scalability of our systems. Key Responsibilities: - Design, develop, and maintain high-performance applications using Java and Big Data technologies. - Implement data ingestion and processing workflows utilizing frameworks like Hadoop and Spark. - Collaborate with the data architecture team to define data models and ensure efficient data storage and retrieval. - Optimize existing applications for performance, scalability, and reliability. - Mentor and guide junior engineers, providing technical leadership and fostering a culture of continuous improvement. - Participate in code reviews and ensure best practices for coding, testing, and documentation are followed. - Stay current with technology trends in Java and Big Data, and evaluate new tools and methodologies to enhance system capabilities. Skills and Tools Required: - Strong proficiency in Java programming language with experience in building complex applications. - Hands-on experience with Big Data technologies such as Apache Hadoop, Apache Spark, and Apache Kafka. - Understanding of distributed computing concepts and technologies. - Experience with data processing frameworks and libraries, including MapReduce and Spark SQL. - Familiarity with database systems such as HDFS, NoSQL databases (like Cassandra or MongoDB), and SQL databases. - Strong problem-solving skills and the ability to troubleshoot complex issues. - Knowledge of version control systems like Git, and familiarity with CI/CD pipelines. - Excellent communication and teamwork skills to collaborate effectively with peers and stakeholders. - A bachelor’s or master’s degree in Computer Science, Engineering, or a related field is preferred. Roles and Responsibilities About the Role: - You will be responsible for designing and developing scalable Java applications to handle Big Data processing. - Your role will involve collaborating with cross-functional teams to implement innovative solutions that align with business objectives. - You will also play a key role in ensuring code quality and performance through best practices and testing methodologies. About the Team: - You will work with a diverse team of skilled engineers, data scientists, and product managers who are passionate about technology and innovation. - The team fosters a collaborative environment where knowledge sharing and continuous learning are encouraged. - Regular brainstorming sessions and technical workshops will provide opportunities to enhance your skills and stay updated with industry trends. You are Responsible for: - Developing and maintaining high-performance Java applications that process large volumes of data efficiently. - Implementing data integration and processing frameworks using Big Data technologies such as Hadoop and Spark. - Troubleshooting and optimizing existing systems to improve performance and scalability. To succeed in this role – you should have the following: - Strong proficiency in Java and experience with Big Data technologies and frameworks. - Solid understanding of data structures, algorithms, and software design principles. - Excellent problem-solving skills and the ability to work independently as well as part of a team. - Familiarity with cloud platforms and distributed computing concepts is a plus.

Posted 2 months ago

Apply

10.0 - 15.0 years

96 - 108 Lacs

Bengaluru

Work from Office

Responsibilities: * Design data solutions using Java, Python & Apache Spark. * Collaborate with cross-functional teams on Azure cloud projects. * Ensure data security through Redis caching and HDFS storage.

Posted 2 months ago

Apply

14.0 - 22.0 years

45 - 75 Lacs

Pune

Remote

Architecture design, total solution design from requirements analysis, design and engineering for data ingestion, pipeline, data preparation & orchestration, applying the right ML algorithms on the data stream and predictions. Responsibilities: Defining, designing and delivering ML architecture patterns operable in native and hybrid cloud architectures. Research, analyze, recommend and select technical approaches to address challenging development and data integration problems related to ML Model training and deployment in Enterprise Applications. Perform research activities to identify emerging technologies and trends that may affect the Data Science/ ML life-cycle management in enterprise application portfolio. Implementing the solution using the AI orchestration Requirements: Hands-on programming and architecture capabilities in Python, Java, Minimum 6+ years of Experience in Enterprise applications development (Java, . Net) Experience in implementing and deploying Experience in building Data Pipeline, Data cleaning, Feature Engineering, Feature Store Experience in Data Platforms like Databricks, Snowflake, AWS/Azure/GCP Cloud and Data services Machine Learning solutions (using various models, such as Linear/Logistic Regression, Support Vector Machines, (Deep) Neural Networks, Hidden Markov Models, Conditional Random Fields, Topic Modeling, Game Theory, Mechanism Design, etc. ) Strong hands-on experience with statistical packages and ML libraries (e. g. R, Python scikit learn, Spark MLlib, etc. ) Experience in effective data exploration and visualization (e. g. Excel, Power BI, Tableau, Qlik, etc. ) Extensive background in statistical analysis and modeling (distributions, hypothesis testing, probability theory, etc. ) Hands on experience in RDBMS, NoSQL, big data stores like: Elastic, Cassandra, Hbase, Hive, HDFS Work experience as Solution Architect/Software Architect/Technical Lead roles Experience with open-source software. Excellent problem-solving skills and ability to break down complexity. Ability to see multiple solutions to problems and choose the right one for the situation. Excellent written and oral communication skills. Demonstrated technical expertise around architecting solutions around AI, ML, deep learning and related technologies. Developing AI/ML models in real-world environments and integrating AI/ML using Cloud native or hybrid technologies into large-scale enterprise applications. In-depth experience in AI/ML and Data analytics services offered on Amazon Web Services and/or Microsoft Azure cloud solution and their interdependencies. Specializes in at least one of the AI/ML stack (Frameworks and tools like MxNET and Tensorflow, ML platform such as Amazon SageMaker for data scientists, API-driven AI Services like Amazon Lex, Amazon Polly, Amazon Transcribe, Amazon Comprehend, and Amazon Rekognition to quickly add intelligence to applications with a simple API call). Demonstrated experience developing best practices and recommendations around tools/technologies for ML life-cycle capabilities such as Data collection, Data preparation, Feature Engineering, Model Management, MLOps, Model Deployment approaches and Model monitoring and tuning. Back end: LLM APIs and hosting, both proprietary and open-source solutions, cloud providers, ML infrastructure Orchestration: Workflow management such as LangChain, Llamalndex, HuggingFace, OLLAMA Data Management : LLM cache Monitoring: LLM Ops tool Tools & Techniques: prompt engineering, embedding models, vector DB, validation frameworks, annotation tools, transfer learnings and others Pipelines: Gen AI pipelines and implementation on cloud platforms (preference: Azure data bricks, Docker Container, Nginx, Jenkins)

Posted 2 months ago

Apply

7.0 - 10.0 years

8 - 12 Lacs

Bengaluru

Work from Office

Education Qualification : BE/B Tech Minimum Years of Experience : 7-10 Years Type of Employment : Permanent Requirement : Immediate or Max 15 days Job Description : Big Data Developer (Hadoop/Spark/Kafka) - This role is ideal for an experienced Big Data developer who is confident in taking complete ownership of the software development life cycle - from requirement gathering to final deployment. - The candidate will be responsible for engaging with stakeholders to understand the use cases, translating them into functional and technical specifications (FSD & TSD), and implementing scalable, efficient big data solutions. - A key part of this role involves working across multiple projects, coordinating with QA/support engineers for test case preparation, and ensuring deliverables meet high-quality standards. - Strong analytical skills are necessary for writing and validating SQL queries, along with developing optimized code for data processing workflows. - The ideal candidate should also be capable of writing unit tests and maintaining documentation to ensure code quality and maintainability. - The role requires hands-on experience with the Hadoop ecosystem, particularly Spark (including Spark Streaming), Hive, Kafka, and Shell scripting. - Experience with workflow schedulers like Airflow is a plus, and working knowledge of cloud platforms (AWS, Azure, GCP) is beneficial. - Familiarity with Agile methodologies will help in collaborating effectively in a fast-paced team environment. - Job scheduling and automation via shell scripts, and the ability to optimize performance and resource usage in a distributed system, are critical. - Prior experience in performance tuning and writing production-grade code will be valued. - The candidate must demonstrate strong communication skills to effectively coordinate with business users, developers, and testers, and to manage dependencies across teams. Key Skills Required : Must Have : - Hadoop, Spark (core & streaming), Hive, Kafka, Shell Scripting, SQL, TSD/FSD documentation. Good to Have : - Airflow, Scala, Cloud (AWS/Azure/GCP), Agile methodology. This role is both technically challenging and rewarding, offering the opportunity to work on large-scale, real-time data processing systems in a dynamic, agile environment.

Posted 2 months ago

Apply

6.0 - 8.0 years

25 - 30 Lacs

Bengaluru

Work from Office

6+ years of experience in information technology, Minimum of 3-5 years of experience in managing and administering Hadoop/Cloudera environments. Cloudera CDP (Cloudera Data Platform), Cloudera Manager, and related tools. Hadoop ecosystem components (HDFS, YARN, Hive, HBase, Spark, Impala, etc.). Linux system administration with experience with scripting languages (Python, Bash, etc.) and configuration management tools (Ansible, Puppet, etc.) Tools like Kerberos, Ranger, Sentry), Docker, Kubernetes, Jenkins Cloudera Certified Administrator for Apache Hadoop (CCAH) or similar certification. Cluster Management, Optimization, Best practice implementation, collaboration and support.

Posted 2 months ago

Apply

3.0 - 7.0 years

17 - 25 Lacs

Bangalore Rural, Bengaluru

Work from Office

Job Description We are looking for energetic, high-performing and highly skilled Quality Assurance Engineer to help shape our technology and product roadmap. You will be part of the fast-paced, entrepreneurial Enterprise Personalization portfolio focused on delivering the next generation global marketing capabilities. This team is responsible for Global campaign tracking of new accounts acquisition and bounty payments and leverages transformational technologies, such as SQL, Hadoop, Spark, Pyspark, HDFS, MapReduce, Hive, HBase, Kafka & Java. Focus: Provides domain expertise to engineers on Automation, Testing and Quality Assurance (QA) methodologies and processes, crafts and executes test scripts, assists in preparation of test strategies, sets up and maintains test data & environments as well as logs results. 4 - 6 years of hands-on software testing experience in developing test cases and test plans with extensive knowledge of automated testing and architecture. Expert knowledge of Testing Frameworks and Test Automation Design Patterns like TDD, BDD etc. Expertise in developing software test cases for Hive, Spark, SQL written in pyspark SQL and Scala. Hands-on experience in Performance and Load Testing tools such as JMeter, pytest or similar tool. Experience with industry standard tools for defect tracking, source code management, test case management, test automation, and other management and monitoring tools Experience working with Agile methodology Experience with Cloud Platform (GCP) Experience in designing, developing, testing and debugging, and operating resilient distributed systems using Big Data ClustersGood sense for software quality, the clean code principles, test driven development and an agile mindset High engagement, self-organization, strong communication skills and team spirit Experience with building and adopting new test frameworks.Bonus skills:Testing Machine learning/data mining Roles & Responsibilities Responsible for testing and quality assurance of large data processing pipeline using Pyspark and SQL. Develops and tests software, including ongoing refactoring of code, and drives continuous improvement in code structure and quality• Functions as a platform SME who drives quality and automation strategy at application level, identifies new opportunities and drives Software Engineers to deliver the highest quality code. Delivers on capabilities for the portfolio automation strategy and executes against the test and automation strategy defined at the portfolio level. Works with engineers to drive improvements in code quality via manual and automated testing. Involved in the review of the user story backlog and requirements specifications for completeness and weaknesses in function, performance, reliability, scalability, testability, usability, and security and compliance testing. Provides recommendations Plans and defines testing approach, providing advice on prioritization of testing activity in support of identified risks in project schedules or test scenarios.

Posted 2 months ago

Apply

8.0 - 11.0 years

45 - 50 Lacs

Noida, Kolkata, Chennai

Work from Office

Dear Candidate, We are hiring a Julia Developer to build computational and scientific applications requiring speed and mathematical accuracy. Ideal for domains like finance, engineering, or AI research. Key Responsibilities: Develop applications and models using the Julia programming language . Optimize for performance, parallelism, and numerical accuracy . Integrate with Python or C++ libraries where needed. Collaborate with data scientists and engineers on simulations and modeling. Maintain well-documented and reusable codebases. Required Skills & Qualifications: Proficient in Julia , with knowledge of multiple dispatch and type system Experience in numerical computing or scientific research Familiarity with Plots.jl, Flux.jl, or DataFrames.jl Understanding of Python, R, or MATLAB is a plus Soft Skills: Strong troubleshooting and problem-solving skills. Ability to work independently and in a team. Excellent communication and documentation skills. Note: If interested, please share your updated resume and preferred time for a discussion. If shortlisted, our HR team will contact you. Kandi Srinivasa Delivery Manager Integra Technologies

Posted 2 months ago

Apply

2.0 - 5.0 years

18 - 21 Lacs

Hyderabad

Work from Office

Overview Annalect is currently seeking a data engineer to join our technology team. In this role you will build Annalect products which sit atop cloud-based data infrastructure. We are looking for people who have a shared passion for technology, design & development, data, and fusing these disciplines together to build cool things. In this role, you will work on one or more software and data products in the Annalect Engineering Team. You will participate in technical architecture, design, and development of software products as well as research and evaluation of new technical solutions. Responsibilities Design, build, test and deploy scalable and reusable systems that handle large amounts of data. Collaborate with product owners and data scientists to build new data products. Ensure data quality and reliability Qualifications Experience designing and managing data flows. Experience designing systems and APIs to integrate data into applications. 4+ years of Linux, Bash, Python, and SQL experience 2+ years using Spark and other frameworks to process large volumes of data. 2+ years using Parquet, ORC, or other columnar file formats. 2+ years using AWS cloud services, esp. services that are used for data processing e.g. Glue, Dataflow, Data Factory, EMR, Dataproc, HDInsights , Athena, Redshift, BigQuery etc. Passion for Technology: Excitement for new technology, bleeding edge applications, and a positive attitude towards solving real world challenges

Posted 2 months ago

Apply

7.0 - 12.0 years

12 - 22 Lacs

Pune, Chennai, Bengaluru

Work from Office

Dear Candidate, Greetings of the day!!! Location- Bangalore, Hyderabad , Pune and Chennai Experience- 3.5 years to 13 years Short JD- Job Description Key skills- Spark or Pyspark or Scala (Any big data skill is fine) All skills are good to have. Desired Competencies (Technical/Behavioral Competency) Must-Have** (Ideally should not be more than 3-5) 1. Minimum 3-12 years of experience in build & deployment of Bigdata applications using SparkSQL, SparkStreaming in Python; 2. Minimum 2 years of extensive experience in design, build and deployment of Python-based applications; 3. Design and develop ETL integration patterns using Python on Spark. Develop framework for converting existing PowerCenter mappings and to PySpark(Python and Spark) Jobs. Expertise on graph algorithms and advanced recursion techniques. Hands-on experience in generating/parsing XML, JSON documents, and REST API request/responses. Good-to-Have Hands-on experience writing complex SQL queries, exporting and importing large amounts of data using utilities.

Posted 2 months ago

Apply

8.0 - 11.0 years

35 - 37 Lacs

Kolkata, Ahmedabad, Bengaluru

Work from Office

Dear Candidate, We are hiring a Data Engineer to build and maintain data pipelines for our analytics platform. Perfect for engineers focused on data processing and scalability. Key Responsibilities: Design and implement ETL processes Manage data warehouses and ensure data quality Collaborate with data scientists to provide necessary data Optimize data workflows for performance Required Skills & Qualifications: Proficiency in SQL and Python Experience with data pipeline tools like Apache Airflow Familiarity with big data technologies (Spark, Hadoop) Bonus: Knowledge of cloud data services (AWS Redshift, Google BigQuery) Soft Skills: Strong troubleshooting and problem-solving skills. Ability to work independently and in a team. Excellent communication and documentation skills. Note: If interested, please share your updated resume and preferred time for a discussion. If shortlisted, our HR team will contact you. Kandi Srinivasa Delivery Manager Integra Technologies

Posted 2 months ago

Apply

5.0 - 9.0 years

0 - 0 Lacs

Mumbai, Pune, Bengaluru

Hybrid

Data Engineer Experience 5 to 10 years Location Pune Yeravda hybrid Primary Skill: Scala coding Spark SQL **Key Responsibilities:** - Design and implement high-performance data pipelines using Apache Spark and Scala. - Optimize Spark jobs for efficiency and scalability. - Collaborate with diverse data sources and teams to deliver valuable insights. - Monitor and troubleshoot production pipelines to ensure smooth operations. - Maintain thorough documentation for all systems and code. **Required Skills & Qualifications:** - Minimum of 3 years hands-on experience with Apache Spark and Scala. - Strong grasp of distributed computing principles and Spark internals. - Proficiency in working with big data technologies like HDFS, Hive, Kafka, and HBase. - Ability to write optimized Spark jobs using Scala effectively.

Posted 2 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies