Jobs
Interviews

344 Hdfs Jobs - Page 11

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

7.0 - 10.0 years

8 - 14 Lacs

Hyderabad

Hybrid

Responsibilities of the Candidate : - Be responsible for the design and development of big data solutions. Partner with domain experts, product managers, analysts, and data scientists to develop Big Data pipelines in Hadoop - Be responsible for moving all legacy workloads to a cloud platform - Work with data scientists to build Client pipelines using heterogeneous sources and provide engineering services for data PySpark science applications - Ensure automation through CI/CD across platforms both in cloud and on-premises - Define needs around maintainability, testability, performance, security, quality, and usability for the data platform - Drive implementation, consistent patterns, reusable components, and coding standards for data engineering processes - Convert SAS-based pipelines into languages like PySpark, and Scala to execute on Hadoop and non-Hadoop ecosystems - Tune Big data applications on Hadoop and non-Hadoop platforms for optimal performance - Apply an in-depth understanding of how data analytics collectively integrate within the sub-function as well as coordinate and contribute to the objectives of the entire function. - Produce a detailed analysis of issues where the best course of action is not evident from the information available, but actions must be recommended/taken. - Assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients, and assets, by driving compliance with applicable laws, rules, and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct, and business practices, and escalating, managing and reporting control issues with transparency Requirements : - 6+ years of total IT experience - 3+ years of experience with Hadoop (Cloudera)/big data technologies - Knowledge of the Hadoop ecosystem and Big Data technologies Hands-on experience with the Hadoop eco-system (HDFS, MapReduce, Hive, Pig, Impala, Spark, Kafka, Kudu, Solr) - Experience in designing and developing Data Pipelines for Data Ingestion or Transformation using Java Scala or Python. - Experience with Spark programming (Pyspark, Scala, or Java) - Hands-on experience with Python/Pyspark/Scala and basic libraries for machine learning is required. - Proficient in programming in Java or Python with prior Apache Beam/Spark experience a plus. - Hand on experience in CI/CD, Scheduling and Scripting - Ensure automation through CI/CD across platforms both in cloud and on-premises - System level understanding - Data structures, algorithms, distributed storage & compute - Can-do attitude on solving complex business problems, good interpersonal and teamwork skills

Posted 3 months ago

Apply

12.0 - 15.0 years

35 - 50 Lacs

Hyderabad

Work from Office

Skill : Java, Spark, Kafka Experience : 10 to 16 years Location : Hyderabad As Data Engineer, you will : Support in designing and rolling out the data architecture and infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources Identify data source, design and implement data schema/models and integrate data that meet the requirements of the business stakeholders Play an active role in the end-to-end delivery of AI solutions, from ideation, feasibility assessment, to data preparation and industrialization. Work with business, IT and data stakeholders to support with data-related technical issues, their data infrastructure needs as well as to build the most flexible and scalable data platform. With a strong focus on DataOps, design, develop and deploy scalable batch and/or real-time data pipelines. Design, document, test and deploy ETL/ELT processes Find the right tradeoffs between the performance, reliability, scalability, and cost of the data pipelines you implement Monitor data processing efficiency and propose solutions for improvements. • Have the discipline to create and maintain comprehensive project documentation. • Build and share knowledge with colleagues and coach junior profiles.

Posted 3 months ago

Apply

8.0 - 13.0 years

25 - 40 Lacs

Bengaluru

Hybrid

Job Title / Primary Skill: Big Data Developer (Lead/Associate Manager) Management Level: G150 Years of Experience: 8 to 13 years Job Location: Bangalore (Hybrid) Must Have Skills: Big data, Spark, Scala, SQL, Hadoop Ecosystem. Educational Qualification: BE/BTech/ MTech/ MCA, Bachelor or masters degree in Computer Science, Job Overview Overall Experience 8+ years in IT, Software Engineering or relevant discipline. Designs, develops, implements, and updates software systems in accordance with the needs of the organization. Evaluates, schedules, and resources development projects; investigates user needs; and documents, tests, and maintains computer programs. Job Description: We look for developers to have good knowledge of Scala programming skills and Knowledge of SQL Technical Skills: Scala, Python -> Scala is often used for Hadoop-based projects, while Python and Scala are choices for Apache Spark-based projects. SQL -> Knowledge of SQL (Structured Query Language) is important for querying and manipulating data Shell Script -> Shell scripts are used for batch processing of data, it can be used for scheduling the jobs and shell scripts are often used for deploying applications Spark Scala -> Spark Scala allows you to write Spark applications using the Spark API in Scala Spark SQL -> It allows to work with structured data using SQL-like queries and Data Frame APIs. We can execute SQL queries against Data Frames, enabling easy data exploration, transformation, and analysis. The typical tasks and responsibilities of a Big Data Developer include: 1. Data Ingestion: Collecting and importing data from various sources, such as databases, logs, APIs into the Big Data infrastructure. 2. Data Processing: Designing data pipelines to clean, transform, and prepare raw data for analysis. This often involves using technologies like Apache Hadoop, Apache Spark. 3. Data Storage: Selecting appropriate data storage technologies like Hadoop Distributed File System (HDFS), HIVE, IMPALA, or cloud-based storage solutions (Snowflake, Databricks).

Posted 3 months ago

Apply

4.0 - 9.0 years

6 - 16 Lacs

Coimbatore

Work from Office

Position Name: Data Engineer Location: Coimbatore (Hybrid 3 days per week) Work Shift Timing: 1.30 pm to 10.30 pm (IST) Mandatory Skills: Hadoop, Spark, Python, Data bricks Good to have: Java/Scala The Role: • Designing and building optimized data pipelines using cutting-edge technologies in a cloud environment to drive analytical insights. • Constructing infrastructure for efficient ETL processes from various sources and storage systems. • Leading the implementation of algorithms and prototypes to transform raw data into useful information. • Architecting, designing, and maintaining database pipeline architectures, ensuring readiness for AI/ML transformations. • Creating innovative data validation methods and data analysis tools. • Ensuring compliance with data governance and security policies. • Interpreting data trends and patterns to establish operational alerts. • Developing analytical tools, programs, and reporting mechanisms. • Conducting complex data analysis and presenting results effectively. • Preparing data for prescriptive and predictive modeling. • Continuously exploring opportunities to enhance data quality and reliability. • Applying strong programming and problem-solving skills to develop scalable solutions. Requirements: • Experience in the Big Data technologies (Hadoop, Spark, Nifi, Impala). • Hands-on experience designing, building, deploying, testing, maintaining, monitoring, and owning scalable, resilient, and distributed data pipelines. • High proficiency in Scala/Java and Spark for applied large-scale data processing • Expertise with big data technologies, including Spark, Data Lake, and Hive. • Solid understanding of batch and streaming data processing techniques. • Proficient knowledge of the Data Lifecycle Management process, including data collection, access, use, storage, transfer, and deletion. • Expert-level ability to write complex, optimized SQL queries across extensive data volumes. • Experience on HDFS, Nifi, Kafka. • Experience on Apache Ozone, Delta Tables, Databricks, Axon(Kafka), Spring Batch, Oracle DB • Familiarity with Agile methodologies. • Obsession for service observability, instrumentation, monitoring, and alerting. • Knowledge or experience in architectural best practices for building data lakes Interested candidates can share their resume at Neesha1@damcogroup.com

Posted 3 months ago

Apply

4.0 - 8.0 years

15 - 30 Lacs

Noida, Hyderabad, India

Hybrid

Spark Architecture , Spark tuning, Delta tables, Madelaine architecture, data Bricks , Azure cloud services python Oops concept, Pyspark complex transformation , Read data from different file format and sources writing to delta tables Dataware housing concepts How to process large files and handle pipeline failures in current projects Roles and Responsibilities Spark Architecture , Spark tuning, Delta tables, Madelaine architecture, data Bricks , Azure cloud services python Oops concept, Pyspark complex transformation , Read data from different file format and sources writing to delta tables Dataware housing concepts How to process large files and handle pipeline failures in current projects

Posted 3 months ago

Apply

8.0 - 13.0 years

35 - 50 Lacs

Mumbai

Work from Office

Hiring Big Data Lead with 8+ years experience for US Shift time: Must Have: - Big Data: Spark, Hadoop, Kafka, Hive, Flink - Backend: Python, Scala - NoSQL: MongoDB, Cassandra - Cloud: AWS/AZURE/GCP, Snowflake, Databricks - Docker, Kubernetes, CI/CD Required Candidate profile - Excellent in Mentoring/ Training in Big Data- HDFS, YARN, Airflow, Hive, Mapreduce, Hbase, Kafka & ETL/ELT, real-time streaming, data modeling - Immediate Joiner is plus - Excellent in Communication

Posted 3 months ago

Apply

5.0 - 9.0 years

2 - 3 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

HARMAN s engineers and designers are creative, purposeful and agile. As part of this team, you ll combine your technical expertise with innovative ideas to help drive cutting-edge solutions in the car, enterprise and connected ecosystem. Every day, you will push the boundaries of creative design, and HARMAN is committed to providing you with the opportunities, innovative technologies and resources to build a successful career.A Career at HARMANAs a technology leader that is rapidly on the move, HARMAN is filled with people who are focused on making life better. Innovation, inclusivity and teamwork are a part of our DNA. When you add that to the challenges we take on and solve together, you ll discover that at HARMAN you can grow, make a difference and be proud of the work you do everyday.Job description: Lead and mentor a team of Python developers. Design, develop, and maintain highly scalable data processing applications Write efficient, reusable and well documented code Deliver big data projects using Spark, Scala, Python, SQL, HQL, Hive Leverage data pipelining application to package work Maintain and tune existing Hadoop applications Work closely with QA, Operations and various teams to deliver error free software on time Perform code reviews and provide constructive feedback. Actively participate in daily agile / scrum meetings Job skill 7+ years of software development experience with Hadoop framework components(HDFS, Spark, PySpark, Sqoop, Hive, HQL, Spark, Scala) Experience in a leadership or supervisory role. 4+ years of experience using Python, SQL and shell scripting Experience in developing and tuning spark applications Excellent understanding of spark architecture, data frames and tuning spark Strong knowledge of database concepts, systems architecture, and data structures is a must Process oriented with strong analytical and problem solving skills Excellent written and verbal communication skills Bachelors degree in Computer Science or related field HARMAN is proud to be an Equal Opportunity / Affirmative Action employer. All qualified applicants will receive consideration for employment without regard torace, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.

Posted 3 months ago

Apply

6.0 - 11.0 years

19 - 27 Lacs

Haryana

Work from Office

About Company Job Description Key responsibilities: 1. Understand, implement, and automate ETL pipelines with better industry standards 2. Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, design infrastructure for greater scalability, etc 3. Developing, integrating, testing, and maintaining existing and new applications 4. Design, and create data pipelines (data lake / data warehouses) for real world energy analytical solutions 5. Expert-level proficiency in Python (preferred) for automating everyday tasks 6. Strong understanding and experience in distributed computing frameworks, particularly Spark, Spark-SQL, Kafka, Spark Streaming, Hive, Azure Databricks etc 7. Limited experience in using other leading cloud platforms preferably Azure. 8. Hands on experience on Azure data factory, logic app, Analysis service, Azure blob storage etc. 9. Ability to work in a team in an agile setting, familiarity with JIRA and clear understanding of how Git works 10. Must have 5-7 years of experience

Posted 3 months ago

Apply

6.0 - 11.0 years

18 - 25 Lacs

Hyderabad

Work from Office

SUMMARY Data Modeling Professional Location Hyderabad/Pune Experience: The ideal candidate should possess at least 6 years of relevant experience in data modeling with proficiency in SQL, Python, Pyspark, Hive, ETL, Unix, Control-M (or similar scheduling tools) along with GCP. Key Responsibilities: Develop and configure data pipelines across various platforms and technologies. Write complex SQL queries for data analysis on databases such as SQL Server, Oracle, and HIVE. Create solutions to support AI/ML models and generative AI. Work independently on specialized assignments within project deliverables. Provide solutions and tools to enhance engineering efficiencies. Design processes, systems, and operational models for end-to-end execution of data pipelines. Preferred Skills: Experience with GCP, particularly Airflow, Dataproc, and Big Query, is advantageous. Requirements Requirements: Strong problem-solving and analytical abilities. Excellent communication and presentation skills. Ability to deliver high-quality materials against tight deadlines. Effective under pressure with rapidly changing priorities. Note: The ability to communicate efficiently at a global level is paramount. --- Minimum 6 years of experience in data modeling with SQL, Python, Pyspark, Hive, ETL, Unix, Control-M (or similar scheduling tools). Proficiency in writing complex SQL queries for data analysis. Experience with GCP, particularly Airflow, Dataproc, and Big Query, is an advantage. Strong problem-solving and analytical abilities. Excellent communication and presentation skills. Ability to work effectively under pressure with rapidly changing priorities.

Posted 3 months ago

Apply

4.0 - 9.0 years

8 - 18 Lacs

Bengaluru

Hybrid

We have an immediate opening for Big Data Developer with Encora Innovation Labs in Bangalore. Exp: 4 to 8 Yrs Location : Bangalore (Hybrid) Budget: Not a constraint for right candidate Job Description: Spark and Scala Hive, Hadoop Strong communication skills If interested, please revert with your updated resume and passport size photo along with below mentioned details. Total Exp: Rel Exp: CTC: ECTC: Notice Period (Immediate to 15 Days): Current Location: Preferred Location: Any offers in Han

Posted 3 months ago

Apply

5.0 - 10.0 years

20 - 35 Lacs

Chennai

Work from Office

5+ Years of experience in ETL development with strong proficiency in Informatica BDM . Hands-on experience with big data platforms like Hadoop, Hive, HDFS, Spark . Proficiency in SQL and working knowledge of Unix/Linux shell scripting. Experience in performance tuning of ETL jobs in a big data environment. Familiarity with data modeling concepts and working with large datasets. Strong problem-solving skills and attention to detail. Experience with job scheduling tools (e.g., Autosys, Control-M) is a plus.

Posted 3 months ago

Apply

3.0 - 8.0 years

3 - 8 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

The Regulatory Engineering team builds sophisticated applications and systems that help the firm file regulatory reports to various exchanges and regulators across the globe. The regulatory obligations cover all non-financial reporting across major businesses and asset classes in the firm. The applications and systems have the requirements to process very high volume of data in a short time in order for the firm to meet the tight regulatory SLAs. How You Will Fulfill Your Potential Hands on Technical developer to implement, support and maintain the regulatory reporting applications and systems Engage in the entire software development lifecycle, including interacting with end users to elicit and convert requirements into technical solutions and interacting with end users to resolve support issues. Participate as part of a global team on large development projects within the Regulatory reporting space. Design, evaluate and recommend tools and technologies that the team should be using to help solve problems. Actively participate as a member of a global team on larger development projects and assume responsibilities of components of global projects, depending on need. Support the system with business users and communicate ideas clearly and concisely to non-technical users of the system. Basic Qualifications Bachelor's degree in Computer Science, Computer Engineering or a related field 3+ years of prior experience Experience with Java Experience with Apache Spark, Hadoop / HDFS, Sybase IQ preferred

Posted 3 months ago

Apply

4.0 - 9.0 years

7 - 12 Lacs

Bengaluru

Work from Office

Practice Overview : Skill/Operating Group Technology Consulting Level Consultant Location Gurgaon/Mumbai/ Bangalore Travel Percentage Expected Travel could be anywhere between 0-100% Why Technology Consulting The Technology Consulting business within Capability Network invents the future for clients by providing them with the right guidance, design thinking and innovative solutions for technological transformation. As technology rapidly evolves, it's more important than ever to have an innovation advisor who can create a new vision or put one into placeto solve the client's toughest business problems. Specialize in management or technology consulting to transform the world's leading organizations by building innovative business solutions as their trusted advisor by: Helping Clients :Rethinking IT and Digital ecosystems and innovating to help clients keep pace with fast-changing, customer-driven environments. Enhancing your Skillset :Building expertise and innovating with leading-edge technologies such as Blockchain, Artificial Intelligence and Cloud. Transforming Businesses :Developing customized, next-generation products and services that help clients shift to new business models designed for today's connectedlandscape of disruptive technologies Principal Duties And Resp o nsibilities: Working closely with our clients, Consulting professionals design, build and implement strategies , POC that can help enhance business performance. They develop specialized expertise strategic, industry, functional, technicalin a diverse project environment that offers multiple opportunities for career growth. The opportunities to make a difference within exciting client initiatives are limitless in this ever-changing business landscape. Here are just a few of your day-to-day responsibilities. Identifying, Assessing and solvingcomplex business problemsfor area of responsibility, where analysis of situations or data requires an in-depth evaluation of variable factors. Interact with client stakeholders to understand their AI problems, priority use-cases, define a problem statement, understand the scope of the engagement, and also drive projects to deliver value to the client Understand client's business and IT goals, vision and identify opportunities for reinvestment Through your expertise and experience, guide your team-mates to suggest the right solutions to meet the needs of clients and help draw up practical implementation road maps that position them for long-term success Benchmark against global research benchmarks and leading industry peers to understand current & recommend AI solutions Conduct discovery workshops and design sessions to elicit AI opportunities & client pain areas. Experience in designing & developing Enterprise wide AI architecture & strategy Contribute towards practice development initiatives like new offering development, people management etc Qualifications Qualifications: MBA Degree from Tier-1 College (Preferable) Bachelors Degree AI/ML/Cloud AI Certifications preferred Minimum of4-9years large scale consulting experience and managing teams in a consulting environment or at high tech companies Experience: We are looking for Advanced Analytics and AI Consultants to join our growing team having experience with Machine learning and NLP applications . Operating across all stages of the innovation spectrum, with a view to build the futur e . Experience in designing and building end-to-end AI Strategy & solutions using Cloud & Non-Cloud AI ML, NLP algo. Assessment: Works as part of multi-disciplinary teams and responsible for independently driving specific AI workstreams through collaboration with designers, other AI team members, platform & Data engineers, business subject matter experts and technology delivery teams to assess AI potential and develop use cases. Liaises effectively with global community and senior leadership to identify and drive value opportunities around AI, support investments in POCs, pilots and chargeable projects. Design: The Consultantsrole on these projects centers around the application of analytics, data science and advanced cognitive methods to derive insights from data and develop POCs and appropriate high-level solution designs.Strong expertise in designing AutoML, MLOps AI solutions using any of the Cloud services (AWS, GCP, Azure). Architecture: The nature of these projects change as ideas and concepts mature, ranging from research, proofs-of-concept and the art-of-the-possible to the delivery of real-world applications for our clients. Focus in this area is on the impact to clients technology landscape/ architecture and ensuring formulation of relevant guiding principles and platform components. Expertise in designing Enterprise AI strategy, Responsible AI frameworks is preferred. Strong experience handling multiple end-to-end AI Lifecycle projects from Data preparation, Modeling, Build, Train & Deploy. Product/ Framework/ Tools evaluation: Collaborate with business experts for business understanding, working with other consultants and platform engineers for solutions and with technology teams for prototyping and client implementations.Evaluate existing products and frameworks available and develop options for proposed solutions. Have an understanding of the framework/ tools to engage the client in meaningful discussions and steer towards the right recommendation. Modeling : Strong SME knowledge in Data Preparation, Feature engineering, Feature selection, Training datasets, Algorithmic selection, Optimizing & Production deployment. Work as a technical SME to teams and clienton advising right AI solutions. The Consultant should have practical industry expertise. The areas of Financial Services, Retail, Telecommunications, Life Sciences and Resources are of interest but experience in equivalent domains is also welcomed. Consultants should understand the key technology trends in their domain and the related business implications. Key Competencies and Skills: Strong desire to work in technology-driven business transformation Strong knowledge of technology trends across IT and digital and how they can be applied to companies to address real world problems and opportunities. Exceptional interpersonal and presentation skills - ability to convey technology and business value propositions to senior stakeholders Team oriented and collaborative working style, both with clients and those within the organization. Capacity to develop high impact thought leadership that articulates a forward thinking view of the market. Ability to develop and maintain strong internal and client relationships Proven track record in working creatively and analytically in a problem-solving environment Proven success in contributing to a team-oriented environment with effective consulting skills Proven track record to quickly understand the key value drivers of a business, how they impact the scope and approach of the engagement Flexibility to accommodate client travel requirements Technical Skills Good exposure to AI/NLP/ML algorithms and building models/chatbots/solutions. Expert in developing AI solutions using Cloud AI services (AWS, GCP, Azure). Strong understanding of entire ML project lifecycle from business issue identification, data audit to model maintenance in production Experience in conducting client workshops and AI use-case development & prioritization Strong knowledge on AI frameworks & algorithms Strong & relevant experience in end-to-end AI lifecycle from Data capture, Data preparation, Model planning, Model selection, Model build, train & test & Model Deployment Expert in designing AI pipelines, Feature engg, Feature selection, Labeling, Training & Optimizing models Good understanding on Scaling/Industrializing AI, Enterprise AI, Lean AI concepts, frameworks & tools Familiarity with some distributed storage frameworks and architectures Machine Learning Language: Python/Scala/R/SAS knowledge Virtual Agents NLU, NLG, Text to speech ML Libraries SciKit learn, Stanford, NLTK, Numpy, PyTorchetc Familiarity with cloud AI and Data offerings Database: HDFS, NoSQL, In-memory (Spark), Neo4j Experience with any of the platforms like IBM Watson, Microsoft ML, Google Cloud AI or AWS ML

Posted 3 months ago

Apply

6.0 - 11.0 years

4 - 9 Lacs

Gurugram

Work from Office

Job Description : We are looking for a skilled and detail-oriented Big Data QA Engineer with 48 years of experience in data testing and QA processes. The ideal candidate should have strong expertise in Big Data tools, SQL, and Linux, and be capable of both manual and automated testing. Key Responsibilities : Perform end-to-end QA for Big Data applications, including data pipelines and batch processing systems. Validate data flow and processing across HDFS, Hive, and Kafka. Write and execute complex SQL queries for data validation. Utilize Linux commands for log analysis and environment verification. Ensure test coverage, accuracy, and maintainability of test cases and plans. Collaborate with development, DevOps, and data engineering teams. Contribute to test automation frameworks and scripting (if applicable)

Posted 3 months ago

Apply

12.0 - 20.0 years

35 - 40 Lacs

Navi Mumbai

Work from Office

Job Title: Big Data Developer and Project Support & Mentorship Position Overview: We are seeking a skilled Big Data Developer to join our growing delivery team, with a dual focus on hands-on project support and mentoring junior engineers. This role is ideal for a developer who not only thrives in a technical, fast-paced environment but is also passionate about coaching and developing the next generation of talent. You will work on live client projects, provide technical support, contribute to solution delivery, and serve as a go-to technical mentor for less experienced team members. Key Responsibilities: Perform hands-on Big Data development work, including coding, testing, troubleshooting, and deploying solutions. Support ongoing client projects, addressing technical challenges and ensuring smooth delivery. Collaborate with junior engineers to guide them on coding standards, best practices, debugging, and project execution. Review code and provide feedback to junior engineers to maintain high quality and scalable solutions. Assist in designing and implementing solutions using Hadoop, Spark, Hive, HDFS, and Kafka. Lead by example in object-oriented development, particularly using Scala and Java. Translate complex requirements into clear, actionable technical tasks for the team. Contribute to the development of ETL processes for integrating data from various sources. Document technical approaches, best practices, and workflows for knowledge sharing within the team. Required Skills and Qualifications: 8+ years of professional experience in Big Data development and engineering. Strong hands-on expertise with Hadoop, Hive, HDFS, Apache Spark, and Kafka. Solid object-oriented development experience with Scala and Java. Strong SQL skills with experience working with large data sets. Practical experience designing, installing, configuring, and supporting Big Data clusters. Deep understanding of ETL processes and data integration strategies. Proven experience mentoring or supporting junior engineers in a team setting. Strong problem-solving, troubleshooting, and analytical skills. Excellent communication and interpersonal skills. Preferred Qualifications: Professional certifications in Big Data technologies (Cloudera, Databricks, AWS Big Data Specialty, etc.). Experience with cloud Big Data platforms (AWS EMR, Azure HDInsight, or GCP Dataproc). Exposure to Agile or DevOps practices in Big Data project environments. What We Offer: Opportunity to work on challenging, high-impact Big Data projects. Leadership role in shaping and mentoring the next generation of engineers. Supportive and collaborative team culture. Flexible working environment Competitive compensation and professional growth opportunities.

Posted 3 months ago

Apply

7.0 - 12.0 years

9 - 15 Lacs

Bengaluru

Work from Office

We are looking for lead or principal software engineers to join our Data Cloud team. Our Data Cloud team is responsible for the Zeta Identity Graph platform, which captures billions of behavioural, demographic, environmental, and transactional signals, for people-based marketing. As part of this team, the data engineer will be designing and growing our existing data infrastructure to democratize data access, enable complex data analyses, and automate optimization workflows for business and marketing operations. Job Description: Essential Responsibilities: As a Lead or Principal Data Engineer, your responsibilities will include: Building, refining, tuning, and maintaining our real-time and batch data infrastructure Daily use technologies such as HDFS, Spark, Snowflake, Hive, HBase, Scylla, Django, FastAPI, etc. Maintaining data quality and accuracy across production data systems Working with Data Engineers to optimize data models and workflows Working with Data Analysts to develop ETL processes for analysis and reporting Working with Product Managers to design and build data products Working with our DevOps team to scale and optimize our data infrastructure Participate in architecture discussions, influence the road map, take ownership and responsibility over new projects Participating in 24/7 on-call rotation (be available by phone or email in case something goes wrong) Desired Characteristics: Minimum 7 years of software engineering experience. Proven long term experience and enthusiasm for distributed data processing at scale, eagerness to learn new things. Expertise in designing and architecting distributed low latency and scalable solutions in either cloud and onpremises environment. Exposure to the whole software development lifecycle from inception to production and monitoring Fluency in Python or solid experience in Scala, Java Proficient with relational databases and Advanced SQL Expert in usage of services like Spark, HDFS, Hive, HBase Experience in adequate usage of any scheduler such as Apache Airflow, Apache Luigi, Chronos etc. Experience in adequate usage of cloud services (AWS) at scale Experience in agile software development processes Excellent interpersonal and communication skills Nice to have: Experience with large scale / multi-tenant distributed systems Experience with columnar / NoSQL databases Vertica, Snowflake, HBase, Scylla, Couchbase Experience in real team streaming frameworks Flink, Storm Experience with web frameworks such as Flask, Django .

Posted 3 months ago

Apply

4.0 - 9.0 years

7 - 17 Lacs

Hyderabad

Hybrid

Mega Walkin Drive for Lead Analyst/Senior Software Engineer - Data Engineer Shift Timings : Shift: General Shift (5 Days WFO for initial 8 weeks) Your future duties and responsibilities: Job Overview: CGI is looking for a talented and motivated Data Engineer with strong expertise in Python, Apache Spark, HDFS, and MongoDB to build and manage scalable, efficient, and reliable data pipelines and infrastructure Youll play a key role in transforming raw data into actionable insights, working closely with data scientists, analysts, and business teams. Key Responsibilities: Design, develop, and maintain scalable data pipelines using Python and Spark. Ingest, process, and transform large datasets from various sources into usable formats. Manage and optimize data storage using HDFS and MongoDB. Ensure high availability and performance of data infrastructure. Implement data quality checks, validations, and monitoring processes. Collaborate with cross-functional teams to understand data needs and deliver solutions. Write reusable and maintainable code with strong documentation practices. Optimize performance of data workflows and troubleshoot bottlenecks. Maintain data governance, privacy, and security best practices. Required qualifications to be successful in this role: Minimum 6 years of experience as a Data Engineer or similar role. Strong proficiency in Python for data manipulation and pipeline development. Hands-on experience with Apache Spark for large-scale data processing. Experience with HDFS and distributed data storage systems. Proficient in working with MongoDB, including data modeling, indexing, and querying. Strong understanding of data architecture, data modeling, and performance tuning. Familiarity with version control tools like Git. Experience with workflow orchestration tools (e.g., Airflow, Luigi) is a plus. Knowledge of cloud services (AWS, GCP, or Azure) is preferred. Bachelors or Masters degree in Computer Science, Information Systems, or a related field. Preferred Skills: Experience with containerization (Docker, Kubernetes). Knowledge of real-time data streaming tools like Kafka. Familiarity with data visualization tools (e.g., Power BI, Tableau). Exposure to Agile/Scrum methodologies. Skills: English Oracle Python Java Notice Period- 0-45 Days Pre requisites : Aadhar Card a copy, PAN card copy, UAN Disclaimer : The selected candidates will initially be required to work from the office for 8 weeks before transitioning to a hybrid model with 2 days of work from the office each week.

Posted 3 months ago

Apply

6.0 - 11.0 years

18 - 25 Lacs

Hyderabad

Work from Office

SUMMARY Data Modeling Professional Location Hyderabad/Pune Experience: The ideal candidate should possess at least 6 years of relevant experience in data modeling with proficiency in SQL, Python, Pyspark, Hive, ETL, Unix, Control-M (or similar scheduling tools). Key Responsibilities: Develop and configure data pipelines across various platforms and technologies. Write complex SQL queries for data analysis on databases such as SQL Server, Oracle, and HIVE. Create solutions to support AI/ML models and generative AI. Work independently on specialized assignments within project deliverables. Provide solutions and tools to enhance engineering efficiencies. Design processes, systems, and operational models for end-to-end execution of data pipelines. Preferred Skills: Experience with GCP, particularly Airflow, Dataproc, and Big Query, is advantageous. Requirements Requirements: Strong problem-solving and analytical abilities. Excellent communication and presentation skills. Ability to deliver high-quality materials against tight deadlines. Effective under pressure with rapidly changing priorities. Note: The ability to communicate efficiently at a global level is paramount. --- Minimum 6 years of experience in data modeling with SQL, Python, Pyspark, Hive, ETL, Unix, Control-M (or similar scheduling tools). Proficiency in writing complex SQL queries for data analysis. Experience with GCP, particularly Airflow, Dataproc, and Big Query, is an advantage. Strong problem-solving and analytical abilities. Excellent communication and presentation skills. Ability to work effectively under pressure with rapidly changing priorities.

Posted 3 months ago

Apply

4.0 - 8.0 years

11 - 12 Lacs

Gurugram

Work from Office

Big Data Tester Requirements: • Experience 4-8 years • Good knowledge and hands on experience of Big Data (HDFS, Hive, Kafka) testing. (Must) • Good knowledge and hands on experience of SQL(Must). • Good knowledge and hands on experience of Linux (Must) • Well versed with QA methodologies. • Manual + Automation will work. • Knowledge of DBT, AWS or Automation testing will be a plus.

Posted 3 months ago

Apply

5.0 - 8.0 years

15 - 25 Lacs

Pune

Work from Office

Design, develop & maintain scalable data pipelines & systems using PySpark and other big data tools. Monitor, troubleshoot & resolve issues in data workflows & pipelines. Implement best practices for data processing, security & storage. Required Candidate profile Strong programming skills in PySpark & Python. Exp with big data frameworks like Hadoop, Spark, or Kafka. Proficiency in working with cloud platforms. Exp with data modeling & working with databases.

Posted 3 months ago

Apply

5.0 - 9.0 years

8 - 12 Lacs

Chennai

Work from Office

Core technical skills in Big Data (HDFS, Hive, Spark, HDP/CDP, ETL pipeline, SQL, Ranger, Python), Cloud (either AWS or Azure preferably both) services (S3/ADLS, Delta Lake, KeyVault, Hashicorp, Splunk), DevOps, preferably Data Quality & Governance Knowledge, preferably hands-on experience in tools such DataIku/Dremio or any similar tools or knowledge on any such tools. Should be able to lead project and report timely status. Should ensure smooth release management Strategy Responsibilities include development, testing and support required for the project Business IT-Projects-CPBB Data Technlgy Processes As per SCB Governance People & Talent Applicable to SCB Guidelines Risk Management Applicable to SCB standards Key Responsibilities Regulatory & Business Conduct Display exemplary conduct and live by the Group s Values and Code of Conduct. Take personal responsibility for embedding the highest standards of ethics, including regulatory and business conduct, across Standard Chartered Bank. This includes understanding and ensuring compliance with, in letter and spirit, all applicable laws, regulations, guidelines and the Group Code of Conduct. Lead the team to achieve the outcomes set out in the Bank s Conduct Principles: [Fair Outcomes for Clients; Effective Financial Markets; Financial Crime Compliance; The Right Environment. ] Effectively and collaboratively identify, escalate, mitigate and resolve risk, conduct and compliance matters. Key stakeholders Athena Program Other Responsibilities Analysis, Development, Testing and Support , Leading the team, Release management Skills and Experience Hadoop SQL HDFS Hive Python ETL Process ADO & Confluence DataWarehouse Concepts Delivery Process Knowledge Qualifications Hadoop, HDFS, HBASE, Spark, Scala, ADO & Confluence, ETL Process, SQL(Expert), Dremio(Entry), Dataiku(Entry) About Standard Chartered Were an international bank, nimble enough to act, big enough for impact. For more than 170 years, weve worked to make a positive difference for our clients, communities, and each other. We question the status quo, love a challenge and enjoy finding new opportunities to grow and do better than before. If youre looking for a career with purpose and you want to work for a bank making a difference, we want to hear from you. You can count on us to celebrate your unique talents and we cant wait to see the talents you can bring us. Our purpose, to drive commerce and prosperity through our unique diversity, together with our brand promise, to be here for good are achieved by how we each live our valued behaviours. When you work with us, youll see how we value difference and advocate inclusion. Together we: Do the right thing and are assertive, challenge one another, and live with integrity, while putting the client at the heart of what we do Never settle, continuously striving to improve and innovate, keeping things simple and learning from doing well, and not so well Are better together, we can be ourselves, be inclusive, see more good in others, and work collectively to build for the long term What we offer In line with our Fair Pay Charter, we offer a competitive salary and benefits to support your mental, physical, financial and social wellbeing. Core bank funding for retirement savings, medical and life insurance, with flexible and voluntary benefits available in some locations. Time-off including annual leave, parental/maternity (20 weeks), sabbatical (12 months maximum) and volunteering leave (3 days), along with minimum global standards for annual and public holiday, which is combined to 30 days minimum. Flexible working options based around home and office locations, with flexible working patterns. Proactive wellbeing support through Unmind, a market-leading digital wellbeing platform, development courses for resilience and other human skills, global Employee Assistance Programme, sick leave, mental health first-aiders and all sorts of self-help toolkits A continuous learning culture to support your growth, with opportunities to reskill and upskill and access to physical, virtual and digital learning. Being part of an inclusive and values driven organisation, one that embraces and celebrates our unique diversity, across our teams, business functions and geographies - everyone feels respected and can realise their full potential. www. sc. com/careers 29430

Posted 3 months ago

Apply

6.0 - 11.0 years

8 - 13 Lacs

Bengaluru

Work from Office

About the Role: This role is responsible for managing and maintaining complex, distributed big data ecosystems. It ensures the reliability, scalability, and security of large-scale production infrastructure. Key responsibilities include automating processes, optimizing workflows, troubleshooting production issues, and driving system improvements across multiple business verticals. Roles and Responsibilities: Manage, maintain, and support incremental changes to Linux/Unix environments. Lead on-call rotations and incident responses, conducting root cause analysis and driving postmortem processes. Design and implement automation systems for managing big data infrastructure, including provisioning, scaling, upgrades, and patching clusters. Troubleshoot and resolve complex production issues while identifying root causes and implementing mitigating strategies. Design and review scalable and reliable system architectures. Collaborate with teams to optimize overall system performance. Enforce security standards across systems and infrastructure. Set technical direction, drive standardization, and operate independently. Ensure availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolve, analyze, and respond to system outages and disruptions and implement measures to prevent similar incidents from recurring. Develop tools and scripts to automate operational processes, reducing manual workload, increasing efficiency and improving system resilience. Monitor and optimize system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaborate with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle. Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities. Develop and enforce SRE best practices and principles. Align across functional teams on priorities and deliverables. Drive automation to enhance operational efficiency. Skills Required: Over 6 years of experience managing and maintaining distributed big data ecosystems. Strong expertise in Linux including IP, Iptables, and IPsec. Proficiency in scripting/programming with languages like Perl, Golang, or Python. Hands-on experience with the Hadoop stack (HDFS, HBase, Airflow, YARN, Ranger, Kafka, Pinot). Familiarity with open-source configuration management and deployment tools such as Puppet, Salt, Chef, or Ansible. Solid understanding of networking, open-source technologies, and related tools. Excellent communication and collaboration skills. DevOps tools: Saltstack, Ansible, docker, Git. SRE Logging and monitoring tools: ELK stack, Grafana, Prometheus, opentsdb, Open Telemetry. Good to Have: Experience managing infrastructure on public cloud platforms (AWS, Azure, GCP). Experience in designing and reviewing system architectures for scalability and reliability. Experience with observability tools to visualize and alert on system performance.

Posted 3 months ago

Apply

5.0 - 7.0 years

15 - 25 Lacs

Chennai

Work from Office

Job Summary: We are seeking a skilled Big Data Tester & Developer to design, develop, and validate data pipelines and applications on large-scale data platforms. You will work on data ingestion, transformation, and testing workflows using tools from the Hadoop ecosystem and modern data engineering stacks. Experience - 6-12 years Key Responsibilities: • Develop and test Big Data pipelines using Spark, Hive, Hadoop, and Kafka • Write and optimize PySpark/Scala code for data processing • Design test cases for data validation, quality, and integrity • Automate testing using Python/Java and tools like Apache Nifi, Airflow, or DBT • Collaborate with data engineers, analysts, and QA teams Key Skills: • Strong hands-on experience in Big Data tools: Spark, Hive, HDFS, Kafka • Proficient in PySpark, Scala, or Java • Experience in data testing, ETL validation, and data quality checks • Familiarity with SQL, NoSQL, and data lakes • Knowledge of CI/CD, Git, and automation frameworks We are looking for a skilled PostgreSQL Developer/DBA to design, implement, optimize, and maintain our PostgreSQL database systems. You will work closely with developers and data teams to ensure high performance, scalability, and data integrity. Experience - 6 to 12 years Key Responsibilities: • Develop complex SQL queries, stored procedures, and functions • Optimize query performance and database indexing • Manage backups, replication, and security • Monitor and tune database performance • Support schema design and data migrations Key Skills: • Strong hands-on experience with PostgreSQL • Proficient in SQL, PL/pgSQL scripting • Experience in performance tuning, query optimization, and indexing • Familiarity with logical replication, partitioning, and extensions • Exposure to tools like pgAdmin, psql, or PgBouncer

Posted 3 months ago

Apply

6.0 - 11.0 years

5 - 15 Lacs

Hyderabad

Hybrid

Dear Candidates, We are conducting Face to Face Drive on 7th June 2025. Whoever interested in F2F Drive kindly share the updated resume asap. Here are the JD Details: Role: Data Engineer with Python, Apache Spark, HDFS Experience: 6 to 12 Years Location: Hyderabad Shift Timings: General Shift Job Overview: Key Responsibilities: • Design, develop, and maintain scalable data pipelines using Python and Spark. • Ingest, process, and transform large datasets from various sources into usable formats. • Manage and optimize data storage using HDFS and MongoDB. • Ensure high availability and performance of data infrastructure. • Implement data quality checks, validations, and monitoring processes. • Collaborate with cross-functional teams to understand data needs and deliver solutions. • Write reusable and maintainable code with strong documentation practices. • Optimize performance of data workflows and troubleshoot bottlenecks. • Maintain data governance, privacy, and security best practices. Required qualifications to be successful in this role • Minimum 6 years of experience as a Data Engineer or similar role. • Strong proficiency in Python for data manipulation and pipeline development. • Hands-on experience with Apache Spark for large-scale data processing. • Experience with HDFS and distributed data storage systems. • Strong understanding of data architecture, data modeling, and performance tuning. • Familiarity with version control tools like Git. • Experience with workflow orchestration tools (e.g., Airflow, Luigi) is a plus. • Knowledge of cloud services (AWS, GCP, or Azure) is preferred. • Bachelors or Masters degree in Computer Science, Information Systems, or a related field. Preferred Skills: • Experience with containerization (Docker, Kubernetes). • Knowledge of real-time data streaming tools like Kafka. • Familiarity with data visualization tools (e.g., Power BI, Tableau). • Exposure to Agile/Scrum methodologies. Note: If Interested then please share me your updated resume to jamshira@srinav.net with below requried details asap. Details Needed: Full Name: Mail id : Contact Number : Current Exp Relevant Exp: CTC: Expected CTC/MONTH: Current Location: Relocation (Yes/No): Notice Period Official: LWD: Holding offer in hand: Tentative Doj: PAN ID: DOB (DD/MM/YYYY): LinkedIn profile Link:

Posted 3 months ago

Apply

0.0 years

0 Lacs

Pune, Maharashtra, India

On-site

Introduction In this role, youll work in one of our IBM Consulting Client Innovation Centers (Delivery Centers), where we deliver deep technical and industry expertise to a wide range of public and private sector clients around the world. Our delivery centers offer our clients locally based skills and technical expertise to drive innovation and adoption of new technology. In this role, youll work in one of our IBM Consulting Client Innovation Centers (Delivery Centers), where we deliver deep technical and industry expertise to a wide range of public and private sector clients around the world. Our delivery centers offer our clients locally based skills and technical expertise to drive innovation and adoption of new technology. Your role and responsibilities As a Data Engineer at IBM, youll play a vital role in the development, design of application, provide regular support/guidance to project teams on complex coding, issue resolution and execution. Your primary responsibilities include: Lead the design and construction of new solutions using the latest technologies, always looking to add business value and meet user requirements. Strive for continuous improvements by testing the build solution and working under an agile framework. Discover and implement the latest technologies trends to maximize and build creative solutions Required education Bachelors Degree Preferred education Masters Degree Required technical and professional expertise Experience with Apache Spark (PySpark): In-depth knowledge of Spark architecture, core APIs, and PySpark for distributed data processing. Big Data Technologies: Familiarity with Hadoop, HDFS, Kafka, and other big data tools. Data Engineering Skills: Strong understanding of ETL pipelines, data modeling, and data warehousing concepts. Strong proficiency in Python: Expertise in Python programming with a focus on data processing and manipulation. Data Processing Frameworks: Knowledge of data processing libraries such as Pandas, NumPy. SQL Proficiency: Experience writing optimized SQL queries for large-scale data analysis and transformation. Cloud Platforms: Experience working with cloud platforms like AWS, Azure, or GCP, including using cloud storage systems Preferred technical and professional experience Define, drive, and implement an architecture strategy and standards for end-to-end monitoring. Partner with the rest of the technology teams including application development, enterprise architecture, testing services, network engineering, Good to have detection and prevention tools for Company products and Platform and customer-facing

Posted 3 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies