Jobs
Interviews

344 Hdfs Jobs - Page 14

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

4 - 8 years

12 - 22 Lacs

Hyderabad, Chennai, Bengaluru

Hybrid

Warm Greetings from SP Staffing!! Role: Big Data Developer Experience Required :4 to 8 yrs Work Location : Bangalore/Chennai/Pune/Delhi/Hyderabad/Kochi Required Skills, Spark and Scala Interested candidates can send resumes to nandhini.s@spstaffing.in

Posted 4 months ago

Apply

5 - 7 years

0 - 0 Lacs

Thiruvananthapuram

Work from Office

Job: Data Engineer Experience: 5+ years Mandatory Skill: Python, Pyspark, Linux Shell Scripting Location: Trivandrum Required Skills & Experience: Experience with large-scale distributed data processing systems. Expertise in data modelling, testing, quality, access, and storage. Proficiency in Python, SQL , and experience with Databricks and DBT. Experience implementing cloud data technologies (GCP, Azure, or AWS). Knowledge of improving the data development lifecycle and shortening lead times. Agile delivery experience. Required Skills Python,Pyspark,Linux Shell Scripting

Posted 4 months ago

Apply

6.0 - 10.0 years

15 - 25 Lacs

bengaluru

Work from Office

We're Nagarro , We are a Digital Product Engineering company that is scaling in a big way! We build products, services, and experiences that inspire, excite, and delight. We work at scale across all devices and digital mediums, and our people exist everywhere in the world (18000+ experts across 38 countries, to be exact). Our work culture is dynamic and non-hierarchical. We are looking for great new colleagues. That is where you come in! REQUIREMENTS: Total experience 6+ years. Excellent knowledge and experience in Big Data engineer. Deep understanding of big data technologies, including Spark, Scala, AWS/Azure/GCP (Any 1), Hadoop, Hive, Maven, SQL Server . Must have experience in Hadoop, Hive, Spark with Scala with good experience in performance tuning and debugging issues. Good to have any stream processing Spark/Scala Kafka. Must have experience in design and development of Big data projects. Good knowledge in Functional programming and OOP concepts, SOLID principles, design patterns for developing scalable applications. Familiarity with build tools like Maven. Must have experience with any RDBMS and at least one NoSQL database preferably PostgreSQL. Problem-solving mindset with the ability to tackle complex data engineering challenges. Strong communication and teamwork skills, with the ability to mentor and collaborate effectively. Experience with creating technical documentation and solution designs. RESPONSIBILITIES: Writing and reviewing great quality code Understanding the clients business use cases and technical requirements and be able to convert them in to technical design which elegantly meets the requirements Mapping decisions with requirements and be able to translate the same to developers Identifying different solutions and being able to narrow down the best option that meets the clients requirements Defining guidelines and benchmarks for NFR considerations during project implementation Writing and reviewing design document explaining overall architecture, framework, and high-level design of the application for the developers Reviewing architecture and design on various aspects like extensibility, scalability, security, design patterns, user experience, NFRs, etc., and ensure that all relevant best practices are followed Developing and designing the overall solution for defined functional and non-functional requirements; and defining technologies, patterns, and frameworks to materialize it Understanding and relating technology integration scenarios and applying these learnings in projects Resolving issues that are raised during code/review, through exhaustive systematic analysis of the root cause, and being able to justify the decision taken Carrying out POCs to make sure that suggested design/technologies meet the requirements

Posted Date not available

Apply

4.0 - 7.0 years

9 - 13 Lacs

gurugram, chennai, bengaluru

Work from Office

Skills : Bigdata, Pyspark, Hive, Spark Optimization Good to have : GCP Roles and Responsibilities Skills : Bigdata, Pyspark, Hive, Spark Optimization Good to have : GCP

Posted Date not available

Apply

3.0 - 8.0 years

0 - 1 Lacs

bengaluru

Work from Office

Were looking for a Python Developer with 3+ years of experience in AI/ML development and distributed systems. The ideal candidate is skilled in Python, understands core machine learning algorithms, AI advancements and has some experience/knowledge working with distributed computing and storage frameworks. Knowledge of backend architecture is a strong plus Role & responsibilities Preferred candidate profile - Strong Python programming skills - Understands Databases and Object Storage - Experience with ML algorithms and model development - Familiarity with distributed systems (e.g., Spark, Dask, Ray) - Exposure to large-scale data storage (e.g., S3, HDFS) - Good grasp of software engineering best practices

Posted Date not available

Apply

6.0 - 10.0 years

14 - 24 Lacs

pune, chennai

Work from Office

Mandatory - Experience and knowledge in designing, implementing, and managing non-relational data stores (e.g., MongoDB, Cassandra, DynamoDB), focusing on flexible schema design, scalability, and performance optimization for handling large volumes of unstructured or semi-structured data. Mainly client needs No SQL DB, either MongoDB or HBase Data Pipeline Development: Design, develop, test, and deploy robust, high-performance, and scalable ETL/ELT data pipelines using Scala and Apache Spark to ingest, process, and transform large volumes of structured and unstructured data from diverse sources. Big Data Expertise: Leverage expertise in the Hadoop ecosystem (HDFS, Hive, etc.) and distributed computing principles to build efficient and fault-tolerant data solutions. Advanced SQL: Write complex, optimized SQL queries and stored procedures. Performance Optimization: Continuously monitor, analyze, and optimize the performance of data pipelines and data stores. Troubleshoot complex data-related issues, identify bottlenecks, and implement solutions for improved efficiency and reliability. Data Quality & Governance: Implement data quality checks, validation rules, and reconciliation processes to ensure the accuracy, completeness, and consistency of data. Contribute to data governance and security best practices. Automation & CI/CD: Implement automation for data pipeline deployment, monitoring, and alerting using tools like Apache Airflow, Jenkins, or similar CI/CD platforms. Documentation: Create and maintain comprehensive technical documentation for data architectures, pipelines, and processes. Required Skills & Qualifications: Bachelor's or Master's degree in Computer Science, Engineering, or a related quantitative field. Minimum 5 years of professional experience in Data Engineering, with a strong focus on big data technologies. Proficiency in Scala for developing big data applications and transformations, especially with Apache Spark. Expert-level proficiency in SQL; ability to write complex queries, optimize performance, and understand database internals. Extensive hands-on experience with Apache Spark (Spark SQL, DataFrames, RDDs) for large-scale data processing and analytics. Mandatory - Experience and knowledge in designing, implementing, and managing non-relational data stores (e.g., MongoDB, Cassandra, DynamoDB), focusing on flexible schema design, scalability, and performance optimization for handling large volumes of unstructured or semi-structured data. Solid understanding of distributed computing concepts and experience with the Hadoop ecosystem (HDFS, Hive). Experience with building and optimizing ETL/ELT processes and data warehousing concepts. Strong understanding of data modeling techniques (e.g., Star Schema, Snowflake Schema). Familiarity with version control systems (e.g., Git). Excellent problem-solving, analytical, and communication skills. Ability to work independently and collaboratively in an Agile team environment.

Posted Date not available

Apply

5.0 - 8.0 years

20 - 30 Lacs

bengaluru

Work from Office

Cloud Data Engineer Req number: R5934 Employment type: Full time Worksite flexibility: Remote Who we are CAI is a global technology services firm with over 8,500 associates worldwide and a yearly revenue of $1 billion+. We have over 40 years of excellence in uniting talent and technology to power the possible for our clients, colleagues, and communities. As a privately held company, we have the freedom and focus to do what is right—whatever it takes. Our tailor-made solutions create lasting results across the public and commercial sectors, and we are trailblazers in bringing neurodiversity to the enterprise. Job Summary We are seeking a motivated Cloud Data Engineer that has experience in building data products using Databricks and related technologies. This is a Full-time and Remote position. Job Description What You’ll Do Analyze and understand existing data warehouse implementations to support migration and consolidation efforts. Reverse-engineer legacy stored procedures (PL/SQL, SQL) and translate business logic into scalable Spark SQL code within Databricks notebooks. Design and develop data lake solutions on AWS using S3 and Delta Lake architecture, leveraging Databricks for processing and transformation. Build and maintain robust data pipelines using ETL tools with ingestion into S3 and processing in Databricks. Collaborate with data architects to implement ingestion and transformation frameworks aligned with enterprise standards. Evaluate and optimize data models (Star, Snowflake, Flattened) for performance and scalability in the new platform. Document ETL processes, data flows, and transformation logic to ensure transparency and maintainability. Perform foundational data administration tasks including job scheduling, error troubleshooting, performance tuning, and backup coordination. Work closely with cross-functional teams to ensure smooth transition and integration of data sources into the unified platform. Participate in Agile ceremonies and contribute to sprint planning, retrospectives, and backlog grooming. Triage, debug and fix technical issues related to Data Lakes. Maintain and Manage Code repositories like Git. What You'll Need 5+ years of experience working with Databricks , including Spark SQL and Delta Lake implementations. 3 + years of experience in designing and implementing data lake architectures on Databricks. Strong SQL and PL/SQL skills with the ability to interpret and refactor legacy stored procedures. Hands-on experience with data modeling and warehouse design principles. Proficiency in at least one programming language (Python, Scala, Java). Bachelor’s degree in Computer Science, Information Technology, Data Engineering, or related field. Experience working in Agile environments and contributing to iterative development cycles. Experience working on Agile projects and Agile methodology in general. Databricks cloud certification is a big plus. Exposure to enterprise data governance and metadata management practices. Physical Demands This role involves mostly sedentary work, with occasional movement around the office to attend meetings, etc. Ability to perform repetitive tasks on a computer, using a mouse, keyboard, and monitor. Reasonable accommodation statement If you require a reasonable accommodation in completing this application, interviewing, completing any pre-employment testing, or otherwise participating in the employment selection process, please direct your inquiries to application.accommodations@cai.io or (888) 824 – 8111.

Posted Date not available

Apply

3.0 - 6.0 years

25 - 27 Lacs

mumbai

Work from Office

Overview Annalect is currently seeking a data engineer to join our technology team. In this role you will build Annalect products which sit atop cloud-based data infrastructure. We are looking for people who have a shared passion for technology, design & development, data, and fusing these disciplines together to build cool things. In this role, you will work on one or more software and data products in the Annalect Engineering Team. You will participate in technical architecture, design and development of software products as well as research and evaluation of new technical solutions. Responsibilities Steward data and compute environments to facilitate usage of data assets Design, build, test and deploy scalable and reusable systems that handle large amounts of data Manage small team of developers Perform code reviews and provide leadership and guidance to junior developers Learn and teach new technologies Qualifications Experience designing and managing data flows Experience designing systems and APIs to integrate data into applications 8+ years of Linux, Bash, Python, and SQL experience 4+ years using Spark and other Hadoop ecosystem software 4+ years using AWS cloud services, esp. EMR, Glue, Athena, and Redshift 4+ years managing team of developers Passion for Technology: Excitement for new technology, bleeding edge applications, and a positive attitude towards solving real world challenges Experience in any of the modern data stack technologies like dbt , airbyte is desired. Experience building scalable data pipelines ( Airflow )

Posted Date not available

Apply

5.0 - 10.0 years

6 - 16 Lacs

pune

Work from Office

Exp-5-12 years MUST Have: End to End ETL Testing hands on and should be really strong in process work flow. MUST Have: Should be able to write Complex SQL MUST Have: UNIX or Shell scripting hands-on experience needed for ETL automation work MUST Have: Bigdata -HDFS, Hive, Good to have : Nifi pipeline Good to have : Python or Scala or Impala Position available for Pune location only

Posted Date not available

Apply

7.0 - 12.0 years

20 - 30 Lacs

bengaluru

Hybrid

Were Hiring – Big Data Engineer (Scala + Spark) Location: Bangalore – Whitefield (Hybrid: 3 Days Office / Week) Budget: Up to 30 LPA Availability: Immediate to Max 2 Weeks Notice Must Have Skills: Scala & Spark (Mandatory) Big Data & Hadoop Ecosystem Hive, Kafka, Oozie Python Data Structures & Algorithms Design & Coding Skills CI/CD Pipelines Problem Solving & Communication Skills Strong Organizational Skills Nice to Have: Experience in distributed data processing & performance optimization Why Join Us? At Xebia, you’ll be part of a high-performance engineering team solving challenging data problems for global clients. We value innovation, ownership, and continuous learning. How to Apply: Send your updated CV to vijay.s@xebia.com with the below details: Full Name Total Experience Current CTC Expected CTC Current Location Preferred Xebia Location (from above) Notice Period / Last Working Day (if serving) Primary Skills LinkedIn Profile

Posted Date not available

Apply

10.0 - 15.0 years

35 - 40 Lacs

pune

Work from Office

Experience Required : 10+ years overall, with 5+ years in Kafka infrastructure management and operations. Must have successfully deployed and maintained Kafka clusters in production environments, with proven experience in securing, monitoring, and scaling Kafka for enterprise-grade data streaming. Overview : We are seeking an experienced Kafka Administrator to lead the deployment, configuration, and operational management of Apache Kafka clusters supporting real-time data ingestion pipelines. The role involves ensuring secure, scalable, and highly available Kafka infrastructure for streaming flow records into centralized data platforms. Role & responsibilities Architect and deploy Apache Kafka clusters with high availability. Implement Kafka MirrorMaker for cross-site replication and disaster recovery readiness. Integrate Kafka with upstream flow record sources using IPFIX-compatible plugins. Configure Kafka topics, partitions, replication, and retention policies based on data flow requirements. Set up TLS/SSL encryption, Kerberos authentication, and access control using Apache Ranger. Monitor Kafka performance using Prometheus, Grafana, or Cloudera Manager and ensure proactive alerting. Perform capacity planning, cluster upgrades, patching, and performance tuning. Ensure audit logging, compliance with enterprise security standards, and integration with SIEM tools. Collaborate with solution architects and Kafka developers to align infrastructure with data pipeline needs. Maintain operational documentation, SOPs, and support SIT/UAT and production rollout activities. Preferred candidate profile Proven experience in Apache Kafka, Kafka Connect, Kafka Streams, and Schema Registry. Strong understanding of IPFIX, nProbe Cento, and network flow data ingestion. Hands-on experience with Apache Spark (Structured Streaming) and modern data lake or DWH platforms. Familiarity with Cloudera Data Platform, HDFS, YARN, Ranger, and Knox. Deep knowledge of data security protocols, encryption, and governance frameworks. Excellent communication, documentation, and stakeholder management skills.

Posted Date not available

Apply

4.0 - 7.0 years

9 - 13 Lacs

gurugram, bengaluru

Work from Office

Skills : Bigdata, Pyspark, Hive, Spark Optimization Good to have : GCP Roles and Responsibilities Skills : Bigdata, Pyspark, Hive, Spark Optimization Good to have : GCP

Posted Date not available

Apply

6.0 - 8.0 years

10 - 15 Lacs

hyderabad

Hybrid

Key Responsibilities: Design, build, and optimize large-scale data processing systems using distributed computing frameworks like Hadoop, Spark, and Kafka . Develop and maintain data pipelines (ETL/ELT) to support analytics, reporting, and machine learning use cases. Integrate data from multiple sources (structured and unstructured) and ensure data quality and consistency. Collaborate with cross-functional teams to understand data needs and deliver data-driven solutions. Implement data governance, data security, and privacy best practices. Monitor performance and troubleshoot issues across data infrastructure. Stay updated with the latest trends and technologies in big data and cloud computing. Required Qualifications: Bachelors or master's degree in computer science, Engineering, or a related field. 6+ years of experience in big data engineering or a similar role. Proficiency in big data technologies such as Hadoop, Apache Spark, Hive, and Kafka. Strong programming skills in Python. Experience with cloud platforms like AWS (EMR, S3, Redshift), GCP (Big Query, Dataflow), or Azure (Data Lake, Synapse). Solid understanding of data modeling, ETL/ELT processes, and data warehousing concepts. Familiarity with CI/CD tools and practices for data engineering. Preferred Qualifications: Experience with orchestration tools like Apache Airflow or Prefect. Knowledge of real-time data processing and stream analytics. Exposure to containerization tools like Docker and Kubernetes. Certification in cloud technologies (e.g., AWS Certified Big Data – Specialty).

Posted Date not available

Apply

10.0 - 13.0 years

30 - 40 Lacs

pune

Work from Office

Experience Required : 10+ years overall, with 5+ years in Kafka infrastructure management and operations. Must have successfully deployed and maintained Kafka clusters in production environments, with proven experience in securing, monitoring, and scaling Kafka for enterprise-grade data streaming. Overview : We are seeking an experienced Kafka Administrator to lead the deployment, configuration, and operational management of Apache Kafka clusters supporting real-time data ingestion pipelines. The role involves ensuring secure, scalable, and highly available Kafka infrastructure for streaming flow records into centralized data platforms. Role & responsibilities Architect and deploy Apache Kafka clusters with high availability. Implement Kafka MirrorMaker for cross-site replication and disaster recovery readiness. Integrate Kafka with upstream flow record sources using IPFIX-compatible plugins. Configure Kafka topics, partitions, replication, and retention policies based on data flow requirements. Set up TLS/SSL encryption, Kerberos authentication, and access control using Apache Ranger. Monitor Kafka performance using Prometheus, Grafana, or Cloudera Manager and ensure proactive alerting. Perform capacity planning, cluster upgrades, patching, and performance tuning. Ensure audit logging, compliance with enterprise security standards, and integration with SIEM tools. Collaborate with solution architects and Kafka developers to align infrastructure with data pipeline needs. Maintain operational documentation, SOPs, and support SIT/UAT and production rollout activities. Preferred candidate profile Proven experience in Apache Kafka, Kafka Connect, Kafka Streams, and Schema Registry. Strong understanding of IPFIX, nProbe Cento, and network flow data ingestion. Hands-on experience with Apache Spark (Structured Streaming) and modern data lake or DWH platforms. Familiarity with Cloudera Data Platform, HDFS, YARN, Ranger, and Knox. Deep knowledge of data security protocols, encryption, and governance frameworks. Excellent communication, documentation, and stakeholder management skills.

Posted Date not available

Apply

4.0 - 7.0 years

12 - 16 Lacs

hyderabad, bengaluru, india

Hybrid

Exp - 4 to 7 Yrs Loc - Bangalore or Hyderabad Posi - Permanent FTE Must Have Skills - Python, PySpark, Hadoop, Hive, Big Data Technologies, Scala, SQL, Airflow, Kafka Required Candidate profile Looking for strong Immediate Joiners or candidates with last working date in August from Hyderabad or Bangalore or South India for Big Data Developer positions for a US banking major client.

Posted Date not available

Apply

6.0 - 10.0 years

25 - 40 Lacs

noida

Work from Office

Data Pipeline Development: Design and develop efficient big data pipelines (batch as well as streaming) using Apache Spark, Trino, ensuring timely and accurate data delivery. Collaboration and Communication: Work closely with data scientists, analysts, and stakeholders to understand data requirements and perform exploratory data analysis to recommend best feature attributes, data models for AI model training and deliver highquality data solutions. Data Exploration: Analyse customer data and patterns and suggest use cases in BFSI like Instights, AI & GenAI use cases. Data Quality and Security: Ensure data quality, integrity, and security across all data platforms, maintaining robust data governance practices. Documentation and Troubleshooting: Own and document data pipelines and data lineage, monitoring and troubleshooting data pipeline issues to ensure timely and accurate data delivery. Big Data Processing: Hands-on experience with Apache Spark, Trino, Minio/S3, Clickhouse and SQL for batch and streaming data pipelines. Design and develop scalable data pipelines using Apache Spark (PySpark) and Python. Implement batch and real-time data processing solutions on large datasets. Work with various SQL and NoSQL databases (e.g., PostgreSQL, MySQL, MongoDB, Cassandra, DynamoDB). Data Migration & Warehousing: Expertise in metadata management, ETL processes, and cloud-based data solutions. Data Analysis & Exploration: Strong skills in dataset analysis, pattern recognition, and data visualization using Tableau or Power BI. Collaborate with data architects, analysts, and other engineers to understand business requirements and translate them into technical solutions. Optimize and troubleshoot Spark jobs for performance, scalability, and reliability. Exposure to containerization technologies like Docker and orchestration tools like Kubernetes.

Posted Date not available

Apply

6.0 - 10.0 years

18 - 32 Lacs

pune, chennai

Work from Office

Mandatory - Experience and knowledge in designing, implementing, and managing non-relational data stores (e.g., MongoDB, Cassandra, DynamoDB), focusing on flexible schema design, scalability, and performance optimization for handling large volumes of unstructured or semi-structured data. Mainly client needs No SQL DB, either MongoDB or HBase Data Pipeline Development: Design, develop, test, and deploy robust, high-performance, and scalable ETL/ELT data pipelines using Scala and Apache Spark to ingest, process, and transform large volumes of structured and unstructured data from diverse sources. Big Data Expertise: Leverage expertise in the Hadoop ecosystem (HDFS, Hive, etc.) and distributed computing principles to build efficient and fault-tolerant data solutions. Advanced SQL: Write complex, optimized SQL queries and stored procedures. Performance Optimization: Continuously monitor, analyze, and optimize the performance of data pipelines and data stores. Troubleshoot complex data-related issues, identify bottlenecks, and implement solutions for improved efficiency and reliability. Data Quality & Governance: Implement data quality checks, validation rules, and reconciliation processes to ensure the accuracy, completeness, and consistency of data. Contribute to data governance and security best practices. Automation & CI/CD: Implement automation for data pipeline deployment, monitoring, and alerting using tools like Apache Airflow, Jenkins, or similar CI/CD platforms. Documentation: Create and maintain comprehensive technical documentation for data architectures, pipelines, and processes. Required Skills & Qualifications: Bachelor's or Master's degree in Computer Science, Engineering, or a related quantitative field. Minimum 5 years of professional experience in Data Engineering, with a strong focus on big data technologies. Proficiency in Scala for developing big data applications and transformations, especially with Apache Spark. Expert-level proficiency in SQL; ability to write complex queries, optimize performance, and understand database internals. Extensive hands-on experience with Apache Spark (Spark SQL, DataFrames, RDDs) for large-scale data processing and analytics. Mandatory - Experience and knowledge in designing, implementing, and managing non-relational data stores (e.g., MongoDB, Cassandra, DynamoDB), focusing on flexible schema design, scalability, and performance optimization for handling large volumes of unstructured or semi-structured data. Solid understanding of distributed computing concepts and experience with the Hadoop ecosystem (HDFS, Hive). Experience with building and optimizing ETL/ELT processes and data warehousing concepts. Strong understanding of data modeling techniques (e.g., Star Schema, Snowflake Schema). Familiarity with version control systems (e.g., Git). Excellent problem-solving, analytical, and communication skills. Ability to work independently and collaboratively in an Agile team environment.

Posted Date not available

Apply

2.0 - 5.0 years

4 - 7 Lacs

pune

Work from Office

about our diversity, equity, and inclusion efforts and the networks ZS supports to assist our ZSers in cultivating community spaces, obtaining the resources they need to thrive, and sharing the messages they are passionate about. ZSs Platform Development team designs, implements, tests and supports ZSs ZAIDYN Platform which helps drive superior customer experiences and revenue outcomes through integrated products analytics. Whether writing distributed optimization algorithms or advanced mapping and visualization interfaces, you will have an opportunity to solve challenging problems, make an immediate impact and contribute to bring better health outcomes. What you'll do: As part of our full-stack product engineering team, you will build multi-tenant cloud-based software products/platforms and internal assets that will leverage cutting edge based on the Amazon AWS cloud platform. Pair program, write unit tests, lead code reviews, and collaborate with QA analysts to ensure you develop the highest quality multi-tenant software that can be productized. Work with junior developers to implement large features that are on the cutting edge of Big Data Be a technical leader to your team, and help them improve their technical skills Stand up for engineering practices that ensure quality products: automated testing, unit testing, agile development, continuous integration, code reviews, and technical design Work with product managers and architects to design product architecture and to work on POCs Take immediate responsibility for project deliverables Understand client business issues and design features that meet client needs Undergo on-the-job and formal trainings and certifications, and will constantly advance your knowledge and problem solving skills What you'll bring: 1-3 years of experience in developing software, ideally building SaaS products and services Bachelor's Degree in CS, IT, or related discipline Strong analytic, problem solving, and programming ability Good hands on to work with AWS services (EC2, EMR, S3, Serverless stack, RDS, Sagemaker, IAM, EKS etc) Experience in coding in an object-oriented language such as Python, Java, C# etc. Hands on experience on Apache Spark, EMR, Hadoop, HDFS, or other big data technologies Experience with development on the AWS (Amazon Web Services) platform is preferable Experience in Linux shell or PowerShell scripting is preferable Experience in HTML5, JavaScript, and JavaScript libraries is preferable Good to have Pharma domain understanding Initiative and drive to contribute Excellent organizational and task management skills Strong communication skills Ability to work in global cross-office teams ZS is a global firm; fluency in English is required

Posted Date not available

Apply

4.0 - 6.0 years

32 - 35 Lacs

bengaluru

Work from Office

Overview Annalect is currently seeking a data engineering lead to join our technology team. In this role you will be building data pipelines and developing data set processes. We are looking for people who have a shared passion for technology, design & development, data, and fusing these disciplines together to build cool things. In this role, you will lead teams working on one or more software and data products in the Annalect Engineering Team. You will participate in technical architecture, design and development of software products as well as research and evaluation of new technical solutions. You will also help us drive the vision forward for data engineering projects by helping us extend and improve standards, and by building high performing, collaborative, enthusiastic teams. Responsibilities Evaluating architectural designs, technical tradeoffs, and solutions to problems to guide teams to deliver sustainable data and compute environments to facilitate usage of data assets. Coaching engineers to improve their technical and soft skills. Align teams with Annalect’s technical standards. Ensure compliance with critical security requirements. Leading collaboration with product and design to ensure delivery deadlines are met. Design, build, test and deploy scalable and reusable systems that handle large amounts of data Qualifications 6+ Experience designing and managing data flows Experience designing systems and APIs to integrate data into applications 9+ years of Linux, Bash, Python, and SQL experience 6+ years using Spark or Hadoop ecosystem software 4+ years using AWS cloud services, esp. EMR, Glue, Athena, and Redshift 6+ years managing team of developers Passion for Technology: Excitement for new technology, bleeding edge applications, and a positive attitude towards solving real world challenges

Posted Date not available

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies