Jobs
Interviews

552 Hbase Jobs - Page 11

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 - 10.0 years

4 - 8 Lacs

Noida

Work from Office

We are looking for a skilled Data Software Engineer with 5 to 12 years of experience in Big Data and related technologies. The ideal candidate will have expertise in distributed computing principles, Apache Spark, and hands-on programming with Python. Roles and Responsibility Design and implement Big Data solutions using Apache Spark and other relevant technologies. Develop and maintain large-scale data processing systems, including stream-processing systems. Collaborate with cross-functional teams to integrate data from multiple sources, such as RDBMS, ERP, and files. Optimize performance of Spark jobs and troubleshoot issues. Lead a team efficiently and contribute to the development of Big Data solutions. Experience with native Cloud data services, such as AWS or AZURE Databricks. Job Expert-level understanding of distributed computing principles and Apache Spark. Hands-on programming experience with Python and proficiency with Hadoop v2, Map Reduce, HDFS, and Sqoop. Experience with building stream-processing systems using technologies like Apache Storm or Spark-Streaming. Good understanding of Big Data querying tools, such as Hive and Impala. Knowledge of ETL techniques and frameworks, along with experience with NoSQL databases like HBase, Cassandra, and MongoDB. Ability to work in an AGILE environment and lead a team efficiently. Strong understanding of SQL queries, joins, stored procedures, and relational schemas. Experience with integrating data from multiple sources, including RDBMS (SQL Server, Oracle), ERP, and files.

Posted 1 month ago

Apply

9.0 - 14.0 years

3 - 7 Lacs

Noida

Work from Office

We are looking for a skilled Data Engineer with 9 to 15 years of experience in the field. The ideal candidate will have expertise in designing and developing data pipelines using Confluent Kafka, ksqlDB, and Apache Flink. Roles and Responsibility Design and develop data pipelines for real-time and batch data ingestion and processing using Confluent Kafka, ksqlDB, and Apache Flink. Build and configure Kafka Connectors to ingest data from various sources, including databases, APIs, and message queues. Develop Flink applications for complex event processing, stream enrichment, and real-time analytics. Optimize ksqlDB queries for real-time data transformations, aggregations, and filtering. Implement data quality checks and monitoring to ensure data accuracy and reliability throughout the pipeline. Monitor and troubleshoot data pipeline performance, identifying bottlenecks and implementing optimizations. Job Bachelor's degree or higher from a reputed university. 8 to 10 years of total experience, with a majority related to ETL/ELT big data and Kafka. Proficiency in developing Flink applications for stream processing and real-time analytics. Strong understanding of data streaming concepts and architectures. Extensive experience with Confluent Kafka, including Kafka Brokers, Producers, Consumers, and Schema

Posted 1 month ago

Apply

7.0 - 10.0 years

4 - 8 Lacs

Noida

Work from Office

We are looking for a skilled Java developer with 7 to 10 years of hands-on experience in Java development and expertise in Apache Spark. The ideal candidate should have strong SQL skills, familiarity with data lake/data warehouse concepts, and a good understanding of distributed computing, parallel processing, and batch processing pipelines. Roles and Responsibility Design, develop, and maintain large-scale Java applications using Core Java, Multithreading, and Collections. Collaborate with cross-functional teams to identify and prioritize project requirements. Develop and implement efficient algorithms and data structures to solve complex problems. Troubleshoot and debug issues in existing codebases. Participate in code reviews and contribute to improving overall code quality. Stay updated with industry trends and emerging technologies to enhance our products and services. Job Strong proficiency in Java programming language with expertise in Core Java, Multithreading, and Collections. Experience working with Apache Spark, including RDD, DataFrame, and Spark SQL. Familiarity with Hadoop ecosystems, Hive, or HBase is desirable. Strong SQL skills and understanding of data lake/data warehouse concepts. Good understanding of distributed computing, parallel processing, and batch processing pipelines. Exposure to Kafka or other messaging systems for real-time data ingestion is an added advantage. Knowledge of Agile/Scrum development practices is beneficial. Familiarity with version control systems like Git, Bitbucket, build tools such as Maven, Gradle, and Jenkins is preferred.

Posted 1 month ago

Apply

5.0 - 10.0 years

7 - 11 Lacs

Pune

Work from Office

As a backend software engineer at Arista, you own your project end to end. You and your project team will work with product management and customers to define the requirements and design the architecture. you'll build the backend, write automated tests, and get it deployed into production via our CD pipeline. As a senior member of the team you'll also be expected to help mentor and grow new team members. This role demands a strong and broad software engineering background, and you won t be limited to any single aspect of the product or development process. BS/MS degree in Computer Science and 5+ years of relevant experience. Strong knowledge of one or more of programming languages (Go, Java, Python) Experience developing distributed systems or scale out applications for a SaaS environment Experience developing scalable backend systems in Go is a plus Experience with network monitoring, network protocols, machine learning or data analytics is a plus

Posted 1 month ago

Apply

5.0 - 10.0 years

4 - 8 Lacs

Bengaluru

Work from Office

As a backend software engineer at Arista, you own your project end to end. You and your project team will work with product management and customers to define the requirements and design the architecture. you'll build the backend, write automated tests, and get it deployed into production via our CD pipeline. As a senior member of the team you'll also be expected to help mentor and grow new team members. This role demands a strong and broad software engineering background, and you won t be limited to any single aspect of the product or development process. BS/MS degree in Computer Science and 5+ years of relevant experience. Strong knowledge of one or more of programming languages (Go, Python, Java) Experience developing distributed systems or scale out applications for a SaaS environment Experience developing scalable backend systems in Go is a plus Experience with network monitoring, network protocols, machine learning or data analytics is a plus

Posted 1 month ago

Apply

5.0 - 10.0 years

35 - 40 Lacs

Pune

Work from Office

As a backend software engineer at Arista, you own your project end to end. You and your project team will work with product management and customers to define the requirements and design the architecture. you'll build the backend, write automated tests, and get it deployed into production via our CD pipeline. As a senior member of the team you'll also be expected to help mentor and grow new team members. This role demands a strong and broad software engineering background, and you won t be limited to any single aspect of the product or development process. BS/MS degree in Computer Science and 5+ years of relevant experience. Strong knowledge of one or more of programming languages (Go, Java, Python) Experience developing distributed systems or scale out applications for a SaaS environment Experience developing scalable backend systems in Go is a plus Experience with network monitoring, network protocols, machine learning or data analytics is a plus

Posted 1 month ago

Apply

5.0 - 10.0 years

35 - 40 Lacs

Bengaluru

Work from Office

As a backend software engineer at Arista, you own your project end to end. You and your project team will work with product management and customers to define the requirements and design the architecture. you'll build the backend, write automated tests, and get it deployed into production via our CD pipeline. As a senior member of the team you'll also be expected to help mentor and grow new team members. This role demands a strong and broad software engineering background, and you won t be limited to any single aspect of the product or development process. BS/MS degree in Computer Science and 5+ years of relevant experience. Strong knowledge of one or more of programming languages (Go, Python, Java) Experience developing distributed systems or scale out applications for a SaaS environment Experience developing scalable backend systems in Go is a plus Experience with network monitoring, network protocols, machine learning or data analytics is a plus

Posted 1 month ago

Apply

4.0 - 9.0 years

9 - 13 Lacs

Kolkata, Mumbai, New Delhi

Work from Office

Krazy Mantra Group of Companies is looking for Big Data Engineer to join our dynamic team and embark on a rewarding career journeyDesigning and implementing scalable data storage solutions, such as Hadoop and NoSQL databases.Developing and maintaining big data processing pipelines using tools such as Apache Spark and Apache Storm.Writing and testing data processing scripts using languages such as Python and Scala.Integrating big data solutions with other IT systems and data sources.Collaborating with data scientists and business stakeholders to understand data requirements and identify opportunities for data-driven decision making.Ensuring the security and privacy of sensitive data.Monitoring performance and optimizing big data systems to ensure they meet performance and availability requirements.Staying up-to-date with emerging technologies and trends in big data and data engineering.Mentoring junior team members and providing technical guidance as needed.Documenting and communicating technical designs, solutions, and best practices.Strong problem-solving and debugging skillsExcellent written and verbal communication skills

Posted 1 month ago

Apply

2.0 - 6.0 years

5 - 8 Lacs

Pune

Work from Office

Supports, develops, and maintains a data and analytics platform. Effectively and efficiently processes, stores, and makes data available to analysts and other consumers. Works with Business and IT teams to understand requirements and best leverage technologies to enable agile data delivery at scale. Note:- Although the role category in the GPP is listed as Remote, the requirement is for a Hybrid work model. Key Responsibilities: Oversee the development and deployment of end-to-end data ingestion pipelines using Azure Databricks, Apache Spark, and related technologies. Design high-performance, resilient, and scalable data architectures for data ingestion and processing. Provide technical guidance and mentorship to a team of data engineers. Collaborate with data scientists, business analysts, and stakeholders to integrate various data sources into the data lake/warehouse. Optimize data pipelines for speed, reliability, and cost efficiency in an Azure environment. Enforce and advocate for best practices in coding standards, version control, testing, and documentation. Work with Azure services such as Azure Data Lake Storage, Azure SQL Data Warehouse, Azure Synapse Analytics, and Azure Blob Storage. Implement data validation and data quality checks to ensure consistency, accuracy, and integrity. Identify and resolve complex technical issues proactively. Develop reliable, efficient, and scalable data pipelines with monitoring and alert mechanisms. Use agile development methodologies, including DevOps, Scrum, and Kanban. External Qualifications and Competencies Technical Skills: Expertise in Spark, including optimization, debugging, and troubleshooting. Proficiency in Azure Databricks for distributed data processing. Strong coding skills in Python and Scala for data processing. Experience with SQL for handling large datasets. Knowledge of data formats such as Iceberg, Parquet, ORC, and Delta Lake. Understanding of cloud infrastructure and architecture principles, especially within Azure. Leadership & Soft Skills: Proven ability to lead and mentor a team of data engineers. Excellent communication and interpersonal skills. Strong organizational skills with the ability to manage multiple tasks and priorities. Ability to work in a fast-paced, constantly evolving environment. Strong problem-solving, analytical, and troubleshooting abilities. Ability to collaborate effectively with cross-functional teams. Competencies: System Requirements Engineering: Uses appropriate methods to translate stakeholder needs into verifiable requirements. Collaborates: Builds partnerships and works collaboratively to meet shared objectives. Communicates Effectively: Delivers clear, multi-mode communications tailored to different audiences. Customer Focus: Builds strong customer relationships and delivers customer-centric solutions. Decision Quality: Makes good and timely decisions to keep the organization moving forward. Data Extraction: Performs ETL activities and transforms data for consumption by downstream applications. Programming: Writes and tests computer code, version control, and build automation. Quality Assurance Metrics: Uses measurement science to assess solution effectiveness. Solution Documentation: Documents information for improved productivity and knowledge transfer. Solution Validation Testing: Ensures solutions meet design and customer requirements. Data Quality: Identifies, understands, and corrects data flaws. Problem Solving: Uses systematic analysis to address and resolve issues. Values Differences: Recognizes the value that diverse perspectives bring to an organization. Preferred Knowledge & Experience: Exposure to Big Data open-source technologies (Spark, Scala/Java, Map-Reduce, Hive, HBase, Kafka, etc.). Experience with SQL and working with large datasets. Clustered compute cloud-based implementation experience. Familiarity with developing applications requiring large file movement in a cloud-based environment. Exposure to Agile software development and analytical solutions. Exposure to IoT technology. Additional Responsibilities Unique to this Position Qualifications: Education: Bachelors or Masters degree in Computer Science, Information Technology, Engineering, or a related field. Experience: 3 to 5 years of experience in data engineering or a related field. Strong hands-on experience with Azure Databricks, Apache Spark, Python/Scala, CI/CD, Snowflake, and Qlik for data processing. Experience working with multiple file formats like Parquet, Delta, and Iceberg. Knowledge of Kafka or similar streaming technologies. Experience with data governance and data security in Azure. Proven track record of building large-scale data ingestion and ETL pipelines in cloud environments. Deep understanding of Azure Data Services. Experience with CI/CD pipelines, version control (Git), Jenkins, and agile methodologies. Familiarity with data lakes, data warehouses, and modern data architectures. Experience with Qlik Replicate (optional).

Posted 1 month ago

Apply

5.0 - 10.0 years

7 - 14 Lacs

Pune

Work from Office

We are looking for a skilled Data Engineer with 5-10 years of experience to join our team in Pune. The ideal candidate will have a strong background in data engineering and excellent problem-solving skills. Roles and Responsibility Design, develop, and implement data pipelines and architectures. Collaborate with cross-functional teams to identify and prioritize project requirements. Develop and maintain large-scale data systems and databases. Ensure data quality, integrity, and security. Optimize data processing and analysis workflows. Participate in code reviews and contribute to improving overall code quality. Job Requirements Strong proficiency in programming languages such as Python or Java. Experience with big data technologies like Hadoop or Spark. Knowledge of database management systems like MySQL or NoSQL. Excellent problem-solving skills and attention to detail. Ability to work collaboratively in a team environment. Strong communication and interpersonal skills. Notice period: Immediate joiners preferred.

Posted 1 month ago

Apply

3.0 - 6.0 years

5 - 9 Lacs

Chennai

Work from Office

We are looking for a skilled Hadoop Developer with 3 to 6 years of experience to join our team at IDESLABS PRIVATE LIMITED. The ideal candidate will have expertise in developing and implementing big data solutions using Hadoop technologies. Roles and Responsibility Design, develop, and deploy scalable big data applications using Hadoop. Collaborate with cross-functional teams to identify business requirements and develop solutions. Develop and maintain large-scale data processing systems using Hadoop MapReduce. Troubleshoot and optimize performance issues in existing Hadoop applications. Participate in code reviews to ensure high-quality code standards. Stay updated with the latest trends and technologies in big data development. Job Requirements Strong understanding of Hadoop ecosystem including HDFS, YARN, and Oozie. Experience with programming languages such as Java or Python. Knowledge of database management systems such as MySQL or NoSQL. Familiarity with agile development methodologies and version control systems like Git. Excellent problem-solving skills and attention to detail. Ability to work collaboratively in a team environment and communicate effectively with stakeholders.

Posted 1 month ago

Apply

3.0 - 5.0 years

9 - 13 Lacs

Bengaluru

Work from Office

At Allstate, great things happen when our people work together to protect families and their belongings from lifes uncertainties. And for more than 90 years our innovative drive has kept us a step ahead of our customers evolving needs. From advocating for seat belts, air bags and graduated driving laws, to being an industry leader in pricing sophistication, telematics, and, more recently, device and identity protection. This role is responsible for executing multiple tracks of work to deliver Big Data solutions enabling advanced data science and analytics. This includes working with the team on new Big Data systems for analyzing data; the coding & development of advanced analytics solutions to make/optimize business decisions and processes; integrating new tools to improve descriptive, predictive, and prescriptive analytics. This role contributes to the structured and unstructured Big Data / Data Science tools of Allstate from traditional to emerging analytics technologies and methods. The role is responsible for assisting in the selection and development of other team members. Key Responsibilities Participate in the development of moderately complex and occasionally complex technical solutions using Big Data techniques in data & analytics processes Develops innovative solutions within the team Participates in the development of moderately complex and occasionally complex prototypes and department applications that integrate Big Data and advanced analytics to make business decisions Uses new areas of Big Data technologies, (ingestion, processing, distribution) and research delivery methods that can solve business problems Understands the Big Data related problems and requirements to identify the correct technical approach Takes coaching from key team members to ensure efforts within owned tracks of work will meet their needs Executes moderately complex and occasionally complex functional work tracks for the team Partners with Allstate Technology teams on Big Data efforts Partners closely with team members on Big Data solutions for our data science community and analytic users Leverages and uses Big Data best practices / lessons learned to develop technical solutions Education 4 year Bachelors Degree (Preferred) Experience 2 or more years of experience (Preferred) Supervisory Responsibilities This job does not have supervisory duties. Education & Experience (in lieu) In lieu of the above education requirements, an equivalent combination of education and experience may be considered. Primary Skills Big Data Engineering, Big Data Systems, Big Data Technologies, Data Science, Influencing Others Shift Time Recruiter Info Annapurna Jhaajhat@allstate.com About Allstate The Allstate Corporation is one of the largest publicly held insurance providers in the United States. Ranked No. 84 in the 2023 Fortune 500 list of the largest United States corporations by total revenue, The Allstate Corporation owns and operates 18 companies in the United States, Canada, Northern Ireland, and India. Allstate India Private Limited, also known as Allstate India, is a subsidiary of The Allstate Corporation. The India talent center was set up in 2012 and operates under the corporations Good Hands promise. As it innovates operations and technology, Allstate India has evolved beyond its technology functions to be the critical strategic business services arm of the corporation. With offices in Bengaluru and Pune, the company offers expertise to the parent organizations business areas including technology and innovation, accounting and imaging services, policy administration, transformation solution design and support services, transformation of property liability service design, global operations and integration, and training and transition. Learn more about Allstate India here.

Posted 1 month ago

Apply

5.0 - 7.0 years

6 - 10 Lacs

Noida

Work from Office

R1 is a leading provider of technology-driven solutions that help hospitals and health systems to manage their financial systems and improve patients experience. We are the one company that combines the deep expertise of a global workforce of revenue cycle professionals with the industry's most advanced technology platform, encompassing sophisticated analytics, Al, intelligent automation and workflow orchestration. R1 is a place where we think boldly to create opportunities for everyone to innovate and grow. A place where we partner with purpose through transparency and inclusion. We are a global community of engineers, front-line associates, healthcare operators, and RCM experts that work together to go beyond for all those we serve. Because we know that all this adds up to something more, a place where we're all together better. R1 India is proud to be recognized amongst Top 25 Best Companies to Work For 2024, by the Great Place to Work Institute. This is our second consecutive recognition on this prestigious Best Workplaces list, building on the Top 50 recognition we achieved in 2023. Our focus on employee wellbeing and inclusion and diversity is demonstrated through prestigious recognitions with R1 India being ranked amongst Best in Healthcare, Top 100 Best Companies for Women by Avtar & Seramount, and amongst Top 10 Best Workplaces in Health & Wellness. We are committed to transform the healthcare industry with our innovative revenue cycle management services. Our goal is to make healthcare work better for all by enabling efficiency for healthcare systems, hospitals, and physician practices. With over 30,000 employees globally, we are about 16,000+ strong in India with presence in Delhi NCR, Hyderabad, Bangalore, and Chennai. Our inclusive culture ensures that every employee feels valued, respected, and appreciated with a robust set of employee benefits and engagement activities. R1 RCM Inc. is a leading provider of technology-enabled revenue cycle management services which transform and solve challenges across health systems, hospitals and physician practices. Headquartered in Chicago, R1 is a publicly -traded organization with employees throughout the US and multiple INDIA locations.Our mission is to be the one trusted partner to manage revenue, so providers and patients can focus on what matters most. Our priority is to always do what is best for our clients, patients and each other. With our proven and scalable operating model, we complement a healthcare organizations infrastructure, quickly driving sustainable improvements to net patient revenue and cash flows while reducing operating costs and enhancing the patient experience. Description: We are seeking a Data Engineer with 5-7 year of experience to join our Data Platform team. This role will report to the Manager of data engineering and be involved in the planning, design, and implementation of our centralized data warehouse solution for ETL, reporting and analytics across all applications within the company.QualificationsDeep knowledge and experience working with Python/Scala and Spark Experienced in Azure data factory, Azure Data bricks, Azure Blob Storage, Azure Data Lake, Delta lake. Experience working on Unity Catalog, Apache Parquet Experience with Azure cloud environments Experience with acquiring and preparing data from primary and secondary disparate data sources Experience working on large scale data product implementation, responsible for technical delivery. Experience working with agile methodology preferred Healthcare industry experience preferredResponsibilities: Collaborate with and across Agile teams to design, develop, test, implement, and support technical solutions Work with other team with deep experience in ETL process, distributed microservices, and data science domains to understand how to centralize their data Share your passion for staying experimenting with and learning new technologies. Perform thorough data analysis, uncover opportunities, and address business problems. Working in an evolving healthcare setting, we use our shared expertise to deliver innovative solutions. Our fast-growing team has opportunities to learn and grow through rewarding interactions, collaboration and the freedom to explore professional interests. Our associates are given valuable opportunities to contribute, to innovate and create meaningful work that makes an impact in the communities we serve around the world. We also offer a culture of excellence that drives customer success and improves patient care. We believe in giving back to the community and offer a competitive benefits package. To learn more, visitr1rcm.com Visit us on Facebook

Posted 1 month ago

Apply

4.0 - 8.0 years

10 - 14 Lacs

Bengaluru

Work from Office

About the role: We are seeking a highly skilled Domain Expert in Condition Monitoring to join our team and play a pivotal role in advancing predictive maintenance strategies for electrical equipment. This position focuses on leveraging cutting-edge machine learning and data analytics techniques to design and implement scalable solutions that optimize maintenance processes, enhance equipment reliability, and support operational efficiency. As part of this role, you will apply your expertise in predictive modeling, supervised and unsupervised learning, and advanced data analysis to uncover actionable insights from high-dimensional datasets. You will collaborate with cross-functional teams to translate business requirements into data-driven solutions that surpass customer expectations. If you have a passion for innovation and sustainability in the industrial domain, this is an opportunity to make a meaningful impact. Key Responsibilities: Develop and implement predictive maintenance models using a variety of supervised and unsupervised learning techniques. Analyze high-dimensional datasets to identify patterns and correlations that can inform maintenance strategies. Utilize linear methods for regression and classification, as well as advanced techniques such as splines, wavelets, and kernel methods. Conduct model assessment and selection, focusing on bias, variance, overfitting, and cross-validation. Apply ensemble learning techniques, including Random Forest and Boosting, to improve model accuracy and robustness. Implement structured methods for supervised learning, including additive models, trees, neural networks, and support vector machines. Explore unsupervised learning methods such as cluster analysis, principal component analysis, and self-organizing maps to uncover insights from data. Engage in directed and undirected graph modeling to represent and analyze complex relationships within the data. Collaborate with cross-functional teams to translate business requirements into data-driven solutions. Communicate findings and insights to stakeholders, providing actionable recommendations for maintenance optimization. Mandatory : Masters degree or Ph.D. in Data Science, Statistics, Computer Science, Engineering, or a related field. Proven experience in predictive modeling and machine learning, particularly in the context of predictive maintenance. Strong programming skills in languages such as Python, R, or similar, with experience in relevant libraries (e.g., scikit-learn, TensorFlow, Keras). Familiarity with data visualization tools and techniques to effectively communicate complex data insights. Experience with big data technologies and frameworks (e.g., Hadoop, Spark) is a plus. Excellent problem-solving skills and the ability to work independently as well as part of a team. Strong communication skills, with the ability to convey technical concepts to non-technical stakeholders. Good to Have: Experience in Industrial software & Enterprise solutions Preferred Skills & Attributes: Strong understanding of modern software architectures and DevOps principles. Ability to analyze complex problems and develop effective solutions. Excellent communication and teamwork skills, with experience in cross-functional collaboration. Self-motivated and capable of working independently on complex projects. About the Team Become a part of our mission for sustainabilityclean energy for generations to come. We are a global team of diverse colleagues who share a passion for renewable energy and have a culture of trust and empowerment to make our own ideas a reality. We focus on personal and professional development to grow internally within our organization. Who is Siemens Energy At Siemens Energy, we are more than just an energy technology company. We meet the growing energy demand across 90+ countries while ensuring our climate is protected. With more than 96,000 dedicated employees, we not only generate electricity for over 16% of the global community, but were also using our technology to help protect people and the environment. Our global team is committed to making sustainable, reliable, and affordable energy a reality by pushing the boundaries of what is possible. We uphold a 150-year legacy of innovation that encourages our search for people who will support our focus on decarbonization, new technologies, and energy transformation.

Posted 1 month ago

Apply

3.0 - 8.0 years

11 - 16 Lacs

Bengaluru

Work from Office

As a Data Engineer , you are required to Design, build, and maintain data pipelines that efficiently process and transport data from various sources to storage systems or processing environments while ensuring data integrity, consistency, and accuracy across the entire data pipeline. Integrate data from different systems, often involving data cleaning, transformation (ETL), and validation. Design the structure of databases and data storage systems, including the design of schemas, tables, and relationships between datasets to enable efficient querying. Work closely with data scientists, analysts, and other stakeholders to understand their data needs and ensure that the data is structured in a way that makes it accessible and usable. Stay up-to-date with the latest trends and technologies in the data engineering space, such as new data storage solutions, processing frameworks, and cloud technologies. Evaluate and implement new tools to improve data engineering processes. Qualification Bachelor's or Master's in Computer Science & Engineering, or equivalent. Professional Degree in Data Science, Engineering is desirable. Experience level At least3- 5years hands-on experience in Data Engineering Desired Knowledge & Experience Spark: Spark 3.x, RDD/DataFrames/SQL, Batch/Structured Streaming Knowing Spark internalsCatalyst/Tungsten/Photon Databricks: Workflows, SQL Warehouses/Endpoints, DLT, Pipelines, Unity, Autoloader IDE: IntelliJ/Pycharm, Git, Azure Devops, Github Copilot Test: pytest, Great Expectations CI/CD Yaml Azure Pipelines, Continuous Delivery, Acceptance Testing Big Data Design: Lakehouse/Medallion Architecture, Parquet/Delta, Partitioning, Distribution, Data Skew, Compaction Languages: Python/Functional Programming (FP) SQL TSQL/Spark SQL/HiveQL Storage Data Lake and Big Data Storage Design additionally it is helpful to know basics of: Data Pipelines ADF/Synapse Pipelines/Oozie/Airflow Languages: Scala, Java NoSQL :Cosmos, Mongo, Cassandra Cubes SSAS (ROLAP, HOLAP, MOLAP), AAS, Tabular Model SQL Server TSQL, Stored Procedures Hadoop HDInsight/MapReduce/HDFS/YARN/Oozie/Hive/HBase/Ambari/Ranger/Atlas/Kafka Data Catalog Azure Purview, Apache Atlas, Informatica Required Soft skills & Other Capabilities Great attention to detail and good analytical abilities. Good planning and organizational skills Collaborative approach to sharing ideas and finding solutions Ability to work independently and also in a global team environment.

Posted 1 month ago

Apply

8.0 - 13.0 years

13 - 17 Lacs

Noida

Work from Office

Join Us in Transforming Healthcare with the Power of Data & AI At Innovaccer, were on a advanced Healthcare Intelligence Platform ever created. Grounded in an AI-first design philosophy, our platform turns complex health data into real-time intelligence, empowering healthcare systems to make faster, smarter decisions. We are building a unified , end-to-end data platform that spans Data Acquisition & Integration, Master Data Management , Data Classification & Governance , Advanced Analytics & AI Studio , App Marketplace , AI-as-BI capabilities, etc. All of this is powered by an Agent-first approach , enabling customers to build solutions dynamically and at scale. Youll have the opportunity to define and develop platform capabilities that help healthcare organizations tackle some of the industrys most pressing challenges, such as Kidney disease management, Clinical trials optimization for pharmaceutical companies, Supply chain intelligence for pharmacies, and many more real-world applications. Were looking for talented engineers and platform thinkers who thrive on solving large-scale, complex, and meaningful problems. If youre excited about working at the intersection of healthcare, AI, and cutting-edge platform engineering, wed love to hear from you. About the Role We are looking for a Staff Engineer to design and develop highly scalable, low-latency data platforms and processing engines. This role is ideal for engineers who enjoy building core systems and infrastructure that enable mission-critical analytics at scale. Youll work on solving some of the toughest data engineering challenges in healthcare. A Day in the Life Architect, design, and build scalable data tools and frameworks. Collaborate with cross-functional teams to ensure data compliance, security, and usability. Lead initiatives around metadata management, data lineage, and data cataloging. Define and evangelize standards and best practices across data engineering teams. Own the end-to-end lifecycle of tooling from prototyping to production deployment. Mentor and guide junior engineers and contribute to technical leadership across the organization. Drive innovation in privacy-by-design, regulatory compliance (e.g., HIPAA), and data observability solutions. What You Need 8+ years of experience in software engineering with strong experience building distributed systems. Proficient in backend development (Python, Java, or Scala or Go) and familiar with RESTful API design. Expertise in modern data stacks: Kafka, Spark, Airflow, Snowflake etc. Experience with open-source data governance frameworks like Apache Atlas, Amundsen, or DataHub is a big plus. Familiarity with cloud platforms (AWS, Azure, GCP) and their native data governance offerings. Bachelor's or Masters degree in Computer Science, Engineering, or a related field.

Posted 1 month ago

Apply

7.0 - 12.0 years

4 - 8 Lacs

Bengaluru

Work from Office

About the Role We are seeking a highly skilled Data Engineer with deep expertise in PySpark and the Cloudera Data Platform (CDP) to join our data engineering team. As a Data Engineer, you will be responsible for designing, developing, and maintaining scalable data pipelines that ensure high data quality and availability across the organization. This role requires a strong background in big data ecosystems, cloud-native tools, and advanced data processing techniques. The ideal candidate has hands-on experience with data ingestion, transformation, and optimization on the Cloudera Data Platform, along with a proven track record of implementing data engineering best practices. You will work closely with other data engineers to build solutions that drive impactful business insights. Responsibilities Data Pipeline DevelopmentDesign, develop, and maintain highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform, ensuring data integrity and accuracy. Data IngestionImplement and manage data ingestion processes from a variety of sources (e.g., relational databases, APIs, file systems) to the data lake or data warehouse on CDP. Data Transformation and ProcessingUse PySpark to process, cleanse, and transform large datasets into meaningful formats that support analytical needs and business requirements. Performance OptimizationConduct performance tuning of PySpark code and Cloudera components, optimizing resource utilization and reducing runtime of ETL processes. Data Quality and ValidationImplement data quality checks, monitoring, and validation routines to ensure data accuracy and reliability throughout the pipeline. Automation and OrchestrationAutomate data workflows using tools like Apache Oozie, Airflow, or similar orchestration tools within the Cloudera ecosystem. Education and Experience Bachelors or Masters degree in Computer Science, Data Engineering, Information Systems, or a related field. 3+ years of experience as a Data Engineer, with a strong focus on PySpark and the Cloudera Data Platform. Technical Skills PySparkAdvanced proficiency in PySpark, including working with RDDs, DataFrames, and optimization techniques. Cloudera Data PlatformStrong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase. Data WarehousingKnowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala). Big Data TechnologiesFamiliarity with Hadoop, Kafka, and other distributed computing tools. Orchestration and SchedulingExperience with Apache Oozie, Airflow, or similar orchestration frameworks. Scripting and AutomationStrong scripting skills in Linux.

Posted 1 month ago

Apply

4.0 - 9.0 years

8 - 13 Lacs

Hyderabad

Work from Office

We are looking for a PySpark solutions developer and data engineer who can design and build solutions for one of our Fortune 500 Client programs, which aims towards building a data standardized and curation needs on Hadoop cluster. This is high visibility, fast-paced key initiative will integrate data across internal and external sources, provide analytical insights, and integrate with the customers critical systems. Key Responsibilities Ability to design, build and unit test applications on Spark framework on Python. Build PySpark based applications for both batch and streaming requirements, which will require in-depth knowledge on majority of Hadoop and NoSQL databases as well. Develop and execute data pipeline testing processes and validate business rules and policies. Optimize performance of the built Spark applications in Hadoop using configurations around Spark Context, Spark-SQL, Data Frame, and Pair RDD's. Optimize performance for data access requirements by choosing the appropriate native Hadoop file formats (Avro, Parquet, ORC etc) and compression codec respectively. Build integrated solutions leveraging Unix shell scripting, RDBMS, Hive, HDFS File System, HDFS File Types, HDFS compression codec. Build data tokenization libraries and integrate with Hive & Spark for column-level obfuscation. Experience in processing large amounts of structured and unstructured data, including integrating data from multiple sources. Create and maintain integration and regression testing framework on Jenkins integrated with Bit Bucket and/or GIT repositories. Participate in the agile development process, and document and communicate issues and bugs relative to data standards in scrum meetings. Work collaboratively with onsite and offshore team. Develop & review technical documentation for artifacts delivered. Ability to solve complex data-driven scenarios and triage towards defects and production issues. Ability to learn-unlearn-relearn concepts with an open and analytical mindset. Participate in code release and production deployment. Challenge and inspire team members to achieve business results in a fast paced and quickly changing environment. Preferred Qualifications BE/B.Tech/ B.Sc. in Computer Science/ Statistics from an accredited college or university. Minimum 3 years of extensive experience in design, build and deployment of PySpark-based applications. Expertise in handling complex large-scale Big Data environments preferably (20Tb+). Minimum 3 years of experience in the followingHIVE, YARN, HDFS. Hands-on experience writing complex SQL queries, exporting, and importing large amounts of data using utilities. Ability to build abstracted, modularized reusable code components. Prior experience on ETL tools preferably Informatica PowerCenter is advantageous. Able to quickly adapt and learn. Able to jump into an ambiguous situation and take the lead on resolution. Able to communicate and coordinate across various teams. Are comfortable tackling new challenges and new ways of working Are ready to move from traditional methods and adapt into agile ones Comfortable challenging your peers and leadership team. Can prove yourself quickly and decisively. Excellent communication skills and Good Customer Centricity. Strong Target & High Solution Orientation.

Posted 1 month ago

Apply

8.0 - 13.0 years

5 - 10 Lacs

Hyderabad

Work from Office

6+ years of experience with Java Spark. Strong understanding of distributed computing, big data principles, and batch/stream processing. Proficiency in working with AWS services such as S3, EMR, Glue, Lambda, and Athena. Experience with Data Lake architectures and handling large volumes of structured and unstructured data. Familiarity with various data formats. Strong problem-solving and analytical skills. Excellent communication and collaboration abilities. Design, develop, and optimize large-scale data processing pipelines using Java Spark Build scalable solutions to manage data ingestion, transformation, and storage in AWS-based Data Lake environments. Collaborate with data architects and analysts to implement data models and workflows aligned with business requirements. Ensure performance tuning, fault tolerance, and reliability of distributed data processing systems.

Posted 1 month ago

Apply

8.0 - 13.0 years

5 - 10 Lacs

Bengaluru

Work from Office

6+ years of experience with Java Spark. Strong understanding of distributed computing, big data principles, and batch/stream processing. Proficiency in working with AWS services such as S3, EMR, Glue, Lambda, and Athena. Experience with Data Lake architectures and handling large volumes of structured and unstructured data. Familiarity with various data formats. Strong problem-solving and analytical skills. Excellent communication and collaboration abilities. Design, develop, and optimize large-scale data processing pipelines using Java Spark Build scalable solutions to manage data ingestion, transformation, and storage in AWS-based Data Lake environments. Collaborate with data architects and analysts to implement data models and workflows aligned with business requirements. Ensure performance tuning, fault tolerance, and reliability of distributed data processing systems.

Posted 1 month ago

Apply

8.0 - 13.0 years

8 - 12 Lacs

Hyderabad

Work from Office

10+ years of experience with Java Spark. Strong understanding of distributed computing, big data principles, and batch/stream processing. Proficiency in working with AWS services such as S3, EMR, Glue, Lambda, and Athena. Experience with Data Lake architectures and handling large volumes of structured and unstructured data. Familiarity with various data formats. Strong problem-solving and analytical skills. Excellent communication and collaboration abilities. Design, develop, and optimize large-scale data processing pipelines using Java Spark Build scalable solutions to manage data ingestion, transformation, and storage in AWS-based Data Lake environments. Collaborate with data architects and analysts to implement data models and workflows aligned with business requirements.

Posted 1 month ago

Apply

8.0 - 13.0 years

10 - 14 Lacs

Bengaluru

Work from Office

Educational Bachelor of Engineering Service Line Strategic Technology Group Responsibilities Power Programmer is an important initiative within Global Delivery to develop a team of Full Stack Developers who will be working on complex engineering projects, platforms and marketplaces for our clients using emerging technologies., They will be ahead of the technology curve and will be constantly enabled and trained to be Polyglots., They are Go-Getters with a drive to solve end customer challenges and will spend most of their time in designing and coding, End to End contribution to technology oriented development projects., Providing solutions with minimum system requirements and in Agile Mode., Collaborate with Power Programmers., Open Source community and Tech User group., Custom Development of new Platforms & Solutions ,Opportunities., Work on Large Scale Digital Platforms and marketplaces., Work on Complex Engineering Projects using cloud native architecture ., Work with innovative Fortune 500 companies in cutting edge technologies., Co create and develop New Products and Platforms for our clients., Contribute to Open Source and continuously upskill in latest technology areas., Incubating tech user group Technical and Professional : Bigdata Spark, scala, hive, kafka Preferred Skills: Technology-Big Data-Hbase Technology-Big Data-Sqoop Technology-Functional Programming-Scala Technology-Big Data - Data Processing-Spark-SparkSQL

Posted 1 month ago

Apply

5.0 - 8.0 years

8 - 13 Lacs

Bengaluru

Work from Office

Educational Bachelor of Engineering Service Line Strategic Technology Group Responsibilities Power Programmer is an important initiative within Global Delivery to develop a team of Full Stack Developers who will be working on complex engineering projects, platforms and marketplaces for our clients using emerging technologies., They will be ahead of the technology curve and will be constantly enabled and trained to be Polyglots., They are Go-Getters with a drive to solve end customer challenges and will spend most of their time in designing and coding, End to End contribution to technology oriented development projects., Providing solutions with minimum system requirements and in Agile Mode., Collaborate with Power Programmers., Open Source community and Tech User group., Custom Development of new Platforms & Solutions ,Opportunities., Work on Large Scale Digital Platforms and marketplaces., Work on Complex Engineering Projects using cloud native architecture ., Work with innovative Fortune 500 companies in cutting edge technologies., Co create and develop New Products and Platforms for our clients., Contribute to Open Source and continuously upskill in latest technology areas., Incubating tech user group Technical and Professional : Bigdata Spark, scala, hive, kafka Preferred Skills: Technology-Big Data - Hadoop-Hadoop-Hive Technology-Big Data-Sqoop Technology-Functional Programming-Scala Technology-Big Data - Data Processing-Spark-SparkSQL

Posted 1 month ago

Apply

6.0 - 11.0 years

3 - 7 Lacs

Pune

Work from Office

Experience :7-9 yrs Experience in AWS services must like S3, Lambda , Airflow, Glue, Athena, Lake formation ,Step functions etc. Experience in programming in JAVA and Python. Experience performing data analysis (NOT DATA SCIENCE) on AWS platforms Nice to have : Experience in a Big Data technologies (Terradata, Snowflake, Spark, Redshift, Kafka, etc.) Experience with data management process on AWS is a huge Plus Experience in implementing complex ETL transformations on AWS using Glue. Familiarity with relational database environment (Oracle, Teradata, etc.) leveraging databases, tables/views, stored procedures, agent jobs, etc.

Posted 1 month ago

Apply

8.0 - 13.0 years

4 - 8 Lacs

Hyderabad

Work from Office

This role will be instrumental in building and maintaining robust, scalable, and reliable data pipelines using Confluent Kafka, ksqlDB, Kafka Connect, and Apache Flink. The ideal candidate will have a strong understanding of data streaming concepts, experience with real-time data processing, and a passion for building high-performance data solutions. This role requires excellent analytical skills, attention to detail, and the ability to work collaboratively in a fast-paced environment. Essential Responsibilities Design & develop data pipelines for real time and batch data ingestion and processing using Confluent Kafka, ksqlDB, Kafka Connect, and Apache Flink. Build and configure Kafka Connectors to ingest data from various sources (databases, APIs, message queues, etc.) into Kafka. Develop Flink applications for complex event processing, stream enrichment, and real-time analytics. Develop and optimize ksqlDB queries for real-time data transformations, aggregations, and filtering. Implement data quality checks and monitoring to ensure data accuracy and reliability throughout the pipeline. Monitor and troubleshoot data pipeline performance, identify bottlenecks, and implement optimizations. Automate data pipeline deployment, monitoring, and maintenance tasks. Stay up-to-date with the latest advancements in data streaming technologies and best practices. Contribute to the development of data engineering standards and best practices within the organization. Participate in code reviews and contribute to a collaborative and supportive team environment. Work closely with other architects and tech leads in India & US and create POCs and MVPs Provide regular updates on the tasks, status and risks to project manager The experience we are looking to add to our team Required Bachelors degree or higher from a reputed university 8 to 10 years total experience with majority of that experience related to ETL/ELT, big data, Kafka etc. Proficiency in developing Flink applications for stream processing and real-time analytics. Strong understanding of data streaming concepts and architectures. Extensive experience with Confluent Kafka, including Kafka Brokers, Producers, Consumers, and Schema Registry. Hands-on experience with ksqlDB for real-time data transformations and stream processing. Experience with Kafka Connect and building custom connectors. Extensive experience in implementing large scale data ingestion and curation solutions Good hands on experience in big data technology stack with any cloud platform - Excellent problemsolving, analytical, and communication skills. Ability to work independently and as part of a team Good to have Experience in Google Cloud Healthcare industry experience Experience in Agile

Posted 1 month ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies