Jobs
Interviews

280 Apache Spark Jobs - Page 8

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 - 7.0 years

7 - 9 Lacs

Coimbatore

Work from Office

About the job : Exp :5+yrs NP : Imm-15 days Rounds : 3 Rounds (Virtual) Mandate Skills : Apache spark, hive, Hadoop, spark, scala, Databricks Job Description : The Role : - Designing and building optimized data pipelines using cutting-edge technologies in a cloud environment to drive analytical insights. - Constructing infrastructure for efficient ETL processes from various sources and storage systems. - Leading the implementation of algorithms and prototypes to transform raw data into useful information. - Architecting, designing, and maintaining database pipeline architectures, ensuring readiness for AI/ML transformations. - Creating innovative data validation methods and data analysis tools. - Ensuring compliance with data governance and security policies. - Interpreting data trends and patterns to establish operational alerts. - Developing analytical tools, programs, and reporting mechanisms - Conducting complex data analysis and presenting results effectively. - Preparing data for prescriptive and predictive modeling. - Continuously exploring opportunities to enhance data quality and reliability. - Applying strong programming and problem-solving skills to develop scalable solutions. Requirements : - Experience in the Big Data technologies (Hadoop, Spark, Nifi, Impala) - 5+ years of hands-on experience designing, building, deploying, testing, maintaining, monitoring, and owning scalable, resilient, and distributed data pipelines. - High proficiency in Scala/Java and Spark for applied large-scale data processing. - Expertise with big data technologies, including Spark, Data Lake, and Hive

Posted 3 weeks ago

Apply

5.0 - 9.0 years

20 - 35 Lacs

Bengaluru

Hybrid

Job Description: Data Development Engineer for Data Initiative at Global Link, the ideal candidate will: Work with the team to define high-level technical requirements and architecture for the back-end services , Data components, data monetization components Develop new application features and enhance existing ones Develop relevant documentation and diagrams. Work with other teams for deployment, testing, training, and production support. Integration with Data Engineering teams Ensure that development, coding, privacy, and security standards are adhered to Write clean and quality code. Ready to work on new technologies as business demands Strong communication skills and work ethics. Core/Must have skills: Out of total years of experience, minimum 5+ years of professional experience in Python development, with a focus on data-intensive applications. Proven experience with Apache Spark and PySpark for large-scale data processing. Solid understanding of SQL and experience working with relational databases (e.g., Oracle, sparkSQL) and query optimization. Experience in SDLC, particularly in applying software development best practices and methodologies. Experience in creating and maintaining unit tests, integration tests, and performance testing for data pipelines and systems. Experience with big data platform Databricks. Experience in building data intensive applications, data products and good understanding of data pipeline (Feature Data Engineering ,Data Transformation, Data Lineage, Data Quality) Experience with cloud platforms such as AWS for data infrastructure and services is preferred. This is a hands-on developer positions within a small elite development team that moves very fast Role will evolve as tech leadership for Data Initiative Good to have skills: Knowledge of FX business / capital market domain is a plus Knowledge of data formats like AVRO, Parquet, and working with complex data types. Experience with Apache Kafka for real-time data streaming and Kafka Streams for processing data streams. Experience with Airflow for orchestrating complex data workflows and pipelines. Expertise or interest in Linux Exposure to data governance and security best practices in data management.

Posted 4 weeks ago

Apply

6.0 - 8.0 years

8 - 10 Lacs

Bengaluru

Work from Office

About the Role Love deep data? Does innovative-thinking describe you? Then you may be our next Lead Data Scientist. In this role youll be the Dumbledore to our team of wizards - our junior data scientists. You will be responsible for transforming scattered pieces of information into valuable data that can be used to achieve goals effectively. You will extract and mine critical bits of information and drive insightful discussions that result in app innovations. What you will do Own and deliver solutions across multiple charters by formulating well-scoped problem statements and driving them to execution with measurable impact Mentor a team of data scientists (DS2s and DS3s), helping them with project planning, execution, and on-call issue resolution Design and optimize key user journeys (e.g., Reseller Experience, Search, Fraud Systems) by identifying user intents and behavioral patterns from large-scale data Collaborate with machine learning engineers and big data teams to build scalable ML pipelines and improve inference performance Continuously track and improve model performance using state-of-the-art (SOTA) techniques and libraries Lead experimental design for usability improvements and user growth, leveraging statistical rigor Contribute to system-level thinking by enhancing internal tools, frameworks, and libraries to improve team efficiency and code quality Partner with engineering to ensure data reliability, compliance with security/PII guidelines, and integration of models into production systems Proactively explore new areas of opportunity through research, data mining, and academic collaboration, including publishing and attending top-tier conferences Communicate findings, plans, and results clearly with DS, product, and tech stakeholders, and create technical documentation consumable by both DS and engineering teams Conduct research collaborations with premier colleges and universities Attend conferences and publish research papers What you will need A Bachelor's degree in Computer Science, Data Science, or a related field; a Masters is a plus 6-8 years of experience in data science with a strong track record of building and deploying ML solutions at scale Deep understanding of core ML techniques supervised, unsupervised, and semi-supervised learning along with strong foundations in statistics and linear algebra Exposure to deep learning concepts and architectures (e.g., CNNs, RNNs, Transformers) and their practical applications Proficiency in Python and SQL, with experience in building data pipelines and analytical workflows Hands-on experience with large-scale data processing using Apache Spark, Hadoop/Hive, or cloud platforms such as GCP Strong programming fundamentals and experience writing clean, maintainable, and production-ready code Excellent analytical and problem-solving skills the ability to extract actionable insights from messy and high-volume data Solid grasp of statistical testing, hypothesis validation, and common pitfalls in experimental design Experience designing and interpreting A/B tests, including uplift measurement and segmentation Ability to work closely with product and engineering teams to translate business goals into scalable ML or data solutions Bonus points for: Experience with reinforcement learning or sequence modeling techniques Contributions to ML libraries, internal tools, or research publications.

Posted 4 weeks ago

Apply

5.0 - 8.0 years

22 - 32 Lacs

Bengaluru

Work from Office

Work with the team to define high-level technical requirements and architecture for the back-end services ,Data components,data monetization component Develop new application features & enhance existing one Develop relevant documentation and diagram Required Candidate profile min 5+ yr of exp in Python development, with a focus on data-intensive application exp with Apache Spark & PySpark for large-scale data process understand of SQL & exp working with relational database

Posted 4 weeks ago

Apply

3.0 - 5.0 years

10 - 14 Lacs

Bengaluru

Work from Office

Key Responsibilities : - Design, build, and maintain scalable and robust data pipelines and ETL workflows using GCP services. - Work extensively with BigQuery, Cloud Storage, Cloud Dataflow, and other GCP components to ingest, process, and transform large datasets. - Leverage big data frameworks such as Apache Spark and Hadoop to process structured and unstructured data efficiently. - Develop and optimize SQL queries and Python scripts for data transformation and automation. - Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions. - Implement best practices for data quality, monitoring, and alerting for data workflows. - Ensure compliance with data governance policies, including data privacy, security, and regulatory standards. - Continuously improve system performance and reliability by identifying and resolving bottlenecks or inefficiencies. - Participate in code reviews, architecture discussions, and technical planning. Required Qualifications : - Bachelors degree in Computer Science, Information Technology, or a related technical field. - 3+ years of hands-on experience in data engineering, with a focus on large-scale data processing and cloud technologies. - Strong expertise in Google Cloud Platform (GCP), particularly BigQuery, Cloud Composer, Cloud Storage, Dataflow, and Pub/Sub. - Solid knowledge of SQL, Python, and scripting for automation and data manipulation. - Practical experience with Apache Spark, Hadoop, or similar distributed computing frameworks. - Familiarity with data modeling, warehousing concepts, and data pipeline orchestration. - Understanding of data privacy, security, and governance in cloud environments. - Excellent problem-solving skills and ability to work in a collaborative team environment. Preferred Skills (Good to Have) : - GCP Certification (e.g., Professional Data Engineer) - Experience with CI/CD pipelines and infrastructure-as-code tools (e.g., Terraform) - Exposure to Airflow or Cloud Composer for orchestration - Experience working in Agile development environments

Posted 1 month ago

Apply

8.0 - 10.0 years

5 - 6 Lacs

Navi Mumbai, SBI Belapur

Work from Office

ISA Non captive RTH-Y Note: 1.This position requires the candidate to work from the office starting from day one clinet office. 2.Ensure that you perform basic validation and gauge the interest level of the candidate before uploading their profile to our system. 3.Candidate Band will be count as per their relevant experience. We will not entertain lesser experience profile for higher band. 4. Candidate full BGV is required before onboarding the candidate. 5. If required will regularize the candidate after 6months. Hence 6 months NOC is required from the DOJ. Mode of Interview: Face to Face (Mandatory). **JOB DESCRIPTION** Total Years of Experience : 8-10 Years Relevant Years of Experience : 8-10 Years Mandatory Skills : Kafka Platform SME/Admin Detailed JD : We are seeking an experienced Kafka knowledge and Apache Spark Senior Developer to join our data engineering team. The ideal candidate will have hands-on experience in developing, managing, and optimizing big data pipelines on Cloudera platforms using Apache Spark. The developer will be responsible for building large-scale, distributed data processing solutions that ensure high-performance data processing and analytics capabilities across the organization. Key skills include message serialization (Avro, JSON), Kafka Streams API for stream processing, and Kafka Connect for system integration.

Posted 1 month ago

Apply

6.0 - 11.0 years

15 - 30 Lacs

Bengaluru

Work from Office

We are seeking an experienced Databricks Developer / Data Engineer to design, develop, and optimize data pipelines, ETL workflows, and big data solutions using Databricks. The ideal candidate should have expertise in Apache Spark, PySpark, SQL, and cloud-based data platforms (Azure, AWS, GCP). This role involves working with large-scale datasets, data lakes, and data warehouses to drive business intelligence and analytics. Key Responsibilities: Design, build, and optimize ETL and ELT pipelines using Databricks and Apache Spark. Work with big data processing frameworks (PySpark, Scala, SQL) for data transformation and analytics. Implement Delta Lake architecture for data reliability, ACID transactions, and schema evolution. Integrate Databricks with cloud services like Azure Data Lake, AWS S3, GCP BigQuery, and Snowflake. Develop and maintain data models, data lakes, and data warehouse solutions. Optimize Spark performance tuning, job scheduling, and cluster configurations. Work with Azure Synapse, AWS Glue, or GCP Dataflow to enable seamless data integration. Implement CI/CD automation for data pipelines using Azure DevOps, GitHub Actions, or Jenkins. Perform data quality checks, validation, and governance using Databricks Unity Catalog. Collaborate with data scientists, analysts, and business teams to support analytics and AI/ML models. Required Skills & Qualifications: 6+ years of experience in data engineering and big data technologies. Strong expertise in Databricks, Apache Spark, and PySpark/Scala. Hands-on experience with SQL, NoSQL, and structured/unstructured data processing. Experience with cloud platforms (Azure, AWS, GCP) and their data services. Proficiency in Python, SQL, and Spark optimizations. Experience with Delta Lake, Lakehouse architecture, and metadata management. Strong understanding of ETL/ELT processes, data lakes, and warehousing concepts. Experience with streaming data processing (Kafka, Event Hubs, Kinesis, etc.). Knowledge of security best practices, role-based access control (RBAC), and compliance. Experience in Agile methodologies and working in cross-functional teams. Preferred Qualifications: Databricks Certifications (Databricks Certified Data Engineer Associate/Professional). Experience with Machine Learning and AI/ML pipelines on Databricks. Hands-on experience with Terraform, CloudFormation, or Infrastructure as Code (IaC).

Posted 1 month ago

Apply

8.0 - 13.0 years

5 - 14 Lacs

Pune

Work from Office

Role & responsibilities Mandatory skills* API, Java, Databricks and AWS Detailed JD *(Roles and Responsibilities) Technical Two or more years of API Development experience (specifically Rest APIs using Java, Spring boot, Hibernate) Two or more years of Data Engineering and the respective tools and technologies (e.g., Apache Spark, Databricks, SQL DB, NoSQL DB, Data Lake concepts) Working knowledge of Test-driven development Working knowledge of experience leveraging DevOps and lean development principles such as Continuous Integration, Continuous Delivery/Deployment using tools like Git Working knowledge of ETL, Data Modeling, Data Warehousing, and working with large-scale datasets Working Knowledge of AWS services such as Lambda, RDS, ECS, DynamoDB, API Gateway, S3 etc. Good to have: AWS Developer certification or working experience in AWS cloud or other cloud technologies Passionate, creative and have the desire to learn new complex technical areas Accountable, curious, and collaborative with an intense focus on product quality Skilled in interpersonal communications and ability to communicate complex topics to non-technical audiences Experience working in an agile team environment Interested candidate can share me there updated resume in recruiter.wtr26@walkingtree.in

Posted 1 month ago

Apply

10.0 - 15.0 years

96 - 108 Lacs

Bengaluru

Work from Office

Responsibilities: * Design data solutions using Java, Python & Apache Spark. * Collaborate with cross-functional teams on Azure cloud projects. * Ensure data security through Redis caching and HDFS storage.

Posted 1 month ago

Apply

5.0 - 10.0 years

18 - 30 Lacs

Pune

Hybrid

Title: Big Data (Apache Spark) Experience: 4+ Yrs Location: Pune (Hybrid) JD: True Hands-On Developer in Programming Languages like Java or Scala . Expertise in Apache Spark . Database modelling and working with any of the SQL or NoSQL Database is must. Working knowledge of Scripting languages like shell/python. Experience of working with Cloudera is Preferred Orchestration tools like Airflow or Oozie would be a value addition. Knowledge of Table formats like Delta or Iceberg is plus to have. Working experience of Version controls like Git, build tools like Maven is recommended. Having software development experience is good to have along with Data Engineering experience. Qualifications: A bachelors or master’s degree in computer science, Engineering, or a related discipline. Ability to communicate effectively across multiple audiences, including firm-wide business units, senior leaders, associates and clients. Exceptional interpersonal skills, including teamwork, facilitation, and negotiation. Strong planning and organizational skills

Posted 1 month ago

Apply

4.0 - 9.0 years

10 - 20 Lacs

Bengaluru

Remote

Job Title: Software Engineer GCP Data Engineering Work Mode: Remote Base Location: Bengaluru Experience Required: 4 to 6 Years Job Summary: We are seeking a Software Engineer with a strong background in GCP Data Engineering and a solid understanding of how to build scalable data processing frameworks. The ideal candidate will be proficient in data ingestion, transformation, and orchestration using modern cloud-native tools and technologies. This role requires hands-on experience in designing and optimizing ETL pipelines, managing big data workloads, and supporting data quality initiatives. Key Responsibilities: Design and develop scalable data processing solutions using Apache Beam, Spark, and other modern frameworks. Build and manage data pipelines on Google Cloud Platform (GCP) using services like Dataflow, Dataproc, Composer (Airflow), and BigQuery . Collaborate with data architects and analysts to understand data models and implement efficient ETL solutions. Leverage DevOps and CI/CD best practices for code management, testing, and deployment using tools like GitHub and Cloud Build. Ensure data quality, performance tuning, and reliability of data processing systems. Work with cross-functional teams to understand business requirements and deliver robust data infrastructure to support analytical use cases. Required Skills: 4 to 6 years of professional experience as a Data Engineer working on cloud platforms, preferably GCP . Proficiency in Java and Python with strong problem-solving and analytical skills. Hands-on experience with Apache Beam , Apache Spark , Dataflow , Dataproc , Composer (Airflow) , and BigQuery . Strong understanding of data warehousing concepts and ETL pipeline optimization techniques. Experience in cloud-based architectures and DevOps practices. Familiarity with version control (GitHub) and CI/CD pipelines . Preferred Skills: Exposure to modern ETL tools and data integration platforms. Experience with data governance, data quality frameworks , and metadata management. Familiarity with performance tuning in distributed data processing systems. Tech Stack: Cloud: GCP (Dataflow, BigQuery, Dataproc, Composer) Programming: Java, Python Frameworks: Apache Beam, Apache Spark DevOps: GitHub, CI/CD tools, Composer (Airflow) ETL/Data Tools: Data ingestion, transformation, and warehousing on GCP

Posted 1 month ago

Apply

7.0 - 12.0 years

25 - 35 Lacs

Hyderabad, Pune, Bengaluru

Hybrid

Must have: Scala, Spark, Azure databricks, Kubernetes Note: Quantexa certification is a Must. Good to have: Python, Pyspark, Elastic, Restful APIs ROLE PURPOSE The purpose of the Data Engineer is to design, build and unit test data pipelines and jobs for Projects and Programmes on Azure Platform. This role is purposed for Quantexa Fraud platform programme, Quantexa certified engineer is a preferred. KEY ACCOUNTABILITIES Analyse business requirements and support and maintain Quantexa platform. Build and deploy new/changes to data mappings, sessions, and workflows in Azure Cloud Platform – key focus area would be Quantexa platform on Azure. Develop both batch (using Azure Databricks) and real time (Kafka and Kubernetes) pipelines and jobs to extract, transform and load data to platform. Perform ETL routines performance tuning, troubleshooting, support, and capacity estimation. Conduct thorough testing of ETL code changes to ensure quality deliverables Provide day-to-day support and mentoring to end users who are interacting with the data Profile and understand large amounts of source data available, including structured and semi-structured/web activity data Analyse defects and provide fixes Provide release notes for deployments Support Release activities Problem solving attitude Keep up to date with new skills - Develop technology skills in other areas of Platform FUNCTIONAL / TECHNICAL SKILLS Skills and Experience: Exposure to Fraud, financial crime, customer insights or compliance-based projects that utilize detection and prediction models Experienced in ETL tools like databricks (Spark) and data projects Experience with Kubernetes to deliver real time data ingestion and transformation using scala. Scala knowledge would be highly desirable, Python knowledge is a plus Strong knowledge of SQL Strong Analytical skills Azure DevOps knowledge Experience with local IDE, design documentations, Unit testing.

Posted 1 month ago

Apply

5.0 - 10.0 years

15 - 27 Lacs

Pune

Work from Office

Hi, Wishes from GSN!!! Pleasure connecting with you We been into Corporate Search Services for Identifying & Bringing in Stellar Talented Professionals for our reputed IT / Non-IT clients in India. We have been successfully providing results to various potential needs of our clients for the last 20 years. We have been mandated by one of our prestigious MNC client to identify Scala Developer - Pune professionals. Kindly find below the required details. ******** Looking for SHORT JOINERs ******** Position : Permanent Mandatory Skill : Scala Developer Exp Range : 5+ years Job Role : Senior Developer / Tech Lead Location : Only Pune Work Mode : WFO - All 5 Days Job Description: Bachelor's or Master's degree in Computer Science, Engineering, or a related quantitative field. Minimum 5 years of professional experience in Data Engineering, with a strong focus on big data technologies. Proficiency in Scala for developing big data applications and transformations, especially with Apache Spark. Expert-level proficiency in SQL ; ability to write complex queries, optimize performance, and understand database internals. Extensive hands-on experience with Apache Spark (Spark SQL, DataFrames, RDDs) for large-scale data processing and analytics. ******** Looking for SHORT JOINERs ******** Kindly apply ONLINE for an IMMEDIATE response. Thanks & Regards KAVIYA GSN HR Pvt Ltd Mob : 9150016092 Email : Shobana@gsnhr.net Web : www.gsnhr.net Google review : https://g.co/kgs/UAsF9W

Posted 1 month ago

Apply

8.0 - 10.0 years

20 - 25 Lacs

Hyderabad, Pune, Chennai

Hybrid

Please Note - NP should be 0-15 days. Primary Responsibilities - Create and maintain data storage solutions including Azure SQL Database, Azure Data Lake, and Azure Blob Storage. Design, implement, and maintain data pipelines for data ingestion, processing, and transformation in Azure Create data models for analytics purposes Utilizing Azure Data Factory or comparable technologies, create and maintain ETL (Extract, Transform, Load) operations Use Azure Data Factory and Databricks to assemble large, complex data sets Implementing data validation and cleansing procedures will ensure the quality, integrity, and dependability of the data. Ensure data security and compliance Collaborate with data engineers, and other stakeholders to understand requirements and translate them into scalable and reliable data platform architectures Required skills: Blend of technical expertise, analytical problem-solving, and collaboration with cross-functional teams Azure DevOps Apache Spark, Python SQL proficiency Azure Databricks knowledge Big data technologies

Posted 1 month ago

Apply

4.0 - 9.0 years

6 - 11 Lacs

Mumbai

Work from Office

Role Senior Databricks Engineer As a Mid Databricks Engineer, you will play a pivotal role in designing, implementing, and optimizing data processing pipelines and analytics solutions on the Databricks platform. You will collaborate closely with cross-functional teams to understand business requirements, architect scalable solutions, and ensure the reliability and performance of our data infrastructure. This role requires deep expertise in Databricks, strong programming skills, and a passion for solving complex engineering challenges. What you'll do : - Design and develop data processing pipelines and analytics solutions using Databricks. - Architect scalable and efficient data models and storage solutions on the Databricks platform. - Collaborate with architects and other teams to migrate current solution to use Databricks. - Optimize performance and reliability of Databricks clusters and jobs to meet SLAs and business requirements. - Use best practices for data governance, security, and compliance on the Databricks platform. - Mentor junior engineers and provide technical guidance. - Stay current with emerging technologies and trends in data engineering and analytics to drive continuous improvement. You'll be expected to have : - Bachelor's or master's degree in computer science, Engineering, or a related field. - 5 to 8 years of overall experience and 2+ years of experience designing and implementing data solutions on the Databricks platform. - Proficiency in programming languages such as Python, Scala, or SQL. - Strong understanding of distributed computing principles and experience with big data technologies such as Apache Spark. - Experience with cloud platforms such as AWS, Azure, or GCP, and their associated data services. - Proven track record of delivering scalable and reliable data solutions in a fast-paced environment. - Excellent problem-solving skills and attention to detail. - Strong communication and collaboration skills with the ability to work effectively in cross-functional teams. - Good to have experience with containerization technologies such as Docker and Kubernetes. - Knowledge of DevOps practices for automated deployment and monitoring of data pipelines.

Posted 1 month ago

Apply

5.0 - 8.0 years

32 - 35 Lacs

Bengaluru

Remote

We are seeking a MLOps Engineer to design, implement, and manage scalable machine learning infrastructure and automation pipelines. The ideal candidate will have deep hands-on expertise in Azure, AKS, Infrastructure as Code, and CI/CD, with a passion for enabling efficient and reliable deployment of machine learning models in production environments. Responsibilities:- Architect & Deploy: Design and manage scalable ML infrastructure on Azure (AKS), leveraging Infrastructure as Code principles. Automate & Accelerate: Build and optimize CI/CD pipelines with GitHub Actions for seamless software, data, and model delivery. Engineer Performance: Develop efficient and reliable data pipelines using Python and distributed computing frameworks. Ensure Reliability : Implement solutions for deploying and maintaining ML models in production. Collaborate & Innovate : Partner with data scientists and engineers to continuously enhance existing MLOps capabilities. Expertise:- Azure & AKS : Deep hands-on experience. IaC & CI/CD : Mastery of Terraform/Bicep & GitHub Actions. Data Engineering : Advanced Python & Spark for complex pipelines. ML Operations : Proven ability in model serving & monitoring. Problem Solver: Adept at navigating complex technical challenges and delivering solutions. Nice to Have:- Experience with Kubeflow, MLflow, or similar MLOps tools. Exposure to other cloud platforms (AWS, GCP) in addition to Azure. Familiarity with security, compliance, and cost-optimization for ML workloads.

Posted 1 month ago

Apply

4.0 - 9.0 years

6 - 11 Lacs

Chennai

Work from Office

As a Mid Databricks Engineer, you will play a pivotal role in designing, implementing, and optimizing data processing pipelines and analytics solutions on the Databricks platform. You will collaborate closely with cross-functional teams to understand business requirements, architect scalable solutions, and ensure the reliability and performance of our data infrastructure. This role requires deep expertise in Databricks, strong programming skills, and a passion for solving complex engineering challenges. What you'll do : - Design and develop data processing pipelines and analytics solutions using Databricks. - Architect scalable and efficient data models and storage solutions on the Databricks platform. - Collaborate with architects and other teams to migrate current solution to use Databricks. - Optimize performance and reliability of Databricks clusters and jobs to meet SLAs and business requirements. - Use best practices for data governance, security, and compliance on the Databricks platform. - Mentor junior engineers and provide technical guidance. - Stay current with emerging technologies and trends in data engineering and analytics to drive continuous improvement. You'll be expected to have : - Bachelor's or master's degree in computer science, Engineering, or a related field. - 5 to 8 years of overall experience and 2+ years of experience designing and implementing data solutions on the Databricks platform. - Proficiency in programming languages such as Python, Scala, or SQL. - Strong understanding of distributed computing principles and experience with big data technologies such as Apache Spark. - Experience with cloud platforms such as AWS, Azure, or GCP, and their associated data services. - Proven track record of delivering scalable and reliable data solutions in a fast-paced environment. - Excellent problem-solving skills and attention to detail. - Strong communication and collaboration skills with the ability to work effectively in cross-functional teams. - Good to have experience with containerization technologies such as Docker and Kubernetes. - Knowledge of DevOps practices for automated deployment and monitoring of data pipelines.

Posted 1 month ago

Apply

4.0 - 9.0 years

8 - 13 Lacs

Bengaluru

Work from Office

As a Mid Databricks Engineer, you will play a pivotal role in designing, implementing, and optimizing data processing pipelines and analytics solutions on the Databricks platform. You will collaborate closely with cross-functional teams to understand business requirements, architect scalable solutions, and ensure the reliability and performance of our data infrastructure. This role requires deep expertise in Databricks, strong programming skills, and a passion for solving complex engineering challenges. What you'll do : - Design and develop data processing pipelines and analytics solutions using Databricks. - Architect scalable and efficient data models and storage solutions on the Databricks platform. - Collaborate with architects and other teams to migrate current solution to use Databricks. - Optimize performance and reliability of Databricks clusters and jobs to meet SLAs and business requirements. - Use best practices for data governance, security, and compliance on the Databricks platform. - Mentor junior engineers and provide technical guidance. - Stay current with emerging technologies and trends in data engineering and analytics to drive continuous improvement. You'll be expected to have : - Bachelor's or master's degree in computer science, Engineering, or a related field. - 5 to 8 years of overall experience and 2+ years of experience designing and implementing data solutions on the Databricks platform. - Proficiency in programming languages such as Python, Scala, or SQL. - Strong understanding of distributed computing principles and experience with big data technologies such as Apache Spark. - Experience with cloud platforms such as AWS, Azure, or GCP, and their associated data services. - Proven track record of delivering scalable and reliable data solutions in a fast-paced environment. - Excellent problem-solving skills and attention to detail. - Strong communication and collaboration skills with the ability to work effectively in cross-functional teams. - Good to have experience with containerization technologies such as Docker and Kubernetes. - Knowledge of DevOps practices for automated deployment and monitoring of data pipelines.

Posted 1 month ago

Apply

4.0 - 9.0 years

8 - 13 Lacs

Kolkata

Work from Office

Role Senior Databricks Engineer As a Mid Databricks Engineer, you will play a pivotal role in designing, implementing, and optimizing data processing pipelines and analytics solutions on the Databricks platform. You will collaborate closely with cross-functional teams to understand business requirements, architect scalable solutions, and ensure the reliability and performance of our data infrastructure. This role requires deep expertise in Databricks, strong programming skills, and a passion for solving complex engineering challenges. What you'll do : - Design and develop data processing pipelines and analytics solutions using Databricks. - Architect scalable and efficient data models and storage solutions on the Databricks platform. - Collaborate with architects and other teams to migrate current solution to use Databricks. - Optimize performance and reliability of Databricks clusters and jobs to meet SLAs and business requirements. - Use best practices for data governance, security, and compliance on the Databricks platform. - Mentor junior engineers and provide technical guidance. - Stay current with emerging technologies and trends in data engineering and analytics to drive continuous improvement. You'll be expected to have : - Bachelor's or master's degree in computer science, Engineering, or a related field. - 5 to 8 years of overall experience and 2+ years of experience designing and implementing data solutions on the Databricks platform. - Proficiency in programming languages such as Python, Scala, or SQL. - Strong understanding of distributed computing principles and experience with big data technologies such as Apache Spark. - Experience with cloud platforms such as AWS, Azure, or GCP, and their associated data services. - Proven track record of delivering scalable and reliable data solutions in a fast-paced environment. - Excellent problem-solving skills and attention to detail. - Strong communication and collaboration skills with the ability to work effectively in cross-functional teams. - Good to have experience with containerization technologies such as Docker and Kubernetes. - Knowledge of DevOps practices for automated deployment and monitoring of data pipelines.

Posted 1 month ago

Apply

1.0 - 4.0 years

5 - 9 Lacs

Mumbai

Work from Office

The Role: Company Overview : Kennect is the Modern Sales Compensation Platform designed for enterprises. We are leading the way in Sales Performance Management, with everything businesses need to harness the power of Sales Incentives. Our mission is to deliver customised enterprise-level commission calculation and tracking software for the most innovative businesses around the world. Key Responsibilities: Writing well-designed, testable and efficient code. Building reusable components and libraries for future. Troubleshooting and debugging to optimize performance Providing code documentation and other inputs to technical documents and participating in code reviews. Working on big projects with a full understanding of the system and will manage end to end system development. Our Ideal Candidate: Strong Understanding of System Design for atomic features, Node & Mongo Db, Docker & CI/CD, AWS & Linux. Must have a proven master of distributed systems and master of scalable database In depth knowledge of Kubernetes or anything similar, HTML JS, JavaScript,VueJs, Apache spark. Familiarity with Git, Linux (shell scripting) Mechanisms: Auth, Encryption. Preferred experience in PHP

Posted 1 month ago

Apply

5.0 - 10.0 years

20 - 25 Lacs

Bengaluru

Work from Office

The Platform Data Engineer will be responsible for designing and implementing robust data platform architectures, integrating diverse data technologies, and ensuring scalability, reliability, performance, and security across the platform. The role involves setting up and managing infrastructure for data pipelines, storage, and processing, developing internal tools to enhance platform usability, implementing monitoring and observability, collaborating with software engineering teams for seamless integration, and driving capacity planning and cost optimization initiatives.

Posted 1 month ago

Apply

5.0 - 7.0 years

11 - 15 Lacs

Coimbatore

Work from Office

Mandate Skills : Apache spark, hive, Hadoop, spark, scala, Databricks The Role : - Designing and building optimized data pipelines using cutting-edge technologies in a cloud environment to drive analytical insights. - Constructing infrastructure for efficient ETL processes from various sources and storage systems. - Leading the implementation of algorithms and prototypes to transform raw data into useful information. - Architecting, designing, and maintaining database pipeline architectures, ensuring readiness for AI/ML transformations. - Creating innovative data validation methods and data analysis tools. - Ensuring compliance with data governance and security policies. - Interpreting data trends and patterns to establish operational alerts. - Developing analytical tools, programs, and reporting mechanisms - Conducting complex data analysis and presenting results effectively. - Preparing data for prescriptive and predictive modeling. - Continuously exploring opportunities to enhance data quality and reliability. - Applying strong programming and problem-solving skills to develop scalable solutions. Requirements : - Experience in the Big Data technologies (Hadoop, Spark, Nifi, Impala) - 5+ years of hands-on experience designing, building, deploying, testing, maintaining, monitoring, and owning scalable, resilient, and distributed data pipelines. - High proficiency in Scala/Java and Spark for applied large-scale data processing. - Expertise with big data technologies, including Spark, Data Lake, and Hive

Posted 1 month ago

Apply

2.0 - 4.0 years

4 - 8 Lacs

Bengaluru

Hybrid

About the Role Love deep data? Love discussing solutions instead of problems? Then you could be our next Data Scientist. In a nutshell, your primary responsibility will be enhancing the productivity and utilization of the generated data. Other things you will do are: Work closely with the business stakeholders Transform scattered pieces of information into valuable data Share and present your valuable insights with peers What You Will Do Develop models and run experiments to infer insights from hard data Improve our product usability and identify new growth opportunities Understand reseller preferences to provide them with the most relevant products Designing discount programs to help our resellers sell more Help resellers better recognize end-customer preferences to improve their revenue Use data to identify bottlenecks that will help our suppliers meet their SLA requirements Model seasonal demand to predict key organizational metrics Mentor junior data scientists in the team What You Will Need Bachelor's/Master's degree in computer science (or similar degrees) 2-4 years of experience as a Data Scientist in a fast-paced organization, preferably B2C Familiarity with Neural Networks, Machine Learning, etc. Familiarity with tools like SQL, R, Python, etc. Strong understanding of Statistics and Linear Algebra Strong understanding of hypothesis/model testing and ability to identify common model testing errors Experience designing and running A/B tests and drawing insights from them Proficiency in machine learning algorithms Excellent analytical skills to fetch data from reliable sources to generate accurate insights Experience in tech and product teams is a plus Bonus points for: Experience in working on personalization or other ML problems Familiarity with Big Data tech stacks like Apache Spark, Hadoop, Redshift, etc.

Posted 1 month ago

Apply

5.0 - 7.0 years

15 - 18 Lacs

Bengaluru

Work from Office

We are seeking a highly skilled GCP Data Engineer with experience in designing and developing data ingestion frameworks, real-time processing solutions, and data transformation frameworks using open-source tools. The role involves operationalizing open-source data-analytic tools for enterprise use, ensuring adherence to data governance policies, and performing root-cause analysis on data-related issues. The ideal candidate should have a strong understanding of cloud platforms, especially GCP, with hands-on expertise in tools such as Kafka, Apache Spark, Python, Hadoop, and Hive. Experience with data governance and DevOps practices, along with GCP certifications, is preferred.

Posted 1 month ago

Apply

5.0 - 9.0 years

0 - 0 Lacs

Mumbai, Pune, Bengaluru

Hybrid

Data Engineer Experience 5 to 10 years Location Pune Yeravda hybrid Primary Skill: Scala coding Spark SQL **Key Responsibilities:** - Design and implement high-performance data pipelines using Apache Spark and Scala. - Optimize Spark jobs for efficiency and scalability. - Collaborate with diverse data sources and teams to deliver valuable insights. - Monitor and troubleshoot production pipelines to ensure smooth operations. - Maintain thorough documentation for all systems and code. **Required Skills & Qualifications:** - Minimum of 3 years hands-on experience with Apache Spark and Scala. - Strong grasp of distributed computing principles and Spark internals. - Proficiency in working with big data technologies like HDFS, Hive, Kafka, and HBase. - Ability to write optimized Spark jobs using Scala effectively.

Posted 1 month ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies