Get alerts for new jobs matching your selected skills, preferred locations, and experience range.
6.0 - 7.0 years
5 - 9 Lacs
Navi Mumbai
Work from Office
Collaborating closely with diverse teams, you'll play an important role in deciding the most suitable data management systems and identifying the crucial data required for insightful analysis. As a Data Engineer, you'll tackle obstacles related to database integration and untangle complex, unstructured data sets. In this role, your responsibilities may include Implementing and validating predictive models as well as creating and maintain statistical models with a focus on big data, incorporating a variety of statistical and machine learning techniques Designing and implementing various enterprise search applications such as Elasticsearch and Splunk for client requirements Work in an Agile, collaborative environment, partnering with other scientists, engineers, consultants and database administrators of all backgrounds and disciplines to bring analytical rigor and statistical methods to the challenges of predicting behaviours. Build teams or writing programs to cleanse and integrate data in an efficient and reusable manner, developing predictive or prescriptive models, and evaluating modeling results Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Total Exp-6-7 Yrs (Relevant-4-5 Yrs) Mandatory Skills: Azure Databricks, Python/PySpark, SQL, Github, - Azure Devops - Azure Blob Ability to use programming languages like Java, Python, Scala, etc., to build pipelines to extract and transform data from a repository to a data consumer Ability to use Extract, Transform, and Load (ETL) tools and/or data integration, or federation tools to prepare and transform data as needed. Ability to use leading edge tools such as Linux, SQL, Python, Spark, Hadoop and Java Preferred technical and professional experience You thrive on teamwork and have excellent verbal and written communication skills. Ability to communicate with internal and external clients to understand and define business needs, providing analytical solutions Ability to communicate results to technical and non-technical audiences
Posted 1 week ago
4.0 - 7.0 years
14 - 17 Lacs
Bengaluru
Work from Office
A Data Engineer specializing in enterprise data platforms, experienced in building, managing, and optimizing data pipelines for large-scale environments. Having expertise in big data technologies, distributed computing, data ingestion, and transformation frameworks. Proficient in Apache Spark, PySpark, Kafka, and Iceberg tables, and understand how to design and implement scalable, high-performance data processing solutions.What you’ll doAs a Data Engineer – Data Platform Services, responsibilities include: Data Ingestion & Processing Designing and developing data pipelines to migrate workloads from IIAS to Cloudera Data Lake. Implementing streaming and batch data ingestion frameworks using Kafka, Apache Spark (PySpark). Working with IBM CDC and Universal Data Mover to manage data replication and movement. Big Data & Data Lakehouse Management Implementing Apache Iceberg tables for efficient data storage and retrieval. Managing distributed data processing with Cloudera Data Platform (CDP). Ensuring data lineage, cataloging, and governance for compliance with Bank/regulatory policies. Optimization & Performance Tuning Optimizing Spark and PySpark jobs for performance and scalability. Implementing data partitioning, indexing, and caching to enhance query performance. Monitoring and troubleshooting pipeline failures and performance bottlenecks. Security & Compliance Ensuring secure data access, encryption, and masking using Thales CipherTrust. Implementing role-based access controls (RBAC) and data governance policies. Supporting metadata management and data quality initiatives. Collaboration & Automation Working closely with Data Scientists, Analysts, and DevOps teams to integrate data solutions. Automating data workflows using Airflow and implementing CI/CD pipelines with GitLab and Sonatype Nexus. Supporting Denodo-based data virtualization for seamless data access Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise 4-7 years of experience in big data engineering, data integration, and distributed computing. Strong skills in Apache Spark, PySpark, Kafka, SQL, and Cloudera Data Platform (CDP). Proficiency in Python or Scala for data processing. Experience with data pipeline orchestration tools (Apache Airflow, Stonebranch UDM). Understanding of data security, encryption, and compliance frameworks Preferred technical and professional experience Experience in banking or financial services data platforms. Exposure to Denodo for data virtualization and DGraph for graph-based insights. Familiarity with cloud data platforms (AWS, Azure, GCP). Certifications in Cloudera Data Engineering, IBM Data Engineering, or AWS Data Analytics
Posted 1 week ago
6.0 - 7.0 years
14 - 18 Lacs
Kochi
Work from Office
As an Associate Software Developer at IBM you will harness the power of data to unveil captivating stories and intricate patterns. You'll contribute to data gathering, storage, and both batch and real-time processing. Collaborating closely with diverse teams, you'll play an important role in deciding the most suitable data management systems and identifying the crucial data required for insightful analysis. As a Data Engineer, you'll tackle obstacles related to database integration and untangle complex, unstructured data sets. In this role, your responsibilities may include: Implementing and validating predictive models as well as creating and maintain statistical models with a focus on big data, incorporating a variety of statistical and machine learning techniques Designing and implementing various enterprise seach applications such as Elasticsearch and Splunk for client requirements Work in an Agile, collaborative environment, partnering with other scientists, engineers, consultants and database administrators of all backgrounds and disciplines to bring analytical rigor and statistical methods to the challenges of predicting behaviors. Build teams or writing programs to cleanse and integrate data in an efficient and reusable manner, developing predictive or prescriptive models, and evaluating modeling results Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Total Exp-6-7 Yrs (Relevant-4-5 Yrs) Mandatory Skills: Azure Databricks, Python/PySpark, SQL, Github, - Azure Devops - Azure Blob Ability to use programming languages like Java, Python, Scala, etc., to build pipelines to extract and transform data from a repository to a data consumer Ability to use Extract, Transform, and Load (ETL) tools and/or data integration, or federation tools to prepare and transform data as needed. Ability to use leading edge tools such as Linux, SQL, Python, Spark, Hadoop and Java Preferred technical and professional experience You thrive on teamwork and have excellent verbal and written communication skills. Ability to communicate with internal and external clients to understand and define business needs, providing analytical solutions Ability to communicate results to technical and non-technical audiences
Posted 1 week ago
5.0 - 10.0 years
14 - 17 Lacs
Pune
Work from Office
As a Big Data Engineer, you will develop, maintain, evaluate, and test big data solutions. You will be involved in data engineering activities like creating pipelines/workflows for Source to Target and implementing solutions that tackle the clients needs. Your primary responsibilities include Design, build, optimize and support new and existing data models and ETL processes based on our clients business requirements. Build, deploy and manage data infrastructure that can adequately handle the needs of a rapidly growing data driven organization. Coordinate data access and security to enable data scientists and analysts to easily access to data whenever they need too Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Must have 5+ years exp in Big Data -Hadoop Spark -Scala ,Python Hbase, Hive Good to have Aws -S3, athena ,Dynomo DB, Lambda, Jenkins GIT Developed Python and pyspark programs for data analysis. Good working experience with python to develop Custom Framework for generating of rules (just like rules engine). Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark. Apache Spark DataFrames/RDD's were used to apply business transformations and utilized Hive Context objects to perform read/write operations Preferred technical and professional experience Understanding of Devops. Experience in building scalable end-to-end data ingestion and processing solutions Experience with object-oriented and/or functional programming languages, such as Python, Java and Scala
Posted 1 week ago
3.0 - 8.0 years
9 - 13 Lacs
Mumbai
Work from Office
Role Overview : As a Big Data Engineer, you'll design and build robust data pipelines on Cloudera using Spark (Scala/PySpark) for ingestion, transformation, and processing of high-volume data from banking systems. Key Responsibilities : Build scalable batch and real-time ETL pipelines using Spark and Hive Integrate structured and unstructured data sources Perform performance tuning and code optimization Support orchestration and job scheduling (NiFi, Airflow) Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Experience 3-15 Years Proficiency in PySpark/Scala with Hive/Impala Experience with data partitioning, bucketing, and optimization Familiarity with Kafka, Iceberg, NiFi is a must Knowledge of banking or financial datasets is a plus
Posted 1 week ago
0 years
0 Lacs
Pune, Maharashtra, India
On-site
Apply now » Apply Now Start applying with LinkedIn Please wait... Date: Jun 7, 2025 Location: Pune, MH, IN Company: HMH HMH is a learning technology company committed to delivering connected solutions that engage learners, empower educators and improve student outcomes. As a leading provider of K–12 core curriculum, supplemental and intervention solutions, and professional learning services, HMH partners with educators and school districts to uncover solutions that unlock students’ potential and extend teachers’ capabilities. HMH serves more than 50 million students and 4 million educators in 150 countries. HMH Technology India Pvt. Ltd. is our technology and innovation arm in India focused on developing novel products and solutions using cutting-edge technology to better serve our clients globally. HMH aims to help employees grow as people, and not just as professionals. For more information, visit www.hmhco.com The data architect is responsible for designing, creating, and managing an organization’s data architecture. This role is critical in establishing a solid foundation for data management within an organization, ensuring that data is organized, accessible, secure, and aligned with business objectives. The data architect designs data models, warehouses, file systems and databases, and defines how data will be collected and organized. Responsibilities Interprets and delivers impactful strategic plans improving data integration, data quality, and data delivery in support of business initiatives and roadmaps Designs the structure and layout of data systems, including databases, warehouses, and lakes Selects and designs database management systems that meet the organization’s needs by defining data schemas, optimizing data storage, and establishing data access controls and security measures Defines and implements the long-term technology strategy and innovations roadmaps across analytics, data engineering, and data platforms Designs processes for the ETL process from various sources into the organization’s data systems Translates high-level business requirements into data models and appropriate metadata, test data, and data quality standards Manages senior business stakeholders to secure strong engagement and ensures that the delivery of the project aligns with longer-term strategic roadmaps Simplifies the existing data architecture, delivering reusable services and cost-saving opportunities in line with the policies and standards of the company Leads and participates in the peer review and quality assurance of project architectural artifacts across the EA group through governance forums Defines and manages standards, guidelines, and processes to ensure data quality Works with IT teams, business analysts, and data analytics teams to understand data consumers’ needs and develop solutions Evaluates and recommends emerging technologies for data management, storage, and analytics Design, create, and implement logical and physical data models for both IT and business solutions to capture the structure, relationships, and constraints of relevant datasets Build and operationalize complex data solutions, correct problems, apply transformations, and recommend data cleansing/quality solutions Effectively collaborate and communicate with various stakeholders to understand data and business requirements and translate them into data models Create entity-relationship diagrams (ERDs), data flow diagrams, and other visualization tools to represent data models Collaborate with database administrators and software engineers to implement and maintain data models in databases, data warehouses, and data lakes Develop data modeling best practices, and use these standards to identify and resolve data modeling issues and conflicts Conduct performance tuning and optimization of data models for efficient data access and retrieval Incorporate core data management competencies, including data governance, data security and data quality Education Job Requirements A bachelor’s degree in computer science, data science, engineering, or related field Experience At least five years of relevant experience in design and implementation of data models for enterprise data warehouse initiatives Experience leading projects involving data warehousing, data modeling, and data analysis Design experience in Azure Databricks, PySpark, and Power BI/Tableau Skills Ability in programming languages such as Java, Python, and C/C++ Ability in data science languages/tools such as SQL, R, SAS, or Excel Proficiency in the design and implementation of modern data architectures and concepts such as cloud services (AWS, Azure, GCP), real-time data distribution (Kafka, Dataflow), and modern data warehouse tools (Snowflake, Databricks) Experience with database technologies such as SQL, NoSQL, Oracle, Hadoop, or Teradata Understanding of entity-relationship modeling, metadata systems, and data quality tools and techniques Ability to think strategically and relate architectural decisions and recommendations to business needs and client culture Ability to assess traditional and modern data architecture components based on business needs Experience with business intelligence tools and technologies such as ETL, Power BI, and Tableau Ability to regularly learn and adopt new technology, especially in the ML/AI realm Strong analytical and problem-solving skills Ability to synthesize and clearly communicate large volumes of complex information to senior management of various technical understandings Ability to collaborate and excel in complex, cross-functional teams involving data scientists, business analysts, and stakeholders Ability to guide solution design and architecture to meet business needs Expert knowledge of data modeling concepts, methodologies, and best practices Proficiency in data modeling tools such as Erwin or ER/Studio Knowledge of relational databases and database design principles Familiarity with dimensional modeling and data warehousing concepts Strong SQL skills for data querying, manipulation, and optimization, and knowledge of other data science languages, including JavaScript, Python, and R Ability to collaborate with cross-functional teams and stakeholders to gather requirements and align on data models Excellent analytical and problem-solving skills to identify and resolve data modeling issues Strong communication and documentation skills to effectively convey complex data modeling concepts to technical and business stakeholders HMH Technology Private Limited is an Equal Opportunity Employer and considers applicants for all positions without regard to race, colour, religion or belief, sex, age, national origin, citizenship status, marital status, military/veteran status, genetic information, sexual orientation, gender identity, physical or mental disability or any other characteristic protected by applicable laws. We are committed to creating a dynamic work environment that values diversity and inclusion, respect and integrity, customer focus, and innovation. For more information, visit https://careers.hmhco.com/. Follow us on Twitter, Facebook, LinkedIn, and YouTube. Job Segment: Curriculum, Social Media, Education, Marketing Apply now » Apply Now Start applying with LinkedIn Please wait... Show more Show less
Posted 1 week ago
5.0 - 7.0 years
12 - 16 Lacs
Bengaluru
Work from Office
As Data Engineer, you will develop, maintain, evaluate and test big data solutions. You will be involved in the development of data solutions using Spark Framework with Python or Scala on Hadoop and AWS Cloud Data Platform Experienced in building data pipelines to Ingest, process, and transform data from files, streams and databases. Process the data with Spark, Python, PySpark, Scala, and Hive, Hbase or other NoSQL databases on Cloud Data Platforms (AWS) or HDFS Experienced in develop efficient software code for multiple use cases leveraging Spark Framework / using Python or Scala and Big Data technologies for various use cases built on the platform Experience in developing streaming pipelines Experience to work with Hadoop / AWS eco system components to implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Apache Spark, Kafka, any Cloud computing etc Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Total 5 - 7+ years of experience in Data Management (DW, DL, Data Platform, Lakehouse) and Data Engineering skills Minimum 4+ years of experience in Big Data technologies with extensive data engineering experience in Spark / Python or Scala. Minimum 3 years of experience on Cloud Data Platforms on AWS; Exposure to streaming solutions and message brokers like Kafka technologies. Experience in AWS EMR / AWS Glue / DataBricks, AWS RedShift, DynamoDB Good to excellent SQL skills Preferred technical and professional experience Certification in AWS and Data Bricks or Cloudera Spark Certified developers AWS S3 , Redshift , and EMR for data storage and distributed processing. AWS Lambda , AWS Step Functions , and AWS Glue to build serverless, event-driven data workflows and orchestrate ETL processes
Posted 1 week ago
3.0 - 6.0 years
14 - 18 Lacs
Bengaluru
Work from Office
Establish and implement best practices for DBT workflows , ensuring efficiency, reliability, and maintainability. Collaborate with data analysts, engineers, and business teams to align data transformations with business needs. Monitor and troubleshoot data pipelines to ensure accuracy and performance. Work with Azure-based cloud technologies to support data storage, transformation, and processing Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Strong MS SQL, Azure Databricks experience Implement and manage data models in DBT, data transformation and alignment with business requirements. Ingest raw, unstructured data into structured datasets to cloud object store. Utilize DBT to convert raw, unstructured data into structured datasets, enabling efficient analysis and reporting. Write and optimize SQL queries within DBT to enhance data transformation processes and improve overall performance Preferred technical and professional experience Establish best DBT processes to improve performance, scalability, and reliability. Design, develop, and maintain scalable data models and transformations using DBT in conjunction with Databricks Proven interpersonal skills while contributing to team effort by accomplishing related results as required
Posted 1 week ago
5.0 - 10.0 years
14 - 17 Lacs
Mumbai
Work from Office
As a Big Data Engineer, you will develop, maintain, evaluate, and test big data solutions. You will be involved in data engineering activities like creating pipelines/workflows for Source to Target and implementing solutions that tackle the clients needs. Your primary responsibilities include: Design, build, optimize and support new and existing data models and ETL processes based on our clients business requirements. Build, deploy and manage data infrastructure that can adequately handle the needs of a rapidly growing data driven organization. Coordinate data access and security to enable data scientists and analysts to easily access to data whenever they need too Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Must have 5+ years exp in Big Data -Hadoop Spark -Scala ,Python Hbase, Hive Good to have Aws -S3, athena ,Dynomo DB, Lambda, Jenkins GIT Developed Python and pyspark programs for data analysis. Good working experience with python to develop Custom Framework for generating of rules (just like rules engine). Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark. Apache Spark DataFrames/RDD's were used to apply business transformations and utilized Hive Context objects to perform read/write operations Preferred technical and professional experience Understanding of Devops. Experience in building scalable end-to-end data ingestion and processing solutions Experience with object-oriented and/or functional programming languages, such as Python, Java and Scala
Posted 1 week ago
15.0 - 20.0 years
5 - 9 Lacs
Mumbai
Work from Office
Location Mumbai Role Overview : As a Big Data Engineer, you'll design and build robust data pipelines on Cloudera using Spark (Scala/PySpark) for ingestion, transformation, and processing of high-volume data from banking systems. Key Responsibilities : Build scalable batch and real-time ETL pipelines using Spark and Hive Integrate structured and unstructured data sources Perform performance tuning and code optimization Support orchestration and job scheduling (NiFi, Airflow) Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Experience3–15 years Proficiency in PySpark/Scala with Hive/Impala Experience with data partitioning, bucketing, and optimization Familiarity with Kafka, Iceberg, NiFi is a must Knowledge of banking or financial datasets is a plus
Posted 1 week ago
2.0 - 5.0 years
14 - 17 Lacs
Pune
Work from Office
As a Data Engineer at IBM, you'll play a vital role in the development, design of application, provide regular support/guidance to project teams on complex coding, issue resolution and execution. Your primary responsibilities include: Lead the design and construction of new solutions using the latest technologies, always looking to add business value and meet user requirements. Strive for continuous improvements by testing the build solution and working under an agile framework. Discover and implement the latest technologies trends to maximize and build creative solutions Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Experience with Apache Spark (PySpark)In-depth knowledge of Spark’s architecture, core APIs, and PySpark for distributed data processing. Big Data TechnologiesFamiliarity with Hadoop, HDFS, Kafka, and other big data tools. Data Engineering Skills: Strong understanding of ETL pipelines, data modeling, and data warehousing concepts. Strong proficiency in PythonExpertise in Python programming with a focus on data processing and manipulation. Data Processing FrameworksKnowledge of data processing libraries such as Pandas, NumPy. SQL ProficiencyExperience writing optimized SQL queries for large-scale data analysis and transformation. Cloud PlatformsExperience working with cloud platforms like AWS, Azure, or GCP, including using cloud storage systems Preferred technical and professional experience Define, drive, and implement an architecture strategy and standards for end-to-end monitoring. Partner with the rest of the technology teams including application development, enterprise architecture, testing services, network engineering, Good to have detection and prevention tools for Company products and Platform and customer-facing
Posted 1 week ago
5.0 - 10.0 years
14 - 18 Lacs
Bengaluru
Work from Office
As an Data Engineer at IBM you will harness the power of data to unveil captivating stories and intricate patterns. You'll contribute to data gathering, storage, and both batch and real-time processing. Collaborating closely with diverse teams, you'll play an important role in deciding the most suitable data management systems and identifying the crucial data required for insightful analysis. As a Data Engineer, you'll tackle obstacles related to database integration and untangle complex, unstructured data sets. In this role, your responsibilities may include: Implementing and validating predictive models as well as creating and maintain statistical models with a focus on big data, incorporating a variety of statistical and machine learning techniques Designing and implementing various enterprise search applications such as Elasticsearch and Splunk for client requirements Work in an Agile, collaborative environment, partnering with other scientists, engineers, consultants and database administrators of all backgrounds and disciplines to bring analytical rigor and statistical methods to the challenges of predicting behaviours. Build teams or writing programs to cleanse and integrate data in an efficient and reusable manner, developing predictive or prescriptive models, and evaluating modeling results Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise We are seeking a skilled Azure Data Engineer with 5+ years of experience Including 3+ years of hands-on experience with ADF/Databricks The ideal candidate Data bricks,Data Lake, Phyton programming skills. The candidate will also have experience for deploying to data bricks. Familiarity with Azure Data Factory Preferred technical and professional experience Good communication skills. 3+ years of experience with ADF/DB/DataLake. Ability to communicate results to technical and non-technical audiences
Posted 1 week ago
4.0 - 7.0 years
14 - 17 Lacs
Gurugram
Work from Office
A Data Engineer specializing in enterprise data platforms, experienced in building, managing, and optimizing data pipelines for large-scale environments. Having expertise in big data technologies, distributed computing, data ingestion, and transformation frameworks. Proficient in Apache Spark, PySpark, Kafka, and Iceberg tables, and understand how to design and implement scalable, high-performance data processing solutions.What you’ll doAs a Data Engineer – Data Platform Services, responsibilities include: Data Ingestion & Processing Designing and developing data pipelines to migrate workloads from IIAS to Cloudera Data Lake. Implementing streaming and batch data ingestion frameworks using Kafka, Apache Spark (PySpark). Working with IBM CDC and Universal Data Mover to manage data replication and movement. Big Data & Data Lakehouse Management Implementing Apache Iceberg tables for efficient data storage and retrieval. Managing distributed data processing with Cloudera Data Platform (CDP). Ensuring data lineage, cataloging, and governance for compliance with Bank/regulatory policies. Optimization & Performance Tuning Optimizing Spark and PySpark jobs for performance and scalability. Implementing data partitioning, indexing, and caching to enhance query performance. Monitoring and troubleshooting pipeline failures and performance bottlenecks. Security & Compliance Ensuring secure data access, encryption, and masking using Thales CipherTrust. Implementing role-based access controls (RBAC) and data governance policies. Supporting metadata management and data quality initiatives. Collaboration & Automation Working closely with Data Scientists, Analysts, and DevOps teams to integrate data solutions. Automating data workflows using Airflow and implementing CI/CD pipelines with GitLab and Sonatype Nexus. Supporting Denodo-based data virtualization for seamless data access Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise 4-7 years of experience in big data engineering, data integration, and distributed computing. Strong skills in Apache Spark, PySpark, Kafka, SQL, and Cloudera Data Platform (CDP). Proficiency in Python or Scala for data processing. Experience with data pipeline orchestration tools (Apache Airflow, Stonebranch UDM). Understanding of data security, encryption, and compliance frameworks Preferred technical and professional experience Experience in banking or financial services data platforms. Exposure to Denodo for data virtualization and DGraph for graph-based insights. Familiarity with cloud data platforms (AWS, Azure, GCP). Certifications in Cloudera Data Engineering, IBM Data Engineering, or AWS Data Analytics
Posted 1 week ago
5.0 - 10.0 years
14 - 17 Lacs
Navi Mumbai
Work from Office
As a Big Data Engineer, you will develop, maintain, evaluate, and test big data solutions. You will be involved in data engineering activities like creating pipelines/workflows for Source to Target and implementing solutions that tackle the clients needs. Your primary responsibilities include: Design, build, optimize and support new and existing data models and ETL processes based on our clients business requirements. Build, deploy and manage data infrastructure that can adequately handle the needs of a rapidly growing data driven organization. Coordinate data access and security to enable data scientists and analysts to easily access to data whenever they need too Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Must have 5+ years exp in Big Data -Hadoop Spark -Scala ,Python Hbase, Hive Good to have Aws -S3, athena ,Dynomo DB, Lambda, Jenkins GIT Developed Python and pyspark programs for data analysis. Good working experience with python to develop Custom Framework for generating of rules (just like rules engine). Developed Python code to gather the data from HBase and designs the solution to implement using Pyspark. Apache Spark DataFrames/RDD's were used to apply business transformations and utilized Hive Context objects to perform read/write operations Preferred technical and professional experience Understanding of Devops. Experience in building scalable end-to-end data ingestion and processing solutions Experience with object-oriented and/or functional programming languages, such as Python, Java and Scala
Posted 1 week ago
6.0 - 11.0 years
14 - 17 Lacs
Pune
Work from Office
As an Application Developer, you will lead IBM into the future by translating system requirements into the design and development of customized systems in an agile environment. The success of IBM is in your hands as you transform vital business needs into code and drive innovation. Your work will power IBM and its clients globally, collaborating and integrating code into enterprise systems. You will have access to the latest education, tools and technology, and a limitless career path with the world’s technology leader. Come to IBM and make a global impact Responsibilities: Responsible to manage end to end feature development and resolve challenges faced in implementing the same Learn new technologies and implement the same in feature development within the time frame provided Manage debugging, finding root cause analysis and fixing the issues reported on Content Management back end software system fixing the issues reported on Content Management back-end software system Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Overall, more than 6 years of experience with more than 4+ years of Strong Hands on experience in Python and Spark Strong technical abilities to understand, design, write and debug to develop applications on Python and Pyspark. Good to Have;- Hands on Experience on cloud technology AWS/GCP/Azure strong problem-solving skill Preferred technical and professional experience Good to Have;- Hands on Experience on cloud technology AWS/GCP/Azure
Posted 1 week ago
1.0 - 3.0 years
3 - 7 Lacs
Chennai
Hybrid
Strong experience in Python Good experience in Databricks Experience working in AWS/Azure Cloud Platform. Experience working with REST APIs and services, messaging and event technologies. Experience with ETL or building Data Pipeline tools Experience with streaming platforms such as Kafka. Demonstrated experience working with large and complex data sets. Ability to document data pipeline architecture and design Experience in Airflow is nice to have To build complex Deltalake
Posted 1 week ago
4.0 - 9.0 years
2 - 6 Lacs
Bengaluru
Work from Office
Roles and Responsibilities: 4+ years of experience as a data developer using Python Knowledge in Spark, PySpark preferable but not mandatory Azure Cloud experience (preferred) Alternate Cloud experience is fine preferred experience in Azure platform including Azure data Lake, data Bricks, data Factory Working Knowledge on different file formats such as JSON, Parquet, CSV, etc. Familiarity with data encryption, data masking Database experience in SQL Server is preferable preferred experience in NoSQL databases like MongoDB Team player, reliable, self-motivated, and self-disciplined
Posted 1 week ago
1.0 - 3.0 years
2 - 5 Lacs
Chennai
Work from Office
Mandatory Skills: AWS, Python, SQL, spark, Airflow, SnowflakeResponsibilities Create and manage cloud resources in AWS Data ingestion from different data sources which exposes data using different technologies, such asRDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations Develop an infrastructure to collect, transform, combine and publish/distribute customer data. Define process improvement opportunities to optimize data collection, insights and displays. Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible Identify and interpret trends and patterns from complex data sets Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders. Key participant in regular Scrum ceremonies with the agile teams Proficient at developing queries, writing reports and presenting findings Mentor junior members and bring best industry practices
Posted 1 week ago
5.0 - 10.0 years
5 - 9 Lacs
Mumbai
Work from Office
Roles & Responsibilities: Resource must have 5+ years of hands on experience in Azure Cloud development (ADF + DataBricks) - mandatory Strong in Azure SQL and good to have knowledge on Synapse / Analytics Experience in working on Agile Project and familiar with Scrum/SAFe ceremonies. Good communication skills - Written & Verbal Can work directly with customer Ready to work in 2nd shift Good in communication and flexible Defines, designs, develops and test software components/applications using Microsoft Azure- Data-bricks, ADF, ADL, Hive, Python, Data bricks, SparkSql, PySpark. Expertise in Azure Data Bricks, ADF, ADL, Hive, Python, Spark, PySpark Strong T-SQL skills with experience in Azure SQL DW Experience handling Structured and unstructured datasets Experience in Data Modeling and Advanced SQL techniques Experience implementing Azure Data Factory Pipelines using latest technologies and techniques. Good exposure in Application Development. The candidate should work independently with minimal supervision
Posted 1 week ago
5.0 years
0 Lacs
Bengaluru, Karnataka, India
On-site
Hello, Truecaller is calling you from Bangalore, India! Ready to pick up? Our goal is to make communication smarter, safer, and more efficient, all while building trust everywhere. We're all about bringing you smart services with a big social impact, keeping you safe from fraud, harassment, scam calls or messages, so you can focus on the conversations that matter. Top 20 most downloaded apps globally, and world’s #1 caller ID and spam-blocking service for Android and iOS, with extensive AI capabilities, with more than 400 million active users per month. Founded in 2009, listed on Nasdaq OMX Stockholm and is categorized as a Large Cap. Our focus on innovation, operational excellence, sustainable growth, and collaboration has resulted in consistently high profitability and strong EBITDA margins. A team of 400 people from ~35 different nationalities spread across our headquarters in Stockholm and offices in Bangalore, Mumbai, Gurgaon and Tel Aviv with high ambitions . We in the Insights Team are responsible for SMS Categorization, Fraud detection and other Smart SMS features within the Truecaller app. The OTP & bank notifications, bill & travel reminder alerts are some examples of the Smart SMS features. The team has developed a patented offline text parser that powers all these features and the team is also exploring cutting edge technologies like LLM to enhance the Smart SMS features. The team’s mission is to become the World’s most loved and trusted SMS app which is aligned with Truecaller’s vision to make communication safe and efficient. Smart SMS is used by over 90M users every day. As a Senior Data Scientist, you will be responsible for collecting, organizing, analyzing, and interpreting Truecaller data with a focus on NLP. In this role, you will be pivotal in advancing our work with large language models and on-device models across diverse regions. Your expertise will enhance our natural language processing, machine learning, and predictive analytics capabilities. What you bring in: 5+ years of experience in designing, developing, and deploying ML models at scale, with a focus on NLP-driven solutions. Strong background in Natural Language Processing (NLP), including text classification, entity recognition, language modeling, and transformer-based architectures. Experience in building and deploying models at scale, handling millions of messages efficiently while maintaining performance and accuracy. Also working with on-device models. Ability to not only build ML models but also take ownership of deploying them into production, ensuring scalability, reliability, and monitoring. Knowledge of anomaly detection, adversarial ML techniques, and risk modeling to identify and prevent spam and fraudulent messaging activities. Strong ability to take ML models from research and experimentation to production, working closely with ML engineers and data engineers. Expertise in machine learning libraries such as TensorFlow, PyTorch, pandas and Scikit-learn, along with NLP-specific tools like Hugging Face Transformers, spaCy with experience in TFlife, ONNX. Hands-on experience fine-tuning LLMs including transformer-based architectures (BERT, GPT, LLaMA, T5, etc.) for domain-specific applications, including knowledge distillation, quantization, and model compression for efficiency. Strong ability to design, refine, and optimize prompts for LLM-based applications, ensuring high-quality responses and reduced model hallucinations. Ability to leverage data driven decision by experimentation, and statistical analysis to improve models and business outcomes. Strong understanding of designing, testing, and optimizing prompts for LLM-based applications to improve model accuracy and efficiency. Programming knowledge in at least one language, such as Python or R. Preferably python. Expert knowledge of machine learning algorithms. Familiarity with database modelling and data warehousing principles with a working knowledge of SQL Experience in building and optimizing large-scale data processing systems using Spark/PySpark Strong ability to work cross-functionally with engineers, product managers, and business stakeholders to align ML solutions with company objectives. The impact you will create: Take a loosely defined business problem and break it into tractable data problems. For each data problem, clearly articulate the value of solving it, its impact, and its complexity. Collaborate with Product and Engineering to scope, design, and implement systems that solve complex business problems ensuring they are delivered on time and within scope. Design, develop, and optimize state-of-the-art NLP models for large-scale message classification, fraud detection, and spam filtering, impacting millions of users globally. Take full ownership of ML model development, deployment, and monitoring, ensuring models are production-ready, scalable, and cost-efficient. Lead data science projects from ideation to deployment, ensuring alignment with business objectives and timelines. Manage and analyze large datasets collected from multiple countries, ensuring data integrity and consistency. Stay updated on industry best practices and emerging technologies to drive innovation within the Data Team. You work collaboratively across systems and teams to solve user and business problems. You are expected to help define success and design and build the systems to achieve it. To work with the Product to decide on priorities and set direction, design solutions, and help the team implement them. It would be great if you also have: Understanding of Conversational AI Deploying NLP models in production Working knowledge of GCP components Life at Truecaller - Behind the code: https://www.instagram.com/lifeattruecaller/ Sounds like your dream job? We will fill the position as soon as we find the right candidate, so please send your application as soon as possible. As part of the recruitment process, we will conduct a background check. This position is based in Bangalore, India. We only accept applications in English . What we offer: A smart, talented and agile team: An international team where ~35 nationalities are working together in several locations and time zones with a learning, sharing and fun environment. A great compensation package: Competitive salary, 30 days of paid vacation, flexible working hours, private health insurance, parental leave, telephone bill reimbursement, Udemy membership to keep learning and improving and Wellness allowance. Great tech tools: Pick the computer and phone that you fancy the most within our budget ranges. Office life: We strongly believe in the in-person collaboration and follow an office-first approach while offering some flexibility. Enjoy your days with great colleagues with loads of good stuff to learn from, daily lunch and breakfast and a wide range of healthy snacks and beverages. In addition, every now and then check out the playroom for a fun break or join our exciting parties and or team activities such as Lab days, sports meetups etc. There something for everyone! Come as you are: Truecaller is diverse, equal and inclusive. We need a wide variety of backgrounds, perspectives, beliefs and experiences in order to keep building our great products. No matter where you are based, which language you speak, your accent, race, religion, color, nationality, gender, sexual orientation, age, marital status, etc. All those things make you who you are, and that’s why we would love to meet you. Show more Show less
Posted 1 week ago
9.0 - 12.0 years
35 - 40 Lacs
Bengaluru
Work from Office
We are seeking an experienced AWS Architect with a strong background in designing and implementing cloud-native data platforms. The ideal candidate should possess deep expertise in AWS services such as S3, Redshift, Aurora, Glue, and Lambda, along with hands-on experience in data engineering and orchestration tools. Strong communication and stakeholder management skills are essential for this role. Key Responsibilities Design and implement end-to-end data platforms leveraging AWS services. Lead architecture discussions and ensure scalability, reliability, and cost-effectiveness. Develop and optimize solutions using Redshift, including stored procedures, federated queries, and Redshift Data API. Utilize AWS Glue and Lambda functions to build ETL/ELT pipelines. Write efficient Python code and data frame transformations, along with unit testing. Manage orchestration tools such as AWS Step Functions and Airflow. Perform Redshift performance tuning to ensure optimal query execution. Collaborate with stakeholders to understand requirements and communicate technical solutions clearly. Required Skills & Qualifications Minimum 9 years of IT experience with proven AWS expertise. Hands-on experience with AWS services: S3, Redshift, Aurora, Glue, and Lambda . Mandatory experience working with AWS Redshift , including stored procedures and performance tuning. Experience building end-to-end data platforms on AWS . Proficiency in Python , especially working with data frames and writing testable, production-grade code. Familiarity with orchestration tools like Airflow or AWS Step Functions . Excellent problem-solving skills and a collaborative mindset. Strong verbal and written communication and stakeholder management abilities. Nice to Have Experience with CI/CD for data pipelines. Knowledge of AWS Lake Formation and Data Governance practices.
Posted 1 week ago
50.0 years
0 Lacs
Mumbai, Maharashtra, India
On-site
Company Description Anuva is a specialized consulting firm that assists businesses in navigating complexity across commercial compliance, data solutions, and internal audit automation. With over 50 years of combined professional experience, Anuva is empowering businesses to make informed decisions. Role Description This is a full-time on-site role for a Sr. Power BI Developer at Anuva located in Mumbai. The Sr. Power BI Developer will be responsible for developing data models, creating dashboards, utilizing analytical skills, managing data warehouses, and performing Extract Transform Load (ETL) processes. Qualifications Master's or Bachelor's degree in Computer Science, Information Systems, or related field. Minimum 5 years experience with Power BI and SQL. Proficiency with Python (PySpark, PySQL, etc.). Data Modelling and Data Warehousing skills. Experience with Extract Transform Load (ETL) processes. Knowledge and experience with cloud computing is a plus. Strong problem-solving and critical thinking abilities. Excellent communication and collaboration skills. Show more Show less
Posted 1 week ago
4.0 years
0 Lacs
Surat, Gujarat, India
On-site
Company Name: Logicserve Digital Consultancy Services Pvt. Ltd. Location: Surat Experience: 4+ Years Working Days: 6 Days a Week Key Responsibilities: Requirement Gathering: Collaborate with stakeholders to gather and document data requirements. Translate business needs into technical specifications for data solutions. Data Pipeline Development: Design, implement, and maintain data pipelines using Microsoft Fabric. Create and manage ETL processes, ensuring efficient data flow. SSIS Integration: Develop and maintain ETL processes using SQL Server Integration Services (SSIS). Optimize SSIS packages for performance and reliability. CDC Pipelines: Implement Change Data Capture (CDC) pipelines to track and manage data changes. Ensure timely and accurate data updates across systems. PySpark Proficiency: Utilize PySpark for data transformation and processing tasks. Develop data processing scripts to handle large datasets efficiently. Collaboration: Work closely with data analysts, data scientists, and other teams to ensure data integrity. Participate in cross-functional projects to enhance data accessibility and usability. Required Skills: Technical Proficiency: Strong experience with Azure Data Factory, Azure Synapse, and Microsoft Fabric. Proficient in PySpark for data processing and transformation. Solid understanding of SQL Server and experience in writing complex queries. Data Modeling: Knowledge of data modeling concepts and best practices. Experience in designing data architectures that support business needs. Cloud Services: Familiarity with cloud computing concepts and services, particularly within the Azure ecosystem. Understanding of data storage solutions such as Azure Data Lake and Blob Storage. Problem-Solving: Strong analytical skills to troubleshoot data-related issues. Ability to optimize data workflows for performance and efficiency. Preferred Qualifications: Experience with data warehousing concepts and tools. Knowledge of data governance and compliance standards. Familiarity with version control systems and CI/CD practices. Show more Show less
Posted 1 week ago
3.0 - 7.0 years
0 Lacs
Greater Kolkata Area
On-site
About This Opportunity Ericsson is a leading provider of telecommunications equipment and services to mobile and fixed network operators globally. We are seeking a highly skilled and experienced Data Scientist to join our dynamic team at Ericsson. As a Data Scientist, you will be responsible for leveraging advanced analytics and machine learning techniques to drive actionable insights and solutions for our telecom domain. This role requires a deep understanding of data science methodologies, strong programming skills, and proficiency in cloud-based environments. What You Will Do Develop and deploy machine learning models for various applications including chat-bot, XGBoost, random forest, NLP, computer vision, and generative AI. Utilize Python for data manipulation, analysis, and modeling tasks. Proficient in SQL for querying and analyzing large datasets. Experience with Docker and Kubernetes for containerization and orchestration of applications. Basic knowledge of PySpark for distributed computing and data processing. Collaborate with cross-functional teams to understand business requirements and translate them into analytical solutions. Deploy machine learning models into production environments and ensure scalability and reliability. Preferably have experience working with Google Cloud Platform (GCP) services for data storage, processing, and deployment. Experience in analysing complex problems and translate it into algorithms. Backend development in Rest APIs using Flask, Fast API Deployment experience with CI/CD pipelines Working knowledge of handling data sets and data pre-processing through PySpark Writing queries to target Casandra, PostgreSQL database. Design Principles in application development. Experience of Service Oriented Architecture (SOA, web services, REST) Experience of agile development & GCP BigQuery Have experience in general tools and techniques: E.g. Docker, K8s, GIT, Argo WorkFlow The Skills You Bring Bachelor's degree in Computer Science, Statistics, Mathematics, or a related field. A Master's degree or PhD is preferred. 3-7 years of experience in data science and machine learning roles, preferably within the telecommunications or related industry. Proven experience in model development, evaluation, and deployment. Strong programming skills in Python and SQL. Familiarity with Docker, Kubernetes, and PySpark. Solid understanding of machine learning techniques and algorithms. Experience working with cloud platforms, preferably GCP. Excellent problem-solving skills and ability to work independently as well as part of a team. Strong communication and presentation skills, with the ability to explain complex analytical concepts to non-technical stakeholders. Why join Ericsson? At Ericsson, you´ll have an outstanding opportunity. The chance to use your skills and imagination to push the boundaries of what´s possible. To build solutions never seen before to some of the world’s toughest problems. You´ll be challenged, but you won’t be alone. You´ll be joining a team of diverse innovators, all driven to go beyond the status quo to craft what comes next. What happens once you apply? Click Here to find all you need to know about what our typical hiring process looks like. Encouraging a diverse and inclusive organization is core to our values at Ericsson, that's why we champion it in everything we do. We truly believe that by collaborating with people with different experiences we drive innovation, which is essential for our future growth. We encourage people from all backgrounds to apply and realize their full potential as part of our Ericsson team. Ericsson is proud to be an Equal Opportunity Employer. learn more. Primary country and city: India (IN) || Noida Req ID: 759817 Show more Show less
Posted 1 week ago
5.0 years
0 Lacs
Bengaluru, Karnataka, India
On-site
We are looking for an immediate joiner and experienced Big Data Developer with a strong background in Kafka, PySpark, Python/Scala, Spark, SQL, and the Hadoop ecosystem. The ideal candidate should have over 5 years of experience and be ready to join immediately. This role requires hands-on expertise in big data technologies and the ability to design and implement robust data processing solutions. Responsibilities Design, develop, and maintain scalable data processing pipelines using Kafka, PySpark, Python/Scala, and Spark. Work extensively with the Kafka and Hadoop ecosystem, including HDFS, Hive, and other related technologies. Write efficient SQL queries for data extraction, transformation, and analysis. Implement and manage Kafka streams for real-time data processing. Utilize scheduling tools to automate data workflows and processes. Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions. Ensure data quality and integrity by implementing robust data validation processes. Optimize existing data processes for performance and scalability. Requirements Experience with GCP. Knowledge of data warehousing concepts and best practices. Familiarity with machine learning and data analysis tools. Understanding of data governance and compliance standards. This job was posted by Arun Kumar K from krtrimaIQ Cognitive Solutions. Show more Show less
Posted 1 week ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
PySpark, a powerful data processing framework built on top of Apache Spark and Python, is in high demand in the job market in India. With the increasing need for big data processing and analysis, companies are actively seeking professionals with PySpark skills to join their teams. If you are a job seeker looking to excel in the field of big data and analytics, exploring PySpark jobs in India could be a great career move.
Here are 5 major cities in India where companies are actively hiring for PySpark roles: 1. Bangalore 2. Pune 3. Hyderabad 4. Mumbai 5. Delhi
The estimated salary range for PySpark professionals in India varies based on experience levels. Entry-level positions can expect to earn around INR 6-8 lakhs per annum, while experienced professionals can earn upwards of INR 15 lakhs per annum.
In the field of PySpark, a typical career progression may look like this: 1. Junior Developer 2. Data Engineer 3. Senior Developer 4. Tech Lead 5. Data Architect
In addition to PySpark, professionals in this field are often expected to have or develop skills in: - Python programming - Apache Spark - Big data technologies (Hadoop, Hive, etc.) - SQL - Data visualization tools (Tableau, Power BI)
Here are 25 interview questions you may encounter when applying for PySpark roles:
As you explore PySpark jobs in India, remember to prepare thoroughly for interviews and showcase your expertise confidently. With the right skills and knowledge, you can excel in this field and advance your career in the world of big data and analytics. Good luck!
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
36723 Jobs | Dublin
Wipro
11788 Jobs | Bengaluru
EY
8277 Jobs | London
IBM
6362 Jobs | Armonk
Amazon
6322 Jobs | Seattle,WA
Oracle
5543 Jobs | Redwood City
Capgemini
5131 Jobs | Paris,France
Uplers
4724 Jobs | Ahmedabad
Infosys
4329 Jobs | Bangalore,Karnataka
Accenture in India
4290 Jobs | Dublin 2