Home
Jobs

5397 Pyspark Jobs - Page 40

Filter
Filter Interviews
Min: 0 years
Max: 25 years
Min: ₹0
Max: ₹10000000
Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

9.0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

Linkedin logo

Who We Are With Citi’s Analytics & Information Management (AIM) group, you will do meaningful work from Day 1. Our collaborative and respectful culture lets people grow and make a difference in one of the world’s leading Financial Services Organizations. The purpose of the group is to use Citi’s data assets to analyze information and create actionable intelligence for our business leaders. We value what makes you unique so that you have opportunity to shine. You also get the opportunity to work with the best minds and top leaders in the analytics space. What The Role Is The role of Officer – Data Management/Information Analyst will be part of AIM team based out of Chennai/Bengaluru, India supporting the Global Workforce Optimization unit. The GWFO team supports capacity planning across the organization. The primary responsibility of the GWFO team is to forecast future demand (Inbound /Outbound Call Volume, back-office process volume, etc.) and the capacity required to fulfill the demand. It also includes forecasting short-term demand (daily/ hourly) and scheduling the agent accordingly. The GWFO team is also responsible for collaborating with multiple stakeholders and coming up with optimal hiring plans to ensure adequate capacity as well as optimize the operational budget. In this role, you will work along a highly talented team of analysts to build data solutions to track key business metrics and support the workforce optimization activities. You will be responsible for understanding and mapping out the data landscape for current and new businesses that are onboarded by GWFO and design the data store and pipes needed to provide for capacity planning, reporting and analytics, as well as real-time monitoring. You would be responsible for ensuring that the partner needs are met in a timely manner, by identifying creative short term and long-term approaches, and leveraging your expertise and background in selecting the right tools and techniques to get the missing accomplished. You would work very closely with GWFO’s technology partners in getting these solutions implemented in a compliant environment. You will also help align all stakeholders and keeping senior leadership updated on the plans and progress. You will be a senior SME in the group, supporting and mentoring the team on technical solutioning and should be ready to lend your hand to juniors when needed. Who You Are Data Driven. A proven track record of enabling decision making and problem solving with data. Conceptual thinking skills must be complemented by a strong quantitative orientation and data driven approach. Excellent Problem Solver. You are a critical thinker, able to ask right questions, make sense of a situation and come up with intelligent solutions. Strong Team Player. You build a trusted relationships with your team members. You are ready to offer unconditional assistance, will listen, share knowledge, and are always ready to provide support as needed. Strong Communicator. You can communicate verbally and through written communication with clarity and can structure and present your work to your partners & leadership. Clear Results Orientation. You display a keen focus on achieving both short and long-term goals and have experience driving and executing an agenda in a demanding and fast-paced environment, with an eye on risks & controls. Innovative. You are always challenging yourself and your team to find better and faster ways of doing things. What You Do Data Exploration. Understand underlying data sources by dwelling into multiple platforms scattered across the organization. You do what it takes to gather information by connecting with people across business teams and technology. Build Data Assets. You have a strong data design background and are capable of developing and building multi-dimensional data assets and pipes that captures abundant information about various line of business. Process & Controls Orientation. You develop strong processes, and indestructible controls to address risk and seek to propagate that culture to become the core value of your team. Dashboarding and Visualization. You develop insightful, visually compelling and engaging dashboards that supports decision making and drive adoption. Flawless Execution. You manage and sequence delivery of reporting and data needs by actively managing requests against available bandwidth and identify opportunities for improved productivity. Be an Enabler. You support your team and help them accomplish their goals with empathy. You act as a facilitator and remove blockers and create a positive atmosphere for then to be innovative and productive at work. Qualifications Must have 9-10+ years of work experience largely in the Data Management / engineering space. Must have expertise working with SQL. Must have expertise working with PySpark/Python for data extraction and deep dive activities Prior experience in an Operations role is desirable Good to have working experience on MS Office Package (Excel, Outlook, PowerPoint, etc. with VBA) and/or BI Visualization tools like Tableau is a plus. Must have expertise working with Tableau and other visualization / reporting tools Minimum of 5 years of experience mentoring juniors and driving complex projects independently ------------------------------------------------------ Job Family Group: Decision Management ------------------------------------------------------ Job Family: Data/Information Management ------------------------------------------------------ Time Type: ------------------------------------------------------ Citi is an equal opportunity and affirmative action employer. Qualified applicants will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran. Citigroup Inc. and its subsidiaries ("Citi”) invite all qualified interested applicants to apply for career opportunities. If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi . View the " EEO is the Law " poster. View the EEO is the Law Supplement . View the EEO Policy Statement . View the Pay Transparency Posting Show more Show less

Posted 1 week ago

Apply

7.0 - 11.0 years

9 - 13 Lacs

Bengaluru

Work from Office

Naukri logo

Skill required: Data Management - PySpark Designation: Data Eng, Mgmt & Governance Specialist Qualifications: BE/BTech Years of Experience: 7 to 11 years About Accenture Combining unmatched experience and specialized skills across more than 40 industries, we offer Strategy and Consulting, Technology and Operations services, and Accenture Song all powered by the worlds largest network of Advanced Technology and Intelligent Operations centers. Our 699,000 people deliver on the promise of technology and human ingenuity every day, serving clients in more than 120 countries. Visit us at www.accenture.com What would you do Data & AIUnderstand Pyspark interface and integration in handling complexities of multiprocessing, such as distributing the data, distributing code and collecting output from the workers on a cluster of machines. What are we looking for PySpark AWS Architecture Adaptable and flexible Strong analytical skills Prioritization of workload Ability to manage multiple stakeholders Agility for quick learning Roles and Responsibilities: In this role you are required to do analysis and solving of moderately complex problems May create new solutions, leveraging and, where needed, adapting existing methods and procedures The person would require understanding of the strategic direction set by senior management as it relates to team goals Primary upward interaction is with direct supervisor May interact with peers and/or management levels at a client and/or within Accenture Guidance would be provided when determining methods and procedures on new assignments Decisions made by you will often impact the team in which they reside Individual would manage small teams and/or work efforts (if in an individual contributor role) at a client or within Accenture Qualification BE,BTech

Posted 1 week ago

Apply

5.0 - 8.0 years

9 - 13 Lacs

Bengaluru

Work from Office

Naukri logo

Skill required: Data Management - PySpark Designation: Data Eng, Mgmt & Governance Sr Analyst Qualifications: BE/BTech Years of Experience: 5 to 8 years About Accenture Combining unmatched experience and specialized skills across more than 40 industries, we offer Strategy and Consulting, Technology and Operations services, and Accenture Song all powered by the worlds largest network of Advanced Technology and Intelligent Operations centers. Our 699,000 people deliver on the promise of technology and human ingenuity every day, serving clients in more than 120 countries. Visit us at www.accenture.com What would you do Data & AIUnderstand Pyspark interface and integration in handling complexities of multiprocessing, such as distributing the data, distributing code and collecting output from the workers on a cluster of machines. What are we looking for PySpark AWS Architecture Adaptable and flexible Strong analytical skills Agility for quick learning Ability to work well in a team Commitment to quality Roles and Responsibilities: In this role you are required to do analysis and solving of increasingly complex problems Your day-to-day interactions are with peers within Accenture You are likely to have some interaction with clients and/or Accenture management You will be given minimal instruction on daily work/tasks and a moderate level of instruction on new assignments Decisions that are made by you impact your own work and may impact the work of others In this role you would be an individual contributor and/or oversee a small work effort and/or team Qualification BE,BTech

Posted 1 week ago

Apply

15.0 - 20.0 years

5 - 9 Lacs

Chennai

Work from Office

Naukri logo

Project Role : Application Developer Project Role Description : Design, build and configure applications to meet business process and application requirements. Must have skills : PySpark Good to have skills : NAMinimum 3 year(s) of experience is required Educational Qualification : 15 years full time education Summary :As an Application Developer, you will engage in the design, construction, and configuration of applications tailored to fulfill specific business processes and application requirements. Your typical day will involve collaborating with team members to understand project needs, developing innovative solutions, and ensuring that applications are optimized for performance and usability. You will also participate in testing and debugging processes to ensure the applications function as intended, while continuously seeking ways to enhance application efficiency and user experience. Roles & Responsibilities:- Expected to perform independently and become an SME.- Required active participation/contribution in team discussions.- Contribute in providing solutions to work related problems.- Assist in the documentation of application specifications and user guides.- Engage in code reviews to ensure adherence to best practices and standards. Professional & Technical Skills: - Must To Have Skills: Proficiency in Databricks Unified Data Analytics Platform.- Strong understanding of data integration and ETL processes.- Experience with cloud computing platforms and services.- Familiarity with programming languages such as Python or Scala.- Knowledge of data visualization techniques and tools. Additional Information:- The candidate should have minimum 2 years of experience in Databricks Unified Data Analytics Platform.- This position is based at our Chennai office.- A 15 years full time education is required. Qualification 15 years full time education

Posted 1 week ago

Apply

15.0 - 20.0 years

10 - 14 Lacs

Bengaluru

Work from Office

Naukri logo

Project Role : Application Lead Project Role Description : Lead the effort to design, build and configure applications, acting as the primary point of contact. Must have skills : PySpark Good to have skills : NAMinimum 5 year(s) of experience is required Educational Qualification : 15 years full time education Summary :As an Application Lead, you will lead the effort to design, build, and configure applications, acting as the primary point of contact. Your typical day will involve collaborating with various teams to ensure that application development aligns with business objectives, overseeing project timelines, and facilitating communication among stakeholders to drive project success. You will also engage in problem-solving activities, providing guidance and support to your team while ensuring that best practices are followed throughout the development process. Your role will be pivotal in shaping the direction of application projects and ensuring that they meet the needs of the organization and its clients. Roles & Responsibilities:- Expected to be an SME.- Collaborate and manage the team to perform.- Responsible for team decisions.- Engage with multiple teams and contribute on key decisions.- Provide solutions to problems for their immediate team and across multiple teams.- Facilitate training and development opportunities for team members to enhance their skills.- Monitor project progress and implement necessary adjustments to ensure timely delivery. Professional & Technical Skills: - Must To Have Skills: Proficiency in Databricks Unified Data Analytics Platform.- Good To Have Skills: Experience with cloud computing platforms such as AWS or Azure.- Strong understanding of data engineering principles and practices.- Experience in application development using modern programming languages.- Familiarity with Agile methodologies and project management tools. Additional Information:- The candidate should have minimum 5 years of experience in Databricks Unified Data Analytics Platform.- This position is based at our Bengaluru office.- A 15 years full time education is required. Qualification 15 years full time education

Posted 1 week ago

Apply

10.0 - 12.0 years

6 - 9 Lacs

Chennai, Bengaluru

Work from Office

Naukri logo

Extract data from source system using Data Factory pipelines Massaging and Cleansing the data Transform data based on business rules Expose the data for reporting needs and exchange data with downstream applications. Standardize the various integration flows (e.g decom ALDML Init integration, simplify ALDML Delta integration).

Posted 1 week ago

Apply

4.0 - 6.0 years

12 - 16 Lacs

Chennai

Work from Office

Naukri logo

We are seeking a skilled Data Engineer who can function as a Data Architect, designing scalable data pipelines, table structures, and ETL workflows. The ideal candidate will be responsible for recommending cost-effective and high-performance data architecture solutions, collaborating with cross-functional teams to enable efficient analytics and data science initiatives. Key Responsibilities: Design and implement ETL workflows, data pipelines, and table structures to support business analytics and data science. Optimize data storage, retrieval, and processing for cost-efficiency and high performance. Collaborate with Analytics and Data Science teams for feature engineering and KPI computations. Develop and maintain data models for structured and unstructured data. Ensure data quality, integrity, and security across systems. Work with cloud platforms (AWS/ Azure/ GCP) to design and manage scalable data architectures. Technical Skills Required: SQL & Python Strong proficiency in writing optimized queries and scripts. PySpark Hands-on experience with distributed data processing. Cloud Technologies (AWS/ Azure/ GCP) Experience with cloud-based data solutions. Spark & Airflow Experience with big data frameworks and workflow orchestration. Gen AI (Preferred) Exposure to generative AI applications is a plus. Preferred Qualifications: Experience in data modeling, ETL optimization, and performance tuning. Strong problem-solving skills and ability to work in a fast-paced environment. Prior experience working with large-scale data processing.

Posted 1 week ago

Apply

12.0 - 15.0 years

13 - 17 Lacs

Mumbai

Work from Office

Naukri logo

12+ Years experience in Big data Space across Architecture, Design, Development, testing & Deployment, full understanding in SDLC. 1. Experience of Hadoop and related technology stack experience 2. Experience of the Hadoop Eco-system(HDP+CDP) / Big Data (especially HIVE) Hand on experience with programming languages such as Java/Scala/python Hand-on experience/knowledge on Spark3. Being responsible and focusing on uptime and reliable running of all or ingestion/ETL jobs4. Good SQL and used to work in a Unix/Linux environment is a must.5. Create and maintain optimal data pipeline architecture.6. Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.7. Good to have cloud experience8. Good to have experience for Hadoop integration with data visualization tools like PowerBI. Location Mumbai, Pune, Chennai, Hyderabad, Coimbatore, Kolkata

Posted 1 week ago

Apply

8.0 - 13.0 years

1 - 4 Lacs

Pune

Work from Office

Naukri logo

Roles & Responsibilities: Provides expert level development system analysis design and implementation of applications using AWS services specifically using Python for Lambda Translates technical specifications and/or design models into code for new or enhancement projects (for internal or external clients). Develops code that reuses objects is well-structured includes sufficient comments and is easy to maintain Provides follow up Production support when needed. Submits change control requests and documents. Participates in design code and test inspections throughout the life cycle to identify issues and ensure methodology compliance. Participates in systems analysis activities including system requirements analysis and definition e.g. prototyping. Participates in other meetings such as those for use case creation and analysis. Performs unit testing and writes appropriate unit test plans to ensure requirements are satisfied. Assists in integration systems acceptance and other related testing as needed. Ensures developed code is optimized in order to meet client performance specifications associated with page rendering time by completing page performance tests. Technical Skills Required Experience in building large scale batch and data pipelines with data processing frameworks in AWS cloud platform using PySpark (on EMR) & Glue ETL Deep experience in developing data processing data manipulation tasks using PySpark such as reading data from external sources merge data perform data enrichment and load in to target data destinations. Experience in deployment and operationalizing the code using CI/CD tools Bit bucket and Bamboo Strong AWS cloud computing experience. Extensive experience in Lambda S3 EMR Redshift Should have worked on Data Warehouse/Database technologies for at least 8 years. 7. Any AWS certification will be an added advantage.

Posted 1 week ago

Apply

1.0 - 3.0 years

2 - 6 Lacs

Indore, Pune, Bengaluru

Work from Office

Naukri logo

LocationsPune, Bangalore, Indore Work modeWork from Office Informatica data quality - idq Azure databricks Azure data lake Azure Data Factory Api integration

Posted 1 week ago

Apply

6.0 - 7.0 years

14 - 18 Lacs

Bengaluru

Work from Office

Naukri logo

As Data Engineer, you will develop, maintain, evaluate and test big data solutions. You will be involved in the development of data solutions using Spark Framework with Python or Scala on Hadoop and Azure Cloud Data Platform Responsibilities: Experienced in building data pipelines to Ingest, process, and transform data from files, streams and databases. Process the data with Spark, Python, PySpark and Hive, Hbase or other NoSQL databases on Azure Cloud Data Platform or HDFS Experienced in develop efficient software code for multiple use cases leveraging Spark Framework / using Python or Scala and Big Data technologies for various use cases built on the platform Experience in developing streaming pipelines Experience to work with Hadoop / Azure eco system components to implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Apache Spark, Kafka, any Cloud computing etc Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Total 6 - 7+ years of experience in Data Management (DW, DL, Data Platform, Lakehouse) and Data Engineering skills Minimum 4+ years of experience in Big Data technologies with extensive data engineering experience in Spark / Python or Scala; Minimum 3 years of experience on Cloud Data Platforms on Azure; Experience in DataBricks / Azure HDInsight / Azure Data Factory, Synapse, SQL Server DB Good to excellent SQL skills Preferred technical and professional experience Certification in Azure and Data Bricks or Cloudera Spark Certified developers

Posted 1 week ago

Apply

1.0 - 4.0 years

2 - 6 Lacs

Mumbai, Pune, Chennai

Work from Office

Naukri logo

Graph data Engineer required for a complex Supplier Chain Project. Key required Skills Graph data modelling (Experience with graph data models (LPG, RDF) and graph language (Cypher), exposure to various graph data modelling techniques) Experience with neo4j Aura, Optimizing complex queries. Experience with GCP stacks like BigQuery, GCS, Dataproc. Experience in PySpark, SparkSQL is desirable. Experience in exposing Graph data to visualisation tools such as Neo Dash, Tableau and PowerBI The Expertise You Have: Bachelors or Masters Degree in a technology related field (e.g. Engineering, Computer Science, etc.). Demonstrable experience in implementing data solutions in Graph DB space. Hands-on experience with graph databases (Neo4j(Preferred), or any other). Experience Tuning Graph databases. Understanding of graph data model paradigms (LPG, RDF) and graph language, hands-on experience with Cypher is required. Solid understanding of graph data modelling, graph schema development, graph data design. Relational databases experience, hands-on SQL experience is required. Desirable (Optional) skills: Data ingestion technologies (ETL/ELT), Messaging/Streaming Technologies (GCP data fusion, Kinesis/Kafka), API and in-memory technologies. Understanding of developing highly scalable distributed systems using Open-source technologies. Experience in Supply Chain Data is desirable but not essential. Location: Pune, Mumbai, Chennai, Bangalore, Hyderabad

Posted 1 week ago

Apply

2.0 - 5.0 years

14 - 17 Lacs

Bengaluru

Work from Office

Naukri logo

As a BigData Engineer at IBM you will harness the power of data to unveil captivating stories and intricate patterns. You'll contribute to data gathering, storage, and both batch and real-time processing. Collaborating closely with diverse teams, you'll play an important role in deciding the most suitable data management systems and identifying the crucial data required for insightful analysis. As a Data Engineer, you'll tackle obstacles related to database integration and untangle complex, unstructured data sets. In this role, your responsibilities may include: As a Big Data Engineer, you will develop, maintain, evaluate, and test big data solutions. You will be involved in data engineering activities like creating pipelines/workflows for Source to Target and implementing solutions that tackle the clients needs Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Big Data Developer, Hadoop, Hive, Spark, PySpark, Strong SQL. Ability to incorporate a variety of statistical and machine learning techniques. Basic understanding of Cloud (AWS,Azure, etc). Ability to use programming languages like Java, Python, Scala, etc., to build pipelines to extract and transform data from a repository to a data consumer Ability to use Extract, Transform, and Load (ETL) tools and/or data integration, or federation tools to prepare and transform data as needed. Ability to use leading edge tools such as Linux, SQL, Python, Spark, Hadoop and Java Preferred technical and professional experience Basic understanding or experience with predictive/prescriptive modeling skills You thrive on teamwork and have excellent verbal and written communication skills. Ability to communicate with internal and external clients to understand and define business needs, providing analytical solutions

Posted 1 week ago

Apply

2.0 - 5.0 years

14 - 17 Lacs

Navi Mumbai

Work from Office

Naukri logo

As a Data Engineer at IBM, you'll play a vital role in the development, design of application, provide regular support/guidance to project teams on complex coding, issue resolution and execution. Your primary responsibilities include: Lead the design and construction of new solutions using the latest technologies, always looking to add business value and meet user requirements. Strive for continuous improvements by testing the build solution and working under an agile framework. Discover and implement the latest technologies trends to maximize and build creative solutions Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Experience with Apache Spark (PySpark)In-depth knowledge of Spark’s architecture, core APIs, and PySpark for distributed data processing. Big Data TechnologiesFamiliarity with Hadoop, HDFS, Kafka, and other big data tools. Data Engineering Skills: Strong understanding of ETL pipelines, data modeling, and data warehousing concepts. Strong proficiency in PythonExpertise in Python programming with a focus on data processing and manipulation. Data Processing FrameworksKnowledge of data processing libraries such as Pandas, NumPy. SQL ProficiencyExperience writing optimized SQL queries for large-scale data analysis and transformation. Cloud PlatformsExperience working with cloud platforms like AWS, Azure, or GCP, including using cloud storage systems Preferred technical and professional experience Define, drive, and implement an architecture strategy and standards for end-to-end monitoring. Partner with the rest of the technology teams including application development, enterprise architecture, testing services, network engineering, Good to have detection and prevention tools for Company products and Platform and customer-facing

Posted 1 week ago

Apply

3.0 years

0 Lacs

Gurugram, Haryana, India

On-site

Linkedin logo

Line of Service Advisory Industry/Sector Not Applicable Specialism Data, Analytics & AI Management Level Senior Associate Job Description & Summary At PwC, our people in data and analytics focus on leveraging data to drive insights and make informed business decisions. They utilise advanced analytics techniques to help clients optimise their operations and achieve their strategic goals. In business intelligence at PwC, you will focus on leveraging data and analytics to provide strategic insights and drive informed decision-making for clients. You will develop and implement innovative solutions to optimise business performance and enhance competitive advantage. Why PWC At PwC, you will be part of a vibrant community of solvers that leads with trust and creates distinctive outcomes for our clients and communities. This purpose-led and values-driven work, powered by technology in an environment that drives innovation, will enable you to make a tangible impact in the real world. We reward your contributions, support your wellbeing, and offer inclusive benefits, flexibility programmes and mentorship that will help you thrive in work and life. Together, we grow, learn, care, collaborate, and create a future of infinite experiences for each other. Learn more about us. At PwC, we believe in providing equal employment opportunities, without any discrimination on the grounds of gender, ethnic background, age, disability, marital status, sexual orientation, pregnancy, gender identity or expression, religion or other beliefs, perceived differences and status protected by law. We strive to create an environment where each one of our people can bring their true selves and contribute to their personal growth and the firm’s growth. To enable this, we have zero tolerance for any discrimination and harassment based on the above considerations. " Responsibilities Hands-on experience in Pyspark, preferably more than 3 years - familiarity with RDD level programming as well as coding using Spark DataFrames API. Ability to develop and maintain data pipelines and ETLs using Python and Pyspark. Should have good knowledge of Python and Spark concepts like Driver vs Worker, actions and transforms, data partitioning and bucketing etc. Experience in data management activities and data lifecycle management - for example activities related to integrating source databases to a data lake via full batch or incremental data updates. Ability to design, implement, and optimize Spark jobs for performance and scalability. Perform data analysis and troubleshooting to ensure data quality and reliability. Experience in cloud technologies - esp. object storage for data lakes is a must. Experience in Spark Streaming, GraphX is a plus. Mandatory Skill Sets Pyspark Preferred Skill Sets Pyspark Years Of Experience Required 5+ Education Qualification BE/BTech/MBA/MCA Education (if blank, degree and/or field of study not specified) Degrees/Field of Study required: Bachelor of Technology, Master of Business Administration, Bachelor of Engineering Degrees/Field Of Study Preferred Certifications (if blank, certifications not specified) Required Skills PySpark Optional Skills Accepting Feedback, Accepting Feedback, Active Listening, Analytical Thinking, Business Case Development, Business Data Analytics, Business Intelligence and Reporting Tools (BIRT), Business Intelligence Development Studio, Communication, Competitive Advantage, Continuous Process Improvement, Creativity, Data Analysis and Interpretation, Data Architecture, Database Management System (DBMS), Data Collection, Data Pipeline, Data Quality, Data Science, Data Visualization, Embracing Change, Emotional Regulation, Empathy, Inclusion, Industry Trend Analysis {+ 16 more} Desired Languages (If blank, desired languages not specified) Travel Requirements Not Specified Available for Work Visa Sponsorship? No Government Clearance Required? No Job Posting End Date Show more Show less

Posted 1 week ago

Apply

6.0 - 11.0 years

14 - 17 Lacs

Mysuru

Work from Office

Naukri logo

As an Application Developer, you will lead IBM into the future by translating system requirements into the design and development of customized systems in an agile environment. The success of IBM is in your hands as you transform vital business needs into code and drive innovation. Your work will power IBM and its clients globally, collaborating and integrating code into enterprise systems. You will have access to the latest education, tools and technology, and a limitless career path with the world’s technology leader. Come to IBM and make a global impact Responsibilities: Responsible to manage end to end feature development and resolve challenges faced in implementing the same Learn new technologies and implement the same in feature development within the time frame provided Manage debugging, finding root cause analysis and fixing the issues reported on Content Management back end software system fixing the issues reported on Content Management back end software system Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Overall, more than 6 years of experience with more than 4+ years of Strong Hands on experience in Python and Spark Strong technical abilities to understand, design, write and debug to develop applications on Python and Pyspark. Good to Have;- Hands on Experience on cloud technology AWS/GCP/Azure strong problem-solving skill Preferred technical and professional experience Good to Have;- Hands on Experience on cloud technology AWS/GCP/Azure

Posted 1 week ago

Apply

2.0 - 5.0 years

14 - 17 Lacs

Mumbai

Work from Office

Naukri logo

As a Data Engineer at IBM, you'll play a vital role in the development, design of application, provide regular support/guidance to project teams on complex coding, issue resolution and execution. Your primary responsibilities include: Lead the design and construction of new solutions using the latest technologies, always looking to add business value and meet user requirements. Strive for continuous improvements by testing the build solution and working under an agile framework. Discover and implement the latest technologies trends to maximize and build creative solutions Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Experience with Apache Spark (PySpark)In-depth knowledge of Spark’s architecture, core APIs, and PySpark for distributed data processing. Big Data TechnologiesFamiliarity with Hadoop, HDFS, Kafka, and other big data tools. Data Engineering Skills: Strong understanding of ETL pipelines, data modeling, and data warehousing concepts. Strong proficiency in PythonExpertise in Python programming with a focus on data processing and manipulation. Data Processing FrameworksKnowledge of data processing libraries such as Pandas, NumPy. SQL ProficiencyExperience writing optimized SQL queries for large-scale data analysis and transformation. Cloud PlatformsExperience working with cloud platforms like AWS, Azure, or GCP, including using cloud storage systems Preferred technical and professional experience Define, drive, and implement an architecture strategy and standards for end-to-end monitoring. Partner with the rest of the technology teams including application development, enterprise architecture, testing services, network engineering, Good to have detection and prevention tools for Company products and Platform and customer-facing

Posted 1 week ago

Apply

2.0 - 5.0 years

14 - 17 Lacs

Mumbai

Work from Office

Naukri logo

Experience with Scala object-oriented/object function Strong SQL background. Experience in Spark SQL, Hive, Data Engineer. SQL Experience with data pipelines & Data Lake Strong background in distributed comp. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise SQL Experience with data pipelines & Data Lake Strong background in distributed comp Experience with Scala object-oriented/object function Strong SQL background Preferred technical and professional experience Core Scala Development Experience

Posted 1 week ago

Apply

5.0 - 7.0 years

14 - 18 Lacs

Bengaluru

Work from Office

Naukri logo

As Data Engineer, you will develop, maintain, evaluate and test big data solutions. You will be involved in the development of data solutions using Spark Framework with Python or Scala on Hadoop and Azure Cloud Data Platform Responsibilities Experienced in building data pipelines to Ingest, process, and transform data from files, streams and databases. Process the data with Spark, Python, PySpark and Hive, Hbase or other NoSQL databases on Azure Cloud Data Platform or HDFS Experienced in develop efficient software code for multiple use cases leveraging Spark Framework / using Python or Scala and Big Data technologies for various use cases built on the platform Experience in developing streaming pipelines Experience to work with Hadoop / Azure eco system components to implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Apache Spark, Kafka, any Cloud computing etc Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Total 5 - 7+ years of experience in Data Management (DW, DL, Data Platform, Lakehouse) and Data Engineering skills Minimum 4+ years of experience in Big Data technologies with extensive data engineering experience in Spark / Python or Scala. Minimum 3 years of experience on Cloud Data Platforms on Azure Experience in DataBricks / Azure HDInsight / Azure Data Factory, Synapse, SQL Server DB Exposure to streaming solutions and message brokers like Kafka technologies Experience Unix / Linux Commands and basic work experience in Shell Scripting Preferred technical and professional experience Certification in Azure and Data Bricks or Cloudera Spark Certified developers

Posted 1 week ago

Apply

2.0 - 5.0 years

14 - 17 Lacs

Kochi

Work from Office

Naukri logo

As a Data Engineer at IBM, you'll play a vital role in the development, design of application, provide regular support/guidance to project teams on complex coding, issue resolution and execution Your primary responsibilities include: Lead the design and construction of new solutions using the latest technologies, always looking to add business value and meet user requirements. Strive for continuous improvements by testing the build solution and working under an agile framework. Discover and implement the latest technologies trends to maximize and build creative solutions Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Experience with Apache Spark (PySpark)In-depth knowledge of Spark’s architecture, core APIs, and PySpark for distributed data processing. Big Data TechnologiesFamiliarity with Hadoop, HDFS, Kafka, and other big data tools. Data Engineering Skills: Strong understanding of ETL pipelines, data modeling, and data warehousing concepts. * Strong proficiency in PythonExpertise in Python programming with a focus on data processing and manipulation. Data Processing FrameworksKnowledge of data processing libraries such as Pandas, NumPy. SQL ProficiencyExperience writing optimized SQL queries for large-scale data analysis and transformation. Cloud PlatformsExperience working with cloud platforms like AWS, Azure, or GCP, including using cloud storage systems Preferred technical and professional experience Define, drive, and implement an architecture strategy and standards for end-to-end monitoring. Partner with the rest of the technology teams including application development, enterprise architecture, testing services, network engineering, Good to have detection and prevention tools for Company products and Platform and customer-facing

Posted 1 week ago

Apply

5.0 - 7.0 years

14 - 18 Lacs

Mumbai

Work from Office

Naukri logo

Work with broader team to build, analyze and improve the AI solutions. You will also work with our software developers in consuming different enterprise applications Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Resource should have 5-7 years of experience. Sound knowledge of Python and should know how to use the ML related services. Proficient in Python with focus on Data Analytics Packages. Strategy Analyse large, complex data sets and provide actionable insights to inform business decisions. Strategy Design and implementing data models that help in identifying patterns and trends. Collaboration Work with data engineers to optimize and maintain data pipelines. Perform quantitative analyses that translate data into actionable insights and provide analytical, data-driven decision-making. Identify and recommend process improvements to enhance the efficiency of the data platform. Develop and maintain data models, algorithms, and statistical models Preferred technical and professional experience Experience with conversation analytics. Experience with cloud technologies Experience with data exploration tools such as Tableu

Posted 1 week ago

Apply

2.0 - 6.0 years

12 - 16 Lacs

Bengaluru

Work from Office

Naukri logo

As Data Engineer, you will develop, maintain, evaluate and test big data solutions. You will be involved in the development of data solutions using Spark Framework with Python or Scala on Hadoop and AWS Cloud Data Platform Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise AWS Data Vault 2.0 development mechanism for agile data ingestion, storage and scaling Databricks for complex queries on transformation, aggregation, business logic implementation aggregation, business logic implementation AWS Redshift and Redshift spectrum, for complex queries on transformation, aggregation, business logic implementation DWH Concept on star schema, Materialize view concept. Strong SQL and data manipulation/transformation skills Preferred technical and professional experience Robust and Scalable Cloud Infrastructure End-to-End Data Engineering Pipeline Versatile Programming Capabilities

Posted 1 week ago

Apply

3.0 - 7.0 years

14 - 18 Lacs

Bengaluru

Work from Office

Naukri logo

An AI Data Scientist at IBM is not just a job title - it’s a mindset. You’ll leverage the watsonx,AWS Sagemaker,Azure Open AI platform to co-create AI value with clients, focusing on technology patterns to enhance repeatability and delight clients. We are seeking an experienced and innovative AI Data Scientist to be specialized in foundation models and large language models. In this role, you will be responsible for architecting and delivering AI solutions using cutting-edge technologies, with a strong focus on foundation models and large language models. You will work closely with customers, product managers, and development teams to understand business requirements and design custom AI solutions that address complex challenges. Experience with tools like Github Copilot, Amazon Code Whisperer etc. is desirable. Success is our passion, and your accomplishments will reflect this, driving your career forward, propelling your team to success, and helping our clients to thrive. Day-to-Day Duties: Proof of Concept (POC) DevelopmentDevelop POCs to validate and showcase the feasibility and effectiveness of the proposed AI solutions. Collaborate with development teams to implement and iterate on POCs, ensuring alignment with customer requirements and expectations. Help in showcasing the ability of Gen AI code assistant to refactor/rewrite and document code from one language to another, particularly COBOL to JAVA through rapid prototypes/ PoC Documentation and Knowledge SharingDocument solution architectures, design decisions, implementation details, and lessons learned. Create technical documentation, white papers, and best practice guides. Contribute to internal knowledge sharing initiatives and mentor new team members. Industry Trends and InnovationStay up to date with the latest trends and advancements in AI, foundation models, and large language models. Evaluate emerging technologies, tools, and frameworks to assess their potential impact on solution design and implementation Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Strong programming skills, with proficiency in Python and experience with AI frameworks such as TensorFlow, PyTorch, Keras or Hugging Face. Understanding in the usage of libraries such as SciKit Learn, Pandas, Matplotlib, etc. Familiarity with cloud platforms (e.g. Kubernetes, AWS, Azure, GCP) and related services is a plus. Experience and working knowledge in COBOL & JAVA would be preferred o Having experience in Code generation, code matching & code translation leveraging LLM capabilities would be a Big plus (e.g. Amazon Code Whisperer, Github Copilot etc.) Soft Skills: Excellent interpersonal and communication skills. Engage with stakeholders for analysis and implementation. Commitment to continuous learning and staying updated with advancements in the field of AI. Growth mindsetDemonstrate a growth mindset to understand clients' business processes and challenges. Experience in python and pyspark will be added advantage Preferred technical and professional experience ExperienceProven experience in designing and delivering AI solutions, with a focus on foundation models, large language models, exposure to open source, or similar technologies. Experience in natural language processing (NLP) and text analytics is highly desirable. Understanding of machine learning and deep learning algorithms. Strong track record in scientific publications or open-source communities Experience in full AI project lifecycle, from research and prototyping to deployment in production environments

Posted 1 week ago

Apply

4.0 - 9.0 years

12 - 16 Lacs

Kochi

Work from Office

Naukri logo

As Data Engineer, you will develop, maintain, evaluate and test big data solutions. You will be involved in the development of data solutions using Spark Framework with Python or Scala on Hadoop and AWS Cloud Data Platform Responsibilities: Experienced in building data pipelines to Ingest, process, and transform data from files, streams and databases. Process the data with Spark, Python, PySpark, Scala, and Hive, Hbase or other NoSQL databases on Cloud Data Platforms (AWS) or HDFS Experienced in develop efficient software code for multiple use cases leveraging Spark Framework / using Python or Scala and Big Data technologies for various use cases built on the platform Experience in developing streaming pipelines Experience to work with Hadoop / AWS eco system components to implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Apache Spark, Kafka, any Cloud computing etc Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Minimum 4+ years of experience in Big Data technologies with extensive data engineering experience in Spark / Python or Scala. Minimum 3 years of experience on Cloud Data Platforms on AWS; Experience in AWS EMR / AWS Glue / DataBricks, AWS RedShift, DynamoDB Good to excellent SQL skills Exposure to streaming solutions and message brokers like Kafka technologies. Preferred technical and professional experience Certification in AWS and Data Bricks or Cloudera Spark Certified developers.

Posted 1 week ago

Apply

8.0 - 13.0 years

5 - 8 Lacs

Mumbai

Work from Office

Naukri logo

Role Overview : Seeking an experienced Apache Airflow specialist to design and manage data orchestration pipelines for batch/streaming workflows in a Cloudera environment. Key Responsibilities : Design, schedule, and monitor DAGs for ETL/ELT pipelines Integrate Airflow with Cloudera services and external APIs Implement retries, alerts, logging, and failure recovery Collaborate with data engineers and DevOps teams Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Skills Required : Experience 3–8 years Expertise in Airflow 2.x, Python, Bash Knowledge of CI/CD for Airflow DAGs Proven experience with Cloudera CDP, Spark/Hive-based data pipelines Integration with Kafka, REST APIs, databases

Posted 1 week ago

Apply

Exploring PySpark Jobs in India

PySpark, a powerful data processing framework built on top of Apache Spark and Python, is in high demand in the job market in India. With the increasing need for big data processing and analysis, companies are actively seeking professionals with PySpark skills to join their teams. If you are a job seeker looking to excel in the field of big data and analytics, exploring PySpark jobs in India could be a great career move.

Top Hiring Locations in India

Here are 5 major cities in India where companies are actively hiring for PySpark roles: 1. Bangalore 2. Pune 3. Hyderabad 4. Mumbai 5. Delhi

Average Salary Range

The estimated salary range for PySpark professionals in India varies based on experience levels. Entry-level positions can expect to earn around INR 6-8 lakhs per annum, while experienced professionals can earn upwards of INR 15 lakhs per annum.

Career Path

In the field of PySpark, a typical career progression may look like this: 1. Junior Developer 2. Data Engineer 3. Senior Developer 4. Tech Lead 5. Data Architect

Related Skills

In addition to PySpark, professionals in this field are often expected to have or develop skills in: - Python programming - Apache Spark - Big data technologies (Hadoop, Hive, etc.) - SQL - Data visualization tools (Tableau, Power BI)

Interview Questions

Here are 25 interview questions you may encounter when applying for PySpark roles:

  • Explain what PySpark is and its main features (basic)
  • What are the advantages of using PySpark over other big data processing frameworks? (medium)
  • How do you handle missing or null values in PySpark? (medium)
  • What is RDD in PySpark? (basic)
  • What is a DataFrame in PySpark and how is it different from an RDD? (medium)
  • How can you optimize performance in PySpark jobs? (advanced)
  • Explain the difference between map and flatMap transformations in PySpark (basic)
  • What is the role of a SparkContext in PySpark? (basic)
  • How do you handle schema inference in PySpark? (medium)
  • What is a SparkSession in PySpark? (basic)
  • How do you join DataFrames in PySpark? (medium)
  • Explain the concept of partitioning in PySpark (medium)
  • What is a UDF in PySpark? (medium)
  • How do you cache DataFrames in PySpark for optimization? (medium)
  • Explain the concept of lazy evaluation in PySpark (medium)
  • How do you handle skewed data in PySpark? (advanced)
  • What is checkpointing in PySpark and how does it help in fault tolerance? (advanced)
  • How do you tune the performance of a PySpark application? (advanced)
  • Explain the use of Accumulators in PySpark (advanced)
  • How do you handle broadcast variables in PySpark? (advanced)
  • What are the different data sources supported by PySpark? (medium)
  • How can you run PySpark on a cluster? (medium)
  • What is the purpose of the PySpark MLlib library? (medium)
  • How do you handle serialization and deserialization in PySpark? (advanced)
  • What are the best practices for deploying PySpark applications in production? (advanced)

Closing Remark

As you explore PySpark jobs in India, remember to prepare thoroughly for interviews and showcase your expertise confidently. With the right skills and knowledge, you can excel in this field and advance your career in the world of big data and analytics. Good luck!

cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies