Home
Jobs

5397 Pyspark Jobs - Page 38

Filter
Filter Interviews
Min: 0 years
Max: 25 years
Min: ₹0
Max: ₹10000000
Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Linkedin logo

Job Title: Backend Developer - Python Job Type: Full-time Location: On-site, Hyderabad, Telangana, India Job Summary: Join one of our top customer's team as a Backend Developer and help drive scalable, high-performance solutions at the intersection of machine learning and data engineering. You’ll collaborate with skilled professionals to design, implement, and maintain backend systems powering advanced AI/ML applications in a dynamic, onsite environment. Key Responsibilities: Develop, test, and deploy robust backend components and microservices using Python and PySpark. Implement and optimize data pipelines leveraging Databricks and distributed computing frameworks. Design and maintain efficient databases with MySQL, ensuring data integrity and high availability. Integrate machine learning models into production-ready backend systems supporting AI-driven features. Collaborate closely with data scientists and engineers to deliver end-to-end solutions aligned with business goals. Monitor, troubleshoot, and enhance system performance, utilizing Redis for caching and improved scalability. Write clear and maintainable documentation, and communicate effectively with team members both verbally and in writing. Required Skills and Qualifications: Proficiency in Python programming for backend development. Hands-on experience with Databricks and PySpark in a production environment. Strong understanding of MySQL database design, querying, and performance tuning. Practical background in machine learning concepts and deploying ML models. Experience with Redis for caching and state management. Excellent written and verbal communication skills, with a keen attention to detail. Demonstrated ability to work effectively in an on-site, collaborative setting in Hyderabad. Preferred Qualifications: Previous experience in high-growth AI/ML or data engineering projects. Familiarity with additional backend technologies or cloud platforms. Demonstrated leadership or mentorship in technical teams. Show more Show less

Posted 1 week ago

Apply

30.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Linkedin logo

About Client: Our client is a market-leading company with over 30 years of experience in the industry. As one of the world’s leading professional services firms, with $19.7B, with 333,640 associates worldwide, helping their clients modernize technology, reimagine processes, and transform experiences, enabling them to remain competitive in our fast-paced world. Their Specialties in Intelligent Process Automation, Digital Engineering, Industry & Platform Solutions, Internet of Things, Artificial Intelligence, Cloud, Data, Healthcare, Banking, Finance, Fintech, Manufacturing, Retail, Technology, and Salesforce Job Title :SDET Key Skills : SDET , Java , Selenium, Cucumber, API Automation, SQL , Rest Assured , Rest API , API Testing Job Locations Hyderabad,pune, bengaulure, kolkatha Experience : 6-9 Education Qualification : Any Graduation. Work Mode : Hybrid. Employment Type : Contract. Notice Period : Immediate Job Description: What SDET Do:: A Software Development Engineer in Test (SDET) role involves a combination of software development and testing responsibilities, focusing on ensuring high-quality software through automation and collaboration. SDETs are involved in designing, developing, and maintaining automated test scripts for various aspects of software, including data validation, API testing, and UI testing. They also work with development and product teams to identify test requirements and ensure test environments are properly set up. They create and maintain automated test scripts using various tools and frameworks, such as Selenium, Playwright, Rest Assured, and Python frameworks like Pandas, PySpark, and Pytest. Responsibilities  Develop and maintain automated test scripts using Java and Selenium to ensure software quality.  Collaborate with development teams to understand application features and create comprehensive test plans.  Execute automated test cases and analyze results to identify defects and ensure product stability.  Provide detailed reports on test results and work with developers to resolve issues.  Oversee the integration of automated tests into the continuous integration and delivery pipeline.  Ensure that all test environments are properly set up and maintained for testing activities.  Conduct code reviews and provide feedback to improve the quality of test scripts and frameworks.  Mentor junior SDETs and provide guidance on best practices in test automation.  Participate in requirement analysis and provide input on testability and quality risks.  Work closely with product management to understand user requirements and ensure they are met.  Stay updated with the latest industry trends and technologies in test automation.  Contribute to the improvement of testing processes and methodologies.  Ensure that all testing activities comply with company policies and standards. Qualifications   Must have strong experience in automation testing using Java and Selenium .  Should have a solid understanding of software development life cycle and testing methodologies.  Must possess excellent problem-solving skills and attention to detail.  Should have experience with continuous integration and delivery tools.  Nice to have experience in program management.  Must have strong communication and collaboration skills. Show more Show less

Posted 1 week ago

Apply

0 years

0 Lacs

Pune, Maharashtra, India

On-site

Linkedin logo

We are seeking an experienced and strategic Data to design, build, and optimize scalable, secure, and high-performance data solutions. You will play a pivotal role in shaping our data infrastructure, working with technologies such as Databricks, Azure Data Factory, SQL and PySpark, Key Responsibilities: • Design and develop scalable data pipelines using Databricks and Medallion (Bronze, Silver, Gold layers). • Write efficient PySpark and SQL code for data transformation, cleansing, and enrichment. • Build and manage data workflows in Azure Data Factory (ADF) including triggers, linked services, and integration runtimes. • Optimize queries and data structures for performance and cost-efficiency . • Develop and maintain CI/CD pipelines using GitHub for automated deployment and version control. • Collaborate with cross-functional teams to define data strategies and drive data quality initiatives. • Implement best practices for DevOps, CI/CD , and infrastructure-as-code in data engineering. • Troubleshoot and resolve performance bottlenecks across Spark, ADF, and Databricks pipelines. Requirements: • Bachelor’s or master’s degree in computer science, Information Systems, or related field. • Proven experience as a Data Architect or Senior Data Engineer. • Strong knowledge of Databricks , Azure Data Factory , Spark (PySpark) , and SQL . • Hands-on experience with data governance , security frameworks , and catalog management . • Proficiency in cloud platforms (preferably Azure). • Experience with CI/CD tools and version control systems like GitHub. • Strong communication and collaboration skills. Show more Show less

Posted 1 week ago

Apply

8.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Linkedin logo

About Client: Our Client is a global IT services company headquartered in Southborough, Massachusetts, USA. Founded in 1996, with a revenue of $1.8B, with 35,000+ associates worldwide, specializes in digital engineering, and IT services company helping clients modernize their technology infrastructure, adopt cloud and AI solutions, and accelerate innovation. It partners with major firms in banking, healthcare, telecom, and media. Our Client is known for combining deep industry expertise with agile development practices, enabling scalable and cost-effective digital transformation. The company operates in over 50 locations across more than 25 countries, has delivery centers in Asia, Europe, and North America and is backed by Baring Private Equity Asia. Job Title: AWS Data Engineer Location : Pan India Experience : 8-6 Years Job Typ e: Contract to Hire Notice Period : Immediate Joiners Mandatory Skills:, AWS services s3, Lambda, Redshift, Glue,Python,PySpark,SQL Job description: JD: Description - External At Storable, were on a mission to power the future of storage. Our innovative platform helps businesses manage, track, and grow their self-storage operations, and were looking for a Data Manager to join our data-driven team. Storable is committed to leveraging cutting-edge technologies to improve the efficiency, accessibility, and insights derived from data, empowering our team to make smarter decisions and foster impactful growth. As a Data Manager, you will play a pivotal role in overseeing and shaping our data operations, ensuring that our data is organized, accessible, and effectively managed across the organization. You will lead a talented team, work closely with cross-functional teams, and drive the development of strategies to enhance data quality, availability, and security. Key Responsibilities: Lead Data Management Strategy Define and execute the data management vision, strategy, and best practices, ensuring alignment with Storables business goals and objectives. Oversee Data Pipelines: Design, implement, and maintain scalable data pipelines using industry-standard tools to efficiently process and manage large-scale datasets. Ensure Data Quality & Governance, Implement data governance policies and frameworks to ensure data accuracy, consistency, and compliance across the organization. Manage Cross-Functional Collaboration - Partner with engineering, product, and business teams to make data accessible and actionable, and ensure it drives informed decision-making. Optimize Data Infrastructure: Leverage modern data tools and platforms. AWS, Apache Airflow, Apache Iceberg to create an efficient, reliable, and scalable data infrastructure. Monitor & Improve Performance: Mentorship & Leadership Lead and develop a team of data engineers and analysts, fostering a collaborative environment where innovation and continuous improvement are valued Qualifications Proven Expertise in Data Management: Significant experience in managing data infrastructure, data governance, and optimizing data pipelines at scale. Technical Proficiency : Strong hands-on experience with data tools and platforms such as Apache Airflow, Apache Iceberg, and AWS services s3, Lambda, Redshift, Glue Data Pipeline Mastery Familiarity with designing, implementing, and optimizing data pipelines and workflows in Python or other languages for data processing Experience with Data Governance: Solid understanding of data privacy, quality control, and governance best practice Leadership Skills: Ability to lead and mentor teams, influence stakeholders, and drive data initiatives across the organization. Analytical Mindset: Strong problem-solving abilities and a data-driven approach to improving business operations. Excellent Communication: Ability to communicate complex data concepts to both technical and non-technical stakeholders effectively. Bonus Points : Experience with visualization tools Looker, Tableau and reporting frameworks to provide actionable insights. Show more Show less

Posted 1 week ago

Apply

5.0 years

0 Lacs

Indore, Madhya Pradesh, India

On-site

Linkedin logo

Project Role : Application Developer Project Role Description : Design, build and configure applications to meet business process and application requirements. Must have skills : AWS Glue Good to have skills : NA Minimum 5 Year(s) Of Experience Is Required Educational Qualification : Btech Summary: As an Application Developer, you will be responsible for designing, building, and configuring applications to meet business process and application requirements using AWS Glue. Your typical day will involve working with PySpark and collaborating with cross-functional teams to deliver impactful data-driven solutions. Roles & Responsibilities: - Design, build, and configure applications using AWS Glue to meet business process and application requirements. - Collaborate with cross-functional teams to deliver impactful data-driven solutions. - Develop and maintain technical documentation related to application development. - Troubleshoot and debug application issues to ensure optimal performance and functionality. Professional & Technical Skills: - Must To Have Skills: Experience in AWS Glue. - Good To Have Skills: Experience in PySpark. - Strong understanding of application development principles and methodologies. - Experience in troubleshooting and debugging application issues. - Experience in developing and maintaining technical documentation related to application development. Additional Information: - The candidate should have a minimum of 5 years of experience in AWS Glue. - The ideal candidate will possess a strong educational background in software engineering, computer science, or a related field. - This position is based at our Bengaluru office. Btech Show more Show less

Posted 1 week ago

Apply

2.0 - 3.0 years

0 Lacs

Pune, Maharashtra, India

Remote

Linkedin logo

Description The Data Engineer supports, develops, and maintains a data and analytics platform to efficiently process, store, and make data available to analysts and other consumers. This role collaborates with Business and IT teams to understand requirements and best leverage technologies for agile data delivery at scale. Note:- Even though the role is categorized as Remote, it will follow a hybrid work model. Key Responsibilities Implement and automate deployment of distributed systems for ingesting and transforming data from various sources (relational, event-based, unstructured). Develop and operate large-scale data storage and processing solutions using cloud-based platforms (e.g., Data Lakes, Hadoop, HBase, Cassandra, MongoDB, DynamoDB). Ensure data quality and integrity through continuous monitoring and troubleshooting. Implement data governance processes, managing metadata, access, and data retention. Develop scalable, efficient, and quality data pipelines with monitoring and alert mechanisms. Design and implement physical data models and storage architectures based on best practices. Analyze complex data elements and systems, data flow, dependencies, and relationships to contribute to conceptual, physical, and logical data models. Participate in testing and troubleshooting of data pipelines. Utilize agile development technologies such as DevOps, Scrum, and Kanban for continuous improvement in data-driven applications. Responsibilities Qualifications, Skills, and Experience: Must-Have 2-3 years of experience in data engineering with expertise in Azure Databricks and Scala/Python. Hands-on experience with Spark (Scala/PySpark) and SQL. Strong understanding of SPARK Streaming, SPARK Internals, and Query Optimization. Proficiency in Azure Cloud Services. Agile Development experience. Experience in Unit Testing of ETL pipelines. Expertise in creating ETL pipelines integrating ML models. Knowledge of Big Data storage strategies (optimization and performance). Strong problem-solving skills. Basic understanding of Data Models (SQL/NoSQL) including Delta Lake or Lakehouse. Exposure to Agile software development methodologies. Quick learner with adaptability to new technologies. Nice-to-Have Understanding of the ML lifecycle. Exposure to Big Data open-source technologies. Experience with clustered compute cloud-based implementations. Familiarity with developing applications requiring large file movement in cloud environments. Experience in building analytical solutions. Exposure to IoT technology. Competencies System Requirements Engineering: Translates stakeholder needs into verifiable requirements. Collaborates: Builds partnerships and works collaboratively with others. Communicates Effectively: Develops and delivers clear communications for various audiences. Customer Focus: Builds strong customer relationships and delivers customer-centric solutions. Decision Quality: Makes timely and informed decisions to drive progress. Data Extraction: Performs ETL activities from various sources using appropriate tools and technologies. Programming: Writes and tests computer code using industry standards, tools, and automation. Quality Assurance Metrics: Applies measurement science to assess solution effectiveness. Solution Documentation: Documents and communicates solutions to enable knowledge transfer. Solution Validation Testing: Ensures configuration changes meet design and customer requirements. Data Quality: Identifies and corrects data flaws to support governance and decision-making. Problem Solving: Uses systematic analysis to identify and resolve issues effectively. Values Differences: Recognizes and values diverse perspectives and cultures. Qualifications Education, Licenses, and Certifications: College, university, or equivalent degree in a relevant technical discipline, or equivalent experience required. This position may require licensing for compliance with export controls or sanctions regulations. Work Schedule Work primarily with stakeholders in the US, requiring a 2-3 hour overlap during EST hours as needed. Job Systems/Information Technology Organization Cummins Inc. Role Category Remote Job Type Exempt - Experienced ReqID 2411641 Relocation Package No Show more Show less

Posted 1 week ago

Apply

5.0 years

0 Lacs

Ahmedabad, Gujarat, India

On-site

Linkedin logo

Project Role : Application Developer Project Role Description : Design, build and configure applications to meet business process and application requirements. Must have skills : Databricks Unified Data Analytics Platform Good to have skills : PySpark, Microsoft Azure Databricks Minimum 5 Year(s) Of Experience Is Required Educational Qualification : 15 years full time education Summary: As an Application Developer, you will be responsible for designing, building, and configuring applications to meet business process and application requirements. Your typical day will involve collaborating with teams to develop innovative solutions and streamline processes. Roles & Responsibilities: - Expected to be an SME - Collaborate and manage the team to perform - Responsible for team decisions - Engage with multiple teams and contribute on key decisions - Provide solutions to problems for their immediate team and across multiple teams - Lead the development and implementation of new applications - Conduct code reviews and ensure coding standards are met - Stay updated on industry trends and best practices Professional & Technical Skills: - Must To Have Skills: Proficiency in Databricks Unified Data Analytics Platform - Good To Have Skills: Experience with PySpark - Strong understanding of data engineering concepts - Experience in building and optimizing data pipelines - Knowledge of cloud platforms like Microsoft Azure - Familiarity with data governance and security practices Additional Information: - The candidate should have a minimum of 5 years of experience in Databricks Unified Data Analytics Platform - This position is based at our Bengaluru office - A 15 years full-time education is required 15 years full time education Show more Show less

Posted 1 week ago

Apply

5.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Linkedin logo

Skills: Glue, Python, lambda, dynamo DB, SQL, Django, AWS, Job Title: Python Developer (Contract + Extension) Experience: 5+ years Location: Bangalore (Hybrid) Notice Period: Immediate Joiners Only Roles And Responsibilities Design, develop, and maintain scalable backend systems using Python and AWS services. Collaborate with cross-functional teams to identify requirements and develop solutions. Work with AWS services such as Glue, Lambda, DynamoDB, S3, and PySpark to build data pipelines, process large datasets, and develop serverless applications. Ensure seamless integration of AWS services with existing systems. Identify and resolve production issues efficiently, using strong debugging skills. Collaborate with QA teams to identify and fix defects. Design, develop, and maintain database schema for MySQL and NoSQL databases. Ensure data consistency, integrity, and performance. Follow coding best practices, ensuring high-quality code and adherence to coding standards. Participate in code reviews and contribute to the improvement of the codebase. Requirements 5+ years of experience in Python development, with a strong focus on backend development. Strong knowledge of AWS services, including Glue, Lambda, DynamoDB, S3, and PySpark. Experience with MySQL and NoSQL databases. Excellent debugging skills and problem-solving abilities. Strong communication and collaboration skills. Ability to work in a hybrid environment, with flexibility to adapt to changing requirements. Strong analytical and problem-solving skills. Bachelor's degree in Computer Science, Engineering, or a related field. Show more Show less

Posted 1 week ago

Apply

5.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Linkedin logo

Skills: Python, Django, CRON, AWS, CI/CD, MySQL, Greetings from ALIQAN Technologies!! We are hiring a Python Developer for one of our clients. Job Title: Python Developer Experience: 5+ Years Job Type: 6 Months Contract + ext Location: Bangalore (Hybrid) Notice Period: Immediate Joiner Only Job Description Python development, backend experience. Strong knowledge of AWS services (Glue, Lambda, DynamoDB, S3, PySpark). Excellent debugging skills to resolve production issues. Experience with MySQL, NoSQL databases. Optional Skills Experience with Django and CRON jobs. Familiarity with data lakes, big data tools, and CI/CD. Show more Show less

Posted 1 week ago

Apply

1.0 years

0 Lacs

Bengaluru, Karnataka, India

Remote

Linkedin logo

As part of the Astellas commitment to delivering value for our patients, our organization is currently undergoing transformation to achieve this critical goal. This is an opportunity to work on digital transformation and make a real impact within a company dedicated to improving lives. DigitalX our new information technology function is spearheading this value driven transformation across Astellas. We are looking for people who excel in embracing change, manage technical challenges and have exceptional communication skills. This position is based in Bengaluru and will require some on-site work. Purpose And Scope As a Junior Data Engineer, you will play a crucial role in assisting in the design, build, and maintenance of our data infrastructure focusing on BI and DWH capabilities. Working with the Senior Data Engineer, your foundational expertise in BI, Databricks, PySpark, SQL, Talend and other related technologies, will be instrumental in driving data-driven decision-making across the organization. You will play a pivotal role in building maintaining and enhancing our systems across the organization. This is a fantastic global opportunity to use your proven agile delivery skills across a diverse range of initiatives, utilize your development skills, and contribute to the continuous improvement/delivery of critical IT solutions. Essential Job Responsibilities Collaborate with FoundationX Engineers to design and maintain scalable data systems. Assist in building robust infrastructure using technologies like PowerBI, Qlik or alternative, Databricks, PySpark, and SQL. Contribute to ensuring system reliability by incorporating accurate business-driving data. Gain experience in BI engineering through hands-on projects. Data Modelling and Integration: Collaborate with cross-functional teams to analyse requirements and create technical designs, data models, and migration strategies. Design, build, and maintain physical databases, dimensional data models, and ETL processes specific to pharmaceutical data. Cloud Expertise: Evaluate and influence the selection of cloud-based technologies such as Azure, AWS, or Google Cloud. Implement data warehousing solutions in a cloud environment, ensuring scalability and security. BI Expertise: Leverage and create PowerBI, Qlik or equivalent technology for data visualization, dashboards, and self-service analytics. Data Pipeline Development: Design, build, and optimize data pipelines using Databricks and PySpark. Ensure data quality, reliability, and scalability. Application Transition: Support the migration of internal applications to Databricks (or equivalent) based solutions. Collaborate with application teams to ensure a seamless transition. Mentorship and Leadership: Lead and mentor junior data engineers. Share best practices, provide technical guidance, and foster a culture of continuous learning. Data Strategy Contribution: Contribute to the organization’s data strategy by identifying opportunities for data-driven insights and improvements. Participate in smaller focused mission teams to deliver value driven solutions aligned to our global and bold move priority initiatives and beyond. Design, develop and implement robust and scalable data analytics using modern technologies. Collaborate with cross functional teams and practises across the organisation including Commercial, Manufacturing, Medical, DataX, GrowthX and support other X (transformation) Hubs and Practices as appropriate, to understand user needs and translate them into technical solutions. Provide Technical Support to internal users troubleshooting complex issues and ensuring system uptime as soon as possible. Champion continuous improvement initiatives identifying opportunities to optimise performance security and maintainability of existing data and platform architecture and other technology investments. Participate in the continuous delivery pipeline. Adhering to DevOps best practises for version control automation and deployment. Ensuring effective management of the FoundationX backlog. Leverage your knowledge of data engineering principles to integrate with existing data pipelines and explore new possibilities for data utilization. Stay-up to date on the latest trends and technologies in data engineering and cloud platforms. Qualifications Required Bachelor's degree in computer science, Information Technology, or related field (master’s preferred) or equivalent experience 1-3+ years of experience in data engineering with a strong understanding of BI technologies, PySpark and SQL, building data pipelines and optimization. 1-3 +years + experience in data engineering and integration tools (e.g., Databricks, Change Data Capture) 1-3+ years + experience of utilizing cloud platforms (AWS, Azure, GCP). A deeper understanding/certification of AWS and Azure is considered a plus. Experience with relational and non-relational databases. Any relevant cloud-based integration certification at foundational level or above. (Any QLIK or BI certification, AWS certified DevOps engineer, AWS Certified Developer, Any Microsoft Certified Azure qualification, Proficient in RESTful APIs, AWS, CDMP, MDM, DBA, SQL, SAP, TOGAF, API, CISSP, VCP or any relevant certification) Experience in MuleSoft (Anypoint platform, its components, Designing and managing API-led connectivity solutions). Experience in AWS (environment, services and tools), developing code in at least one high level programming language. Experience with continuous integration and continuous delivery (CI/CD) methodologies and tools Experience with Azure services related to computing, networking, storage, and security Understanding of cloud integration patterns and Azure integration services such as Logic Apps, Service Bus, and API Management Preferred Subject Matter Expertise: possess a strong understanding of data architecture/ engineering/operations/ reporting within Life Sciences/ Pharma industry across Commercial, Manufacturing and Medical domains. Other complex and highly regulated industry experience will be considered across diverse areas like Commercial, Manufacturing and Medical. Data Analysis and Automation Skills: Proficient in identifying, standardizing, and automating critical reporting metrics and modelling tools Analytical Thinking: Demonstrated ability to lead ad hoc analyses, identify performance gaps, and foster a culture of continuous improvement. Technical Proficiency: Strong coding skills in SQL, R, and/or Python, coupled with expertise in machine learning techniques, statistical analysis, and data visualization. Agile Champion: Adherence to DevOps principles and a proven track record with CI/CD pipelines for continuous delivery. Working Environment At Astellas we recognize the importance of work/life balance, and we are proud to offer a hybrid working solution allowing time to connect with colleagues at the office with the flexibility to also work from home. We believe this will optimize the most productive work environment for all employees to succeed and deliver. Hybrid work from certain locations may be permitted in accordance with Astellas’ Responsible Flexibility Guidelines. \ Category FoundationX Astellas is committed to equality of opportunity in all aspects of employment. EOE including Disability/Protected Veterans Show more Show less

Posted 1 week ago

Apply

5.0 years

0 Lacs

Bangalore Urban, Karnataka, India

On-site

Linkedin logo

Skills: aws, MySQL, Backend development, CI/CD, Agile Methodologies, Python, SQL, Job Title: Python Developer Experience: 5+ Years Job Type: 6 Months Contract + ext Location: Bangalore (Hybrid) Notice Period: Immediate Joiner Only Job Description Python development, backend experience. Strong knowledge of AWS services (Glue, Lambda, DynamoDB, S3, PySpark). Excellent debugging skills to resolve production issues. Experience with MySQL, NoSQL databases. Optional Skills Experience with Django and CRON jobs. Familiarity with data lakes, big data tools, and CI/CD. Show more Show less

Posted 1 week ago

Apply

8.0 - 10.0 years

30 - 35 Lacs

Pune, Chennai, Bengaluru

Work from Office

Naukri logo

Role & responsibilities AWS Architect Primary skills Aws (Redshift, Glue, Lambda, ETL and Aurora), advance SQL and Python , Pyspark Note : -Aurora Database mandatory skill Experience – 8 + yrs Notice period – Immediate joiner Location – Any Brillio location (Preferred is Bangalore) Job Description: year of IT experiences with deep expertise in S3, Redshift, Aurora, Glue and Lambda services. Atleast one instance of proven experience in developing Data platform end to end using AWS Hands-on programming experience with Data Frames, Python, and unit testing the python as well as Glue code. Experience in orchestrating mechanisms like Airflow, Step functions etc. Experience working on AWS redshift is Mandatory. Must have experience writing stored procedures, understanding of Redshift data API and writing federated queries Experience in Redshift performance tunning.Good in communication and problem solving. Very good stakeholder communication and management Preferred candidate profile

Posted 1 week ago

Apply

0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Linkedin logo

Palantir Tech Lead Skills: Python, Pyspark and Palantir Need a strong hands-on lead engineer. Onsite in Hyderabad. Exp 15+ Tasks and Responsibilities: Leads data engineering activities on moderate to complex data and analytics-centric problems which have broad impact and require in-depth analysis to obtain desired results; assemble, enhance, maintain, and optimize current, enable cost savings and meet individual project or enterprise maturity objectives. advanced working knowledge of SQL, Python, and PySpark Experience using tools such as: Git/Bitbucket, Jenkins/CodeBuild, CodePipeline Experience with platform monitoring and alerts tools Work closely with Subject Matter Experts (SMEs) to design and develop Foundry front end applications with the ontology (data model) and data pipelines supporting the applications Implement data transformations to derive new datasets or create Foundry Ontology Objects necessary for business applications Implement operational applications using Foundry Tools (Workshop, Map, and/or Slate) Actively participate in agile/scrum ceremonies (stand ups, planning, retrospectives, etc.) Create and maintain documentation describing data catalog and data objects Maintain applications as usage grows and requirements change Promote a continuous improvement mindset by engaging in after action reviews and sharing learnings Use communication skills, especially for explaining technical concepts to nontechnical business leader Show more Show less

Posted 1 week ago

Apply

5.0 - 8.0 years

0 - 1 Lacs

Hyderabad

Hybrid

Naukri logo

Location: Hyderabad (Hybrid) Please share your resume with +91 9361912009 Roles and Responsibilities Deep understanding of Linux, networking and security fundamentals. Experience working with AWS cloud platform and infrastructure. Experience working with infrastructure as code with Terraform or Ansible tools. Experience managing large BigData clusters in production (at least one of -- Cloudera, Hortonworks, EMR). Excellent knowledge and solid work experience providing observability for BigData platforms using tools like Prometheus, InfluxDB, Dynatrace, Grafana, Splunk etc. Expert knowledge on Hadoop Distributed File System (HDFS) and Hadoop YARN. Decent knowledge of various Hadoop file formats like ORC, Parquet, Avro etc. Deep understanding of Hive (Tez), Hive LLAP, Presto and Spark compute engines. Ability to understand query plans and optimize performance for complex SQL queries on Hive and Spark. Experience supporting Spark with Python (PySpark) and R (SparklyR, SparkR) languages Solid professional coding experience with at least one scripting language - Shell, Python etc. Experience working with Data Analysts, Data Scientists and at least one of these related analytical applications like SAS, R-Studio, JupyterHub, H2O etc. Able to read and understand code (Java, Python, R, Scala), but expertise in at least one scripting languages like Python or Shell. Nice to have skills: Experience with workflow management tools like Airflow, Oozie etc. Knowledge in analytical libraries like Pandas, Numpy, Scipy, PyTorch etc. Implementation history of Packer, Chef, Jenkins or any other similar tooling. Prior working knowledge of Active Directory and Windows OS based VDI platforms like Citrix, AWS Workspaces etc.

Posted 1 week ago

Apply

4.0 - 6.0 years

8 - 18 Lacs

Bengaluru

Work from Office

Naukri logo

We are seeking a skilled Data Engineer & Data Analyst with over 4 years of experience to design, build, and maintain scalable data pipelines and perform advanced data analysis to support business intelligence and data-driven decision-making. The ideal candidate will have a strong foundation in computer science principles, extensive experience with SQL and big data tools, and proficiency in cloud platforms and data visualization tools. Key Responsibilities: Design, develop, and maintain robust, scalable ETL pipelines using Apache Airflow, DBT, Composer (GCP), Control-M, Cron, Luigi, and similar tools. Build and optimize data architectures including data lakes and data warehouses. Integrate data from multiple sources ensuring data quality and consistency. Collaborate with data scientists, analysts, and stakeholders to translate business requirements into technical solutions. Analyze complex datasets to identify trends, generate actionable insights, and support decision-making. Develop and maintain dashboards and reports using Tableau, Power BI, and Jupyter Notebooks for visualization and pipeline validation. Manage and optimize relational and NoSQL databases such as MySQL, PostgreSQL, Oracle, MongoDB, and DynamoDB. Work with big data tools and frameworks including Hadoop, Spark, Hive, Kafka, Informatica, Talend, SSIS, and Dataflow. Utilize cloud data services and warehouses like AWS Glue, GCP Dataflow, Azure Data Factory, Snowflake, Redshift, and BigQuery. Support CI/CD pipelines and DevOps workflows using Git, Docker, Terraform, and related tools. Ensure data governance, security, and compliance standards are met. Participate in Agile and DevOps processes to enhance data engineering workflows. Required Qualifications: 4+ years of professional experience in data engineering and data analysis roles. Strong proficiency in SQL and experience with database management systems such as MySQL, PostgreSQL, Oracle, and MongoDB. Hands-on experience with big data tools like Hadoop and Apache Spark. Proficient in Python programming. Experience with data visualization tools such as Tableau, Power BI, and Jupyter Notebooks. Proven ability to design, build, and maintain scalable ETL pipelines using tools like Apache Airflow, DBT, Composer (GCP), Control-M, Cron, and Luigi. Familiarity with data engineering tools including Hive, Kafka, Informatica, Talend, SSIS, and Dataflow. Experience working with cloud data warehouses and services (Snowflake, Redshift, BigQuery, AWS Glue, GCP Dataflow, Azure Data Factory). Understanding of data modeling concepts and data lake/data warehouse architectures. Experience supporting CI/CD practices with Git, Docker, Terraform, and DevOps workflows. Knowledge of both relational and NoSQL databases, including PostgreSQL, BigQuery, MongoDB, and DynamoDB. Exposure to Agile and DevOps methodologies. Experience with at least one cloud platform: Google Cloud Platform (BigQuery, Dataflow, Composer, Cloud Storage, Pub/Sub) Amazon Web Services (S3, Glue, Redshift, Lambda, Athena) Microsoft Azure (Data Factory, Synapse Analytics, Blob Storage) Preferred Skills: Strong problem-solving and communication skills. Ability to work independently and collaboratively in a team environment. Experience with service development, REST APIs, and automation testing is a plus. Familiarity with version control systems and workflow automation.

Posted 1 week ago

Apply

5.0 - 9.0 years

10 - 20 Lacs

Hyderabad, Pune, Bengaluru

Hybrid

Naukri logo

We are looking for Azure Data Engineer's resources having minimum 5 to 9 years of Experience. To Apply, use the below link: https://career.infosys.com/jobdesc?jobReferenceCode=INFSYS-EXTERNAL-210775&rc=0 Role & responsibilities Blend of technical expertise with 5 to 9 year of experience , analytical problem-solving, and collaboration with cross-functional teams. Design and implement Azure data engineering solutions ( Ingestion & Curation ) Create and maintain Azure data solutions including Azure SQL Database, Azure Data Lake, and Azure Blob Storage. Design, implement, and maintain data pipelines for data ingestion, processing, and transformation in Azure. Utilizing Azure Data Factory or comparable technologies, create and maintain ETL (Extract, Transform, Load) operations Use Azure Data Factory and Databricks to assemble large, complex data sets Implementing data validation and cleansing procedures will ensure the quality, integrity, and dependability of the data. Ensure data quality / security and compliance. Optimize Azure SQL databases for efficient query performance. Collaborate with data engineers, and other stakeholders to understand requirements and translate them into scalable and reliable data platform architectures.

Posted 1 week ago

Apply

8.0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

Linkedin logo

Kyndryl Data Science Bengaluru, Karnataka, India Chennai, Tamil Nadu, India Posted on Jun 9, 2025 Apply now Who We Are At Kyndryl, we design, build, manage and modernize the mission-critical technology systems that the world depends on every day. So why work at Kyndryl? We are always moving forward – always pushing ourselves to go further in our efforts to build a more equitable, inclusive world for our employees, our customers and our communities. The Role At Kyndryl, we design, build, manage and modernize the mission-critical technology systems that the world depends on every day. So why work at Kyndryl? We are always moving forward – always pushing ourselves to go further in our efforts to build a more equitable, inclusive world for our employees, our customers and our communities. As GCP Data Engineer at Kyndryl, you will be responsible for designing and developing data pipelines, participating in architectural discussions, and implementing data solutions in a cloud environment using GCP data services. You will collaborate with global architects and business teams to design and deploy innovative solutions, supporting data analytics, automation, and transformation needs. Responsibilities Design, develop, and maintain scalable data pipelines using GCP services such as BigQuery, Dataflow, Pub/Sub, and Cloud Storage. Participate in architectural discussions, conduct system analysis, and suggest optimal solutions that are scalable, future-proof, and aligned with business requirements. Collaborate with stakeholders to gather requirements and create high-level and detailed technical designs. Design data models suitable for both transactional and big data environments, supporting Machine Learning workflows. Build and optimize ETL/ELT infrastructure using a variety of data sources and GCP services. Develop and maintain Python / PySpark for data processing and integrate with GCP services for seamless data operations. Develop and optimize SQL queries for data analysis and reporting. Monitor and troubleshoot data pipeline issues to ensure timely resolution. Implement data governance and security best practices within GCP. Perform data quality checks and validation to ensure accuracy and consistency. Support DevOps automation efforts to ensure smooth integration and deployment of data pipelines. Provide design expertise in Master Data Management (MDM), Data Quality, and Metadata Management. Provide technical support and guidance to junior data engineers and other team members. Participate in code reviews and contribute to continuous improvement of data engineering practices. Implement best practices for cost management and resource utilization within GCP. If you're ready to embrace the power of data to transform our business and embark on an epic data adventure, then join us at Kyndryl. Together, let's redefine what's possible and unleash your potential. Your Future at Kyndryl Every position at Kyndryl offers a way forward to grow your career. We have opportunities that you won’t find anywhere else, including hands-on experience, learning opportunities, and the chance to certify in all four major platforms. Whether you want to broaden your knowledge base or narrow your scope and specialize in a specific sector, you can find your opportunity here. Who You Are You’re good at what you do and possess the required experience to prove it. However, equally as important – you have a growth mindset; keen to drive your own personal and professional development. You are customer-focused – someone who prioritizes customer success in their work. And finally, you’re open and borderless – naturally inclusive in how you work with others. Required Technical And Professional Experience Bachelor’s or master’s degree in computer science, Engineering, or a related field with over 8 years of experience in data engineering More than 3 years of experience with the GCP data ecosystem Hands-on experience and Strong proficiency in GCP components such as Dataflow, Dataproc, BigQuery, Cloud Functions, Composer, Data Fusion. Excellent command of SQL with the ability to write complex queries and perform advanced data transformation. Strong programming skills in PySpark and/or Python, specifically for building cloud-native data pipelines. Familiarity with GCP tools like Looker, Airflow DAGs, Data Studio, App Maker, etc. Hands-on experience implementing enterprise-wide cloud data lake and data warehouse solutions on GCP. Knowledge of data governance, security, and compliance best practices. Experience with private and public cloud architectures, pros/cons, and migration considerations. Excellent problem-solving, analytical, and critical thinking skills. Ability to manage multiple projects simultaneously, while maintaining a high level of attention to detail. Communication Skills: Must be able to communicate with both technical and nontechnical. Able to derive technical requirements with the stakeholders. Ability to work independently and in agile teams. Preferred Technical And Professional Experience GCP Data Engineer Certification is highly preferred. Professional certification, e.g., Open Certified Technical Specialist with Data Engineering Specialization. Experience working as a Data Engineer and/or in cloud modernization. Knowledge of Databricks, Snowflake, for data analytics. Experience in NoSQL databases Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes). Familiarity with BI dashboards and Google Data Studio is a plus. Being You Diversity is a whole lot more than what we look like or where we come from, it’s how we think and who we are. We welcome people of all cultures, backgrounds, and experiences. But we’re not doing it single-handily: Our Kyndryl Inclusion Networks are only one of many ways we create a workplace where all Kyndryls can find and provide support and advice. This dedication to welcoming everyone into our company means that Kyndryl gives you – and everyone next to you – the ability to bring your whole self to work, individually and collectively, and support the activation of our equitable culture. That’s the Kyndryl Way. What You Can Expect With state-of-the-art resources and Fortune 100 clients, every day is an opportunity to innovate, build new capabilities, new relationships, new processes, and new value. Kyndryl cares about your well-being and prides itself on offering benefits that give you choice, reflect the diversity of our employees and support you and your family through the moments that matter – wherever you are in your life journey. Our employee learning programs give you access to the best learning in the industry to receive certifications, including Microsoft, Google, Amazon, Skillsoft, and many more. Through our company-wide volunteering and giving platform, you can donate, start fundraisers, volunteer, and search over 2 million non-profit organizations. At Kyndryl, we invest heavily in you, we want you to succeed so that together, we will all succeed. Get Referred! If you know someone that works at Kyndryl, when asked ‘How Did You Hear About Us’ during the application process, select ‘Employee Referral’ and enter your contact's Kyndryl email address. Apply now See more open positions at Kyndryl Show more Show less

Posted 1 week ago

Apply

5.0 - 8.0 years

5 - 9 Lacs

Bengaluru

Work from Office

Naukri logo

Wipro Limited (NYSEWIT, BSE507685, NSEWIPRO) is a leading technology services and consulting company focused on building innovative solutions that address clients’ most complex digital transformation needs. Leveraging our holistic portfolio of capabilities in consulting, design, engineering, and operations, we help clients realize their boldest ambitions and build future-ready, sustainable businesses. With over 230,000 employees and business partners across 65 countries, we deliver on the promise of helping our customers, colleagues, and communities thrive in an ever-changing world. For additional information, visit us at www.wipro.com. About The Role _x000D_ Role Purpose The purpose of the role is to support process delivery by ensuring daily performance of the Production Specialists, resolve technical escalations and develop technical capability within the Production Specialists. ? _x000D_ Do Oversee and support process by reviewing daily transactions on performance parameters Review performance dashboard and the scores for the team Support the team in improving performance parameters by providing technical support and process guidance Record, track, and document all queries received, problem-solving steps taken and total successful and unsuccessful resolutions Ensure standard processes and procedures are followed to resolve all client queries Resolve client queries as per the SLA’s defined in the contract Develop understanding of process/ product for the team members to facilitate better client interaction and troubleshooting Document and analyze call logs to spot most occurring trends to prevent future problems Identify red flags and escalate serious client issues to Team leader in cases of untimely resolution Ensure all product information and disclosures are given to clients before and after the call/email requests Avoids legal challenges by monitoring compliance with service agreements ? _x000D_ Handle technical escalations through effective diagnosis and troubleshooting of client queries Manage and resolve technical roadblocks/ escalations as per SLA and quality requirements If unable to resolve the issues, timely escalate the issues to TA & SES Provide product support and resolution to clients by performing a question diagnosis while guiding users through step-by-step solutions Troubleshoot all client queries in a user-friendly, courteous and professional manner Offer alternative solutions to clients (where appropriate) with the objective of retaining customers’ and clients’ business Organize ideas and effectively communicate oral messages appropriate to listeners and situations Follow up and make scheduled call backs to customers to record feedback and ensure compliance to contract SLA’s ? _x000D_ Build people capability to ensure operational excellence and maintain superior customer service levels of the existing account/client Mentor and guide Production Specialists on improving technical knowledge Collate trainings to be conducted as triage to bridge the skill gaps identified through interviews with the Production Specialist Develop and conduct trainings (Triages) within products for production specialist as per target Inform client about the triages being conducted Undertake product trainings to stay current with product features, changes and updates Enroll in product specific and any other trainings per client requirements/recommendations Identify and document most common problems and recommend appropriate resolutions to the team Update job knowledge by participating in self learning opportunities and maintaining personal networks ? _x000D_ Deliver NoPerformance ParameterMeasure1ProcessNo. of cases resolved per day, compliance to process and quality standards, meeting process level SLAs, Pulse score, Customer feedback, NSAT/ ESAT2Team ManagementProductivity, efficiency, absenteeism3Capability developmentTriages completed, Technical Test performance Mandatory Skills: Hadoop_x000D_. Experience5-8 Years_x000D_. Reinvent your world. We are building a modern Wipro. We are an end-to-end digital transformation partner with the boldest ambitions. To realize them, we need people inspired by reinvention. Of yourself, your career, and your skills. We want to see the constant evolution of our business and our industry. It has always been in our DNA - as the world around us changes, so do we. Join a business powered by purpose and a place that empowers you to design your own reinvention. Come to Wipro. Realize your ambitions. Applications from people with disabilities are explicitly welcome.

Posted 1 week ago

Apply

5.0 - 10.0 years

15 - 30 Lacs

Bengaluru

Remote

Naukri logo

Job Requirement for Offshore Data Engineer (with ML expertise) Work Mode: Remote Base Location: Bengaluru Experience: 5+ Years Technical Skills & Expertise: PySpark & Apache Spark: Extensive experience with PySpark and Spark for big data processing and transformation. Strong understanding of Spark architecture, optimization techniques, and performance tuning. Ability to work with Spark jobs in distributed computing environments like Databricks. Data Mining & Transformation: Hands-on experience in designing and implementing data mining workflows. Expertise in data transformation processes, including ETL (Extract, Transform, Load) pipelines. Experience in large-scale data ingestion, aggregation, and cleaning. Programming Languages: Python & Scala: Proficient in Python for data engineering tasks, including using libraries like Pandas and NumPy. Scala proficiency is preferred for Spark job development. Big Data Concepts: In-depth knowledge of big data frameworks and paradigms, such as distributed file systems, parallel computing, and data partitioning. Big Data Technologies: Cassandra & Hadoop: Experience with NoSQL databases like Cassandra and distributed storage systems like Hadoop. Data Warehousing Tools: Proficiency with Hive for data warehousing solutions and querying. ETL Tools: Experience with Beam architecture and other ETL tools for large-scale data workflows. Cloud Technologies (GCP): Expertise in Google Cloud Platform (GCP), including core services like Cloud Storage, BigQuery, and DataFlow. Experience with DataFlow jobs for batch and stream processing. Familiarity with managing workflows using Airflow for task scheduling and orchestration in GCP. Machine Learning & AI: GenAI Experience: Familiarity with Generative AI and its applications in ML pipelines. ML Model Development: Knowledge of basic ML model building using tools like Pandas, NumPy, and visualization with Matplotlib. ML Ops Pipeline: Experience in managing end-to-end ML Ops pipelines for deploying models in production, particularly LLM (Large Language Models) deployments. RAG Architecture: Understanding and experience in building pipelines using Retrieval-Augmented Generation (RAG) architecture to enhance model performance and output. Tech stack : Spark, Pyspark, Python, Scala, GCP data flow, Data composer (Air flow), ETL, Databricks, Hadoop, Hive, GenAI, ML Modeling basic knowledge, ML Ops experience , LLM deployment, RAG

Posted 1 week ago

Apply

5.0 - 10.0 years

20 - 35 Lacs

Bengaluru

Remote

Naukri logo

Job Title: Senior Machine Learning Engineer Work Mode: Remote Base Location: Bengaluru Experience: 5+ Years Strong problem-solving skills and ability to work in a fast-paced, collaborative environment. Strong programming skills in Python and experience with ML frameworks. Proficiency in containerization (Docker) and orchestration (Kubernetes) technologies. Solid understanding of CI/CD principles and tools (e.g., Jenkins, GitLab CI, GitHub Actions). Knowledge of data engineering concepts and experience building data pipelines. Strong understandings on Computational, Storage and Orchestration resources on cloud platforms. Deploying and managing ML models especially on GCP (cloud platform agnostic though) services such as Cloud Run, Cloud Functions, and Vertex AI. Implementing MLOps best practices, including model version tracking, governance, and monitoring for performance degradation and drift. Creating and using benchmarks, metrics, and monitoring to measure and improve services Collaborating with data scientists and engineers to integrate ML workflows from onboarding to decommissioning. Experience with MLOps tools like Kubeflow, MLflow, and Data Version Control (DVC). Manage ML models on any of the following: AWS (SageMaker), Azure (Machine Learning), and GCP (Vertex AI). Tech Stack : Aws or GCP or Azure Experience. (More GCP Specific) must have done Py spark, Databricks is good. ML Experience, Docker and Kubernetes.

Posted 1 week ago

Apply

2.0 - 7.0 years

11 - 15 Lacs

Bengaluru

Work from Office

Naukri logo

About The Role Job TitleAI Scientist Position Overview: We are seeking a talented and experienced Core AI Algorithm Developer to join our Lab45 AI Platform Team, Wipro. We are looking for candidates with 4 to 10 years of hands-on experience in developing cutting-edge AI algorithms, such as in Generative AI, LLM, Deep Learning, Unsupervised AI, etc. along with expertise in Python, TensorFlow, PyTorch, PySpark, distributed computing, Statistics, and cloud technologies. Candidate should have strong foundation in AI and good coding skills. Key Responsibilities: Develop and implement state-of-the-art AI algorithms and models to solve complex problems in diverse domains. Collaborate with cross-functional teams to understand business requirements and translate them into scalable AI production-grade solutions. Work with large datasets to extract insights, optimize algorithms, and enhance model performance. Contribute to the creation of intellectual property (IP) through patents, research papers, and innovative solutions. Stay abreast of the latest advancements in AI research and technologies and apply them to enhance our AI offerings. Collaborate with cross-functional teams to understand business requirements, gather feedback, and iterate on AI solutions. Strong communication and interpersonal skills, with the ability to effectively collaborate with cross-functional teams. Qualifications: Master's, or Ph.D. degree (preferred) in Computer Science, Artificial Intelligence, Machine Learning, or related field. 4 to 10 years of proven experience in developing cutting-edge AI algorithms and solutions. Strong proficiency in Python programming and familiarity with TensorFlow, PyTorch, PySpark, etc. Experience with distributed computing and cloud platforms (e.g., Azure, AWS, GCP). Demonstrated ability to work with large datasets and optimize algorithms for scalability and efficiency. Excellent problem-solving skills and a strong understanding of AI concepts and techniques. Proven track record of delivering high-quality, innovative solutions and contributing to IP creation (e.g., patents, research papers). Strong communication and collaboration skills, with the ability to work effectively in a team environment.

Posted 1 week ago

Apply

3.0 - 9.0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

Linkedin logo

||Greetings from TCS|| We are looking for Big data candidates for Mumbai/Pune/Chennai and Bangalore location. Exp- 3 to 9 years. Must-Have • PySpark • Hive Good-to-Have • Spark • HBase • DQ tool • Agile Scrum experience • Exposure in data ingestion from disparate sources onto Big Data platform Show more Show less

Posted 1 week ago

Apply

4.0 - 7.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Linkedin logo

Job Category: Data Engineer Job Type: Hybrid Job Location: Bangalore Job Experience: 4-7 years We are seeking a Data Engineer to join our growing team. The Data Engineer will be responsible for designing, developing, and maintaining our ETL pipelines and managing our database systems. The ideal candidate should have a strong background in SQL, database design, and ETL processes. Key Responsibilities Responsibilities for the job Analyse the different source systems, profile data, understand, document & fix Data Quality issues. Gather requirements and business process knowledge to transform the data in a way that is geared towards the needs of end users. Write complex SQLs to extract & format source data for ETL/data pipeline. Design, implement, and maintain systems that collect and analyze business intelligence data. Design and architect an analytical data store or cluster for the enterprise and implement data pipelines that extract, transform, and load data into an information product that helps the organization reach strategic goals. Create design documents, Source to Target Mapping documents and any supporting documents needed for deployment/migration. Design, Develop and Test ETL/Data pipelines. Design & build metadata-based frameworks needs for data pipelines. Write Unit Test cases, execute Unit Testing and document Unit Test results. Manage and maintain the database, warehouse, & cluster with other dependent infrastructure. Expertise in managing and optimizing Spark clusters, along with other implementations of Spark. Strong programming skills in Python and Py-spark. Strong proficiency in SQL and experience with relational databases (PostgreSQL, MySQL, Oracle, etc.) and NoSQL databases (MongoDB, Cassandra, DynamoDB). Knowledge of data modelling techniques such as star/snowflake, data vault, etc. Knowledge of semantic modelling. Strong problem-solving skills Be able to hone business acumen with a capacity for straddling between macro business strategy to micro tangible data and AI products. Perform data cleaning, transformation, and validation to ensure accuracy and consistency across various data sources. Technologies preferred: Azure, Databricks Eligibility Criteria for the Job Education B.E/B.Tech in any specialization, BCA, MTech in any specialization, MCA. Primary Skill SQL Databricks Any one of the cloud experiences (AWS, Azure, GCP). Python, Pyspark Management Skills Ability to handle given tasks and projects simultaneously in an organized and timely manner. Soft Skills Good communication skills, verbal and written. Attention to details. Positive attitude and confidence. Show more Show less

Posted 1 week ago

Apply

0.0 - 2.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Linkedin logo

The Data Analytics Analyst 2 is a developing professional role. Applies specialty area knowledge in monitoring, assessing, analyzing and/or evaluating processes and data. Identifies policy gaps and formulates policies. Interprets data and makes recommendations. Researches and interprets factual information. Identifies inconsistencies in data or results, defines business issues and formulates recommendations on policies, procedures or practices. Integrates established disciplinary knowledge within own specialty area with basic understanding of related industry practices. Good understanding of how the team interacts with others in accomplishing the objectives of the area. Develops working knowledge of industry practices and standards. Limited but direct impact on the business through the quality of the tasks/services provided. Impact of the job holder is restricted to own team. Responsibilities: Identifies policy gaps and formulates policies. Interprets data and make recommendations. Integrates established disciplinary knowledge within own specialty area with basic understanding of related industry practices. Makes judgments and recommendations based on analysis and specialty area knowledge. Researches and interprets factual information. Identifies inconsistencies in data or results, define business issues and formulate recommendations on policies, procedures or practices. Exchanges information in a concise and logical way as well as be sensitive to audience diversity. Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency. Qualifications: 0-2 years experience using tools for statistical modeling of large data sets Education: Bachelor’s/University degree or equivalent experience This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required. 2-4 years of experience as a python/Java developer with expertise in automation testing to design, develop, and automate robust software solutions and testing frameworks like Pytest, Behave etc. 2-4 years of experience as Big Data Engineer to develop, optimize, and manage large-scale data processing systems and analytics platforms. 3-4 years of experience in distributed data processing & near real-time data analytics using Springboot/PySpark. 2-5 years of experience in designing and automating Kafka-based messaging systems for real-time data streaming. Familiarity with CI/CD pipelines, version control systems (e.g., Git), and DevOps practices. ------------------------------------------------------ Job Family Group: Technology ------------------------------------------------------ Job Family: Data Analytics ------------------------------------------------------ Time Type: Full time ------------------------------------------------------ Most Relevant Skills Please see the requirements listed above. ------------------------------------------------------ Other Relevant Skills For complementary skills, please see above and/or contact the recruiter. ------------------------------------------------------ Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law. If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi. View Citi’s EEO Policy Statement and the Know Your Rights poster. Show more Show less

Posted 1 week ago

Apply

6.0 - 10.0 years

25 - 32 Lacs

Pune, Chennai

Hybrid

Naukri logo

please fill the below details and share it to snidafazli@altimetrik.com Company: Altimetrik Client: Citi Bank JD: Python, Pyspark, SQL, Django, AWS, Flask, Docker Name(as per aadhar card): Number: EmailID: Current CTC: Fixed CTC: Expected CTC: holding any offers: Current Company: Payroll Company: Notice PEriod: Mention exact LWD: Current Location: Preferred Location: Total Experience: Relevant Experience please mention in years below, Python: Pyspark: SQL: Django: AWS: Flask: Docker: GenAI: ML:

Posted 1 week ago

Apply

Exploring PySpark Jobs in India

PySpark, a powerful data processing framework built on top of Apache Spark and Python, is in high demand in the job market in India. With the increasing need for big data processing and analysis, companies are actively seeking professionals with PySpark skills to join their teams. If you are a job seeker looking to excel in the field of big data and analytics, exploring PySpark jobs in India could be a great career move.

Top Hiring Locations in India

Here are 5 major cities in India where companies are actively hiring for PySpark roles: 1. Bangalore 2. Pune 3. Hyderabad 4. Mumbai 5. Delhi

Average Salary Range

The estimated salary range for PySpark professionals in India varies based on experience levels. Entry-level positions can expect to earn around INR 6-8 lakhs per annum, while experienced professionals can earn upwards of INR 15 lakhs per annum.

Career Path

In the field of PySpark, a typical career progression may look like this: 1. Junior Developer 2. Data Engineer 3. Senior Developer 4. Tech Lead 5. Data Architect

Related Skills

In addition to PySpark, professionals in this field are often expected to have or develop skills in: - Python programming - Apache Spark - Big data technologies (Hadoop, Hive, etc.) - SQL - Data visualization tools (Tableau, Power BI)

Interview Questions

Here are 25 interview questions you may encounter when applying for PySpark roles:

  • Explain what PySpark is and its main features (basic)
  • What are the advantages of using PySpark over other big data processing frameworks? (medium)
  • How do you handle missing or null values in PySpark? (medium)
  • What is RDD in PySpark? (basic)
  • What is a DataFrame in PySpark and how is it different from an RDD? (medium)
  • How can you optimize performance in PySpark jobs? (advanced)
  • Explain the difference between map and flatMap transformations in PySpark (basic)
  • What is the role of a SparkContext in PySpark? (basic)
  • How do you handle schema inference in PySpark? (medium)
  • What is a SparkSession in PySpark? (basic)
  • How do you join DataFrames in PySpark? (medium)
  • Explain the concept of partitioning in PySpark (medium)
  • What is a UDF in PySpark? (medium)
  • How do you cache DataFrames in PySpark for optimization? (medium)
  • Explain the concept of lazy evaluation in PySpark (medium)
  • How do you handle skewed data in PySpark? (advanced)
  • What is checkpointing in PySpark and how does it help in fault tolerance? (advanced)
  • How do you tune the performance of a PySpark application? (advanced)
  • Explain the use of Accumulators in PySpark (advanced)
  • How do you handle broadcast variables in PySpark? (advanced)
  • What are the different data sources supported by PySpark? (medium)
  • How can you run PySpark on a cluster? (medium)
  • What is the purpose of the PySpark MLlib library? (medium)
  • How do you handle serialization and deserialization in PySpark? (advanced)
  • What are the best practices for deploying PySpark applications in production? (advanced)

Closing Remark

As you explore PySpark jobs in India, remember to prepare thoroughly for interviews and showcase your expertise confidently. With the right skills and knowledge, you can excel in this field and advance your career in the world of big data and analytics. Good luck!

cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies