Home
Jobs

4025 Pyspark Jobs - Page 28

Filter Interviews
Min: 0 years
Max: 25 years
Min: ₹0
Max: ₹10000000
Setup a job Alert
Filter
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

0 years

0 Lacs

Mumbai, Maharashtra, India

On-site

Linkedin logo

Introduction In this role, you'll work in one of our IBM Consulting Client Innovation Centers (Delivery Centers), where we deliver deep technical and industry expertise to a wide range of public and private sector clients around the world. Our delivery centers offer our clients locally based skills and technical expertise to drive innovation and adoption of new technology. Your Role And Responsibilities As a Data Engineer at IBM, you'll play a vital role in the development, design of application, provide regular support/guidance to project teams on complex coding, issue resolution and execution. Your primary responsibilities include: Lead the design and construction of new solutions using the latest technologies, always looking to add business value and meet user requirements. Strive for continuous improvements by testing the build solution and working under an agile framework. Discover and implement the latest technologies trends to maximize and build creative solutions Preferred Education Master's Degree Required Technical And Professional Expertise Experience with Apache Spark (PySpark): In-depth knowledge of Spark’s architecture, core APIs, and PySpark for distributed data processing. Big Data Technologies: Familiarity with Hadoop, HDFS, Kafka, and other big data tools. Data Engineering Skills: Strong understanding of ETL pipelines, data modeling, and data warehousing concepts. Strong proficiency in Python: Expertise in Python programming with a focus on data processing and manipulation. Data Processing Frameworks: Knowledge of data processing libraries such as Pandas, NumPy. SQL Proficiency: Experience writing optimized SQL queries for large-scale data analysis and transformation. Cloud Platforms: Experience working with cloud platforms like AWS, Azure, or GCP, including using cloud storage systems Preferred Technical And Professional Experience Define, drive, and implement an architecture strategy and standards for end-to-end monitoring. Partner with the rest of the technology teams including application development, enterprise architecture, testing services, network engineering, Good to have detection and prevention tools for Company products and Platform and customer-facing Show more Show less

Posted 6 days ago

Apply

7.0 years

0 Lacs

Mumbai, Maharashtra, India

On-site

Linkedin logo

Job Description The candidate must possess knowledge relevant to the functional area, and act as a subject matter expert in providing advice in the area of expertise, and also focus on continuous improvement for maximum efficiency. It is vital to focus on the high standard of delivery excellence, provide top-notch service quality and develop successful long-term business partnerships with internal/external customers by identifying and fulfilling customer needs. He/she should be able to break down complex problems into logical and manageable parts in a systematic way, and generate and compare multiple options, and set priorities to resolve problems. The ideal candidate must be proactive, and go beyond expectations to achieve job results and create new opportunities. He/she must positively influence the team, motivate high performance, promote a friendly climate, give constructive feedback, provide development opportunities, and manage career aspirations of direct reports. Communication skills are key here, to explain organizational objectives, assignments, and the big picture to the team, and to articulate team vision and clear objectives. Senior Process Manager Roles And Responsibilities We are seeking a talented and motivated Data Engineer to join our dynamic team. The ideal candidate will have a deep understanding of data integration processes and experience in developing and managing data pipelines using Python, SQL, and PySpark within Databricks. You will be responsible for designing robust backend solutions, implementing CI/CD processes, and ensuring data quality and consistency. Data Pipeline Development: Using Data bricks features to explore raw datasets and understand their structure. Creating and optimizing Spark-based workflows. Create end-to-end data processing pipelines, including ingesting raw data, transforming it, and running analyses on the processed data. Create and maintain data pipelines using Python and SQL. Solution Design and Architecture: Design and architect backend solutions for data integration, ensuring they are robust, scalable, and aligned with business requirements. Implement data processing pipelines using various technologies, including cloud platforms, big data tools, and streaming frameworks. Automation and Scheduling: Automate data integration processes and schedule jobs on servers to ensure seamless data flow. Data Quality and Monitoring: Develop and implement data quality checks and monitoring systems to ensure data accuracy and consistency. CI/CD Implementation: Use Jenkins and Bit bucket to create and maintain metadata and job files. Implement continuous integration and continuous deployment (CI/CD) processes in both development and production environments to deploy data pipelines efficiently. Collaboration and Documentation: Work effectively with cross-functional teams, including software engineers, data scientists, and DevOps, to ensure successful project delivery. Document data pipelines and architecture to ensure knowledge transfer and maintainability. Participate in stakeholder interviews, workshops, and design reviews to define data models, pipelines, and workflows. Technical And Functional Skills Education and Experience: Bachelor’s Degree with 7+ years of experience, including at least 3+ years of hands-on experience in SQL/ and Python. Technical Proficiency: Proficiency in writing and optimizing SQL queries in MySQL and SQL Server. Expertise in Python for writing reusable components and enhancing existing ETL scripts. Solid understanding of ETL concepts and data pipeline architecture, including CDC, incremental loads, and slowly changing dimensions (SCDs). Hands-on experience with PySpark. Knowledge and experience with using Data bricks will be a bonus. Familiarity with data warehousing solutions and ETL processes. Understanding of data architecture and backend solution design. Cloud and CI/CD Experience: Experience with cloud platforms such as AWS, Azure, or Google Cloud. Familiarity with Jenkins and Bit bucket for CI/CD processes. Additional Skills: Ability to work independently and manage multiple projects simultaneously. About Us At eClerx, we serve some of the largest global companies – 50 of the Fortune 500 clients. Our clients call upon us to solve their most complex problems, and deliver transformative insights. Across roles and levels, you get the opportunity to build expertise, challenge the status quo, think bolder, and help our clients seize value About The Team eClerx is a global leader in productized services, bringing together people, technology and domain expertise to amplify business results. Our mission is to set the benchmark for client service and success in our industry. Our vision is to be the innovation partner of choice for technology, data analytics and process management services. Since our inception in 2000, we've partnered with top companies across various industries, including financial services, telecommunications, retail, and high-tech. Our innovative solutions and domain expertise help businesses optimize operations, improve efficiency, and drive growth. With over 18,000 employees worldwide, eClerx is dedicated to delivering excellence through smart automation and data-driven insights. At eClerx, we believe in nurturing talent and providing hands-on experience. eClerx is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability or protected veteran status, or any other legally protected basis, per applicable law. Show more Show less

Posted 6 days ago

Apply

10.0 - 15.0 years

12 - 18 Lacs

Hyderabad, Gurugram, Bengaluru

Work from Office

Naukri logo

Location:Bangalore/Gurgaon/Hyderabad/Mumbai Must have skills: Must have skills:Data Scientist / Transformation Leader & at least 5 years in Telecom Analytics Good to have skills:GEN AI, Agentic AI, Job Summary : About Global Network Data & AI:- Accenture Strategy & Consulting Global Network - Data & AI practice help our clients grow their business in entirely new ways. Analytics enables our clients to achieve high performance through insights from data - insights that inform better decisions and strengthen customer relationships. From strategy to execution, Accenture works with organizations to develop analytic capabilities - from accessing and reporting on data to predictive modelling - to outperform the competition About Comms & Media practice: Comms & Media (C&M) is one of the Industry Practices within Accentures S&C Global Network team. It focuses in serving clients across specific Industries Communications, Media & Entertainment. Communications Focuses primarily on industries related with telecommunications and information & communication technology (ICT). This team serves most of the worlds leading wireline, wireless, cable and satellite communications and service providers Media & Entertainment Focuses on industries like broadcast, entertainment, print and publishing Globally, Accenture Comms & Media practice works to develop value growth strategies for its clients and infuse AI & GenAI to help deliver top their business imperatives i.e., revenue growth & cost reduction. From multi-year Data & AI transformation projects to shorter more agile engagements, we have a rapidly expanding portfolio of hyper-growth clients and an increasing footprint with next-gen solutions and industry practices. Roles & Responsibilities: A Telco domain experienced and data science consultant is responsible to help the clients with designing & delivering AI solutions. He/she should be strong in Telco domain, AI fundamentals and should have good hands-on experience working with the following: Ability to work with large data sets and present conclusions to key stakeholders; Data management using SQL. Propose solutions to the client based on gap analysis for the existing Telco platforms that can generate long term & sustainable value to the client. Gather business requirements from client stakeholders via interactions like interviews and workshops with all stakeholders Track down and read all previous information on the problem or issue in question. Explore obvious and known avenues thoroughly. Ask a series of probing questions to get to the root of a problem. Ability to understand the as-is process; understand issues with the processes which can be resolved either through Data & AI or process solutions and design detail level to-be state Understand customer needs and identify/translate them to business requirements (business requirement definition), business process flows and functional requirements and be able to inform the best approach to the problem. Adopt a clear and systematic approach to complex issues (i.e. A leads to B leads to C). Analyze relationships between several parts of a problem or situation. Anticipate obstacles and identify a critical path for a project. Independently able to deliver products and services that empower clients to implement effective solutions. Makes specific changes and improvements to processes or own work to achieve more. Work with other team members and make deliberate efforts to keep others up to date. Establish a consistent and collaborative presence with clients and act as the primary point of contact for assigned clients; escalate, track, and solve client issues. Partner with clients to understand end clients business goals, marketing objectives, and competitive constraints. Storytelling Crunch the data & numbers to craft a story to be presented to senior client stakeholders. Professional & Technical Skills: Overall 10+ years of experience in Data Science & at least 5 years in Telecom Analytics Masters (MBA/MSc/MTech) from a Tier 1/Tier 2 and Engineering from Tier 1 school Demonstrated experience in solving real-world data problems through Data & AI Direct onsite experience (i.e., experience of facing client inside client offices in India or abroad) is mandatory. Please note we are looking for client facing roles. Proficiency with data mining, mathematics, and statistical analysis Advanced pattern recognition and predictive modeling experience; knowledge of Advanced analytical fields in text mining, Image recognition, video analytics, IoT etc. Execution level understanding of econometric/statistical modeling packages Traditional techniques like Linear/logistic regression, multivariate statistical analysis, time series techniques, fixed/Random effect modelling. Machine learning techniques like - Random Forest, Gradient Boosting, XG boost, decision trees, clustering etc. Knowledge of Deep learning modeling techniques like RNN, CNN etc. Experience using digital & statistical modeling software (one or more) Python, R, PySpark, SQL, BigQuery, Vertex AI Proficient in Excel, MS word, Power point, and corporate soft skills Knowledge of Dashboard creation platforms Excel, tableau, Power BI etc. Excellent written and oral communication skills with ability to clearly communicate ideas and results to non-technical stakeholders. Strong analytical, problem-solving skills and good communication skills Self-Starter with ability to work independently across multiple projects and set priorities Strong team player Proactive and solution oriented, able to guide junior team members. Execution knowledge of optimization techniques is a good-to-have Exact optimization Linear, Non-linear optimization techniques Evolutionary optimization Both population and search-based algorithms Cloud platform Certification, experience in Computer Vision are good-to-haves Qualification Experience: Overall 10+ years of experience in Data Science & at least 5 years in Telecom Educational Qualification: Masters (MBA/MSc/MTech) from a Tier 1/Tier 2 and Engineering from Tier 1 school

Posted 6 days ago

Apply

3.0 years

0 Lacs

Pune, Maharashtra, India

On-site

Linkedin logo

Company Overview Viraaj HR Solutions is dedicated to connecting top talent with forward-thinking companies. Our mission is to provide exceptional talent acquisition services while fostering a culture of trust, integrity, and collaboration. We prioritize our clients' needs and work tirelessly to ensure the ideal candidate-job match. Join us in our commitment to excellence and become part of a dynamic team focused on driving success for individuals and organizations alike. Role Responsibilities Design, develop, and implement data pipelines using Azure Data Factory. Create and maintain data models for structured and unstructured data. Extract, transform, and load (ETL) data from various sources into data warehouses. Develop analytical solutions and dashboards using Azure Databricks. Perform data integration and migration tasks with Azure tools. Ensure optimal performance and scalability of data solutions. Collaborate with cross-functional teams to understand data requirements. Utilize SQL Server for database management and data queries. Implement data quality checks and ensure data integrity. Work on data governance and compliance initiatives. Monitor and troubleshoot data pipeline issues to ensure reliability. Document data processes and architecture for future reference. Stay current with industry trends and Azure advancements. Train and mentor junior data engineers and team members. Participate in design reviews and provide feedback for process improvements. Qualifications Bachelor's degree in Computer Science, Information Technology, or a related field. 3+ years of experience in a data engineering role. Strong expertise in Azure Data Factory and Azure Databricks. Proficient in SQL for data manipulation and querying. Experience with data warehousing concepts and practices. Familiarity with ETL tools and processes. Knowledge of Python or other programming languages for data processing. Ability to design scalable cloud architecture. Experience with data modeling and database design. Effective communication and collaboration skills. Strong analytical and problem-solving abilities. Familiarity with performance tuning and optimization techniques. Knowledge of data visualization tools is a plus. Experience with Agile methodologies. Ability to work independently and manage multiple tasks. Willingness to learn and adapt to new technologies. Skills: etl,azure databricks,sql server,azure,data governance,azure data factory,python,data warehousing,data engineer,data integration,performance tuning,python scripting,sql,data modeling,data migration,data visualization,analytical solutions,pyspark,agile methodologies,data quality checks Show more Show less

Posted 6 days ago

Apply

5.0 - 10.0 years

16 - 25 Lacs

Hyderabad, Bengaluru

Work from Office

Naukri logo

PySpark Data Engineer:- Job Description: 1. API Development : Design, develop, and maintain robust APIs using FastAPI and RESTful principles for scalable backend systems. 2. Big Data Processing : Leverage PySpark to process and analyze large datasets efficiently, ensuring optimal performance in big data environments. 3. Full-Stack Integration : Develop seamless backend-to-frontend feature integrations, collaborating with front-end developers for cohesive user experiences. 4. CI/CD Pipelines : Implement and manage CI/CD pipelines using GitHub Actions and Azure DevOps to streamline deployments and ensure system reliability. 5. Containerization : Utilize Docker for building and deploying containerized applications in development and production environments. 6. Team Leadership : Lead and mentor a team of developers, providing guidance, code reviews, and support to junior team members to ensure high-quality deliverables. 7. Code Optimization : Write clean, maintainable, and efficient Python code, with a focus on scalability, reusability, and performance. 8. Cloud Deployment : Deploy and manage applications on cloud platforms like Azure , ensuring high availability and fault tolerance. 9. Collaboration : Work closely with cross-functional teams, including product managers and designers, to translate business requirements into technical solutions. 10. Documentation : Maintain thorough documentation for APIs, processes, and systems to ensure transparency and ease of maintenance Highlighted Skillset:- Big Data : Strong PySpark skills for processing large datasets. DevOps : Proficiency in GitHub Actions , CI/CD pipelines , Azure DevOps , and Docker . Integration : Experience in backend-to-frontend feature connectivity. Leadership : Proven ability to lead and mentor development teams. Cloud : Knowledge of deploying and managing applications in Azure or other cloud environments. Team Collaboration : Strong interpersonal and communication skills for working in cross-functional teams. Best Practices : Emphasis on clean code, performance optimization, and robust documentation Share updated resume at siddhi.pandey@adecco.com or whatsapp at 6366783349

Posted 6 days ago

Apply

4.0 - 9.0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

Linkedin logo

Experience: 4 to 9 Years Required Skills Python SQL Pyspark AWS (Knowledge) Data-bricks (Good to have) Responsibilities Data Pipeline Development Data Integration and Transformation Performance Optimization Automation and Workflow Management Data Quality and Validation Cloud Platform Management Migration and Upgrades Cost Optimization Data Security and Compliance Collaboration and Support Life sciences/Pharma (Added advantage) Show more Show less

Posted 6 days ago

Apply

0 years

0 Lacs

Gurugram, Haryana, India

On-site

Linkedin logo

Gurgaon/Bangalore, India AXA XL recognizes data and information as critical business assets, both in terms of managing risk and enabling new business opportunities. This data should not only be high quality, but also actionable - enabling AXA XL’s executive leadership team to maximize benefits and facilitate sustained industrious advantage. Our Chief Data Office also known as our Innovation, Data Intelligence & Analytics team (IDA) is focused on driving innovation through optimizing how we leverage data to drive strategy and create a new business model - disrupting the insurance market. As we develop an enterprise-wide data and digital strategy that moves us toward greater focus on the use of data and data-driven insights, we are seeking an Data Engineer. The role will support the team’s efforts towards creating, enhancing, and stabilizing the Enterprise data lake through the development of the data pipelines. This role requires a person who is a team player and can work well with team members from other disciplines to deliver data in an efficient and strategic manner . What You’ll Be DOING What will your essential responsibilities include? Act as a data engineering expert and partner to Global Technology and data consumers in controlling complexity and cost of the data platform, whilst enabling performance, governance, and maintainability of the estate. Understand current and future data consumption patterns, architecture (granular level), partner with Architects to ensure optimal design of data layers. Apply best practices in Data architecture. For example, balance between materialization and virtualization, optimal level of de-normalization, caching and partitioning strategies, choice of storage and querying technology, performance tuning. Leading and hands-on execution of research into new technologies. Formulating frameworks for assessment of new technology vs business benefit, implications for data consumers. Act as a best practice expert, blueprint creator of ways of working such as testing, logging, CI/CD, observability, release, enabling rapid growth in data inventory and utilization of Data Science Platform. Design prototypes and work in a fast-paced iterative solution delivery model. Design, Develop and maintain ETL pipelines using Pyspark in Azure Databricks using delta tables. Use Harness for deployment pipeline. Monitor Performance of ETL Jobs, resolve any issue that arose and improve the performance metrics as needed. Diagnose system performance issue related to data processing and implement solution to address them. Collaborate with other teams to ensure successful integration of data pipelines into larger system architecture requirement. Maintain integrity and quality across all pipelines and environments. Understand and follow secure coding practice to make sure code is not vulnerable. You will report to Technical Lead. What You Will BRING We’re looking for someone who has these abilities and skills: Required Skills And Abilities Effective Communication skills. Bachelor’s degree in computer science, Mathematics, Statistics, Finance, related technical field, or equivalent work experience. Relevant years of extensive work experience in various data engineering & modeling techniques (relational, data warehouse, semi-structured, etc.), application development, advanced data querying skills. Relevant years of programming experience using Databricks. Relevant years of experience using Microsoft Azure suite of products (ADF, synapse and ADLS). Solid knowledge on network and firewall concepts. Solid experience writing, optimizing and analyzing SQL. Relevant years of experience with Python. Ability to break complex data requirements and architect solutions into achievable targets. Robust familiarity with Software Development Life Cycle (SDLC) processes and workflow, especially Agile. Experience using Harness. Technical lead responsible for both individual and team deliveries. Desired Skills And Abilities Worked in big data migration projects. Worked on performance tuning both at database and big data platforms. Ability to interpret complex data requirements and architect solutions. Distinctive problem-solving and analytical skills combined with robust business acumen. Excellent basics on parquet files and delta files. Effective Knowledge of Azure cloud computing platform. Familiarity with Reporting software - Power BI is a plus. Familiarity with DBT is a plus. Passion for data and experience working within a data-driven organization. You care about what you do, and what we do. Who WE are AXA XL, the P&C and specialty risk division of AXA, is known for solving complex risks. For mid-sized companies, multinationals and even some inspirational individuals we don’t just provide re/insurance, we reinvent it. How? By combining a comprehensive and efficient capital platform, data-driven insights, leading technology, and the best talent in an agile and inclusive workspace, empowered to deliver top client service across all our lines of business − property, casualty, professional, financial lines and specialty. With an innovative and flexible approach to risk solutions, we partner with those who move the world forward. Learn more at axaxl.com What we OFFER Inclusion AXA XL is committed to equal employment opportunity and will consider applicants regardless of gender, sexual orientation, age, ethnicity and origins, marital status, religion, disability, or any other protected characteristic. At AXA XL, we know that an inclusive culture and enables business growth and is critical to our success. That’s why we have made a strategic commitment to attract, develop, advance and retain the most inclusive workforce possible, and create a culture where everyone can bring their full selves to work and reach their highest potential. It’s about helping one another — and our business — to move forward and succeed. Five Business Resource Groups focused on gender, LGBTQ+, ethnicity and origins, disability and inclusion with 20 Chapters around the globe. Robust support for Flexible Working Arrangements Enhanced family-friendly leave benefits Named to the Diversity Best Practices Index Signatory to the UK Women in Finance Charter Learn more at axaxl.com/about-us/inclusion-and-diversity. AXA XL is an Equal Opportunity Employer. Total Rewards AXA XL’s Reward program is designed to take care of what matters most to you, covering the full picture of your health, wellbeing, lifestyle and financial security. It provides competitive compensation and personalized, inclusive benefits that evolve as you do. We’re committed to rewarding your contribution for the long term, so you can be your best self today and look forward to the future with confidence. Sustainability At AXA XL, Sustainability is integral to our business strategy. In an ever-changing world, AXA XL protects what matters most for our clients and communities. We know that sustainability is at the root of a more resilient future. Our 2023-26 Sustainability strategy, called “Roots of resilience”, focuses on protecting natural ecosystems, addressing climate change, and embedding sustainable practices across our operations. Our Pillars Valuing nature: How we impact nature affects how nature impacts us. Resilient ecosystems - the foundation of a sustainable planet and society - are essential to our future. We’re committed to protecting and restoring nature - from mangrove forests to the bees in our backyard - by increasing biodiversity awareness and inspiring clients and colleagues to put nature at the heart of their plans. Addressing climate change: The effects of a changing climate are far-reaching and significant. Unpredictable weather, increasing temperatures, and rising sea levels cause both social inequalities and environmental disruption. We're building a net zero strategy, developing insurance products and services, and mobilizing to advance thought leadership and investment in societal-led solutions. Integrating ESG: All companies have a role to play in building a more resilient future. Incorporating ESG considerations into our internal processes and practices builds resilience from the roots of our business. We’re training our colleagues, engaging our external partners, and evolving our sustainability governance and reporting. AXA Hearts in Action: We have established volunteering and charitable giving programs to help colleagues support causes that matter most to them, known as AXA XL’s “Hearts in Action” programs. These include our Matching Gifts program, Volunteering Leave, and our annual volunteering day - the Global Day of Giving. For more information, please see axaxl.com/sustainability. Show more Show less

Posted 6 days ago

Apply

4.0 years

0 Lacs

Gurugram, Haryana, India

On-site

Linkedin logo

Achieving our goals starts with supporting yours. Grow your career, access top-tier health and wellness benefits, build lasting connections with your team and our customers, and travel the world using our extensive route network. Come join us to create what’s next. Let’s define tomorrow, together. Description United's Digital Technology team designs, develops, and maintains massively scaling technology solutions brought to life with innovative architectures, data analytics, and digital solutions. Find your future at United! We’re reinventing what our industry looks like, and what an airline can be – from the planes we fly to the people who fly them. When you join us, you’re joining a global team of 100,000+ connected by a shared passion with a wide spectrum of experience and skills to lead the way forward. Achieving our ambitions starts with supporting yours. Evolve your career and find your next opportunity. Get the care you need with industry-leading health plans and best-in-class programs to support your emotional, physical, and financial wellness. Expand your horizons with travel across the world’s biggest route network. Connect outside your team through employee-led Business Resource Groups. Create what’s next with us. Let’s define tomorrow together. Job Overview And Responsibilities The Data Scientist will work on United's predictive maintenance use cases. This will include collaboration with aircraft engineers to plan installation of new aircraft sensors and avionics, enhancements to existing aircraft monitoring software, deployment of new predictive models, collaboration with OEMs, integration of health monitoring into United's approved maintenance programs, and technology to provide aircraft data directly to technicians. Liason with Stakeholders in the Aircraft Health Monitoring team, Aircraft Engineering and other Data Scientists to understand the importance of several Predictive Mx use case, their executive feasibility and do a cost benefit analysis to identify the highest impact use cases that need to be prioritized Work on complex Sensor data coming out of AC systems like ACARS and QAR and be able to parse out relevant information Analyze the AC Maintenance and Operations data that to identify issues with critical A/C systems Build, Test, Deploy and Monitor complex Machine learning algorithms that predict AC failures Maintain and enhance existing model deployments to make sure they stay valuable and relevant Mentor junior members of the team and stay up-to-date with ML advances This position is offered on local terms and conditions. Expatriate assignments and sponsorship for employment visas, even on a time-limited visa status, will not be awarded. This position is for United Airlines Business Services Pvt. Ltd - a wholly owned subsidiary of United Airlines Inc. Qualifications What’s needed to succeed (Minimum Qualifications): Bachelor's degree in a quantitative field Quantitative field like Math, Statistics, Operations Research, Computer Science, Engineering, Physics, or related fields At least 4 years of experience in Data Science/Machine Learning/advanced analytics/optimization Deep understanding of data structures, relationships, and efficient transformations Familiarity with all parts of the data ecosystem -- acquisition, engineering, storage, management, analysis, visualization, and model deployment and monitoring Strong Knowledge of database querying and being able to write complex queries to extract data Fluent skill in python/pyspark Strong knowledge of Statistical Modelling, feature engineering, testing Ability to propose novel modeling approaches based on the problem being solved Must be legally authorized to work in India for any employer without sponsorship Must be fluent in English (written and spoken) Successful completion of interview required to meet job qualification Reliable, punctual attendance is an essential function of the position What will help you propel from the pack (Preferred Qualifications): Master's degree in a quantitative field At least 2 years experience in a project leadership role preferred Airline experience or knowledge of airline operations preferred Experience with anomaly detection, imbalanced classification, and time series analysis is strongly preferred Able to comprehend new data sources quickly and ask the right questions Quickly get comfortable with new jargon GGN00001950 Show more Show less

Posted 6 days ago

Apply

2.0 years

0 Lacs

Gurugram, Haryana, India

On-site

Linkedin logo

Achieving our goals starts with supporting yours. Grow your career, access top-tier health and wellness benefits, build lasting connections with your team and our customers, and travel the world using our extensive route network. Come join us to create what’s next. Let’s define tomorrow, together. Description Description - External United's Kinective Media Data Engineering team designs, develops, and maintains massively scaling ad- technology solutions brought to life with innovative architectures, data analytics, and digital solutions. Our Values : At United Airlines, we believe that inclusion propels innovation and is the foundation of all that we do. Our Shared Purpose: "Connecting people. Uniting the world." drives us to be the best airline for our employees, customers, and everyone we serve, and we can only do that with a truly diverse and inclusive workforce. Our team spans the globe and is made up of diverse individuals all working together with cutting-edge technology to build the best airline in the history of aviation. With multiple employee-run "Business Resource Group" communities and world-class benefits like health insurance, parental leave, and space available travel, United is truly a one-of-a-kind place to work that will make you feel welcome and accepted. Come join our team and help us make a positive impact on the world. Job Overview And Responsibilities Data Engineering organization is responsible for driving data driven insights & innovation to support the data needs for commercial projects with a digital focus. Data Engineer will be responsible to partner with various teams to define and execute data acquisition, transformation, processing and make data actionable for operational and analytics initiatives that create sustainable revenue and share growth. Execute unit tests and validating expected results to ensure accuracy & integrity of data and applications through analysis, coding, writing clear documentation and problem resolution. This role will also drive the adoption of data processing and analysis within the AWS environment and help cross train other members of the team. Leverage strategic and analytical skills to understand and solve customer and business centric questions. Coordinate and guide cross-functional projects that involve team members across all areas of the enterprise, vendors, external agencies and partners Leverage data from a variety of sources to develop data marts and insights that provide a comprehensive understanding of the business. Develop and implement innovative solutions leading to automation Use of Agile methodologies to manage projects Mentor and train junior engineers. This position is offered on local terms and conditions. Expatriate assignments and sponsorship for employment visas, even on a time-limited visa status, will not be awarded. This position is for United Airlines Business Services Pvt. Ltd - a wholly owned subsidiary of United Airlines Inc. Qualifications Qualifications - External Required BS/BA, in computer science or related STEM field 2+ years of IT experience in software development 2+ years of development experience using Java, Python, Scala 2+ years of experience with Big Data technologies like PySpark, Hadoop, Hive, HBASE, Kafka, Nifi 2+ years of experience with database systems like redshift,MS SQL Server, Oracle, Teradata. Creative, driven, detail-oriented individuals who enjoy tackling tough problems with data and insights Individuals who have a natural curiosity and desire to solve problems are encouraged to apply 2+ years of IT experience in software development 2+ years of development experience using Java, Python, Scala Must be legally authorized to work in India for any employer without sponsorship Successful completion of interview required to meet job qualification Reliable, punctual attendance is an essential function of the position Must be legally authorized to work in India for any employer without sponsorship Must be fluent in English (written and spoken) Successful completion of interview required to meet job qualification Reliable, punctual attendance is an essential function of the position Preferred Masters in computer science or related STEM field Experience with cloud based systems like AWS, AZURE or Google Cloud Certified Developer / Architect on AWS Strong experience with continuous integration & delivery using Agile methodologies Data engineering experience with transportation/airline industry Strong problem-solving skills Strong knowledge in Big Data GGN00002011 Show more Show less

Posted 6 days ago

Apply

3.0 - 6.0 years

0 Lacs

Gurgaon, Haryana, India

On-site

Linkedin logo

dunnhumby is the global leader in Customer Data Science, empowering businesses everywhere to compete and thrive in the modern data-driven economy. We always put the Customer First. Our mission: to enable businesses to grow and reimagine themselves by becoming advocates and champions for their Customers. With deep heritage and expertise in retail – one of the world’s most competitive markets, with a deluge of multi-dimensional data – dunnhumby today enables businesses all over the world, across industries, to be Customer First. dunnhumby employs nearly 2,500 experts in offices throughout Europe, Asia, Africa, and the Americas working for transformative, iconic brands such as Tesco, Coca-Cola, Meijer, Procter & Gamble and Metro. Most companies try to meet expectations, dunnhumby exists to defy them. Using big data, deep expertise and AI-driven platforms to decode the 21st century human experience – then redefine it in meaningful and surprising ways that put customers first. Across digital, mobile and retail. For brands like Tesco, Coca-Cola, Procter & Gamble and PepsiCo. We’re looking for a Big Data Engineer to join our dh strategic team of Loyality and Personalisation which builds products the retailer can use to find the optimal customer segments and send personalised offers and digital recommendations to the consumer. These products are strategic assets to retailer to improve the loyality of their consumer. by that these products are very important for retailers and therefore for dunnhumby What We Expect From You: 3 to 6 years of experience in software development using Python. - Hands on experience in Python OOPS, Design patterns, Dependency Injection, data libraries(Panda), data structures - Exposure to Spark: PySpark, Architecture of Spark, Best practices to optimize jobs - Experience in Hadoop ecosystem: HDFS, Hive, or YARN - Experience of Orchestration tools: Airflow, Argo workflows, Kubernetes - Experience on Cloud native services(GCP/Azure/AWS) preferable GCP - Database knowledge of: SQL, NoSQL - Hands on expsoure to CI/CD pipelines for data engineering workflows - Testing: pytest for unit testing. pytest-spark to create a test Spark Session. Spark UI for Performance tuning & monitoring Good To Have: - Scala What You Can Expect From Us We won’t just meet your expectations. We’ll defy them. So you’ll enjoy the comprehensive rewards package you’d expect from a leading technology company. But also, a degree of personal flexibility you might not expect. Plus, thoughtful perks, like flexible working hours and your birthday off. You’ll also benefit from an investment in cutting-edge technology that reflects our global ambition. But with a nimble, small-business feel that gives you the freedom to play, experiment and learn. And we don’t just talk about diversity and inclusion. We live it every day – with thriving networks including dh Gender Equality Network, dh Proud, dh Family, dh One and dh Thrive as the living proof. Everyone’s invited. Our approach to Flexible Working At dunnhumby, we value and respect difference and are committed to building an inclusive culture by creating an environment where you can balance a successful career with your commitments and interests outside of work. We believe that you will do your best at work if you have a work / life balance. Some roles lend themselves to flexible options more than others, so if this is important to you please raise this with your recruiter, as we are open to discussing agile working opportunities during the hiring process. For further information about how we collect and use your personal information please see our Privacy Notice which can be found (here) What You Can Expect From Us We won’t just meet your expectations. We’ll defy them. So you’ll enjoy the comprehensive rewards package you’d expect from a leading technology company. But also, a degree of personal flexibility you might not expect. Plus, thoughtful perks, like flexible working hours and your birthday off. You’ll also benefit from an investment in cutting-edge technology that reflects our global ambition. But with a nimble, small-business feel that gives you the freedom to play, experiment and learn. And we don’t just talk about diversity and inclusion. We live it every day – with thriving networks including dh Gender Equality Network, dh Proud, dh Family, dh One, dh Enabled and dh Thrive as the living proof. We want everyone to have the opportunity to shine and perform at your best throughout our recruitment process. Please let us know how we can make this process work best for you. Our approach to Flexible Working At dunnhumby, we value and respect difference and are committed to building an inclusive culture by creating an environment where you can balance a successful career with your commitments and interests outside of work. We believe that you will do your best at work if you have a work / life balance. Some roles lend themselves to flexible options more than others, so if this is important to you please raise this with your recruiter, as we are open to discussing agile working opportunities during the hiring process. For further information about how we collect and use your personal information please see our Privacy Notice which can be found (here) Show more Show less

Posted 6 days ago

Apply

0 years

0 Lacs

Pune, Maharashtra, India

On-site

Linkedin logo

Company Overview Viraaj HR Solutions is a leading recruitment firm in India, dedicated to connecting top talent with industry-leading companies. We focus on understanding the unique needs of each client, providing tailored HR solutions that enhance their workforce capabilities. Our mission is to empower organizations by bridging the gap between talent and opportunity. We value integrity, collaboration, and excellence in service delivery, ensuring a seamless experience for both candidates and employers. Job Title: PySpark Data Engineer Work Mode: On-Site Location: India Role Responsibilities Design, develop, and maintain data pipelines using PySpark. Collaborate with data scientists and analysts to gather data requirements. Optimize data processing workflows for efficiency and performance. Implement ETL processes to integrate data from various sources. Create and maintain data models that support analytical reporting. Ensure data quality and accuracy through rigorous testing and validation. Monitor and troubleshoot production data pipelines to resolve issues. Work with SQL databases to extract and manipulate data as needed. Utilize cloud technologies for data storage and processing solutions. Participate in code reviews and provide constructive feedback. Document technical specifications and processes clearly for team reference. Stay updated with industry trends and emerging technologies in big data. Collaborate with cross-functional teams to deliver data solutions. Support the data governance initiatives to ensure compliance. Provide training and mentorship to junior data engineers. Qualifications Bachelor's degree in Computer Science, Information Technology, or related field. Proven experience as a Data Engineer, preferably with PySpark. Strong understanding of data warehousing concepts and architecture. Hands-on experience with ETL tools and frameworks. Proficiency in SQL and NoSQL databases. Familiarity with cloud platforms like AWS, Azure, or Google Cloud. Experience with Python programming for data manipulation. Knowledge of data modeling techniques and best practices. Ability to work in a fast-paced environment and juggle multiple tasks. Excellent problem-solving skills and attention to detail. Strong communication and interpersonal skills. Ability to work independently and as part of a team. Experience in Agile methodologies and practices. Knowledge of data governance and compliance standards. Familiarity with BI tools such as Tableau or Power BI is a plus. Skills: data modeling,python programming,pyspark,bi tools,sql proficiency,sql,cloud technologies,nosql databases,etl processes,data warehousing,agile methodologies,cloud computing,data engineer Show more Show less

Posted 6 days ago

Apply

7.0 - 10.0 years

0 Lacs

Pune, Maharashtra, India

On-site

Linkedin logo

What do we do? The TTS Analytics team provides analytical insights to the Product, Pricing, Client Experience and Sales functions within the global Treasury & Trade Services business. The team works on business problems focused on driving acquisitions, cross-sell, revenue growth & improvements in client experience. The team extracts relevant insights, identifies business opportunities, converts business problems into analytical frameworks, uses big data tools and machine learning algorithms to build predictive models & other solutions, and designs go-to-market strategies for a huge variety of business problems. Role Description The role will be Asst. Vice President (C12) in the TTS Analytics team The role will report to the VP or SVP leading the team The role will involve working on multiple analyses through the year on business problems across the client life cycle – acquisition, engagement, client experience and retention – for the TTS business This will involve leveraging multiple analytical approaches, tools and techniques, working on multiple data sources (client profile & engagement data, transactions & revenue data, digital data, unstructured data like call transcripts etc.) to provide data driven insights to business and functional stakeholders Qualifications Experience: Bachelor’s Degree with 7-10 years of experience in data analytics, or Master’s Degree with 6-10 years of experience in data analytics, or PhD. Must have: Marketing analytics experience Experience on business problems around sales/marketing strategy optimization, pricing optimization, client experience, cross-sell and retention Experience across different analytical methods like hypothesis testing, segmentation, time series forecasting, test vs. control comparison etc. Predictive modeling using Machine Learning Experience with unstructured data analysis, e.g. call transcripts, using Natural language Processing (NLP)/ Text Mining Good to have: Experience in financial services Experience working with data from different sources and of different complexity Skills: Analytical Skills: Ability to understand complex business problems, break them down into simpler solvable parts and develop analytical approaches to tackle them Strong logical reasoning and problem-solving ability Proficient in converting business problems into analytical tasks, and analytical findings into business insights Proficient in identifying trends and patterns with data Tools and Platforms: Proficient in Python/R, SQL Experience in Hive Proficient in MS Excel, PowerPoint Good to have: Experience with PySpark Experience with Tableau Soft Skills: Ability to drive clear and comprehensive communication flow between business stakeholders and the team and vice versa, translating analytical findings into key insights and actionable recommendations Ability to coach and mentor other members in the team on an ongoing basis Ability to drive ideation on analytical projects to tackle strategic business priorities Comfort on working with ambiguity and open-ended questions Strong process/project management skills Contribute to organizational initiatives in wide ranging areas including competency development, training, organizational building activities etc. ------------------------------------------------------ Job Family Group: Decision Management ------------------------------------------------------ Job Family: Business Analysis ------------------------------------------------------ Time Type: Full time ------------------------------------------------------ Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law. If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi. View Citi’s EEO Policy Statement and the Know Your Rights poster. Show more Show less

Posted 6 days ago

Apply

4.0 - 9.0 years

14 - 20 Lacs

Hyderabad, Bengaluru

Work from Office

Naukri logo

Job Description: PySpark Data Engineer:- 1. API Development : Design, develop, and maintain robust APIs using FastAPI and RESTful principles for scalable backend systems. 2. Big Data Processing : Leverage PySpark to process and analyze large datasets efficiently, ensuring optimal performance in big data environments. 3. Full-Stack Integration : Develop seamless backend-to-frontend feature integrations, collaborating with front-end developers for cohesive user experiences. 4. CI/CD Pipelines : Implement and manage CI/CD pipelines using GitHub Actions and Azure DevOps to streamline deployments and ensure system reliability. 5. Containerization : Utilize Docker for building and deploying containerized applications in development and production environments. 6. Team Leadership : Lead and mentor a team of developers, providing guidance, code reviews, and support to junior team members to ensure high-quality deliverables. 7. Code Optimization : Write clean, maintainable, and efficient Python code, with a focus on scalability, reusability, and performance. 8. Cloud Deployment : Deploy and manage applications on cloud platforms like Azure , ensuring high availability and fault tolerance. 9. Collaboration : Work closely with cross-functional teams, including product managers and designers, to translate business requirements into technical solutions. 10. Documentation : Maintain thorough documentation for APIs, processes, and systems to ensure transparency and ease of maintenance. Highlighted Skillset:- Big Data : Strong PySpark skills for processing large datasets. DevOps : Proficiency in GitHub Actions , CI/CD pipelines , Azure DevOps , and Docker . Integration : Experience in backend-to-frontend feature connectivity. Leadership : Proven ability to lead and mentor development teams. Cloud : Knowledge of deploying and managing applications in Azure or other cloud environments. Team Collaboration : Strong interpersonal and communication skills for working in cross-functional teams. Best Practices : Emphasis on clean code, performance optimization, and robust documentation.

Posted 6 days ago

Apply

5.0 years

0 Lacs

Pune, Maharashtra, India

On-site

Linkedin logo

Our Purpose Mastercard powers economies and empowers people in 200+ countries and territories worldwide. Together with our customers, we’re helping build a sustainable economy where everyone can prosper. We support a wide range of digital payments choices, making transactions secure, simple, smart and accessible. Our technology and innovation, partnerships and networks combine to deliver a unique set of products and services that help people, businesses and governments realize their greatest potential. Title And Summary Lead, Data Engineer Who is Mastercard? Mastercard is a global technology company in the payments industry. Our mission is to connect and power an inclusive, digital economy that benefits everyone, everywhere by making transactions safe, simple, smart, and accessible. Using secure data and networks, partnerships and passion, our innovations and solutions help individuals, financial institutions, governments, and businesses realize their greatest potential. Our decency quotient, or DQ, drives our culture and everything we do inside and outside of our company. With connections across more than 210 countries and territories, we are building a sustainable world that unlocks priceless possibilities for all. Overview The Mastercard Services Technology team is looking for a Lead in Data Engineering, to drive our mission to unlock potential of data assets by consistently innovating, eliminating friction in how we manage big data assets, store those assets, accessibility of data and, enforce standards and principles in the Big Data space both on public cloud and on-premises set up. We are looking for a hands-on, passionate Data Engineer who is not only technically strong in PySpark, cloud platforms, and building modern data architectures, but also deeply committed to learning, growing, and lifting others. The person will play a key role in designing and building scalable data solutions, shaping our engineering culture, and mentoring team members. This is a role for builders and collaborators—engineers who love clean data pipelines, cloud-native design, and helping teammates succeed. Role Design and build scalable, cloud-native data platforms using PySpark, Python, and modern data engineering practices. Mentor and guide other engineers, sharing knowledge, reviewing code, and fostering a culture of curiosity, growth, and continuous improvement. Create robust, maintainable ETL/ELT pipelines that integrate with diverse systems and serve business-critical use cases. Lead by example—write high-quality, testable code and participate in architecture and design discussions with a long-term view in mind. Decompose complex problems into modular, efficient, and scalable components that align with platform and product goals. Champion best practices in data engineering, including testing, version control, documentation, and performance tuning. Drive collaboration across teams, working closely with product managers, data scientists, and other engineers to deliver high-impact solutions. Support data governance and quality efforts, ensuring data lineage, cataloging, and access management are built into the platform. Continuously learn and apply new technologies, frameworks, and tools to improve team productivity and platform reliability. Own and optimize cloud infrastructure components related to data engineering workflows, storage, processing, and orchestration. Participate in architectural discussions, iteration planning, and feature sizing meetings Adhere to Agile processes and participate actively in agile ceremonies Stakeholder management skills All About You 5+ years of hands-on experience in data engineering with strong PySpark and Python skills. Solid experience designing and implementing data models, pipelines, and batch/stream processing systems. Proven ability to work with cloud platforms (AWS, Azure, or GCP), especially in data-related services like S3, Glue, Data Factory, Databricks, etc. Strong foundation in data modeling, database design, and performance optimization. Understanding of modern data architectures (e.g., lakehouse, medallion) and data lifecycle management. Comfortable with CI/CD practices, version control (e.g., Git), and automated testing. Demonstrated ability to mentor and uplift junior engineers—strong communication and collaboration skills. Bachelor’s degree in computer science, Engineering, or related field—or equivalent hands-on experience. Comfortable working in Agile/Scrum development environments. Curious, adaptable, and driven by problem-solving and continuous improvement. Good To Have Experience integrating heterogeneous systems and building resilient data pipelines across cloud environments. Familiarity with orchestration tools (e.g., Airflow, dbt, Step Functions, etc.). Exposure to data governance tools and practices (e.g., Lake Formation, Purview, or Atlan). Experience with containerization and infrastructure automation (e.g., Docker, Terraform) will be a good addition. Master’s degree, relevant certifications (e.g., AWS Certified Data Analytics, Azure Data Engineer), or demonstrable contributions to open source/data engineering communities will be a bonus. Exposure to machine learning data pipelines or MLOps is a plus. Corporate Security Responsibility All activities involving access to Mastercard assets, information, and networks comes with an inherent risk to the organization and, therefore, it is expected that every person working for, or on behalf of, Mastercard is responsible for information security and must: Abide by Mastercard’s security policies and practices; Ensure the confidentiality and integrity of the information being accessed; Report any suspected information security violation or breach, and Complete all periodic mandatory security trainings in accordance with Mastercard’s guidelines. R-251380 Show more Show less

Posted 6 days ago

Apply

5.0 - 9.0 years

0 Lacs

Pune, Maharashtra, India

On-site

Linkedin logo

Position-Azure Data Engineer Location- Pune Mandatory Skills- Azure Databricks, pyspark Experience-5 to 9 Years Notice Period- 0 to 30 days/ Immediately Joiner/ Serving Notice period Must have Experience: Strong design and data solutioning skills PySpark hands-on experience with complex transformations and large dataset handling experience Good command and hands-on experience in Python. Experience working with following concepts, packages, and tools, Object oriented and functional programming NumPy, Pandas, Matplotlib, requests, pytest Jupyter, PyCharm and IDLE Conda and Virtual Environment Working experience must with Hive, HBase or similar Azure Skills Must have working experience in Azure Data Lake, Azure Data Factory, Azure Databricks, Azure SQL Databases Azure DevOps Azure AD Integration, Service Principal, Pass-thru login etc. Networking – vnet, private links, service connections, etc. Integrations – Event grid, Service Bus etc. Database skills Oracle, Postgres, SQL Server – any one database experience Oracle PL/SQL or T-SQL experience Data modelling Thank you Show more Show less

Posted 6 days ago

Apply

2.0 - 5.0 years

4 - 7 Lacs

Hyderabad

Work from Office

Naukri logo

We are seeking an MDM Associate Data Engineer with 2 5 years of experience to support and enhance our enterprise MDM (Master Data Management) platforms using Informatica/Reltio. This role is critical in delivering high-quality master data solutions across the organization, utilizing modern tools like Databricks and AWS to drive insights and ensure data reliability. The ideal candidate will have strong SQL, data profiling, and experience working with cross-functional teams in a pharma environment. To succeed in this role, the candidate must have strong data engineering experience along with MDM knowledge, hence the candidates having only MDM experience are not eligible for this role. Candidate must have data engineering experience on technologies like (SQL, Python, PySpark , Databricks, AWS etc ), along with knowledge of MDM (Master Data Management) Roles & Responsibilities: Analyze and manage customer master data using Reltio or Informatica MDM solutions. Perform advanced SQL queries and data analysis to validate and ensure master data integrity. Leverage Python, PySpark, and Databricks for scalable data processing and automation. Collaborate with business and data engineering teams for continuous improvement in MDM solutions. Implement data stewardship processes and workflows, including approval and DCR mechanisms. Utilize AWS cloud services for data storage and compute processes related to MDM. Contribute to metadata and data modeling activities. Track and manage data issues using tools such as JIRA and document processes in Confluence. Apply Life Sciences/Pharma industry context to ensure data standards and compliance. Basic Qualifications and Experience: Masters degree with 1 - 3 years of experience in Business, Engineering, IT or related field OR Bachelors degree with 2 - 5 years of experience in Business, Engineering, IT or related field OR Diploma with 6 - 8 years of experience in Business, Engineering, IT or related field Functional Skills: Must-Have Skills: Advanced SQL expertise and data wrangling. Strong experience in Python and PySpark for data transformation workflows. Strong experience with Databricks and AWS architecture. Must have knowledge of MDM, data governance, stewardship, and profiling practices. In addition to above, candidates having experience with Informatica or Reltio MDM platforms will be preferred. Good-to-Have Skills: Experience with IDQ, data modeling and approval workflow/DCR. Background in Life Sciences/Pharma industries. Familiarity with project tools like JIRA and Confluence. Strong grip on data engineering concepts. Professional Certifications : Any ETL certification (e.g. Informatica) Any Data Analysis certification (SQL, Python, Databricks) Any cloud certification (AWS or AZURE) Soft Skills: Strong analytical abilities to assess and improve master data processes and solutions. Excellent verbal and written communication skills, with the ability to convey complex data concepts clearly to technical and non-technical stakeholders. Effective problem-solving skills to address data-related issues and implement scalable solutions. Ability to work effectively with global, virtual teams We will ensure that individuals with disabilities are provided with reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.

Posted 6 days ago

Apply

5.0 - 8.0 years

15 - 25 Lacs

Gurugram, Bengaluru

Hybrid

Naukri logo

Warm Greetings from SP Staffing!! Role :Azure Data Engineer Experience Required :5 to 8 yrs Work Location : Bangalore/Gurgaon Required Skills, Azure Databricks, ADF, Pyspark/SQL Interested candidates can send resumes to nandhini.spstaffing@gmail.com

Posted 6 days ago

Apply

6.0 - 8.0 years

0 - 1 Lacs

Hyderabad

Hybrid

Naukri logo

ML Engineer | RAG, LLM, AWS, Databricks | 6–8 Yrs Exp | Build scalable ML systems with GenAI, pipelines & cloud integration

Posted 6 days ago

Apply

89.0 years

0 Lacs

Mumbai, Maharashtra, India

On-site

Linkedin logo

Responsibilities Job Description We are looking for a Python Developer responsible for developing the python-based api automation framework and ensuring high performance, who can design, develop, and maintain automated test scripts for backend services and RESTful/SOAP APIs. The ideal candidate will be well-versed in test automation frameworks, CI/CD pipelines, and modern software development practices. Responsibilities/ Required Skills: Tech Stack: Python, SQL, CI/CD, DB2,GreenPlum Working knowledge of JSON, XML Requests, Pandas , XML parsing, Python scripting, pytest Design and implement automated test suites for RESTful and SOAP APIs using Python. Validate request/response schemas, status codes, headers, and response times. Collaborate closely with developers, QA, and DevOps to ensure high quality and timely releases. Perform root cause analysis and debug complex issues in automation or service logic. Maintain test data and configuration across multiple environments. Experience with data collection using APIs, HTTP requests Design and implementation of low-latency, high-availability, and performant test applications. Working in a fast-paced Agile SDLC framework Experience with API based architectures (SOA, micro-services) and message queues Strong proficiency in Python and experience in API test automation Hands-on experience with tools like Postman, Pytest, Requests, RestAssured Knowledge of JSON, XML, and API specs like Swagger. Familiarity with CI/CD pipelines, Docker, and Git. Excellent debugging and problem-solving skills. Good to Have: Understanding of secure API authentication mechanisms, PySpark Design and develop interactive ,user friendly Power-Bi dashboards and reports What You Can Expect From Morgan Stanley We are committed to maintaining the first-class service and high standard of excellence that have defined Morgan Stanley for over 89 years. Our values - putting clients first, doing the right thing, leading with exceptional ideas, committing to diversity and inclusion, and giving back - aren’t just beliefs, they guide the decisions we make every day to do what's best for our clients, communities and more than 80,000 employees in 1,200 offices across 42 countries. At Morgan Stanley, you’ll find an opportunity to work alongside the best and the brightest, in an environment where you are supported and empowered. Our teams are relentless collaborators and creative thinkers, fueled by their diverse backgrounds and experiences. We are proud to support our employees and their families at every point along their work-life journey, offering some of the most attractive and comprehensive employee benefits and perks in the industry. There’s also ample opportunity to move about the business for those who show passion and grit in their work. Morgan Stanley is an equal opportunities employer. We work to provide a supportive and inclusive environment where all individuals can maximize their full potential. Our skilled and creative workforce is comprised of individuals drawn from a broad cross section of the global communities in which we operate and who reflect a variety of backgrounds, talents, perspectives, and experiences. Our strong commitment to a culture of inclusion is evident through our constant focus on recruiting, developing, and advancing individuals based on their skills and talents. Show more Show less

Posted 6 days ago

Apply

4.0 years

0 Lacs

Chennai, Tamil Nadu, India

Remote

Linkedin logo

We are hiring for immediate joiners. This is a Remote mode job. Job Title: GCP Data Engineer (Google Cloud Platform) Experience : 4 + Years Location: Chennai (Hybrid) Responsibilities Google Cloud Platform - Biq Query, Data Flow, Dataproc, Data Fusion, TERRAFORM, Tekton,Cloud SQL, AIRFLOW, POSTGRES, Airflow PySpark, Python, API 2+Years in GCP Services - Biq Query, Data Flow, Dataproc, DataPlex,DataFusion, Terraform, Tekton, Cloud SQL, Redis Memory, Airflow, Cloud Storage 2+ Years in Data Transfer Utilities 2+ Years in Git / any other version control tool 2+ Years in Confluent Kafka 1+ Years of Experience in API Development 2+ Years in Agile Framework 4+ years of strong experience in python, Pyspark development. 4+ years of shell scripting to develop the adhoc jobsfor data importing/exporting. Show more Show less

Posted 6 days ago

Apply

3.0 years

0 Lacs

India

On-site

Linkedin logo

Note: Please do not apply if your salary expectations are higher than the provided Salary Range and experience less than 3 years. If you have experience with Travel Industry and worked on Hotel, Car Rental or Ferry Booking before then we can negotiate the package. Company Description Our company is involved in promoting Greece for the last 25 years through travel sites visited from all around the world with 10 million visitors per year such www.greeka.com, www.ferriesingreece.com etc Through the websites, we provide a range of travel services for a seamless holiday experience such online car rental reservations, ferry tickets, transfers, tours etc….. Role Description We are seeking a highly skilled Artificial Intelligence / Machine Learning Engineer to join our dynamic team. You will work closely with our development team and QAs to deliver cutting-edge solutions that improve our candidate screening and employee onboarding processes. Major Responsibilities & Job Requirements include: • Develop and implement NLP/LLM Models. • Minimum of 3-4 years of experience as an AI/ML Developer or similar role, with demonstrable expertise in computer vision techniques. • Develop and implement AI models using Python, TensorFlow, and PyTorch. • Proven experience in computer vision, including fine-tuning OCR models (e.g., Tesseract, Layoutlmv3 , EasyOCR, PaddleOCR, or custom-trained models). • Strong understanding and hands-on experience with RAG (Retrieval-Augmented Generation) architectures and pipelines for building intelligent Q&A, document summarization, and search systems. • Experience working with LangChain, LLM agents, and chaining tools to build modular and dynamic LLM workflows. • Familiarity with agent-based frameworks and orchestration of multi-step reasoning with tools, APIs, and external data sources. • Familiarity with Cloud AI Solutions, such as IBM, Azure, Google & AWS. • Work on natural language processing (NLP) tasks and create language models (LLM) for various applications. • Design and maintain SQL databases for storing and retrieving data efficiently. • Utilize machine learning and deep learning techniques to build predictive models. • Collaborate with cross-functional teams to integrate AI solutions into existing systems. • Stay updated with the latest advancements in AI technologies, including ChatGPT, Gemini, Claude, and Big Data solutions. • Write clean, maintainable, and efficient code when required. • Handle large datasets and perform big data analysis to extract valuable insights. • Fine-tune pre-trained LLMs using specific type of data and ensure optimal performance. • Proficiency in cloud services from Amazon AWS • Extract and parse text from CVs, application forms, and job descriptions using advanced NLP techniques such as Word2Vec, BERT, and GPT-NER. • Develop similarity functions and matching algorithms to align candidate skills with job requirements. • Experience with microservices, Flask, FastAPI, Node.js. • Expertise in Spark, PySpark for big data processing. • Knowledge of advanced techniques such as SVD/PCA, LSTM, NeuralProphet. • Apply debiasing techniques to ensure fairness and accuracy in the ML pipeline. • Experience in coordinating with clients to understand their needs and delivering AI solutions that meet their requirements. Qualifications : • Bachelor's or Master’s degree in Computer Science, Data Science, Artificial Intelligence, or a related field. • In-depth knowledge of NLP techniques and libraries, including Word2Vec, BERT, GPT, and others. • Experience with database technologies and vector representation of data. • Familiarity with similarity functions and distance metrics used in matching algorithms. • Ability to design and implement custom ontologies and classification models. • Excellent problem-solving skills and attention to detail. • Strong communication and collaboration skills. Show more Show less

Posted 6 days ago

Apply

4.0 - 7.0 years

10 - 17 Lacs

Pune

Hybrid

Naukri logo

Hi Greeting for the Day! We found your profile suitable for the below opening, kindly go through the JD and reach out to us if you are interested. About Us Incorporated in 2006, We are an 18 year old recruitment and staffing company, we are a provider of manpower for some of the fortune 500 companies for junior/ Middle/ Executive talent. About Client Hiring for One of the Most Prestigious Multinational Corporations! Job Description Job Title : Azure Data Engineer Qualification : Any Graduate or Above Relevant Experience : 4+Yrs Required skills: Azure Databricks Python PySpark SQL AZURE Cloud PowerBI {Basic+Debug} Location : Pune CTC Range : 10-17 (Lakhs Per Annum) Mode of Work : Hybrid Joel. IT Staff. Black and White Business solutions PVT Ltd Bangalore, Karnataka, INDIA. 8067432416 I joel.manivasan@blackwhite.in I www.blackwhite.in

Posted 6 days ago

Apply

3.0 - 20.0 years

0 Lacs

Hyderābād

On-site

GlassDoor logo

We're Hiring – Sr. Data Engineer or Architech (Hyderabad, India) Altura Innovative Technologies Pvt Ltd is looking for a Data Engineer with expertise in PySpark, Python, and DSA to join our client team! Experience: 3 to 20 years Key Skills: PySpark, Python , Data Structures & Algorithms, ETL, SQL, Cloud (AWS/Azure/GCP) or Palanti Foundry Foundry or GIS (Must and should) Location: Hyderabad, India (Onsite – 5 days per week) Work Timings: 2 PM – 10 PM IST If you're passionate about big data, scalable pipelines, and cloud technologies, we want to hear from you! Share your updated resume to careers@alturaitech.com or 8179033240 Job Types: Full-time, Permanent Pay: ₹500,000.00 - ₹3,000,000.00 per year Benefits: Health insurance Paid time off Provident Fund Schedule: Day shift UK shift Work Location: In person

Posted 6 days ago

Apply

40.0 years

0 Lacs

Hyderābād

On-site

GlassDoor logo

India - Hyderabad JOB ID: R-216678 ADDITIONAL LOCATIONS: India - Hyderabad WORK LOCATION TYPE: On Site DATE POSTED: Jun. 12, 2025 CATEGORY: Engineering ABOUT AMGEN Amgen harnesses the best of biology and technology to fight the world’s toughest diseases, and make people’s lives easier, fuller and longer. We discover, develop, manufacture and deliver innovative medicines to help millions of patients. Amgen helped establish the biotechnology industry more than 40 years ago and remains on the cutting-edge of innovation, using technology and human genetic data to push beyond what’s known today. ABOUT THE ROLE Role Description: We are seeking a seasoned Principal Architect – Solutions to drive the architecture, development and implementation of data solutions to Amgen functional groups. The ideal candidate able to work in large scale Data Analytic initiatives, engage and work along with Business, Program Management, Data Engineering and Analytic Engineering teams. Be champions of enterprise data analytic strategy, data architecture blueprints and architectural guidelines. As a Principal Architect, you will play a crucial role in designing, building, and optimizing data solutions to Amgen functional groups such as R&D, Operations and GCO. Roles & Responsibilities: Implement and manage large scale data analytic solutions to Amgen functional groups that align with the Amgen Data strategy Collaborate with Business, Program Management, Data Engineering and Analytic Engineering teams to deliver data solutions Responsible for design, develop, optimize, delivery and support of Data solutions on AWS and Databricks architecture Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions. Provide expert guidance and mentorship to the team members, fostering a culture of innovation and best practices. Be passionate and hands-on to quickly experiment with new data related technologies Define guidelines, standards, strategies, security policies and change management policies to support the Enterprise Data platform. Collaborate and align with EARB, Cloud Infrastructure, Security and other technology leaders on Enterprise Data Architecture changes Work with different project and application groups to drive growth of the Enterprise Data Platform using effective written/verbal communication skills, and lead demos at different roadmap sessions Overall management of the Enterprise Data Platform on AWS environment to ensure that the service delivery is cost effective and business SLAs around uptime, performance and capacity are met Ensure scalability, reliability, and performance of data platforms by implementing best practices for architecture, cloud resource optimization, and system tuning. Collaboration with RunOps engineers to continuously increase our ability to push changes into production with as little manual overhead and as much speed as possible. Maintain knowledge of market trends and developments in data integration, data management and analytics software/tools Work as part of team in a SAFe Agile/Scrum model Basic Qualifications and Experience: Master’s degree with 12 - 15 years of experience in Computer Science, IT or related field OR Bachelor’s degree with 14 - 17 years of experience in Computer Science, IT or related field Functional Skills: Must-Have Skills: 8+ years of hands-on experience in Data integrations, Data Management and BI technology stack. Strong experience with one or more Data Management tools such as AWS data lake, Snowflake or Azure Data Fabric Expert-level proficiency with Databricks and experience in optimizing data pipelines and workflows in Databricks environments. Strong experience with Python, PySpark, and SQL for building scalable data workflows and pipelines. Experience with Apache Spark, Delta Lake, and other relevant technologies for large-scale data processing. Familiarity with BI tools including Tableau and PowerBI Demonstrated ability to enhance cost-efficiency, scalability, and performance for data solutions Strong analytical and problem-solving skills to address complex data solutions Good-to-Have Skills: Preferred to have experience in life science or tech or consultative solution architecture roles Experience working with agile development methodologies such as Scaled Agile. Professional Certifications AWS Certified Data Engineer preferred Databricks Certificate preferred Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals Strong presentation and public speaking skills. EQUAL OPPORTUNITY STATEMENT Amgen is an Equal Opportunity employer and will consider you without regard to your race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, or disability status. We will ensure that individuals with disabilities are provided with reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request an accommodation.

Posted 6 days ago

Apply

5.0 - 8.0 years

0 Lacs

Gurugram, Haryana, India

On-site

Linkedin logo

Job Title: Senior PySpark Data Engineer Share only the quality profiles Location: Pune/Hybrid Experience: 5-8 years Budget: 8LPA-11 LPA Notice Period: Immediate to 15 days Mandatory Skills: · Python · SQL · ETL · Informatica PowerCenter · AWS/Azure Good to Have: · IDMC Tech Stack Table Skills Experience Rating out of 10 Python SQL ETL Informatica PowerCenter Aws/Azure Job Summary: We are seeking a Senior PySpark Data Engineer with extensive experience in developing, optimizing, and maintaining data processing jobs using PySpark. The ideal candidate will possess a robust background in SQL and ETL processes, along with proficiency in cloud platforms such as AWS or Azure. This role will require excellent analytical skills and the ability to communicate effectively with both technical and non-technical stakeholders. Key Responsibilities: · Design, develop, and optimize PySpark jobs for enhanced performance and scalability. · Collaborate with data architects and business analysts to understand data requirements and translate them into technical specifications. · Redesign and maintain complex SQL queries and stored procedures to support data extraction and transformation processes. · Utilize ETL tools, specifically Informatica PowerCenter, to build effective data pipelines. · Troubleshoot and resolve data quality issues and performance bottlenecks. · Mentor and provide technical guidance to a team of developers to enhance productivity and code quality. · Stay updated with new technologies and practices to continually improve data processing capabilities. Qualifications: · Education: Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related field. · Experience: 5-8 years of experience in data engineering, with a strong focus on PySpark and ETL processes. Technical Skills: Must-Have: 1. Extensive experience with PySpark, focusing on job optimization techniques. 2. Proficiency in SQL, with experience in SQL Server, MySQL, or other relational databases. 3. Strong knowledge of ETL concepts and tools, particularly Informatica PowerCenter and IDMC. 4. Excellent analytical and troubleshooting skills. 5. Strong communication skills for effective collaboration. Good to Have: 1. Basic knowledge of Unix commands and Shell scripting. 2. Experience in leading and mentoring development teams. 3. Familiarity with Azure/Fabric. Kindly share a profile only in this tracker format ,attach the tracker to the body of the mail. Without this tracker format and Tech Stack Table profile will not be considered. S.no Date Position Names of the Candidate Mobile Number Email id Total Experience Relevant Experience CUrrent CTC Expected CTC Notice Period / On Paper Current Organisation Current Location Address with Pin code Reason of leaving DOB Offer in hand VENDOR NAME - Regards Damodar 91-8976334593 info@d-techworks.com D-TechWorks Pvt Ltd USA | INDIA www.d-techworks.com Information Technology Services Technology | Consulting | Development | Staff Augmentation Show more Show less

Posted 6 days ago

Apply

Exploring PySpark Jobs in India

PySpark, a powerful data processing framework built on top of Apache Spark and Python, is in high demand in the job market in India. With the increasing need for big data processing and analysis, companies are actively seeking professionals with PySpark skills to join their teams. If you are a job seeker looking to excel in the field of big data and analytics, exploring PySpark jobs in India could be a great career move.

Top Hiring Locations in India

Here are 5 major cities in India where companies are actively hiring for PySpark roles: 1. Bangalore 2. Pune 3. Hyderabad 4. Mumbai 5. Delhi

Average Salary Range

The estimated salary range for PySpark professionals in India varies based on experience levels. Entry-level positions can expect to earn around INR 6-8 lakhs per annum, while experienced professionals can earn upwards of INR 15 lakhs per annum.

Career Path

In the field of PySpark, a typical career progression may look like this: 1. Junior Developer 2. Data Engineer 3. Senior Developer 4. Tech Lead 5. Data Architect

Related Skills

In addition to PySpark, professionals in this field are often expected to have or develop skills in: - Python programming - Apache Spark - Big data technologies (Hadoop, Hive, etc.) - SQL - Data visualization tools (Tableau, Power BI)

Interview Questions

Here are 25 interview questions you may encounter when applying for PySpark roles:

  • Explain what PySpark is and its main features (basic)
  • What are the advantages of using PySpark over other big data processing frameworks? (medium)
  • How do you handle missing or null values in PySpark? (medium)
  • What is RDD in PySpark? (basic)
  • What is a DataFrame in PySpark and how is it different from an RDD? (medium)
  • How can you optimize performance in PySpark jobs? (advanced)
  • Explain the difference between map and flatMap transformations in PySpark (basic)
  • What is the role of a SparkContext in PySpark? (basic)
  • How do you handle schema inference in PySpark? (medium)
  • What is a SparkSession in PySpark? (basic)
  • How do you join DataFrames in PySpark? (medium)
  • Explain the concept of partitioning in PySpark (medium)
  • What is a UDF in PySpark? (medium)
  • How do you cache DataFrames in PySpark for optimization? (medium)
  • Explain the concept of lazy evaluation in PySpark (medium)
  • How do you handle skewed data in PySpark? (advanced)
  • What is checkpointing in PySpark and how does it help in fault tolerance? (advanced)
  • How do you tune the performance of a PySpark application? (advanced)
  • Explain the use of Accumulators in PySpark (advanced)
  • How do you handle broadcast variables in PySpark? (advanced)
  • What are the different data sources supported by PySpark? (medium)
  • How can you run PySpark on a cluster? (medium)
  • What is the purpose of the PySpark MLlib library? (medium)
  • How do you handle serialization and deserialization in PySpark? (advanced)
  • What are the best practices for deploying PySpark applications in production? (advanced)

Closing Remark

As you explore PySpark jobs in India, remember to prepare thoroughly for interviews and showcase your expertise confidently. With the right skills and knowledge, you can excel in this field and advance your career in the world of big data and analytics. Good luck!

cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies