Home
Jobs

3895 Pyspark Jobs - Page 33

Filter Interviews
Min: 0 years
Max: 25 years
Min: ₹0
Max: ₹10000000
Setup a job Alert
Filter
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

8.0 - 12.0 years

30 - 35 Lacs

Hyderabad

Work from Office

Naukri logo

Job Summary: We are seeking a highly skilled Data Engineer with expertise in leveraging Data Lake architecture and the Azure cloud platform to develop, deploy, and optimise data-driven solutions. . You will play a pivotal role in transforming raw data into actionable insights, supporting strategic decision-making across the organisation. Responsibilities Design and implement scalable data science solutions using Azure Data Lake, Azure Data Bricks, Azure Data Factory and related Azure services. Develop, train, and deploy machine learning models to address business challenges. Collaborate with data engineering teams to optimise data pipelines and ensure seamless data integration within Azure cloud infrastructure. Conduct exploratory data analysis (EDA) to identify trends, patterns, and insights. Build predictive and prescriptive models to support decision-making processes. Expertise in developing end-to-end Machine learning lifecycle utilizing crisp-DM which includes of data collection, cleansing, visualization, preprocessing, model development, model validation and model retraining Proficient in building and implementing RAG systems that enhance the accuracy and relevance of model outputs by integrating retrieval mechanisms with generative models. Ensure data security, compliance, and governance within the Azure cloud ecosystem. Monitor and optimise model performance and scalability in production environments. Prepare clear and concise documentation for developed models and workflows. Skills Required: Good experience in using Pyspark, Python, MLops (Optional), ML flow (Optional), Azure Data Lake Storage. Unity Catalog Worked and utilized data from various RDBMS like MYSQL, SQL Server, Postgres and NoSQL databases like MongoDB, Cassandra, Redis and graph DB like Neo4j, Grakn. Proven experience as a Data Engineer with a strong focus on Azure cloud platform and Data Lake architecture. Proficiency in Python, Pyspark, Hands-on experience with Azure services such as Azure Data Lake, Azure Synapse Analytics, Azure Machine Learning, Azure Databricks, and Azure Functions. Strong knowledge of SQL and experience in querying large datasets from Data Lakes. Familiarity with data engineering tools and frameworks for data ingestion and transformation in Azure. Experience with version control systems (e.g., Git) and CI/CD pipelines for machine learning projects. Excellent problem-solving skills and the ability to work collaboratively in a team environment.

Posted 6 days ago

Apply

8.0 - 15.0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

Linkedin logo

Position Summary... What you'll do... Role: Staff, Data Scientist Experience: 8- 15 years Location: Chennai About EBS team: Enterprise Business Services is invested in building a compact, robust organization that includes service operations and technology solutions for Finance, People, Associate Digital Experience. Our team is responsible for design and development of solution that knows our consumers needs better than ever by predicting what they want based on unconstrained demand, and efficiently unlock strategic growth, economic profit, and wallet share by orchestrating intelligent, connected planning and decisioning across all functions. We interact with multiple teams across the company to provide scalable robust technical solutions. This role will play crucial role in overseeing the planning, execution and delivery of complex projects within team Walmarts Enterprise Business Services (EBS) is a powerhouse of several exceptional teams delivering world-class technology solutions and services making a profound impact at every level of Walmart. As a key part of Walmart Global Tech, our teams set the bar for operational excellence and leverage emerging technology to support millions of customers, associates, and stakeholders worldwide. Each time an associate turns on their laptop, a customer makes a purchase, a new supplier is onboarded, the company closes the books, physical and legal risk is avoided, and when we pay our associates consistently and accurately, that is EBS. Joining EBS means embarking on a journey of limitless growth, relentless innovation, and the chance to set new industry standards that shape the future of Walmart. About Team The data science team at Enterprise Business Services Pillar at Walmart Global Tech focuses on using the latest research in machine learning, statistics, and optimization to solve business problems. We mine data, distill insights, extract information, build analytical models, deploy Machine Learning algorithms, and use the latest algorithms and technology to empower business decision-making. In addition, we work with engineers to build reference architectures and machine learning pipelines in a big data ecosystem to productize our solutions. Advanced analytical algorithms driven by our team will help Walmart to optimize business operations, business practices and change the way our customers shop. The data science community at Walmart Global Tech is active in most of the Hack events, utilizing the petabytes of data at our disposal, to build some of the coolest ideas. All the work we do at Walmart Labs will eventually benefit our operations ; our associates, helping Customers Save Money to Live Better. What You Will Do As a Staff Data Scientist for Walmart Global Tech, you'll have the opportunity to Drive data-derived insights across a wide range of retail ; Finance divisions by developing advanced statistical models, machine learning algorithms and computational algorithms based on business initiatives Direct the gathering of data, assess data validity and synthesize data into large analytics datasets to support project goals Utilize big data analytics and advanced data science techniques to identify trends, patterns, and discrepancies in data. Determine additional data needed to support insights Build and train AI/ML models for replication for future projects Deploy and maintain the data science solutions Communicate recommendations to business partners and influence future plans based on insights Consult with business stakeholders regarding algorithm-based recommendations and be a thought-leader to develop these into business actions. Closely partners with the Senior Manager ; Director of Data Science to drive data science adoption in the domain Guides. data scientists, senior data scientists ; staff data scientists across multiple sub-domains to ensure on-time delivery of ML products Drive efficiency across the domain in terms of DS and ML best practices, ML Ops practices, resource utilization, reusability and multi-tenancy. Lead multiple complex ML products and guide senior tech leads in the domain in efficiently leading their products. Drive synergies across different products in terms of algorithmic innovation and sharing of best practices. Proactive identification of complex business problems that can be solved using advanced ML, finding opportunities and gaps in the current business domain Evaluates proposed business cases for projects and initiatives What You Will Bring Masters with > 10 years OR Ph.D. with > 8 years of relevant experience. Educational qualifications should be Computer Science/Statistics/Mathematics or a related area. Minimum 6 years of experience as a data science technical lead Ability to lead multiple data science projects end to end. Deep experience in building data science solution in areas like fraud prevention, forecasting, shrink and waste reduction, inventory management, recommendation, assortment and price optimization Deep experience in simultaneously leading multiple data science initiatives end to end – from translating business needs to analytical asks, leading the process of building solutions and the eventual act of deployment and maintenance of them Strong experience in machine learning: Classification models, regression models, NLP, Forecasting, Unsupervised models, Optimization, Graph ML, Causal inference, Causal ML, Statistical Learning, experimentation ; Gen-AI In Gen-AI, it is desirable to have experience in embedding generation from training materials, storage and retrieval from Vector Databases, set-up and provisioning of managed LLM gateways, development of Retrieval augmented generation based LLM agents, model selection, iterative prompt engineering and finetuning based on accuracy and user-feedback, monitoring and governance. Ability to scale and deploy data science solutions. Strong Experience with one or more of Python and R. Experience in GCP/Azure Strong Experience in Python, PySpark Google Cloud platform, Vertex AI, Kubeflow, model deployment Strong Experience with big data platforms – Hadoop (Hive, Map Reduce, HQL, Scala) Experience with GPU/CUDA for computational efficiency About Walmart Global Tech Imagine working in an environment where one line of code can make life easier for hundreds of millions of people. That’s what we do at Walmart Global Tech. We’re a team of software engineers, data scientists, cybersecurity expert's and service professionals within the world’s leading retailer who make an epic impact and are at the forefront of the next retail disruption. People are why we innovate, and people power our innovations. We are people-led and tech-empowered. We train our team in the skillsets of the future and bring in experts like you to help us grow. We have roles for those chasing their first opportunity as well as those looking for the opportunity that will define their career. Here, you can kickstart a great career in tech, gain new skills and experience for virtually every industry, or leverage your expertise to innovate at scale, impact millions and reimagine the future of retail. Flexible, hybrid work We use a hybrid way of working with primary in office presence coupled with an optimal mix of virtual presence. We use our campuses to collaborate and be together in person, as business needs require and for development and networking opportunities. This approach helps us make quicker decisions, remove location barriers across our global team, be more flexible in our personal lives. Benefits Beyond our great compensation package, you can receive incentive awards for your performance. Other great perks include a host of best-in-class benefits maternity and parental leave, PTO, health benefits, and much more. Belonging We aim to create a culture where every associate feels valued for who they are, rooted in respect for the individual. Our goal is to foster a sense of belonging, to create opportunities for all our associates, customers and suppliers, and to be a Walmart for everyone. At Walmart, our vision is "everyone included." By fostering a workplace culture where everyone is—and feels—included, everyone wins. Our associates and customers reflect the makeup of all 19 countries where we operate. By making Walmart a welcoming place where all people feel like they belong, we’re able to engage associates, strengthen our business, improve our ability to serve customers, and support the communities where we operate. Equal Opportunity Employer Walmart, Inc., is an Equal Opportunities Employer – By Choice. We believe we are best equipped to help our associates, customers and the communities we serve live better when we really know them. That means understanding, respecting and valuing unique styles, experiences, identities, ideas and opinions – while being inclusive of all people. Minimum Qualifications... Outlined below are the required minimum qualifications for this position. If none are listed, there are no minimum qualifications. Minimum Qualifications:Option 1: Bachelors degree in Statistics, Economics, Analytics, Mathematics, Computer Science, Information Technology or related field and 4 years' experience in an analytics related field. Option 2: Masters degree in Statistics, Economics, Analytics, Mathematics, Computer Science, Information Technology or related field and 2 years' experience in an analytics related field. Option 3: 6 years' experience in an analytics or related field. Preferred Qualifications... Outlined below are the optional preferred qualifications for this position. If none are listed, there are no preferred qualifications. Primary Location... Rmz Millenia Business Park, No 143, Campus 1B (1St -6Th Floor), Dr. Mgr Road, (North Veeranam Salai) Perungudi , India R-2182242 Show more Show less

Posted 6 days ago

Apply

2.0 - 7.0 years

0 Lacs

Pune, Maharashtra, India

On-site

Linkedin logo

Summary Position Summary AI & Data In this age of disruption, organizations need to navigate the future with confidence, embracing decision making with clear, data-driven choices that deliver enterprise value in a dynamic business environment. The AI & Data team leverages the power of data, analytics, robotics, science and cognitive technologies to uncover hidden relationships from vast troves of data, generate insights, and inform decision-making. Together with the AI & Engineering (AI&E) practice, our AI & Data offering helps clients transform their business by architecting organizational intelligence programs and differentiated strategies to win in their chosen markets. AI & Data will work with our clients to: Implement large-scale data ecosystems including data management, governance and the integration of structured and unstructured data to generate insights leveraging cloud-based platforms Leverage automation, cognitive and science-based techniques to manage data, predict scenarios and prescribe action Job Title: Data Scientist/Machine Learning Engg Job Summary:We are seeking a Data Scientist with experience in leveraging data, machine learning, statistics and AI technologies to generate insights and inform decision-making. You will work on large-scale data ecosystems and collaborate with a team to implement data-driven solutions. Key Responsibilities : Deliver large-scale DS/ML end to end projects across multiple industries and domains Liaison with on-site and client teams to understand various business problem statements, use cases and project requirements Work with a team of Data Engineers, ML/AI Engineers, DevOps, and other Data & AI professionals to deliver projects from inception to implementation Utilize maths/stats, AI, and cognitive techniques to analyze and process data, predict scenarios, and prescribe actions. Drive a human-led culture of Inclusion & Diversity by caring deeply for all team members Qualifications : 2-7 years of relevant hands-on experience in Data Science, Machine Learning, Statistical Modeling Bachelor’s or Master’s degree in a quantitative field. Must have strong hands-on experience with programming languages like Python, PySpark and SQL, and frameworks such as Numpy, Pandas, Scikit-learn, etc. Expertise in Classification, Regression, Time series, Decision Trees, Optimization, etc. Hands on knowledge of Docker containerization, GIT, Tableau or PowerBI Model deployment on Cloud or On-prem will be an added advantage Familiar with Databricks, Snowflake, or Hyperscalers (AWS/Azure/GCP/NVIDIA) Should follow research papers, comprehend and innovate/present the best approaches/solutions related to DS/ML AI/Cloud certification from a premier institute is preferred. Recruiting tips From developing a stand out resume to putting your best foot forward in the interview, we want you to feel prepared and confident as you explore opportunities at Deloitte. Check out recruiting tips from Deloitte recruiters. Benefits At Deloitte, we know that great people make a great organization. We value our people and offer employees a broad range of benefits. Learn more about what working at Deloitte can mean for you. Our people and culture Our inclusive culture empowers our people to be who they are, contribute their unique perspectives, and make a difference individually and collectively. It enables us to leverage different ideas and perspectives, and bring more creativity and innovation to help solve our clients' most complex challenges. This makes Deloitte one of the most rewarding places to work. Our purpose Deloitte’s purpose is to make an impact that matters for our people, clients, and communities. At Deloitte, purpose is synonymous with how we work every day. It defines who we are. Our purpose comes through in our work with clients that enables impact and value in their organizations, as well as through our own investments, commitments, and actions across areas that help drive positive outcomes for our communities. Professional development From entry-level employees to senior leaders, we believe there’s always room to learn. We offer opportunities to build new skills, take on leadership opportunities and connect and grow through mentorship. From on-the-job learning experiences to formal development programs, our professionals have a variety of opportunities to continue to grow throughout their career. Requisition code: 300100 Show more Show less

Posted 6 days ago

Apply

2.0 - 7.0 years

0 Lacs

Mumbai, Maharashtra, India

On-site

Linkedin logo

Summary Position Summary AI & Data In this age of disruption, organizations need to navigate the future with confidence, embracing decision making with clear, data-driven choices that deliver enterprise value in a dynamic business environment. The AI & Data team leverages the power of data, analytics, robotics, science and cognitive technologies to uncover hidden relationships from vast troves of data, generate insights, and inform decision-making. Together with the AI & Engineering (AI&E) practice, our AI & Data offering helps clients transform their business by architecting organizational intelligence programs and differentiated strategies to win in their chosen markets. AI & Data will work with our clients to: Implement large-scale data ecosystems including data management, governance and the integration of structured and unstructured data to generate insights leveraging cloud-based platforms Leverage automation, cognitive and science-based techniques to manage data, predict scenarios and prescribe action Job Title: Data Scientist/Machine Learning Engg Job Summary:We are seeking a Data Scientist with experience in leveraging data, machine learning, statistics and AI technologies to generate insights and inform decision-making. You will work on large-scale data ecosystems and collaborate with a team to implement data-driven solutions. Key Responsibilities : Deliver large-scale DS/ML end to end projects across multiple industries and domains Liaison with on-site and client teams to understand various business problem statements, use cases and project requirements Work with a team of Data Engineers, ML/AI Engineers, DevOps, and other Data & AI professionals to deliver projects from inception to implementation Utilize maths/stats, AI, and cognitive techniques to analyze and process data, predict scenarios, and prescribe actions. Drive a human-led culture of Inclusion & Diversity by caring deeply for all team members Qualifications : 2-7 years of relevant hands-on experience in Data Science, Machine Learning, Statistical Modeling Bachelor’s or Master’s degree in a quantitative field. Must have strong hands-on experience with programming languages like Python, PySpark and SQL, and frameworks such as Numpy, Pandas, Scikit-learn, etc. Expertise in Classification, Regression, Time series, Decision Trees, Optimization, etc. Hands on knowledge of Docker containerization, GIT, Tableau or PowerBI Model deployment on Cloud or On-prem will be an added advantage Familiar with Databricks, Snowflake, or Hyperscalers (AWS/Azure/GCP/NVIDIA) Should follow research papers, comprehend and innovate/present the best approaches/solutions related to DS/ML AI/Cloud certification from a premier institute is preferred. Recruiting tips From developing a stand out resume to putting your best foot forward in the interview, we want you to feel prepared and confident as you explore opportunities at Deloitte. Check out recruiting tips from Deloitte recruiters. Benefits At Deloitte, we know that great people make a great organization. We value our people and offer employees a broad range of benefits. Learn more about what working at Deloitte can mean for you. Our people and culture Our inclusive culture empowers our people to be who they are, contribute their unique perspectives, and make a difference individually and collectively. It enables us to leverage different ideas and perspectives, and bring more creativity and innovation to help solve our clients' most complex challenges. This makes Deloitte one of the most rewarding places to work. Our purpose Deloitte’s purpose is to make an impact that matters for our people, clients, and communities. At Deloitte, purpose is synonymous with how we work every day. It defines who we are. Our purpose comes through in our work with clients that enables impact and value in their organizations, as well as through our own investments, commitments, and actions across areas that help drive positive outcomes for our communities. Professional development From entry-level employees to senior leaders, we believe there’s always room to learn. We offer opportunities to build new skills, take on leadership opportunities and connect and grow through mentorship. From on-the-job learning experiences to formal development programs, our professionals have a variety of opportunities to continue to grow throughout their career. Requisition code: 300100 Show more Show less

Posted 6 days ago

Apply

2.0 - 7.0 years

40 - 45 Lacs

Chandigarh, Bengaluru

Work from Office

Naukri logo

As the Data Engineer, you will play a pivotal role in shaping our data infrastructure and executing against our strategy. You will ideate alongside engineering, data and our clients to deploy data products with an innovative and meaningful impact to clients. You will design, build, and maintain scalable data pipelines and workflows on AWS. Additionally, your expertise in AI and machine learning will enhance our ability to deliver smarter, more predictive solutions. Key Responsibilities Collaborate with other engineers, customers to brainstorm and develop impactful data products tailored to our clients. Leverage AI and machine learning techniques to integrate intelligent features into our offerings. Develop, and optimize end-to-end data pipelines on AWS Follow best practices in software architecture and development. Implement effective cost management and performance optimization strategies. Develop and maintain systems using Python, SQL, PySpark, and Django for front-end development. Work directly with clients and end-users and address their data needs Utilize databases and tools including and not limited to, Postgres, Redshift, Airflow, and MongoDB to support our data ecosystem. Leverage AI frameworks and libraries to integrate advanced analytics into our solutions. Qualifications Experience: Minimum of 3 years of experience in data engineering, software development, or related roles. Proven track record in designing and deploying AWS cloud infrastructure solutions At least 2 years in data analysis and mining techniques to aid in descriptive and diagnostic insights Extensive hands-on experience with Postgres, Redshift, Airflow, MongoDB, and real-time data workflows. Technical Skills: Expertise in Python, SQL, and PySpark Strong background in software architecture and scalable development practices. Tableau, Metabase or similar viz tools experience Working knowledge of AI frameworks and libraries is a plus. Leadership & Communication: Demonstrates ownership and accountability for delivery with a strong commitment to quality. Excellent communication skills with a history of effective client and end-user engagement. Startup & Fintech Mindset: Adaptability and agility to thrive in a fast-paced, early-stage startup environment. Passion for fintech innovation and a strong desire to make a meaningful impact on the future of finance.

Posted 6 days ago

Apply

10.0 - 15.0 years

12 - 17 Lacs

Mumbai, Maharastra

Work from Office

Naukri logo

About the Role: Grade Level (for internal use): 11 The Team You will be an expert contributor and part of the Rating Organizations Data Services Product Engineering Team. This team, who has a broad and expert knowledge on Ratings organizations critical data domains, technology stacks and architectural patterns, fosters knowledge sharing and collaboration that results in a unified strategy. All Data Services team members provide leadership, innovation, timely delivery, and the ability to articulate business value. Be a part of a unique opportunity to build and evolve S&P Ratings next gen analytics platform. Responsibilities: Architect, design, and implement innovative software solutions to enhance S&P Ratings' cloud-based analytics platform. Mentor a team of engineers (as required), fostering a culture of trust, continuous growth, and collaborative problem-solving. Collaborate with business partners to understand requirements, ensuring technical solutions align with business goals. Manage and improve existing software solutions, ensuring high performance and scalability. Participate actively in all Agile scrum ceremonies, contributing to the continuous improvement of team processes. Produce comprehensive technical design documents and conduct technical walkthroughs. Experience & Qualifications: Bachelors degree in computer science, Information Systems, Engineering, equivalent or more is required Proficient with software development lifecycle (SDLC) methodologies like Agile, Test-driven development 10+ years of experience with 4+ years designing/developing enterprise products, modern tech stacks and data platforms 4+ years of hands-on experience contributing to application architecture & designs, proven software/enterprise integration design patterns and full-stack knowledge including modern distributed front end and back-end technology stacks 5+ years full stack development experience in modern web development technologies, Java/J2EE, UI frameworks like Angular, React, SQL, Oracle, NoSQL Databases like MongoDB Experience designing transactional/data warehouse/data lake and data integrations with Big data eco system leveraging AWS cloud technologies Thorough understanding of distributed computing Passionate, smart, and articulate developer Quality first mindset with a strong background and experience with developing products for a global audience at scale Excellent analytical thinking, interpersonal, oral and written communication skills with strong ability to influence both IT and business partners Superior knowledge of system architecture, object-oriented design, and design patterns. Good work ethic, self-starter, and results-oriented Excellent communication skills are essential, with strong verbal and writing proficiencies Exp. with Delta Lake systems like Databricks using AWS cloud technologies and PySpark is a plus Additional Preferred Qualifications: Experience working AWS Experience with SAFe Agile Framework Bachelor's/PG degree in Computer Science, Information Systems or equivalent. Hands-on experience contributing to application architecture & designs, proven software/enterprise integration design principles Ability to prioritize and manage work to critical project timelines in a fast-paced environment Excellent Analytical and communication skills are essential, with strong verbal and writing proficiencies Ability to train and mentor

Posted 6 days ago

Apply

2.0 - 4.0 years

10 - 15 Lacs

Pune

Work from Office

Naukri logo

Role & responsibilities Develop and Maintain Data Pipelines: Design, develop, and manage scalable ETL pipelines to process large datasets using PySpark, Databricks, and other big data technologies. Data Integration and Transformation: Work with various structured and unstructured data sources to build efficient data workflows and integrate them into a central data warehouse. Collaborate with Data Scientists & Analysts: Work closely with the data science and business intelligence teams to ensure the right data is available for advanced analytics, machine learning, and reporting. Optimize Performance: Optimize and tune data pipelines and ETL processes to improve data throughput and reduce latency, ensuring timely delivery of high-quality data. Automation and Monitoring: Implement automated workflows and monitoring tools to ensure data pipelines are running smoothly, and issues are proactively addressed. Ensure Data Quality: Build and maintain validation mechanisms to ensure the accuracy and consistency of the data. Data Storage and Access: Work with data storage solutions (e.g., Azure, AWS, Google Cloud) to ensure effective data storage and fast access for downstream users. Documentation and Reporting: Maintain proper documentation for all data processes and architectures to facilitate easier understanding and onboarding of new team members. Skills and Qualifications: Experience: 5+ years of experience as a Data Engineer or similar role, with hands-on experience in designing, building, and maintaining ETL pipelines. Technologies: Proficient in PySpark for large-scale data processing. Strong programming experience in Python , particularly for data engineering tasks. Experience working with Databricks for big data processing and collaboration. Hands-on experience with data storage solutions (e.g., AWS S3, Azure Data Lake, or Google Cloud Storage). Solid understanding of ETL concepts, tools, and best practices. Familiarity with SQL for querying and manipulating data in relational databases. Experience working with data orchestration tools such as Apache Airflow or Luigi is a plus. Data Modeling & Warehousing: Experience with data warehousing concepts and technologies (e.g., Redshift, Snowflake, or Big Query). Knowledge of data modeling, data transformations, and dimensional modeling. Soft Skills: Strong analytical and problem-solving skills. Excellent communication skills, capable of explaining complex data processes to non-technical stakeholders. Ability to work in a fast-paced, collaborative environment and manage multiple priorities. Preferred Qualifications: Bachelor's or masters degree in computer science, Engineering, or a related field. Certification or experience with cloud platforms like AWS , Azure , or Google Cloud . Experience in Apache Kafka or other stream-processing technologies.

Posted 6 days ago

Apply

3.0 - 15.0 years

0 Lacs

Bangalore Urban, Karnataka, India

On-site

Linkedin logo

Technology & Transformation: EAD: Azure Data Engineer-Consultant/Senior Consultant/Manager Your potential, unleashed. India’s impact on the global economy has increased at an exponential rate and Deloitte presents an opportunity to unleash and realize your potential amongst cutting edge leaders, and organizations shaping the future of the region, and indeed, the world beyond. At Deloitte, your whole self to work, every day. Combine that with our drive to propel with purpose and you have the perfect playground to collaborate, innovate, grow, and make an impact that matters. The Team Deloitte’s Technology & Transformation practice can help you uncover and unlock the value buried deep inside vast amounts of data. Our global network provides strategic guidance and implementation services to help companies manage data from disparate sources and convert it into accurate, actionable information that can support fact-driven decision-making and generate an insight-driven advantage. Our practice addresses the continuum of opportunities in business intelligence & visualization, data management, performance management and next-generation analytics and technologies, including big data, cloud, cognitive and machine learning. Your work profile: As a Consultant/Senior Consultant/Manager in our T&T Team you’ll build and nurture positive working relationships with teams and clients with the intention to exceed client expectations: - Design, develop and deploy solutions using different tools, design principles and conventions. Configure robotics processes and objects using core workflow principles in an efficient way; ensure they are easily maintainable and easy to understand. Understand existing processes and facilitate change requirements as part of a structured change control process. Solve day to day issues arising while running robotics processes and provide timely resolutions. Maintain proper documentation for the solutions, test procedures and scenarios during UAT and Production phase. Coordinate with process owners and business to understand the as-is process and design the automation process flow. Desired Qualifications 3-15 Years of hands-on experience Implementing Azure Cloud data warehouses, Azure and No-SQL databases and hybrid data scenarios. Experience developing Azure Data Factory (covering Azure Functions, LogicApps, Triggers, IR), Databricks (pySpark, Scala), Stream Analytics, Event Hub & HD Insight Components Experience in working on data lake & DW solutions on Azure. Experience managing Azure DevOps pipelines (CI/CD) Experience managing source data access security, using Vault, configuring authentication and authorization, enforcing data policies and standards. UG: B. Tech /B.E. in Any Specialization . Location and way of working: Base location: Pan India This profile involves occasional travelling to client locations. Hybrid is our default way of working. Each domain has customized the hybrid approach to their unique needs. Your role as a Consultant/Senior Consultant/Manager: We expect our people to embrace and live our purpose by challenging themselves to identify issues that are most important for our clients, our people, and for society. In addition to living our purpose, Consultant/Senior Consultant/Manager across our organization must strive to be: Inspiring - Leading with integrity to build inclusion and motivation. Committed to creating purpose - Creating a sense of vision and purpose. Agile - Achieving high-quality results through collaboration and Team unity. Skilled at building diverse capability - Developing diverse capabilities for the future. Persuasive / Influencing - Persuading and influencing stakeholders. Collaborating - Partnering to build new solutions. Delivering value - Showing commercial acumen Committed to expanding business - Leveraging new business opportunities. Analytical Acumen - Leveraging data to recommend impactful approach and solutions through the power of analysis and visualization. Effective communication – Must be well abled to have well-structured and well-articulated conversations to achieve win-win possibilities. Engagement Management / Delivery Excellence - Effectively managing engagement(s) to ensure timely and proactive execution as well as course correction for the success of engagement(s) Managing change - Responding to changing environment with resilience Managing Quality & Risk - Delivering high quality results and mitigating risks with utmost integrity and precision Strategic Thinking & Problem Solving - Applying strategic mindset to solve business issues and complex problems. Tech Savvy - Leveraging ethical technology practices to deliver high impact for clients and for Deloitte Empathetic leadership and inclusivity - creating a safe and thriving environment where everyone's valued for who they are, use empathy to understand others to adapt our behaviours and attitudes to become more inclusive. How you’ll grow Connect for impact Our exceptional team of professionals across the globe are solving some of the world’s most complex business problems, as well as directly supporting our communities, the planet, and each other. Know more in our Global Impact Report and our India Impact Report. Empower to lead You can be a leader irrespective of your career level. Our colleagues are characterised by their ability to inspire, support, and provide opportunities for people to deliver their best and grow both as professionals and human beings. Know more about Deloitte and our One Young World partnership. Inclusion for all At Deloitte, people are valued and respected for who they are and are trusted to add value to their clients, teams and communities in a way that reflects their own unique capabilities. Know more about everyday steps that you can take to be more inclusive. At Deloitte, we believe in the unique skills, attitude and potential each and every one of us brings to the table to make an impact that matters. Drive your career At Deloitte, you are encouraged to take ownership of your career. We recognise there is no one size fits all career path, and global, cross-business mobility and up / re-skilling are all within the range of possibilities to shape a unique and fulfilling career. Know more about Life at Deloitte. Everyone’s welcome… entrust your happiness to us Our workspaces and initiatives are geared towards your 360-degree happiness. This includes specific needs you may have in terms of accessibility, flexibility, safety and security, and caregiving. Here’s a glimpse of things that are in store for you. Interview tips We want job seekers exploring opportunities at Deloitte to feel prepared, confident and comfortable. To help you with your interview, we suggest that you do your research, know some background about the organisation and the business area you’re applying to. Check out recruiting tips from Deloitte professionals. Show more Show less

Posted 6 days ago

Apply

7.0 - 10.0 years

8 - 14 Lacs

Hyderabad

Hybrid

Naukri logo

Responsibilities of the Candidate : - Be responsible for the design and development of big data solutions. Partner with domain experts, product managers, analysts, and data scientists to develop Big Data pipelines in Hadoop - Be responsible for moving all legacy workloads to a cloud platform - Work with data scientists to build Client pipelines using heterogeneous sources and provide engineering services for data PySpark science applications - Ensure automation through CI/CD across platforms both in cloud and on-premises - Define needs around maintainability, testability, performance, security, quality, and usability for the data platform - Drive implementation, consistent patterns, reusable components, and coding standards for data engineering processes - Convert SAS-based pipelines into languages like PySpark, and Scala to execute on Hadoop and non-Hadoop ecosystems - Tune Big data applications on Hadoop and non-Hadoop platforms for optimal performance - Apply an in-depth understanding of how data analytics collectively integrate within the sub-function as well as coordinate and contribute to the objectives of the entire function. - Produce a detailed analysis of issues where the best course of action is not evident from the information available, but actions must be recommended/taken. - Assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients, and assets, by driving compliance with applicable laws, rules, and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct, and business practices, and escalating, managing and reporting control issues with transparency Requirements : - 6+ years of total IT experience - 3+ years of experience with Hadoop (Cloudera)/big data technologies - Knowledge of the Hadoop ecosystem and Big Data technologies Hands-on experience with the Hadoop eco-system (HDFS, MapReduce, Hive, Pig, Impala, Spark, Kafka, Kudu, Solr) - Experience in designing and developing Data Pipelines for Data Ingestion or Transformation using Java Scala or Python. - Experience with Spark programming (Pyspark, Scala, or Java) - Hands-on experience with Python/Pyspark/Scala and basic libraries for machine learning is required. - Proficient in programming in Java or Python with prior Apache Beam/Spark experience a plus. - Hand on experience in CI/CD, Scheduling and Scripting - Ensure automation through CI/CD across platforms both in cloud and on-premises - System level understanding - Data structures, algorithms, distributed storage & compute - Can-do attitude on solving complex business problems, good interpersonal and teamwork skills

Posted 6 days ago

Apply

4.0 years

0 Lacs

India

Remote

Linkedin logo

GCP Data Engineer Remote Type: Fulltime Rate: Market Client -Telus Required Skills: ● 4+ years of industry experience in software development, data engineering, business intelligence, or related field with experience in manipulating, processing, and extracting value from datasets. ● Design, build and deploy internal applications to support our technology life cycle, collaboration and spaces, service delivery management, data and business intelligence among others. ● Building Modular code for multi usable pipeline or any kind of complex Ingestion Framework used to ease the job to load the data into Datalake or Data Warehouse from multiple sources. ● Work closely with analysts and business process owners to translate business requirements into technical solutions. ● Coding experience in scripting and languages (Python, SQL, PySpark). ● Expertise in Google Cloud Platform (GCP) technologies in the data warehousing space ( BigQuery , Google Composer, Airflow, CloudSQL, PostgreSQL, Oracle, GCP Workflows , Dataflow, Cloud Scheduler, Secret Manager, Batch, Cloud Logging, Cloud SDK, Google Cloud Storage, IAM, Vertex AI). ● Maintain highest levels of development practices including: technical design, solution development, systems configuration, test documentation/execution, issue identification and resolution, writing clean, modular and self-sustaining code, with repeatable quality and predictability. ● Understanding CI/CD Processes using Pulumi, Github, Cloud Build, Cloud SDK, Docker Show more Show less

Posted 6 days ago

Apply

8.0 - 12.0 years

13 - 20 Lacs

Chennai

Work from Office

Naukri logo

Mandatory skills required : Strong Python coding and development Good to have skills required : Cloud, SQL , data analysis skills Job Description : We are seeking a highly skilled and experienced Python Lead to join our team. The ideal candidate will have strong expertise in Python coding and development, along with good-to-have skills in cloud technologies, SQL, and data analysis. Key Responsibilities : - Lead the development of high-quality, scalable, and robust Python applications. - Collaborate with cross-functional teams to define, design, and ship new features. - Ensure the performance, quality, and responsiveness of applications. - Develop RESTful applications using frameworks like Flask, Django, or FastAPI. - Utilize Databricks, PySpark SQL, and strong data analysis skills to drive data solutions. - Implement and manage modern data solutions using Azure Data Factory, Data Lake, and Data Bricks. Mandatory Skills : - Proven experience with cloud platforms (e.g. AWS) - Strong proficiency in Python, PySpark, R, and familiarity with additional programming languages such as C++, Rust, or Java. - Expertise in designing ETL architectures for batch and streaming processes, database technologies (OLTP/OLAP), and SQL. - Experience with the Apache Spark, and multi-cloud platforms (AWS, GCP, Azure). - Knowledge of data governance and GxP data contexts; familiarity with the Pharma value chain is a plus. Good to Have Skills : - Experience with modern data solutions via Azure. - Knowledge of principles summarized in the Microsoft Cloud Adoption Framework. - Additional expertise in SQL and data analysis. Educational Qualifications : Bachelor's/Master's degree or equivalent with a focus on software engineering. If you are a passionate Python developer with a knack for cloud technologies and data analysis, we would love to hear from you. Join us in driving innovation and building cutting-edge solutions!

Posted 6 days ago

Apply

7.0 - 10.0 years

17 - 27 Lacs

Gurugram

Hybrid

Naukri logo

Primary Responsibilities: Design and develop applications and services running on Azure, with a strong emphasis on Azure Databricks, ensuring optimal performance, scalability, and security. Build and maintain data pipelines using Azure Databricks and other Azure data integration tools. Write, read, and debug Spark, Scala, and Python code to process and analyze large datasets. Write extensive query in SQL and Snowflake Implement security and access control measures and regularly audit Azure platform and infrastructure to ensure compliance. Create, understand, and validate design and estimated effort for given module/task, and be able to justify it. Possess solid troubleshooting skills and perform troubleshooting of issues in different technologies and environments. Implement and adhere to best engineering practices like design, unit testing, functional testing automation, continuous integration, and delivery. Maintain code quality by writing clean, maintainable, and testable code. Monitor performance and optimize resources to ensure cost-effectiveness and high availability. Define and document best practices and strategies regarding application deployment and infrastructure maintenance. Provide technical support and consultation for infrastructure questions. Help develop, manage, and monitor continuous integration and delivery systems. Take accountability and ownership of features and teamwork. Comply with the terms and conditions of the employment contract, company policies and procedures, and any directives. Required Qualifications: B.Tech/MCA (Minimum 16 years of formal education) Overall 7+ years of experience. Minimum of 3 years of experience in Azure (ADF), Databricks and DevOps. 5 years of experience in writing advanced leve l SQL. 2-3 years of experience in writing, reading, and debugging Spark, Scala, and Python code . 3 or more years of experience in architecting, designing, developing, and implementing cloud solutions on Azure. Proficiency in programming languages and scripting tools. Understanding of cloud data storage and database technologies such as SQL and NoSQL. Proven ability to collaborate with multidisciplinary teams of business analysts, developers, data scientists, and subject-matter experts. Familiarity with DevOps practices and tools, such as continuous integration and continuous deployment (CI/CD) and Teraform. Proven proactive approach to spotting problems, areas for improvement, and performance bottlenecks. Proven excellent communication, writing, and presentation skills. Experience in interacting with international customers to gather requirements and convert them into solutions using relevant skills. Preferred Qualifications: Knowledge of AI/ML or LLM (GenAI). Knowledge of US Healthcare domain and experience with healthcare data. Experience and skills with Snowflake.

Posted 6 days ago

Apply

5.0 - 10.0 years

32 Lacs

Bengaluru

Work from Office

Naukri logo

Responsibilities: Ability to design and build Python-based code generation framework and runtime engine by reading Business Rules repository in order to. Requirements: Minimum 5 years of experience in build & deployment of Bigdata applications using SparkSQL, SparkStreaming in Python; Expertise on graph algorithms and advanced recursion techniques; Minimum 5 years of extensive experience in design, build and deployment of Python-based applications; Minimum 3 years of experience in the following: HIVE, YARN, Kafka, HBase, MongoDB; Hands-on experience in generating/parsing XML, JSON documents, and REST API request/responses; Bachelors degree in a quantitative field (such as Engineering, Computer Science, Statistics, Econometrics) and a minimum of 5 years of experience; Expertise in handling complex large-scale Big Data environments preferably (20Tb+); Hands-on experience writing complex SQL queries, exporting and importing large amounts of data using utilities.

Posted 6 days ago

Apply

4.0 years

0 Lacs

Ahmedabad, Gujarat, India

On-site

Linkedin logo

This role is for one of the Weekday's clients Min Experience: 4 years Location: Ahmedabad JobType: full-time We are seeking a highly skilled Senior Database Administrator with 5-8 years of experience in data engineering and database management. The ideal candidate will have a strong foundation in data architecture, modeling, and pipeline orchestration. Hands-on experience with modern database technologies and exposure to generative AI tools in production environments will be a significant advantage. This role involves leading efforts to streamline data workflows, improve automation, and deliver high-impact insights across the organization. Requirements Key Responsibilities: Design, develop, and manage scalable and efficient data pipelines (ETL/ELT) across multiple database systems. Architect and maintain high-availability, secure, and scalable data storage solutions. Utilize generative AI tools to automate data workflows and enhance system capabilities. Collaborate with engineering, analytics, and data science teams to fulfill data requirements and optimize data delivery. Implement and monitor data quality standards, governance practices, and compliance protocols. Document data architectures, systems, and processes for transparency and maintainability. Apply data modeling best practices to support optimal storage and querying performance. Continuously research and integrate emerging technologies to advance the data infrastructure. Qualifications: Bachelor's or Master's degree in Computer Science, Information Technology, or related field. 5-8 years of experience in database administration and data engineering for large-scale systems. Proven experience in designing and managing relational and non-relational databases. Mandatory Skills: SQL - Proficient in advanced queries, performance tuning, and database management. NoSQL - Experience with at least one NoSQL database such as MongoDB, Cassandra, or CosmosDB. Hands-on experience with at least one of the following cloud data warehouses: Snowflake, Redshift, BigQuery, or Microsoft Fabric. Cloud expertise - Strong experience with Azure and its data services. Working knowledge of Python for scripting and data processing (e.g., Pandas, PySpark). Experience with ETL tools such as Apache Airflow, Microsoft Fabric, Informatica, or Talend. Familiarity with generative AI tools and their integration into data pipelines. Preferred Skills & Competencies: Deep understanding of database performance, tuning, backup, recovery, and security. Strong knowledge of data governance, data quality management, and metadata handling. Experience with Git or other version control systems. Familiarity with AI/ML-driven data solutions is a plus. Excellent problem-solving skills and the ability to resolve complex database issues. Strong communication skills to collaborate with cross-functional teams and stakeholders. Demonstrated ability to manage projects and mentor junior team members. Passion for staying updated with the latest trends and best practices in database and data engineering technologies. Show more Show less

Posted 6 days ago

Apply

6.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Linkedin logo

About the job A little about us... LTIMindtree is a global technology consulting and digital solutions company that enables enterprises across industries to reimagine business models, accelerate innovation, and maximize growth by harnessing digital technologies. As a digital transformation partner to more than 750 clients, LTIMindtree brings extensive domain and technology expertise to help drive superior competitive differentiation, customer experiences, and business outcomes in a converging world. Powered by nearly 90,000 talented and entrepreneurial professionals across 35 countries, LTIMindtree — a Larsen & Toubro Group company — combines the industry-acclaimed strengths of erstwhile Larsen and Toubro Infotech and Mindtree in solving the most complex business challenges and delivering transformation at scale. For more info, please visit www.ltimindtree.com Job Details- We are having a weekend drive for the requirement of a Data Scientist at our Bangalore office. Date - 14th June Experience - 4 to 12 Yrs. Location – LTIMindtree Office Bangalore Whitefield Notice Period - Immediate to 60 Days only Mandatory Skills- Gen-AI, Data Science, Python, RAG and Cloud (AWS/Azure) Secondary - (Any) - Machine Learning, Deep Learning, ChatGPT, Langchain, Prompt, vector stores, RAG, llama, Computer vision, Deep learning, Machine learning, OCR, Transformer, regression, forecasting, classification, hyper parameter tunning, MLOps, Inference, Model training, Model Deployment Generic JD : More than 6 years of experience in Data Engineering, Data Science and AI / ML domain Excellent understanding of machine learning techniques and algorithms, such as GPTs, CNN, RNN, k-NN, Naive Bayes, SVM, Decision Forests, etc. Experience using business intelligence tools (e.g. Tableau, PowerBI) and data frameworks (e.g. Hadoop) Experience in Cloud native skills. Knowledge of SQL and Python; familiarity with Scala, Java or C++ is an asset Analytical mind and business acumen and Strong math skills (e.g. statistics, algebra) Experience with common data science toolkits, such as TensorFlow, KERAs, PyTorch, PANDAs, Microsoft CNTK, NumPy etc. Deep expertise in at least one of these is highly desirable. Experience with NLP, NLG and Large Language Models like – BERT, LLaMa, LaMDA, GPT, BLOOM, PaLM, DALL-E, etc. Great communication and presentation skills. Should have experience in working in a fast-paced team culture. Experience with AIML and Big Data technologies like – AWS SageMaker, Azure Cognitive Services, Google Colab, Jupyter Notebook, Hadoop, PySpark, HIVE, AWS EMR etc. Experience with NoSQL databases, such as MongoDB, Cassandra, HBase, Vector databases Good understanding of applied statistics skills, such as distributions, statistical testing, regression, etc. Should be a data-oriented person with analytical mind and business acumen. Show more Show less

Posted 6 days ago

Apply

0 years

0 Lacs

India

On-site

Linkedin logo

Company Description ThreatXIntel is a startup cyber security company dedicated to providing customized, affordable solutions to protect businesses and organizations from cyber threats. Our services include cloud security, web and mobile security testing, cloud security assessment, and DevSecOps. We take a proactive approach to security, continuously monitoring and testing our clients' digital environments to identify vulnerabilities before they can be exploited. Role Description We are looking for a freelance Data Engineer with strong experience in PySpark and AWS data services, particularly S3 and Redshift . The ideal candidate will also have some familiarity with integrating or handling data from Salesforce . This role focuses on building scalable data pipelines, transforming large datasets, and enabling efficient data analytics and reporting. Key Responsibilities: Develop and optimize ETL/ELT data pipelines using PySpark for large-scale data processing. Manage data ingestion, storage, and transformation across AWS S3 and Redshift . Design data flows and schemas to support reporting, analytics, and business intelligence needs. Perform incremental loads, partitioning, and performance tuning in distributed environments. Extract and integrate relevant datasets from Salesforce for downstream processing. Ensure data quality, consistency, and availability for analytics teams. Collaborate with data analysts, platform engineers, and business stakeholders. Required Skills: Strong hands-on experience with PySpark for large-scale distributed data processing. Proven track record working with AWS S3 (data lake) and Amazon Redshift (data warehouse). Ability to write complex SQL queries for transformation and reporting. Basic understanding or experience integrating data from Salesforce (APIs or exports). Experience with performance optimization, partitioning strategies, and efficient schema design. Knowledge of version control and collaborative development tools (e.g., Git). Nice to Have: Experience with AWS Glue or Lambda for orchestration. Familiarity with Salesforce objects, SOQL, or ETL tools like Talend, Informatica, or Airflow. Understanding of data governance and security best practices in cloud environments. Show more Show less

Posted 6 days ago

Apply

3.0 years

0 Lacs

Pune, Maharashtra, India

On-site

Linkedin logo

Job Summary: We are seeking a highly skilled and innovative Data Scientist to join our team and drive data-centric initiatives by leveraging AI/ML models , Big Data technologies , and cloud platforms like AWS . The ideal candidate will be proficient in Python , experienced in designing end-to-end machine learning pipelines, and comfortable working with large-scale data systems. Key Responsibilities: Design, develop, and deploy machine learning models and AI-based solutions for business problems. Build robust ETL pipelines to process structured and unstructured data using tools like PySpark , Airflow , or Glue . Work with AWS cloud services (e.g., S3, Lambda, SageMaker, Redshift, EMR) to build scalable data science solutions. Perform exploratory data analysis (EDA) and statistical modeling to uncover actionable insights. Collaborate with data engineers, product managers, and stakeholders to identify use cases and deliver impactful data-driven solutions. Optimize model performance and ensure model explainability, fairness, and reproducibility. Maintain and improve existing data science solutions through MLOps practices (e.g., model monitoring, retraining, CI/CD for ML). Required Skills and Qualifications: Bachelor’s or Master’s degree in Computer Science, Statistics, Data Science, or related field. 3+ years of experience in data science or machine learning roles. Strong programming skills in Python and experience with libraries like Pandas, NumPy, Scikit-learn, TensorFlow, or PyTorch . Show more Show less

Posted 6 days ago

Apply

6.0 - 10.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Linkedin logo

Every day, your work will make an impact that matters, while you thrive in a dynamic culture of inclusion, collaboration, and high performance. As the undisputed leader in professional services, Deloitte is where you will find unrivaled opportunities to succeed and realize your full potential. The Team We are seeking highly skilled Databricks Data Engineers to join our data modernization team. You will play a pivotal role in designing, developing, and maintaining robust data solutions on the Databricks platform. Your experience in data engineering, along with a deep understanding of Databricks, will be instrumental in building solutions to drive data-driven decision-making across a variety of customers. Location: Bangalore/Mumbai/Pune/Delhi/Hyderabad/Coimbatore/Kolkata/Chennai Work you’ll do Responsibilities Design, develop, and optimize data workflows and notebooks using Databricks to ingest, transform, and load data from various sources into the data lake. Build and maintain scalable and efficient data processing workflows using Spark (PySpark or Spark SQL) by following coding standards and best practices. Collaborate with technical and business stakeholders to understand data requirements and translate them into technical solutions. Develop data models and schemas to support reporting and analytics needs. Ensure data quality, integrity, and security by implementing appropriate checks and controls. Monitor and optimize data processing performance, identifying, and resolving bottlenecks. Stay up to date with the latest advancements in data engineering and Databricks technologies. Qualifications Bachelor’s or master’s degree in any field 6-10 years of experience in designing, implementing, and maintaining data solutions on Databricks Experience with at least one of the popular cloud platforms – Azure, AWS or GCP Experience with ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) processes Knowledge of data warehousing and data modelling concepts Experience with Python or SQL Experience with Delta Lake Understanding of DevOps principles and practices Excellent problem-solving and troubleshooting skills Strong communication and teamwork skills How you will grow At Deloitte, our professional development plan focuses on helping people at every level of their career to identify and use their strengths to do their best work every day. From entry-level employees to senior leaders, we believe there is always room to learn. We offer opportunities to help build excellent skills in addition to hands-on experience in the global, fast-changing business world. From on-the-job learning experiences to formal development programs at Deloitte University, our professionals have a variety of opportunities to continue to grow throughout their career. Explore Deloitte University, The Leadership Centre. Benefits At Deloitte, we know that great people make a great organization. We value our people and offer employees a broad range of benefits. Learn more about what working at Deloitte can mean for you. Our purpose Deloitte is led by a purpose: To make an impact that matters . Every day, Deloitte people are making a real impact in the places they live and work. We pride ourselves on doing not only what is good for clients, but also what is good for our people and the communities in which we live and work—always striving to be an organization that is held up as a role model of quality, integrity, and positive change. Learn more about Deloitte's impact on the world Recruiter tips We want job seekers exploring opportunities at Deloitte to feel prepared and confident. To help you with your interview, we suggest that you do your research: know some background about the organization and the business area you’re applying to. Check out recruiting tips from Deloitte professionals. Show more Show less

Posted 6 days ago

Apply

7.0 - 12.0 years

25 - 30 Lacs

Hyderabad, Bengaluru

Hybrid

Naukri logo

Cloud Data Engineer The Cloud Data Engineer will be responsible for developing the data lake platform and all applications on Azure cloud. Proficiency in data engineering, data modeling, SQL, and Python programming is essential. The Data Engineer will provide design and development solutions for applications in the cloud. Essential Job Functions: Understand requirements and collaborate with the team to design and deliver projects. Design and implement data lake house projects within Azure. Develop application lifecycle utilizing Microsoft Azure technologies. Participate in design, planning, and necessary documentation. Engage in Agile ceremonies including daily standups, scrum, retrospectives, demos, and code reviews. Hands-on experience with Python/SQL development and Azure data pipelines. Collaborate with the team to develop and deliver cross-functional products. Key Skills: a. Data Engineering and SQL b. Python c. PySpark d. Azure Data Lake and ADF e. Databricks f. CI/CD g. Strong communication Other Responsibilities: Document and maintain project artifacts. Maintain comprehensive knowledge of industry standards, methodologies, processes, and best practices. Complete training as required for Privacy, Code of Conduct, etc. Promptly report any known or suspected loss, theft, or unauthorized disclosure or use of PI to the General Counsel/Chief Compliance Officer or Chief Information Officer. Adhere to the company's compliance program. Safeguard the company's intellectual property, information, and assets. Other duties as assigned. Minimum Qualifications and Job Requirements: Bachelor's degree in Computer Science. 7 years of hands-on experience in designing and developing distributed data pipelines. 5 years of hands-on experience in Azure data service technologies. 5 years of hands-on experience in Python, SQL, Object-oriented programming, ETL, and unit testing. Experience with data integration with APIs, Web services, Queues. Experience with Azure DevOps and CI/CD as well as agile tools and processes including JIRA, Confluence. *Required: Azure data engineering associate and databricks data engineering certification

Posted 6 days ago

Apply

0 years

0 Lacs

India

On-site

Linkedin logo

Job Title "Applied Data Scientist at Poiro, based out of Bengaluru - On-site Role" Company Details The best way to predict the future is to invent it. And the best way to invent the future is to get the best minds to work on an idea whose time has come. Poiro (poiro.ai) builds AI systems and agents to supercharge marketing workflows and bring brands closer to consumers. Poiro’s AI systems can be trained on both structured and unstructured marketing data for a brand - from social media content and e-commerce marketplace data to 1st-party customer data - to build a comprehensive knowledge representation of a brand and its category. On top of that, Poiro’s AI agents seamlessly perform data analytics & data science workflows to generate actionable insights and guide marketing execution. Leading brands are using Poiro to - ● Identify content whitespaces, through analysis of social content across their category, and generate highly engaging organic & ad content ● Get tailored creator recommendations that maximize ROI for a particular product and content brief ● Comprehensively audit a creator's on & off-platform behavior to safeguard themselves from commercial and reputation risks And for many more use cases! Poiro is a subsidiary of Evam Labs (www.evamlabs.ai) - a Singapore headquartered holding company with offices in Bangalore and San Jose, building the next generation of high impact AI powered Enterprise Solutions. From Asia, for the World. Evam Labs was founded by ex-founders, academics and investors with over a decade of experience in building data & AI products and scaling companies from 0 to IPO. The founders are IIT/IIM/CMU alumni and have cumulatively raised over $500M+, invested in 30+ startups and hold 20+ patents. The rest of the Evam team comprises alumni of IITs, IIMs, CMU, IISc, NUS with rich experience across multiple industries. Job Roles & Responsibilities - Develop and implement data-driven models using Python, TensorFlow, Large Language Models (LLM), and Scikit-learn to enhance content monetization for creators. - Collaborate with cross-functional teams to analyze and interpret large datasets using NumPy, Pandas, and PySpark. - Design and optimize machine learning algorithms and solutions to improve user engagement and revenue potential. - Explore and integrate AI technologies to support and automate creator monetization pathways. - Monitor model performance and iteratively refine based on business and technical feedback. - Stay updated with the latest advancements in AI and data science to apply innovative solutions at Poiro. Cultural Expectations - Collaborate openly with team members to enhance AI-driven creator content tools - Embrace innovation, continuously exploring and adapting to cutting-edge AI technologies - Respect diverse ideas, fostering a creative and inclusive workplace - Display a proactive mindset in problem-solving and process improvements - Communicate clearly and effectively, contributing to a positive team dynamic - Uphold accountability, meeting project deadlines with precision and reliability Hiring Process Profile Shortlisting Theory/problem-solving and coding/hands-on Resume/experience-based discussion Cultural fit with founders Show more Show less

Posted 6 days ago

Apply

4.0 years

0 Lacs

Kochi, Kerala, India

On-site

Linkedin logo

Introduction In this role, you'll work in one of our IBM Consulting Client Innovation Centers (Delivery Centers), where we deliver deep technical and industry expertise to a wide range of public and private sector clients around the world. Our delivery centers offer our clients locally based skills and technical expertise to drive innovation and adoption of new technology. Your Role And Responsibilities As Data Engineer, you will develop, maintain, evaluate and test big data solutions. You will be involved in the development of data solutions using Spark Framework with Python or Scala on Hadoop and AWS Cloud Data Platform Responsibilities Experienced in building data pipelines to Ingest, process, and transform data from files, streams and databases. Process the data with Spark, Python, PySpark, Scala, and Hive, Hbase or other NoSQL databases on Cloud Data Platforms (AWS) or HDFS Experienced in develop efficient software code for multiple use cases leveraging Spark Framework / using Python or Scala and Big Data technologies for various use cases built on the platform Experience in developing streaming pipelines Experience to work with Hadoop / AWS eco system components to implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Apache Spark, Kafka, any Cloud computing etc Preferred Education Master's Degree Required Technical And Professional Expertise Minimum 4+ years of experience in Big Data technologies with extensive data engineering experience in Spark / Python or Scala ; Minimum 3 years of experience on Cloud Data Platforms on AWS; Experience in AWS EMR / AWS Glue / DataBricks, AWS RedShift, DynamoDB Good to excellent SQL skills Exposure to streaming solutions and message brokers like Kafka technologies Preferred Technical And Professional Experience Certification in AWS and Data Bricks or Cloudera Spark Certified developers Show more Show less

Posted 6 days ago

Apply

2.0 years

0 Lacs

Gurugram, Haryana, India

On-site

Linkedin logo

Company Overview Viraaj HR Solutions is a leading recruitment firm in India, committed to providing exceptional HR services to organizations across various industries. We pride ourselves on our ability to connect skilled individuals with top-tier companies, aligning talent with the right opportunities. Our mission is to foster professional growth through innovation and collaboration, ensuring our clients and candidates thrive in an ever-changing job market. We value integrity, respect, and excellence, making Viraaj HR Solutions a trusted partner in the recruitment process. Role Responsibilities Develop and maintain scalable big data solutions using Hadoop, Spark, and other tools. Implement data ingestion processes through ETL tools and techniques. Design and optimize complex SQL queries for efficient data retrieval. Collaborate with data scientists to integrate machine learning algorithms into big data solutions. Ensure data quality and integrity through rigorous testing methods. Perform data mining techniques to uncover insights from large datasets. Utilize NoSQL databases to manage unstructured data sources. Write efficient code in Java and Python for data processing tasks. Participate in the architecture and design of data models for analytical purposes. Monitor and troubleshoot big data applications to optimize performance. Work alongside cross-functional teams to understand data requirements and provide solutions. Document the development processes and maintain technical documentation. Stay updated with emerging big data technologies and trends. Train junior developers on big data best practices and tools. Contribute to the continuous improvement of data operations. Qualifications Bachelor's degree in Computer Science, Information Technology, or a related field. 2+ years of experience in big data development. Strong understanding of Hadoop ecosystem components. Experience in Spark programming is essential. Proficiency in Java and Python programming languages. Knowledge of data warehousing and ETL concepts. Hands-on experience with SQL and NoSQL databases. Familiarity with cloud platforms such as AWS or Azure. Understanding of data mining and machine learning techniques. Excellent analytical and problem-solving skills. Strong collaboration and communication abilities. Ability to work effectively in a fast-paced environment. Detail-oriented approach to data processing and development. Willingness to learn new technologies and adapt to changes. Team player with a positive attitude. Prior experience in an Agile environment is a plus. Join Viraaj HR Solutions as a Big Data Developer and take your career to the next level while contributing to innovative data solutions in a collaborative work environment. Apply today! Skills: pyspark,spark,java,azure,etl tools,hadoop,sql,python,big data,data mining,gcp,machine learning,nosql databases,aws Show more Show less

Posted 1 week ago

Apply

2.0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

Linkedin logo

Company Overview Viraaj HR Solutions is a leading recruitment firm in India, committed to providing exceptional HR services to organizations across various industries. We pride ourselves on our ability to connect skilled individuals with top-tier companies, aligning talent with the right opportunities. Our mission is to foster professional growth through innovation and collaboration, ensuring our clients and candidates thrive in an ever-changing job market. We value integrity, respect, and excellence, making Viraaj HR Solutions a trusted partner in the recruitment process. Role Responsibilities Develop and maintain scalable big data solutions using Hadoop, Spark, and other tools. Implement data ingestion processes through ETL tools and techniques. Design and optimize complex SQL queries for efficient data retrieval. Collaborate with data scientists to integrate machine learning algorithms into big data solutions. Ensure data quality and integrity through rigorous testing methods. Perform data mining techniques to uncover insights from large datasets. Utilize NoSQL databases to manage unstructured data sources. Write efficient code in Java and Python for data processing tasks. Participate in the architecture and design of data models for analytical purposes. Monitor and troubleshoot big data applications to optimize performance. Work alongside cross-functional teams to understand data requirements and provide solutions. Document the development processes and maintain technical documentation. Stay updated with emerging big data technologies and trends. Train junior developers on big data best practices and tools. Contribute to the continuous improvement of data operations. Qualifications Bachelor's degree in Computer Science, Information Technology, or a related field. 2+ years of experience in big data development. Strong understanding of Hadoop ecosystem components. Experience in Spark programming is essential. Proficiency in Java and Python programming languages. Knowledge of data warehousing and ETL concepts. Hands-on experience with SQL and NoSQL databases. Familiarity with cloud platforms such as AWS or Azure. Understanding of data mining and machine learning techniques. Excellent analytical and problem-solving skills. Strong collaboration and communication abilities. Ability to work effectively in a fast-paced environment. Detail-oriented approach to data processing and development. Willingness to learn new technologies and adapt to changes. Team player with a positive attitude. Prior experience in an Agile environment is a plus. Join Viraaj HR Solutions as a Big Data Developer and take your career to the next level while contributing to innovative data solutions in a collaborative work environment. Apply today! Skills: pyspark,spark,java,azure,etl tools,hadoop,sql,python,big data,data mining,gcp,machine learning,nosql databases,aws Show more Show less

Posted 1 week ago

Apply

3.0 - 6.0 years

0 Lacs

Navi Mumbai, Maharashtra, India

On-site

Linkedin logo

Job Title: Data Scientist Location: Navi Mumbai Experience: 3-6 Years Duration: Fulltime Job Summary: We are looking for a highly skilled Data Scientist with deep expertise in time series forecasting, particularly in demand forecasting and customer lifecycle analytics (CLV). The ideal candidate will be proficient in Python or PySpark, have hands-on experience with tools like Prophet and ARIMA, and be comfortable working in Databricks environments. Familiarity with classic ML models and optimization techniques is a plus. Key Responsibilities • Develop, deploy, and maintain time series forecasting models (Prophet, ARIMA, etc.) for demand forecasting and customer behavior modeling. • Design and implement Customer Lifetime Value (CLV) models to drive customer retention and engagement strategies. • Process and analyze large datasets using PySpark or Python (Pandas). • Partner with cross-functional teams to identify business needs and translate them into data science solutions. • Leverage classic ML techniques (classification, regression) and boosting algorithms (e.g., XGBoost, LightGBM) to support broader analytics use cases. • Use Databricks for collaborative development, data pipelines, and model orchestration. • Apply optimization techniques where relevant to improve forecast accuracy and business decision-making. • Present actionable insights and communicate model results effectively to technical and non-technical stakeholders. Required Qualifications • Strong experience in Time Series Forecasting, with hands-on knowledge of Prophet, ARIMA, or equivalent – Mandatory. • Proven track record in Demand Forecasting – Highly Preferred. • Experience in modeling Customer Lifecycle Value (CLV) or similar customer analytics use cases – Highly Preferred. • Proficiency in Python (Pandas) or PySpark – Mandatory. • Experience with Databricks – Mandatory. • Solid foundation in statistics, predictive modeling, and machine learning Show more Show less

Posted 1 week ago

Apply

2.0 years

0 Lacs

Pune, Maharashtra, India

On-site

Linkedin logo

Company Overview Viraaj HR Solutions is a leading recruitment firm in India, committed to providing exceptional HR services to organizations across various industries. We pride ourselves on our ability to connect skilled individuals with top-tier companies, aligning talent with the right opportunities. Our mission is to foster professional growth through innovation and collaboration, ensuring our clients and candidates thrive in an ever-changing job market. We value integrity, respect, and excellence, making Viraaj HR Solutions a trusted partner in the recruitment process. Role Responsibilities Develop and maintain scalable big data solutions using Hadoop, Spark, and other tools. Implement data ingestion processes through ETL tools and techniques. Design and optimize complex SQL queries for efficient data retrieval. Collaborate with data scientists to integrate machine learning algorithms into big data solutions. Ensure data quality and integrity through rigorous testing methods. Perform data mining techniques to uncover insights from large datasets. Utilize NoSQL databases to manage unstructured data sources. Write efficient code in Java and Python for data processing tasks. Participate in the architecture and design of data models for analytical purposes. Monitor and troubleshoot big data applications to optimize performance. Work alongside cross-functional teams to understand data requirements and provide solutions. Document the development processes and maintain technical documentation. Stay updated with emerging big data technologies and trends. Train junior developers on big data best practices and tools. Contribute to the continuous improvement of data operations. Qualifications Bachelor's degree in Computer Science, Information Technology, or a related field. 2+ years of experience in big data development. Strong understanding of Hadoop ecosystem components. Experience in Spark programming is essential. Proficiency in Java and Python programming languages. Knowledge of data warehousing and ETL concepts. Hands-on experience with SQL and NoSQL databases. Familiarity with cloud platforms such as AWS or Azure. Understanding of data mining and machine learning techniques. Excellent analytical and problem-solving skills. Strong collaboration and communication abilities. Ability to work effectively in a fast-paced environment. Detail-oriented approach to data processing and development. Willingness to learn new technologies and adapt to changes. Team player with a positive attitude. Prior experience in an Agile environment is a plus. Join Viraaj HR Solutions as a Big Data Developer and take your career to the next level while contributing to innovative data solutions in a collaborative work environment. Apply today! Skills: pyspark,spark,java,azure,etl tools,hadoop,sql,python,big data,data mining,gcp,machine learning,nosql databases,aws Show more Show less

Posted 1 week ago

Apply

Exploring PySpark Jobs in India

PySpark, a powerful data processing framework built on top of Apache Spark and Python, is in high demand in the job market in India. With the increasing need for big data processing and analysis, companies are actively seeking professionals with PySpark skills to join their teams. If you are a job seeker looking to excel in the field of big data and analytics, exploring PySpark jobs in India could be a great career move.

Top Hiring Locations in India

Here are 5 major cities in India where companies are actively hiring for PySpark roles: 1. Bangalore 2. Pune 3. Hyderabad 4. Mumbai 5. Delhi

Average Salary Range

The estimated salary range for PySpark professionals in India varies based on experience levels. Entry-level positions can expect to earn around INR 6-8 lakhs per annum, while experienced professionals can earn upwards of INR 15 lakhs per annum.

Career Path

In the field of PySpark, a typical career progression may look like this: 1. Junior Developer 2. Data Engineer 3. Senior Developer 4. Tech Lead 5. Data Architect

Related Skills

In addition to PySpark, professionals in this field are often expected to have or develop skills in: - Python programming - Apache Spark - Big data technologies (Hadoop, Hive, etc.) - SQL - Data visualization tools (Tableau, Power BI)

Interview Questions

Here are 25 interview questions you may encounter when applying for PySpark roles:

  • Explain what PySpark is and its main features (basic)
  • What are the advantages of using PySpark over other big data processing frameworks? (medium)
  • How do you handle missing or null values in PySpark? (medium)
  • What is RDD in PySpark? (basic)
  • What is a DataFrame in PySpark and how is it different from an RDD? (medium)
  • How can you optimize performance in PySpark jobs? (advanced)
  • Explain the difference between map and flatMap transformations in PySpark (basic)
  • What is the role of a SparkContext in PySpark? (basic)
  • How do you handle schema inference in PySpark? (medium)
  • What is a SparkSession in PySpark? (basic)
  • How do you join DataFrames in PySpark? (medium)
  • Explain the concept of partitioning in PySpark (medium)
  • What is a UDF in PySpark? (medium)
  • How do you cache DataFrames in PySpark for optimization? (medium)
  • Explain the concept of lazy evaluation in PySpark (medium)
  • How do you handle skewed data in PySpark? (advanced)
  • What is checkpointing in PySpark and how does it help in fault tolerance? (advanced)
  • How do you tune the performance of a PySpark application? (advanced)
  • Explain the use of Accumulators in PySpark (advanced)
  • How do you handle broadcast variables in PySpark? (advanced)
  • What are the different data sources supported by PySpark? (medium)
  • How can you run PySpark on a cluster? (medium)
  • What is the purpose of the PySpark MLlib library? (medium)
  • How do you handle serialization and deserialization in PySpark? (advanced)
  • What are the best practices for deploying PySpark applications in production? (advanced)

Closing Remark

As you explore PySpark jobs in India, remember to prepare thoroughly for interviews and showcase your expertise confidently. With the right skills and knowledge, you can excel in this field and advance your career in the world of big data and analytics. Good luck!

cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies