Home
Jobs

3895 Pyspark Jobs - Page 22

Filter Interviews
Min: 0 years
Max: 25 years
Min: ₹0
Max: ₹10000000
Setup a job Alert
Filter
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

10.0 years

0 Lacs

Pune, Maharashtra, India

On-site

Linkedin logo

Data Ops Capability Deployment - Analyst is a seasoned professional role. Applies in-depth disciplinary knowledge, contributing to the development of new solutions/frameworks/techniques and the improvement of processes and workflow for Enterprise Data function. Integrates subject matter and industry expertise within a defined area. Requires in-depth understanding of how areas collectively integrate within the sub-function as well as coordinate and contribute to the objectives of the function and overall business. The primary purpose of this role is to perform data analytics and data analysis across different asset classes, and to build data science/Tooling capabilities within the team. This will involve working closely with the wider Enterprise Data team, in particular the front to back leads to deliver business priorities. The following role is within B & I Data Capabilities team within the Enterprise Data. The team manages the Data quality/Metrics/Controls program in addition to a broad remit to implement and embed improved data governance and data management practices throughout the region. The Data quality program is centered on enhancing Citi’s approach to data risk and addressing regulatory commitments in this area. Key Responsibilities: Hands on with data engineering background and have thorough understanding of Distributed Data platforms and Cloud services. Sound understanding of data architecture and data integration with enterprise applications Research and evaluate new data technologies, data mesh architecture and self-service data platforms Work closely with Enterprise Architecture Team on the definition and refinement of overall data strategy Should be able to address performance bottlenecks, design batch orchestrations, and deliver Reporting capabilities. Ability to perform complex data analytics (data cleansing, transformation, joins, aggregation etc.) on large complex datasets. Build analytics dashboards & data science capabilities for Enterprise Data platforms. Communicate complicated findings and propose solutions to a variety of stakeholders. Understanding business and functional requirements provided by business analysts and convert into technical design documents. Work closely with cross-functional teams e.g. Business Analysis, Product Assurance, Platforms and Infrastructure, Business Office, Control and Production Support. Prepare handover documents and manage SIT, UAT and Implementation. Demonstrate an in-depth understanding of how the development function integrates within overall business/technology to achieve objectives; requires a good understanding of the banking industry. Performs other duties and functions as assigned. Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency. Skills & Qualifications 10 + years of active development background and experience in Financial Services or Finance IT is a required. Experience with Data Quality/Data Tracing/Data Lineage/Metadata Management Tools Hands on experience for ETL using PySpark on distributed platforms along with data ingestion, Spark optimization, resource utilization, capacity planning & batch orchestration. In depth understanding of Hive, HDFS, Airflow, job scheduler Strong programming skills in Python with experience in data manipulation and analysis libraries (Pandas, Numpy) Should be able to write complex SQL/Stored Procs Should have worked on DevOps, Jenkins/Lightspeed, Git, CoPilot. Strong knowledge in one or more of the BI visualization tools such as Tableau, PowerBI. Proven experience in implementing Datalake/Datawarehouse for enterprise use cases. Exposure to analytical tools and AI/ML is desired. Education: Bachelor's/University degree, master's degree in information systems, Business Analysis / Computer Science. ------------------------------------------------------ Job Family Group: Data Governance ------------------------------------------------------ Job Family: Data Governance Foundation ------------------------------------------------------ Time Type: Full time ------------------------------------------------------ Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law. If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi. View Citi’s EEO Policy Statement and the Know Your Rights poster. Show more Show less

Posted 5 days ago

Apply

3.0 years

0 Lacs

Greater Chennai Area

On-site

Linkedin logo

Chennai / Bangalore / Hyderabad Who We Are Tiger Analytics is a global leader in AI and analytics, helping Fortune 1000 companies solve their toughest challenges. We offer full-stack AI and analytics services & solutions to empower businesses to achieve real outcomes and value at scale. We are on a mission to push the boundaries of what AI and analytics can do to help enterprises navigate uncertainty and move forward decisively. Our purpose is to provide certainty to shape a better tomorrow. Our team of 4000+ technologists and consultants are based in the US, Canada, the UK, India, Singapore and Australia, working closely with clients across CPG, Retail, Insurance, BFS, Manufacturing, Life Sciences, and Healthcare. Many of our team leaders rank in Top 10 and 40 Under 40 lists, exemplifying our dedication to innovation and excellence. We are a Great Place to Work-Certified™ (2022-24), recognized by analyst firms such as Forrester, Gartner, HFS, Everest, ISG and others. We have been ranked among the ‘Best’ and ‘Fastest Growing’ analytics firms lists by Inc., Financial Times, Economic Times and Analytics India Magazine. Curious about the role? What your typical day would look like? We are looking for a Senior Analyst or Machine Learning Engineer who will work on a broad range of cutting-edge data analytics and machine learning problems across a variety of industries. More specifically, you will Engage with clients to understand their business context. Translate business problems and technical constraints into technical requirements for the desired analytics solution. Collaborate with a team of data scientists and engineers to embed AI and analytics into the business decision processes. What do we expect? 3+ years of experience with at least 1+ years of relevant DS experience. Proficient in a structured Python, Pyspark, Machine Learning (Experience in productionizing models) Proficient in AWS cloud technologies is mandatory Experience and good understanding with Sagemaker/Data Bricks Experience in MLOPS frameworks (e.g Mlflow or Kubeflow) Follows good software engineering practices and has an interest in building reliable and robust software. Good understanding of DS concepts and DS model lifecycle. Working knowledge of Linux or Unix environments ideally in a cloud environment. Model deployment / model monitoring experience (Preferably in Banking Domain) CI/CD pipeline creation is good to have Excellent written and verbal communication skills B.Tech from Tier-1 college / M.S or M. Tech is preferred You are important to us, let’s stay connected! Every individual comes with a different set of skills and qualities so even if you don’t tick all the boxes for the role today, we urge you to apply as there might be a suitable/unique role for you tomorrow. We are an equal-opportunity employer. Our diverse and inclusive culture and values guide us to listen, trust, respect, and encourage people to grow the way they desire. Note: The designation will be commensurate with expertise and experience. Compensation packages are among the best in the industry. Additional Benefits: Health insurance (self & family), virtual wellness platform, and knowledge communities. Show more Show less

Posted 5 days ago

Apply

4.0 - 8.0 years

6 - 10 Lacs

Pune, Gurugram

Work from Office

Naukri logo

ZS is a place where passion changes lives. As a management consulting and technology firm focused on improving life and how we live it , our most valuable asset is our people. Here you ll work side-by-side with a powerful collective of thinkers and experts shaping life-changing solutions for patients, caregivers and consumers, worldwide. ZSers drive impact by bringing a client first mentality to each and every engagement. We partner collaboratively with our clients to develop custom solutions and technology products that create value and deliver company results across critical areas of their business. Bring your curiosity for learning; bold ideas; courage an d passion to drive life-changing impact to ZS. Our most valuable asset is our people . At ZS we honor the visible and invisible elements of our identities, personal experiences and belief systems the ones that comprise us as individuals, shape who we are and make us unique. We believe your personal interests, identities, and desire to learn are part of your success here. Learn more about our diversity, equity, and inclusion efforts and the networks ZS supports to assist our ZSers in cultivating community spaces, obtaining the resources they need to thrive, and sharing the messages they are passionate about. Business Technology ZS s Technology group focuses on scalable strategies, assets and accelerators that deliver to our clients enterprise-wide transformation via cutting-edge technology. We leverage digital and technology solutions to optimize business processes, enhance decision-making, and drive innovation. Our services include, but are not limited to, Digital and Technology advisory, Product and Platform development and Data, Analytics and AI implementation. What you ll do Undertake complete ownership in accomplishing activities and assigned responsibilities across all phases of project lifecycle to solve business problems across one or more client engagements; Apply appropriate development methodologies (e.g.agile, waterfall) and best practices (e.g. mid-development client reviews, embedded QA procedures, unit testing) to ensure successful and timely completion of assignments; Collaborate with other team members to leverage expertise and ensure seamless transitions; Exhibit flexibility in undertaking new and challenging problems and demonstrate excellent task management; Assist in creating project outputs such as business case development, solution vision and design, user requirements, prototypes, and technical architecture (if needed), test cases, and operations management; Bring transparency in driving assigned tasks to completion and report accurate status; Bring Consulting mindset in problem solving, innovation by leveraging technical and business knowledge/ expertise and collaborate across other teams; Assist senior team members, delivery leads in project management responsibilities What you ll bring Big Data TechnologiesProficiency in working with big data technologies, particularly in the context of Azure Databricks, which may include Apache Spark for distributed data processing. Azure DatabricksIn-depth knowledge of Azure Databricks for data engineering tasks, including data transformations, ETL processes, and job scheduling. SQL and Query OptimizationStrong SQL skills for data manipulation and retrieval, along with the ability to optimize queries for performance in Snowflake. ETL (Extract, Transform, Load)Expertise in designing and implementing ETL processes to move and transform data between systems, utilizing tools and frameworks available in Azure Databricks. Data IntegrationExperience with integrating diverse data sources into a cohesive and usable format, ensuring data quality and integrity. Python/PySparkKnowledge of programming languages like Python and PySpark for scripting and extending the functionality of Azure Databricks notebooks. Version ControlFamiliarity with version control systems, such as Git, for managing code and configurations in a collaborative environment. Monitoring and OptimizationAbility to monitor data pipelines, identify bottlenecks, and optimize performance for both Azure Data Factory Security and ComplianceUnderstanding of security best practices and compliance considerations when working with sensitive data in Azure and Snowflake environments. Snowflake Data WarehouseExperience in designing, implementing, and optimizing data warehouses using Snowflake, including schema design, performance tuning, and query optimization. Healthcare Domain Knowledge: Familiarity with US health plan terminologies and datasets is essential. Programming/Scripting Languages: Proficiency in Python, SQL, and PySpark is required. Cloud Platforms: Experience with AWS or Azure, specifically in building data pipelines, is needed. Cloud-Based Data Platforms: Working knowledge of Snowflake and Databricks is preferred. Data Pipeline Orchestration: Experience with Azure Data Factory and AWS Glue for orchestrating data pipelines is necessary. Relational Databases: Competency with relational databases such as PostgreSQL and MySQL is required, while experience with NoSQL databases is a plus. BI Tools: Knowledge of BI tools such as Tableau and PowerBI is expected. Version Control: Proficiency with Git, including branching, merging, and pull requests, is required. CI/CD for Data Pipelines: Experience in implementing continuous integration and delivery for data workflows using tools like Azure DevOps is essential. Additional Skills Experience with front-end technologies such as SQL, JavaScript, HTML, CSS, and Angular is advantageous. Familiarity with web development frameworks like Flask, Django, and FAST API is beneficial. Basic knowledge of AWS CI/CD practices is a plus. Strong verbal and written communication skills with ability to articulate results and issues to internal and client teams; Proven ability to work creatively and analytically in a problem-solving environment; Willingness to travel to other global offices as needed to work with client or other internal project teams. Perks & Benefits ZS offers a comprehensive total rewards package including health and well-being, financial planning, annual leave, personal growth and professional development. Our robust skills development programs, multiple career progression options and internal mobility paths and collaborative culture empowers you to thrive as an individual and global team member. We are committed to giving our employees a flexible and connected way of working. A flexible and connected ZS allows us to combine work from home and on-site presence at clients/ZS offices for the majority of our week. The magic of ZS culture and innovation thrives in both planned and spontaneous face-to-face connections. Travel Travel is a requirement at ZS for client facing ZSers; business needs of your project and client are the priority. While some projects may be local, all client-facing ZSers should be prepared to travel as needed. Travel provides opportunities to strengthen client relationships, gain diverse experiences, and enhance professional growth by working in different environments and cultures. Considering applying At ZS, we're building a diverse and inclusive company where people bring their passions to inspire life-changing impact and deliver better outcomes for all. We are most interested in finding the best candidate for the job and recognize the value that candidates with all backgrounds, including non-traditional ones, bring. If you are interested in joining us, we encourage you to apply even if you don't meet 100% of the requirements listed above. To Complete Your Application Candidates must possess or be able to obtain work authorization for their intended country of employment.An on-line application, including a full set of transcripts (official or unofficial), is required to be considered.

Posted 5 days ago

Apply

4.0 - 9.0 years

6 - 12 Lacs

Hyderabad

Work from Office

Naukri logo

ABOUT THE ROLE Role Description: We are seeking an experienced MDM Senior Data Engineer with 6- 9 years of experience and expertise in backend engineering to work closely with business on development and operations of our Master Data Management (MDM) platforms, with hands-on experience in Informatica or Reltio and data engineering experience . This role will also involve guiding junior data engineers /analysts , and quality experts to deliver high-performance, scalable, and governed MDM solutions that align with enterprise data strategy. To succeed in this role, the candidate must have strong Data Engineering experience along with MDM knowledge, hence the candidates having only MDM experience are not eligible for this role. Candidate must have data engineering experience on technologies like (SQL, Python, PySpark , Databricks, AWS, API Integrations etc ), along with knowledge of MDM (Master Data Management) Roles & Responsibilities: Develop the MDM backend solutions and implement ETL and Data engineering pipelines using Databricks, AWS, Python/PySpark, SQL etc Lead the implementation and optimization of MDM solutions using Informatica or Reltio platforms. Perform data profiling and identify the DQ rules need. Define and drive enterprise-wide MDM architecture, including IDQ, data stewardship, and metadata workflows. Manage cloud-based infrastructure using AWS and Databricks to ensure scalability and performance. Ensure data integrity, lineage, and traceability across MDM pipelines and solutions. Provide mentorship and technical leadership to junior team members and ensure project delivery timelines. Help custom UI team for integration with backend data using API or other integration methods for better user experience on data stewardship Basic Qualifications and Experience: Masters degree with 4 - 6 years of experience in Business, Engineering, IT or related field OR Bachelors degree with 6 - 9 years of experience in Business, Engineering, IT or related field OR Diploma with 10 - 12 years of experience in Business, Engineering, IT or related field Functional Skills: Must-Have Skills: Strong understanding and hands on experience of Databricks and AWS cloud services. Proficiency in Python, PySpark, SQL, and Unix for data processing and orchestration. Deep knowledge of MDM tools (Informatica, Reltio) and data quality frameworks (IDQ). Must have knowledge on customer master data (HCP, HCO etc) Experience with data modeling, governance, and DCR lifecycle management. Able to implement end to end integrations including API based integrations, Batch integrations and Flat file-based integrations Strong experience with external data enrichments like D&B Strong experience on match/merge and survivorship rules implementations Very good understanding on reference data and its integration with MDM Hands on experience with custom workflows or building data pipelines/orchestrations Good-to-Have Skills: Experience with Tableau or PowerBI for reporting MDM insights. Exposure or knowledge of DataScience and GenAI capabilities. Exposure to Agile practices and tools (JIRA, Confluence). Prior experience in Pharma/Life Sciences. Understanding of compliance and regulatory considerations in master data. Professional Certifications : Any MDM certification (e.g. Informatica, Reltio etc) Databricks Certifications (Data engineer or Architect) Any cloud certification (AWS or AZURE) Soft Skills: Strong analytical abilities to assess and improve master data processes and solutions. Excellent verbal and written communication skills, with the ability to convey complex data concepts clearly to technical and non-technical stakeholders. Effective problem-solving skills to address data-related issues and implement scalable solutions. Ability to work effectively with global, virtual teams

Posted 5 days ago

Apply

0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

Linkedin logo

Role : MLOps Engineer Location - Chennai - CKC Mode of Interview - In Person Data - 7th June 2025 (Saturday) Key Words -Skillset AWS SageMaker, Azure ML Studio, GCP Vertex AI PySpark, Azure Databricks MLFlow, KubeFlow, AirFlow, Github Actions, AWS CodePipeline Kubernetes, AKS, Terraform, Fast API Responsibilities Model Deployment, Model Monitoring, Model Retraining Deployment pipeline, Inference pipeline, Monitoring pipeline, Retraining pipeline Drift Detection, Data Drift, Model Drift Experiment Tracking MLOps Architecture REST API publishing Job Responsibilities Research and implement MLOps tools, frameworks and platforms for our Data Science projects. Work on a backlog of activities to raise MLOps maturity in the organization. Proactively introduce a modern, agile and automated approach to Data Science. Conduct internal training and presentations about MLOps tools’ benefits and usage. Required Experience And Qualifications Wide experience with Kubernetes. Experience in operationalization of Data Science projects (MLOps) using at least one of the popular frameworks or platforms (e.g. Kubeflow, AWS Sagemaker, Google AI Platform, Azure Machine Learning, DataRobot, DKube). Good understanding of ML and AI concepts. Hands-on experience in ML model development. Proficiency in Python used both for ML and automation tasks. Good knowledge of Bash and Unix command line toolkit. Experience in CI/CD/CT pipelines implementation. Experience with cloud platforms - preferably AWS - would be an advantage. Show more Show less

Posted 5 days ago

Apply

0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

Linkedin logo

Company Overview Viraaj HR Solutions is a leading recruitment firm in India, dedicated to connecting top talent with industry-leading companies. We focus on understanding the unique needs of each client, providing tailored HR solutions that enhance their workforce capabilities. Our mission is to empower organizations by bridging the gap between talent and opportunity. We value integrity, collaboration, and excellence in service delivery, ensuring a seamless experience for both candidates and employers. Job Title: PySpark Data Engineer Work Mode: On-Site Location: India Role Responsibilities Design, develop, and maintain data pipelines using PySpark. Collaborate with data scientists and analysts to gather data requirements. Optimize data processing workflows for efficiency and performance. Implement ETL processes to integrate data from various sources. Create and maintain data models that support analytical reporting. Ensure data quality and accuracy through rigorous testing and validation. Monitor and troubleshoot production data pipelines to resolve issues. Work with SQL databases to extract and manipulate data as needed. Utilize cloud technologies for data storage and processing solutions. Participate in code reviews and provide constructive feedback. Document technical specifications and processes clearly for team reference. Stay updated with industry trends and emerging technologies in big data. Collaborate with cross-functional teams to deliver data solutions. Support the data governance initiatives to ensure compliance. Provide training and mentorship to junior data engineers. Qualifications Bachelor's degree in Computer Science, Information Technology, or related field. Proven experience as a Data Engineer, preferably with PySpark. Strong understanding of data warehousing concepts and architecture. Hands-on experience with ETL tools and frameworks. Proficiency in SQL and NoSQL databases. Familiarity with cloud platforms like AWS, Azure, or Google Cloud. Experience with Python programming for data manipulation. Knowledge of data modeling techniques and best practices. Ability to work in a fast-paced environment and juggle multiple tasks. Excellent problem-solving skills and attention to detail. Strong communication and interpersonal skills. Ability to work independently and as part of a team. Experience in Agile methodologies and practices. Knowledge of data governance and compliance standards. Familiarity with BI tools such as Tableau or Power BI is a plus. Skills: data modeling,python programming,pyspark,bi tools,sql proficiency,sql,cloud technologies,nosql databases,etl processes,data warehousing,agile methodologies,cloud computing,data engineer Show more Show less

Posted 5 days ago

Apply

2.0 - 4.0 years

4 - 6 Lacs

Hyderabad

Work from Office

Naukri logo

Overview Data Science Team works in developing Machine Learning (ML) and Artificial Intelligence (AI) projects. Specific scope of this role is to develop ML solution in support of ML/AI projects using big analytics toolsets in a CI/CD environment. Analytics toolsets may include DS tools/Spark/Databricks, and other technologies offered by Microsoft Azure or open-source toolsets. This role will also help automate the end-to-end cycle with Azure Pipelines. You will be part of a collaborative interdisciplinary team around data, where you will be responsible of our continuous delivery of statistical/ML models. You will work closely with process owners, product owners and final business users. This will provide you the correct visibility and understanding of criticality of your developments. Responsibilities Delivery of key Advanced Analytics/Data Science projects within time and budget, particularly around DevOps/MLOps and Machine Learning models in scope Active contributor to code & development in projects and services Partner with data engineers to ensure data access for discovery and proper data is prepared for model consumption. Partner with ML engineers working on industrialization. Communicate with business stakeholders in the process of service design, training and knowledge transfer. Support large-scale experimentation and build data-driven models. Refine requirements into modelling problems. Influence product teams through data-based recommendations. Research in state-of-the-art methodologies. Create documentation for learnings and knowledge transfer. Create reusable packages or libraries. Ensure on time and on budget delivery which satisfies project requirements, while adhering to enterprise architecture standards Leverage big data technologies to help process data and build scaled data pipelines (batch to real time) Implement end-to-end ML lifecycle with Azure Databricks and Azure Pipelines Automate ML models deployments Qualifications BE/B.Tech in Computer Science, Maths, technical fields. Overall 2-4 years of experience working as a Data Scientist. 2+ years experience building solutions in the commercial or in the supply chain space. 2+ years working in a team to deliver production level analytic solutions. Fluent in git (version control). Understanding of Jenkins, Docker are a plus. Fluent in SQL syntaxis. 2+ years experience in Statistical/ML techniques to solve supervised (regression, classification) and unsupervised problems. 2+ years experience in developing business problem related statistical/ML modeling with industry tools with primary focus on Python or Pyspark development. Data Science Hands on experience and strong knowledge of building machine learning models supervised and unsupervised models. Knowledge of Time series/Demand Forecast models is a plus Programming Skills Hands-on experience in statistical programming languages like Python, Pyspark and database query languages like SQL Statistics Good applied statistical skills, including knowledge of statistical tests, distributions, regression, maximum likelihood estimators Cloud (Azure) Experience in Databricks and ADF is desirable Familiarity with Spark, Hive, Pig is an added advantage Business storytelling and communicating data insights in business consumable format. Fluent in one Visualization tool. Strong communications and organizational skills with the ability to deal with ambiguity while juggling multiple priorities Experience with Agile methodology for team work and analytics product creation. Experience in Reinforcement Learning is a plus. Experience in Simulation and Optimization problems in any space is a plus. Experience with Bayesian methods is a plus. Experience with Causal inference is a plus. Experience with NLP is a plus. Experience with Responsible AI is a plus. Experience with distributed machine learning is a plus Experience in DevOps, hands-on experience with one or more cloud service providers AWS, GCP, Azure(preferred) Model deployment experience is a plus Experience with version control systems like GitHub and CI/CD tools Experience in Exploratory data Analysis Knowledge of ML Ops / DevOps and deploying ML models is preferred Experience using MLFlow, Kubeflow etc. will be preferred Experience executing and contributing to ML OPS automation infrastructure is good to have Exceptional analytical and problem-solving skills Stakeholder engagement-BU, Vendors. Experience building statistical models in the Retail or Supply chain space is a plus

Posted 5 days ago

Apply

4.0 - 9.0 years

6 - 11 Lacs

Hyderabad

Work from Office

Naukri logo

Overview Provide data science / analytics support for the Perfect Store group who works with AMESA Sectors, a part of the broader Global Capability Center in Hyderabad, India. This role will help to enable accelerated growth for PepsiCo by building Retailer Value Offer and Shopper Value offer, aligning data, and performing advance analytics approaches for PepsiCo to drive actionable insights at Business Unit, store level. Key responsibilities will be to build and manage advance analytics-deep dives in a cloud environment, manage data and prepare data to be used for advanced analytics, artificial intelligence, machine learning, and deep learning projects. Responsibilities Support Perfect Store (Demand Accelerator) team with delivery of Retail Value Offer, Shopper Value Offer framework for AMESA sector Work within cloud environment (e.g., Microsoft Azure) Build and maintain codes for use in advanced analytics, artificial intelligence, and machine learning projects Clean and prepare data for use in advanced analytics, artificial intelligence, and machine learning projects Build deep dive analysis reports in cloud environment (using Pyspark and Python) to support BU Ask Develop, maintain, and apply statistical techniques to business questions - including Distribution, Outliers, visualizations etc. Support relationships with the key end-user stakeholders with Business Units-AMESA Own flawless execution AND Quality Check of analytics exercises Responsible for managing multiple priorities; being able to manage deadlines and deliverables Lead communication with Business Partners and potentially end-users on matters such as available capacity, changes of scope of existing projects and planning of future projects Deliver outputs in line with the agreed timelines and formats while updating existing project management tools Flag and monitor any business risks related to delivering the requested outputs Qualifications An experienced analytics profession with 4 years+ experience EducationB.Tech or any bachelors degree. Masters are optional Proficient with Python, SQL, Excel and PowerBI Plus, to have knowledge on Machine Learning algorithms Good to have Retail experience Strong collaboratorInterested and motivated by working with others. Owns the full responsibility of deliverables, quality check thoroughly, look for and work on improvements in the process Actively creates and participates in opportunities to co-create solutions across markets. Willing and able to embrace Responsive Ways of Working

Posted 5 days ago

Apply

0 years

0 Lacs

Bhubaneswar, Odisha, India

On-site

Linkedin logo

Company Overview Viraaj HR Solutions is a leading recruitment firm in India, dedicated to connecting top talent with industry-leading companies. We focus on understanding the unique needs of each client, providing tailored HR solutions that enhance their workforce capabilities. Our mission is to empower organizations by bridging the gap between talent and opportunity. We value integrity, collaboration, and excellence in service delivery, ensuring a seamless experience for both candidates and employers. Job Title: PySpark Data Engineer Work Mode: On-Site Location: India Role Responsibilities Design, develop, and maintain data pipelines using PySpark. Collaborate with data scientists and analysts to gather data requirements. Optimize data processing workflows for efficiency and performance. Implement ETL processes to integrate data from various sources. Create and maintain data models that support analytical reporting. Ensure data quality and accuracy through rigorous testing and validation. Monitor and troubleshoot production data pipelines to resolve issues. Work with SQL databases to extract and manipulate data as needed. Utilize cloud technologies for data storage and processing solutions. Participate in code reviews and provide constructive feedback. Document technical specifications and processes clearly for team reference. Stay updated with industry trends and emerging technologies in big data. Collaborate with cross-functional teams to deliver data solutions. Support the data governance initiatives to ensure compliance. Provide training and mentorship to junior data engineers. Qualifications Bachelor's degree in Computer Science, Information Technology, or related field. Proven experience as a Data Engineer, preferably with PySpark. Strong understanding of data warehousing concepts and architecture. Hands-on experience with ETL tools and frameworks. Proficiency in SQL and NoSQL databases. Familiarity with cloud platforms like AWS, Azure, or Google Cloud. Experience with Python programming for data manipulation. Knowledge of data modeling techniques and best practices. Ability to work in a fast-paced environment and juggle multiple tasks. Excellent problem-solving skills and attention to detail. Strong communication and interpersonal skills. Ability to work independently and as part of a team. Experience in Agile methodologies and practices. Knowledge of data governance and compliance standards. Familiarity with BI tools such as Tableau or Power BI is a plus. Skills: data modeling,python programming,pyspark,bi tools,sql proficiency,sql,cloud technologies,nosql databases,etl processes,data warehousing,agile methodologies,cloud computing,data engineer Show more Show less

Posted 5 days ago

Apply

6.0 - 10.0 years

8 - 14 Lacs

Bengaluru

Work from Office

Naukri logo

Work Location : Bangalore (CV Ramen Nagar location) Notice Period : Immediate - 30 days Mandatory Skills : Big Data, Python, SQL, Spark/Pyspark, AWS Cloud JD and required Skills & Responsibilities : - Actively participate in all phases of the software development lifecycle, including requirements gathering, functional and technical design, development, testing, roll-out, and support. - Solve complex business problems by utilizing a disciplined development methodology. - Produce scalable, flexible, efficient, and supportable solutions using appropriate technologies. - Analyse the source and target system data. Map the transformation that meets the requirements. - Interact with the client and onsite coordinators during different phases of a project. - Design and implement product features in collaboration with business and Technology stakeholders. - Anticipate, identify, and solve issues concerning data management to improve data quality. - Clean, prepare, and optimize data at scale for ingestion and consumption. - Support the implementation of new data management projects and re-structure the current data architecture. - Implement automated workflows and routines using workflow scheduling tools. - Understand and use continuous integration, test-driven development, and production deployment frameworks. - Participate in design, code, test plans, and dataset implementation performed by other data engineers in support of maintaining data engineering standards. - Analyze and profile data for the purpose of designing scalable solutions. - Troubleshoot straightforward data issues and perform root cause analysis to proactively resolve product issues. Required Skills : - 5+ years of relevant experience developing Data and analytic solutions. - Experience building data lake solutions leveraging one or more of the following AWS, EMR, S3, Hive & PySpark - Experience with relational SQL. - Experience with scripting languages such as Python. - Experience with source control tools such as GitHub and related dev process. - Experience with workflow scheduling tools such as Airflow. - In-depth knowledge of AWS Cloud (S3, EMR, Databricks) - Has a passion for data solutions. - Has a strong problem-solving and analytical mindset - Working experience in the design, Development, and test of data pipelines. - Experience working with Agile Teams. - Able to influence and communicate effectively, both verbally and in writing, with team members and business stakeholders - Able to quickly pick up new programming languages, technologies, and frameworks. - Bachelor's degree in computer science

Posted 5 days ago

Apply

6.0 - 7.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Linkedin logo

Designation – Sr.Consultant Experience- 6 to 7 years Location- Bengaluru Skills Req- Python, SQL, Databrciks , ADF ,within-Databrcisk - DLT, PySpark, Structural streaming , performance and cost optimization. Roles and Responsibilities: Capture business problems, value drivers, and functional/non-functional requirements and translate into functionality. Assess the risks, feasibility, opportunities, and business impact. Assess and model processes, data flows, and technology to understand the current value and issues, and identify opportunities for improvement. Create / update clear documentation of requirements to align with the solution over the project lifecycle. Ensure traceability of requirements from business needs through testing and scope changes, to final solution. Interact with software suppliers, designers and developers to understand software limitations, deliver elements of system and database design, and ensure that business requirements and use cases are handled. Configure and document software and processes, using agreed standards and tools. Create acceptance criteria and validate that solutions meet business needs, through defining and coordinating testing. Create and present compelling business cases to justify solution value and establish approval, funding and prioritization. Initiate, plan, execute, monitor, and control Business Analysis activities on projects within agreed parameters of cost, time and quality. Lead stakeholder management activities and large design sessions. Lead teams to complete business analysis on projects. Configure and document software and processes. Define and coordinate testing. Agile project experience. Understand Agile frameworks and tools. Worked in Agile. Show more Show less

Posted 5 days ago

Apply

8.0 - 13.0 years

16 - 22 Lacs

Chennai, Bengaluru, Delhi / NCR

Work from Office

Naukri logo

About the job : Role : Senior Databricks Engineer / Databricks Technical Lead/ Data Architect Experience : 8-15 years Location : Bangalore, Chennai, Delhi, Pune Primary Roles And Responsibilities : - Developing Modern Data Warehouse solutions using Databricks and AWS/ Azure Stack - Ability to provide solutions that are forward-thinking in data engineering and analytics space - Collaborate with DW/BI leads to understand new ETL pipeline development requirements. - Triage issues to find gaps in existing pipelines and fix the issues - Work with business to understand the need in reporting layer and develop data model to fulfill reporting needs - Help joiner team members to resolve issues and technical challenges. - Drive technical discussion with client architect and team members - Orchestrate the data pipelines in scheduler via Airflow Skills And Qualifications : - Bachelor's and/or masters degree in computer science or equivalent experience. - Must have total 6+ yrs. of IT experience and 3+ years' experience in Data warehouse/ETL projects. - Deep understanding of Star and Snowflake dimensional modelling. - Strong knowledge of Data Management principles - Good understanding of Databricks Data & AI platform and Databricks Delta Lake Architecture - Should have hands-on experience in SQL, Python and Spark (PySpark) - Candidate must have experience in AWS/ Azure stack - Desirable to have ETL with batch and streaming (Kinesis). - Experience in building ETL / data warehouse transformation processes - Experience with Apache Kafka for use with streaming data / event-based data - Experience with other Open-Source big data products Hadoop (incl. Hive, Pig, Impala) - Experience with Open Source non-relational / NoSQL data repositories (incl. MongoDB, Cassandra, Neo4J) - Experience working with structured and unstructured data including imaging & geospatial data. - Experience working in a Dev/Ops environment with tools such as Terraform, CircleCI, GIT. - Proficiency in RDBMS, complex SQL, PL/SQL, Unix Shell Scripting, performance tuning and troubleshoot - Databricks Certified Data Engineer Associate/Professional Certification (Desirable). - Comfortable working in a dynamic, fast-paced, innovative environment with several ongoing concurrent projects - Should have experience working in Agile methodology - Strong verbal and written communication skills. - Strong analytical and problem-solving skills with a high attention to detail. Location - Bangalore, Chennai, Delhi / NCR, Pune

Posted 5 days ago

Apply

8.0 - 13.0 years

25 - 40 Lacs

Bengaluru

Work from Office

Naukri logo

*Must-Have Skills:* * Azure Databricks / PySpark hands-on * SQL/PL-SQL advanced level * Snowflake – 2+ years * Spark/Data pipeline development – 2+ years * Azure Repos / GitHub, Azure DevOps * Unix Shell Scripting * Cloud technology experience *Key Responsibilities:* 1. *Design, build, and manage data pipelines using Azure Databricks, PySpark, and Snowflake. 2. *Analyze and resolve production issues (Tier 2 support with weekend/on-call rotation). 3. *Write and optimize complex SQL/PL-SQL queries. 4. *Collaborate on low-level and high-level design for data solutions. 5. *Document all project deliverables and support deployment. Good to Have: Knowledge of Oracle, Qlik Replicate, GoldenGate, Hadoop Job scheduler tools like Control-M or Airflow Behavioral: Strong problem-solving & communication skills

Posted 5 days ago

Apply

3.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Linkedin logo

Company Overview Viraaj HR Solutions is dedicated to connecting top talent with forward-thinking companies. Our mission is to provide exceptional talent acquisition services while fostering a culture of trust, integrity, and collaboration. We prioritize our clients' needs and work tirelessly to ensure the ideal candidate-job match. Join us in our commitment to excellence and become part of a dynamic team focused on driving success for individuals and organizations alike. Role Responsibilities Design, develop, and implement data pipelines using Azure Data Factory. Create and maintain data models for structured and unstructured data. Extract, transform, and load (ETL) data from various sources into data warehouses. Develop analytical solutions and dashboards using Azure Databricks. Perform data integration and migration tasks with Azure tools. Ensure optimal performance and scalability of data solutions. Collaborate with cross-functional teams to understand data requirements. Utilize SQL Server for database management and data queries. Implement data quality checks and ensure data integrity. Work on data governance and compliance initiatives. Monitor and troubleshoot data pipeline issues to ensure reliability. Document data processes and architecture for future reference. Stay current with industry trends and Azure advancements. Train and mentor junior data engineers and team members. Participate in design reviews and provide feedback for process improvements. Qualifications Bachelor's degree in Computer Science, Information Technology, or a related field. 3+ years of experience in a data engineering role. Strong expertise in Azure Data Factory and Azure Databricks. Proficient in SQL for data manipulation and querying. Experience with data warehousing concepts and practices. Familiarity with ETL tools and processes. Knowledge of Python or other programming languages for data processing. Ability to design scalable cloud architecture. Experience with data modeling and database design. Effective communication and collaboration skills. Strong analytical and problem-solving abilities. Familiarity with performance tuning and optimization techniques. Knowledge of data visualization tools is a plus. Experience with Agile methodologies. Ability to work independently and manage multiple tasks. Willingness to learn and adapt to new technologies. Skills: etl,azure databricks,sql server,azure,data governance,azure data factory,python,data warehousing,data engineer,data integration,performance tuning,python scripting,sql,data modeling,data migration,data visualization,analytical solutions,pyspark,agile methodologies,data quality checks Show more Show less

Posted 5 days ago

Apply

0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Linkedin logo

Company Overview Viraaj HR Solutions is a leading recruitment firm in India, dedicated to connecting top talent with industry-leading companies. We focus on understanding the unique needs of each client, providing tailored HR solutions that enhance their workforce capabilities. Our mission is to empower organizations by bridging the gap between talent and opportunity. We value integrity, collaboration, and excellence in service delivery, ensuring a seamless experience for both candidates and employers. Job Title: PySpark Data Engineer Work Mode: On-Site Location: India Role Responsibilities Design, develop, and maintain data pipelines using PySpark. Collaborate with data scientists and analysts to gather data requirements. Optimize data processing workflows for efficiency and performance. Implement ETL processes to integrate data from various sources. Create and maintain data models that support analytical reporting. Ensure data quality and accuracy through rigorous testing and validation. Monitor and troubleshoot production data pipelines to resolve issues. Work with SQL databases to extract and manipulate data as needed. Utilize cloud technologies for data storage and processing solutions. Participate in code reviews and provide constructive feedback. Document technical specifications and processes clearly for team reference. Stay updated with industry trends and emerging technologies in big data. Collaborate with cross-functional teams to deliver data solutions. Support the data governance initiatives to ensure compliance. Provide training and mentorship to junior data engineers. Qualifications Bachelor's degree in Computer Science, Information Technology, or related field. Proven experience as a Data Engineer, preferably with PySpark. Strong understanding of data warehousing concepts and architecture. Hands-on experience with ETL tools and frameworks. Proficiency in SQL and NoSQL databases. Familiarity with cloud platforms like AWS, Azure, or Google Cloud. Experience with Python programming for data manipulation. Knowledge of data modeling techniques and best practices. Ability to work in a fast-paced environment and juggle multiple tasks. Excellent problem-solving skills and attention to detail. Strong communication and interpersonal skills. Ability to work independently and as part of a team. Experience in Agile methodologies and practices. Knowledge of data governance and compliance standards. Familiarity with BI tools such as Tableau or Power BI is a plus. Skills: data modeling,python programming,pyspark,bi tools,sql proficiency,sql,cloud technologies,nosql databases,etl processes,data warehousing,agile methodologies,cloud computing,data engineer Show more Show less

Posted 5 days ago

Apply

3.0 - 6.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Linkedin logo

Responsibilities: Develop and execute test scripts to validate data pipelines, transformations, and integrations. Formulate and maintain test strategies—including smoke, performance, functional, and regression testing—to ensure data processing and ETL jobs meet requirements. Collaborate with development teams to assess changes in data workflows and update test cases to preserve data integrity. Design and run tests for data validation, storage, and retrieval using Azure services like Data Lake, Synapse, and Data Factory, adhering to industry standards. Continuously enhance automated tests as new features are developed, ensuring timely delivery per defined quality standards. Participate in data reconciliation and verify Data Quality frameworks to maintain data accuracy, completeness, and consistency across the platform. Share knowledge and best practices by collaborating with business analysts and technology teams to document testing processes and findings. Communicate testing progress effectively with stakeholders, highlighting issues or blockers, and ensuring alignment with business objectives. Maintain a comprehensive understanding of the Azure Data Lake platform's data landscape to ensure thorough testing coverage. Skills & Experience: 3-6 years of QA experience with a strong focus on Big Data testing, particularly in Data Lake environments on Azure's cloud platform. Proficient in Azure Data Factory, Azure Synapse Analytics and Databricks for big data processing and scaled data quality checks. Proficiency in SQL, capable of writing and optimizing both simple and complex queries for data validation and testing purposes. Proficient in PySpark, with experience in data manipulation and transformation, and a demonstrated ability to write and execute test scripts for data processing and validation. Hands-on experience with Functional & system integration testing in big data environments, ensuring seamless data flow and accuracy across multiple systems. Knowledge and ability to design and execute test cases in a behaviour-driven development environment. Fluency in Agile methodologies, with active participation in Scrum ceremonies and a strong understanding of Agile principles. Familiarity with tools like Jira, including experience with X-Ray or Jira Zephyr for defect management and test case management. Proven experience working on high-traffic and large-scale software products, ensuring data quality, reliability, and performance under demanding conditions. Show more Show less

Posted 5 days ago

Apply

6.0 - 8.0 years

8 - 10 Lacs

Kolkata

Work from Office

Naukri logo

Job Summary : We are seeking an experienced Data Engineer with strong expertise in Databricks, Python, PySpark, and Power BI, along with a solid background in data integration and the modern Azure ecosystem. The ideal candidate will play a critical role in designing, developing, and implementing scalable data engineering solutions and pipelines. Key Responsibilities : - Design, develop, and implement robust data solutions using Azure Data Factory, Databricks, and related data engineering tools. - Build and maintain scalable ETL/ELT pipelines with a focus on performance and reliability. - Write efficient and reusable code using Python and PySpark. - Perform data cleansing, transformation, and migration across various platforms. - Work hands-on with Azure Data Factory (ADF) for at least 1.5 to 2 years. - Develop and optimize SQL queries, stored procedures, and manage large data sets using SQL Server, T-SQL, PL/SQL, etc. - Collaborate with cross-functional teams to understand business requirements and provide data-driven solutions. - Engage directly with clients and business stakeholders to gather requirements, suggest optimal solutions, and ensure successful delivery. - Work with Power BI for basic reporting and data visualization tasks. - Apply strong knowledge of data warehousing concepts, modern data platforms, and cloud-based analytics. - Adhere to coding standards and best practices, including thorough documentation and testing (unit, integration, performance). - Support the operations, maintenance, and enhancement of existing data pipelines and architecture. - Estimate tasks and plan release cycles effectively. Required Technical Skills : - Languages & Frameworks : Python, PySpark - Cloud & Tools : Azure Data Factory, Databricks, Azure ecosystem - Databases : SQL Server, T-SQL, PL/SQL - Reporting & BI Tools : Power BI (PBI) - Data Concepts : Data Warehousing, ETL/ELT, Data Cleansing, Data Migration - Other : Version control, Agile methodologies, good problem-solving skills Preferred Qualifications : - Experience with coding in Pysense within Databricks (added advantage) - Solid understanding of cloud data architecture and analytics processes - Ability to independently initiate and lead conversations with business stakeholders

Posted 5 days ago

Apply

0 years

0 Lacs

Kochi, Kerala, India

On-site

Linkedin logo

Company Overview Viraaj HR Solutions is a leading recruitment firm in India, dedicated to connecting top talent with industry-leading companies. We focus on understanding the unique needs of each client, providing tailored HR solutions that enhance their workforce capabilities. Our mission is to empower organizations by bridging the gap between talent and opportunity. We value integrity, collaboration, and excellence in service delivery, ensuring a seamless experience for both candidates and employers. Job Title: PySpark Data Engineer Work Mode: On-Site Location: India Role Responsibilities Design, develop, and maintain data pipelines using PySpark. Collaborate with data scientists and analysts to gather data requirements. Optimize data processing workflows for efficiency and performance. Implement ETL processes to integrate data from various sources. Create and maintain data models that support analytical reporting. Ensure data quality and accuracy through rigorous testing and validation. Monitor and troubleshoot production data pipelines to resolve issues. Work with SQL databases to extract and manipulate data as needed. Utilize cloud technologies for data storage and processing solutions. Participate in code reviews and provide constructive feedback. Document technical specifications and processes clearly for team reference. Stay updated with industry trends and emerging technologies in big data. Collaborate with cross-functional teams to deliver data solutions. Support the data governance initiatives to ensure compliance. Provide training and mentorship to junior data engineers. Qualifications Bachelor's degree in Computer Science, Information Technology, or related field. Proven experience as a Data Engineer, preferably with PySpark. Strong understanding of data warehousing concepts and architecture. Hands-on experience with ETL tools and frameworks. Proficiency in SQL and NoSQL databases. Familiarity with cloud platforms like AWS, Azure, or Google Cloud. Experience with Python programming for data manipulation. Knowledge of data modeling techniques and best practices. Ability to work in a fast-paced environment and juggle multiple tasks. Excellent problem-solving skills and attention to detail. Strong communication and interpersonal skills. Ability to work independently and as part of a team. Experience in Agile methodologies and practices. Knowledge of data governance and compliance standards. Familiarity with BI tools such as Tableau or Power BI is a plus. Skills: data modeling,python programming,pyspark,bi tools,sql proficiency,sql,cloud technologies,nosql databases,etl processes,data warehousing,agile methodologies,cloud computing,data engineer Show more Show less

Posted 5 days ago

Apply

5.0 - 10.0 years

16 - 25 Lacs

Hyderabad, Bengaluru

Work from Office

Naukri logo

Urgent Hiring for PySpark Data Engineer:- Job Location- Bangalore and Hyderabad Exp- 5yrs-9yrs Share CV Mohini.sharma@adecco.com OR Call 9740521948 Job Description: 1. API Development : Design, develop, and maintain robust APIs using FastAPI and RESTful principles for scalable backend systems. 2. Big Data Processing : Leverage PySpark to process and analyze large datasets efficiently, ensuring optimal performance in big data environments. 3. Full-Stack Integration : Develop seamless backend-to-frontend feature integrations, collaborating with front-end developers for cohesive user experiences. 4. CI/CD Pipelines : Implement and manage CI/CD pipelines using GitHub Actions and Azure DevOps to streamline deployments and ensure system reliability. 5. Containerization : Utilize Docker for building and deploying containerized applications in development and production environments. 6. Team Leadership : Lead and mentor a team of developers, providing guidance, code reviews, and support to junior team members to ensure high-quality deliverables. 7. Code Optimization : Write clean, maintainable, and efficient Python code, with a focus on scalability, reusability, and performance. 8. Cloud Deployment : Deploy and manage applications on cloud platforms like Azure , ensuring high availability and fault tolerance. 9. Collaboration : Work closely with cross-functional teams, including product managers and designers, to translate business requirements into technical solutions. 10. Documentation : Maintain thorough documentation for APIs, processes, and systems to ensure transparency and ease of maintenance Highlighted Skillset:- Big Data : Strong PySpark skills for processing large datasets. DevOps : Proficiency in GitHub Actions , CI/CD pipelines , Azure DevOps , and Docker . Integration : Experience in backend-to-frontend feature connectivity. Leadership : Proven ability to lead and mentor development teams. Cloud : Knowledge of deploying and managing applications in Azure or other cloud environments. Team Collaboration : Strong interpersonal and communication skills for working in cross-functional teams. Best Practices : Emphasis on clean code, performance optimization, and robust documentation

Posted 5 days ago

Apply

4.0 - 9.0 years

8 - 18 Lacs

Navi Mumbai, Pune, Mumbai (All Areas)

Hybrid

Naukri logo

Job Description : Job Overview: We are seeking a highly skilled Data Engineer with expertise in SQL, Python, Data Warehousing, AWS, Airflow, ETL, and Data Modeling . The ideal candidate will be responsible for designing, developing, and maintaining robust data pipelines, ensuring efficient data processing and integration across various platforms. This role requires strong problem-solving skills, an analytical mindset, and a deep understanding of modern data engineering frameworks. Key Responsibilities: Design, develop, and optimize scalable data pipelines and ETL processes to support business intelligence, analytics, and operational data needs. Build and maintain data models (conceptual, logical, and physical) to enhance data storage, retrieval, and transformation efficiency. Develop, test, and optimize complex SQL queries for efficient data extraction, transformation, and loading (ETL). Implement and manage data warehousing solutions (e.g., Snowflake, Redshift, BigQuery) for structured and unstructured data storage. Work with AWS, Azure , and cloud-based data solutions to build high-performance data ecosystems. Utilize Apache Airflow for orchestrating workflows and automating data pipeline execution. Collaborate with cross-functional teams to understand business data requirements and ensure alignment with data strategies. Ensure data integrity, security, and compliance with governance policies and best practices. Monitor, troubleshoot, and improve the performance of existing data systems for scalability and reliability. Stay updated with emerging data engineering technologies, frameworks, and best practices to drive continuous improvement. Required Skills & Qualifications: Proficiency in SQL for query development, performance tuning, and optimization. Strong Python programming skills for data processing, automation, and scripting. Hands-on experience with ETL development , data integration, and transformation workflows. Expertise in data modeling for efficient database and data warehouse design. Experience with cloud platforms such as AWS (S3, Redshift, Lambda), Azure, or GCP. Working knowledge of Airflow or similar workflow orchestration tools. Familiarity with Big Data frameworks like Hadoop or Spark (preferred but not mandatory). Strong problem-solving skills and ability to work in a fast-paced, dynamic environment. Role & responsibilities Preferred candidate profile

Posted 5 days ago

Apply

0 years

0 Lacs

Mumbai, Maharashtra, India

On-site

Linkedin logo

Introduction In this role, you'll work in one of our IBM Consulting Client Innovation Centers (Delivery Centers), where we deliver deep technical and industry expertise to a wide range of public and private sector clients around the world. Our delivery centers offer our clients locally based skills and technical expertise to drive innovation and adoption of new technology. Your Role And Responsibilities As a Data Engineer at IBM, you'll play a vital role in the development, design of application, provide regular support/guidance to project teams on complex coding, issue resolution and execution. Your primary responsibilities include: Lead the design and construction of new solutions using the latest technologies, always looking to add business value and meet user requirements. Strive for continuous improvements by testing the build solution and working under an agile framework. Discover and implement the latest technologies trends to maximize and build creative solutions Preferred Education Master's Degree Required Technical And Professional Expertise Experience with Apache Spark (PySpark): In-depth knowledge of Spark’s architecture, core APIs, and PySpark for distributed data processing. Big Data Technologies: Familiarity with Hadoop, HDFS, Kafka, and other big data tools. Data Engineering Skills: Strong understanding of ETL pipelines, data modeling, and data warehousing concepts. Strong proficiency in Python: Expertise in Python programming with a focus on data processing and manipulation. Data Processing Frameworks: Knowledge of data processing libraries such as Pandas, NumPy. SQL Proficiency: Experience writing optimized SQL queries for large-scale data analysis and transformation. Cloud Platforms: Experience working with cloud platforms like AWS, Azure, or GCP, including using cloud storage systems Preferred Technical And Professional Experience Define, drive, and implement an architecture strategy and standards for end-to-end monitoring. Partner with the rest of the technology teams including application development, enterprise architecture, testing services, network engineering, Good to have detection and prevention tools for Company products and Platform and customer-facing Show more Show less

Posted 5 days ago

Apply

7.0 years

0 Lacs

Mumbai, Maharashtra, India

On-site

Linkedin logo

Job Description The candidate must possess knowledge relevant to the functional area, and act as a subject matter expert in providing advice in the area of expertise, and also focus on continuous improvement for maximum efficiency. It is vital to focus on the high standard of delivery excellence, provide top-notch service quality and develop successful long-term business partnerships with internal/external customers by identifying and fulfilling customer needs. He/she should be able to break down complex problems into logical and manageable parts in a systematic way, and generate and compare multiple options, and set priorities to resolve problems. The ideal candidate must be proactive, and go beyond expectations to achieve job results and create new opportunities. He/she must positively influence the team, motivate high performance, promote a friendly climate, give constructive feedback, provide development opportunities, and manage career aspirations of direct reports. Communication skills are key here, to explain organizational objectives, assignments, and the big picture to the team, and to articulate team vision and clear objectives. Senior Process Manager Roles And Responsibilities We are seeking a talented and motivated Data Engineer to join our dynamic team. The ideal candidate will have a deep understanding of data integration processes and experience in developing and managing data pipelines using Python, SQL, and PySpark within Databricks. You will be responsible for designing robust backend solutions, implementing CI/CD processes, and ensuring data quality and consistency. Data Pipeline Development: Using Data bricks features to explore raw datasets and understand their structure. Creating and optimizing Spark-based workflows. Create end-to-end data processing pipelines, including ingesting raw data, transforming it, and running analyses on the processed data. Create and maintain data pipelines using Python and SQL. Solution Design and Architecture: Design and architect backend solutions for data integration, ensuring they are robust, scalable, and aligned with business requirements. Implement data processing pipelines using various technologies, including cloud platforms, big data tools, and streaming frameworks. Automation and Scheduling: Automate data integration processes and schedule jobs on servers to ensure seamless data flow. Data Quality and Monitoring: Develop and implement data quality checks and monitoring systems to ensure data accuracy and consistency. CI/CD Implementation: Use Jenkins and Bit bucket to create and maintain metadata and job files. Implement continuous integration and continuous deployment (CI/CD) processes in both development and production environments to deploy data pipelines efficiently. Collaboration and Documentation: Work effectively with cross-functional teams, including software engineers, data scientists, and DevOps, to ensure successful project delivery. Document data pipelines and architecture to ensure knowledge transfer and maintainability. Participate in stakeholder interviews, workshops, and design reviews to define data models, pipelines, and workflows. Technical And Functional Skills Education and Experience: Bachelor’s Degree with 7+ years of experience, including at least 3+ years of hands-on experience in SQL/ and Python. Technical Proficiency: Proficiency in writing and optimizing SQL queries in MySQL and SQL Server. Expertise in Python for writing reusable components and enhancing existing ETL scripts. Solid understanding of ETL concepts and data pipeline architecture, including CDC, incremental loads, and slowly changing dimensions (SCDs). Hands-on experience with PySpark. Knowledge and experience with using Data bricks will be a bonus. Familiarity with data warehousing solutions and ETL processes. Understanding of data architecture and backend solution design. Cloud and CI/CD Experience: Experience with cloud platforms such as AWS, Azure, or Google Cloud. Familiarity with Jenkins and Bit bucket for CI/CD processes. Additional Skills: Ability to work independently and manage multiple projects simultaneously. About Us At eClerx, we serve some of the largest global companies – 50 of the Fortune 500 clients. Our clients call upon us to solve their most complex problems, and deliver transformative insights. Across roles and levels, you get the opportunity to build expertise, challenge the status quo, think bolder, and help our clients seize value About The Team eClerx is a global leader in productized services, bringing together people, technology and domain expertise to amplify business results. Our mission is to set the benchmark for client service and success in our industry. Our vision is to be the innovation partner of choice for technology, data analytics and process management services. Since our inception in 2000, we've partnered with top companies across various industries, including financial services, telecommunications, retail, and high-tech. Our innovative solutions and domain expertise help businesses optimize operations, improve efficiency, and drive growth. With over 18,000 employees worldwide, eClerx is dedicated to delivering excellence through smart automation and data-driven insights. At eClerx, we believe in nurturing talent and providing hands-on experience. eClerx is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability or protected veteran status, or any other legally protected basis, per applicable law. Show more Show less

Posted 5 days ago

Apply

10.0 - 15.0 years

12 - 18 Lacs

Hyderabad, Gurugram, Bengaluru

Work from Office

Naukri logo

Location:Bangalore/Gurgaon/Hyderabad/Mumbai Must have skills: Must have skills:Data Scientist / Transformation Leader & at least 5 years in Telecom Analytics Good to have skills:GEN AI, Agentic AI, Job Summary : About Global Network Data & AI:- Accenture Strategy & Consulting Global Network - Data & AI practice help our clients grow their business in entirely new ways. Analytics enables our clients to achieve high performance through insights from data - insights that inform better decisions and strengthen customer relationships. From strategy to execution, Accenture works with organizations to develop analytic capabilities - from accessing and reporting on data to predictive modelling - to outperform the competition About Comms & Media practice: Comms & Media (C&M) is one of the Industry Practices within Accentures S&C Global Network team. It focuses in serving clients across specific Industries Communications, Media & Entertainment. Communications Focuses primarily on industries related with telecommunications and information & communication technology (ICT). This team serves most of the worlds leading wireline, wireless, cable and satellite communications and service providers Media & Entertainment Focuses on industries like broadcast, entertainment, print and publishing Globally, Accenture Comms & Media practice works to develop value growth strategies for its clients and infuse AI & GenAI to help deliver top their business imperatives i.e., revenue growth & cost reduction. From multi-year Data & AI transformation projects to shorter more agile engagements, we have a rapidly expanding portfolio of hyper-growth clients and an increasing footprint with next-gen solutions and industry practices. Roles & Responsibilities: A Telco domain experienced and data science consultant is responsible to help the clients with designing & delivering AI solutions. He/she should be strong in Telco domain, AI fundamentals and should have good hands-on experience working with the following: Ability to work with large data sets and present conclusions to key stakeholders; Data management using SQL. Propose solutions to the client based on gap analysis for the existing Telco platforms that can generate long term & sustainable value to the client. Gather business requirements from client stakeholders via interactions like interviews and workshops with all stakeholders Track down and read all previous information on the problem or issue in question. Explore obvious and known avenues thoroughly. Ask a series of probing questions to get to the root of a problem. Ability to understand the as-is process; understand issues with the processes which can be resolved either through Data & AI or process solutions and design detail level to-be state Understand customer needs and identify/translate them to business requirements (business requirement definition), business process flows and functional requirements and be able to inform the best approach to the problem. Adopt a clear and systematic approach to complex issues (i.e. A leads to B leads to C). Analyze relationships between several parts of a problem or situation. Anticipate obstacles and identify a critical path for a project. Independently able to deliver products and services that empower clients to implement effective solutions. Makes specific changes and improvements to processes or own work to achieve more. Work with other team members and make deliberate efforts to keep others up to date. Establish a consistent and collaborative presence with clients and act as the primary point of contact for assigned clients; escalate, track, and solve client issues. Partner with clients to understand end clients business goals, marketing objectives, and competitive constraints. Storytelling Crunch the data & numbers to craft a story to be presented to senior client stakeholders. Professional & Technical Skills: Overall 10+ years of experience in Data Science & at least 5 years in Telecom Analytics Masters (MBA/MSc/MTech) from a Tier 1/Tier 2 and Engineering from Tier 1 school Demonstrated experience in solving real-world data problems through Data & AI Direct onsite experience (i.e., experience of facing client inside client offices in India or abroad) is mandatory. Please note we are looking for client facing roles. Proficiency with data mining, mathematics, and statistical analysis Advanced pattern recognition and predictive modeling experience; knowledge of Advanced analytical fields in text mining, Image recognition, video analytics, IoT etc. Execution level understanding of econometric/statistical modeling packages Traditional techniques like Linear/logistic regression, multivariate statistical analysis, time series techniques, fixed/Random effect modelling. Machine learning techniques like - Random Forest, Gradient Boosting, XG boost, decision trees, clustering etc. Knowledge of Deep learning modeling techniques like RNN, CNN etc. Experience using digital & statistical modeling software (one or more) Python, R, PySpark, SQL, BigQuery, Vertex AI Proficient in Excel, MS word, Power point, and corporate soft skills Knowledge of Dashboard creation platforms Excel, tableau, Power BI etc. Excellent written and oral communication skills with ability to clearly communicate ideas and results to non-technical stakeholders. Strong analytical, problem-solving skills and good communication skills Self-Starter with ability to work independently across multiple projects and set priorities Strong team player Proactive and solution oriented, able to guide junior team members. Execution knowledge of optimization techniques is a good-to-have Exact optimization Linear, Non-linear optimization techniques Evolutionary optimization Both population and search-based algorithms Cloud platform Certification, experience in Computer Vision are good-to-haves Qualification Experience: Overall 10+ years of experience in Data Science & at least 5 years in Telecom Educational Qualification: Masters (MBA/MSc/MTech) from a Tier 1/Tier 2 and Engineering from Tier 1 school

Posted 5 days ago

Apply

3.0 years

0 Lacs

Pune, Maharashtra, India

On-site

Linkedin logo

Company Overview Viraaj HR Solutions is dedicated to connecting top talent with forward-thinking companies. Our mission is to provide exceptional talent acquisition services while fostering a culture of trust, integrity, and collaboration. We prioritize our clients' needs and work tirelessly to ensure the ideal candidate-job match. Join us in our commitment to excellence and become part of a dynamic team focused on driving success for individuals and organizations alike. Role Responsibilities Design, develop, and implement data pipelines using Azure Data Factory. Create and maintain data models for structured and unstructured data. Extract, transform, and load (ETL) data from various sources into data warehouses. Develop analytical solutions and dashboards using Azure Databricks. Perform data integration and migration tasks with Azure tools. Ensure optimal performance and scalability of data solutions. Collaborate with cross-functional teams to understand data requirements. Utilize SQL Server for database management and data queries. Implement data quality checks and ensure data integrity. Work on data governance and compliance initiatives. Monitor and troubleshoot data pipeline issues to ensure reliability. Document data processes and architecture for future reference. Stay current with industry trends and Azure advancements. Train and mentor junior data engineers and team members. Participate in design reviews and provide feedback for process improvements. Qualifications Bachelor's degree in Computer Science, Information Technology, or a related field. 3+ years of experience in a data engineering role. Strong expertise in Azure Data Factory and Azure Databricks. Proficient in SQL for data manipulation and querying. Experience with data warehousing concepts and practices. Familiarity with ETL tools and processes. Knowledge of Python or other programming languages for data processing. Ability to design scalable cloud architecture. Experience with data modeling and database design. Effective communication and collaboration skills. Strong analytical and problem-solving abilities. Familiarity with performance tuning and optimization techniques. Knowledge of data visualization tools is a plus. Experience with Agile methodologies. Ability to work independently and manage multiple tasks. Willingness to learn and adapt to new technologies. Skills: etl,azure databricks,sql server,azure,data governance,azure data factory,python,data warehousing,data engineer,data integration,performance tuning,python scripting,sql,data modeling,data migration,data visualization,analytical solutions,pyspark,agile methodologies,data quality checks Show more Show less

Posted 5 days ago

Apply

5.0 - 10.0 years

16 - 25 Lacs

Hyderabad, Bengaluru

Work from Office

Naukri logo

PySpark Data Engineer:- Job Description: 1. API Development : Design, develop, and maintain robust APIs using FastAPI and RESTful principles for scalable backend systems. 2. Big Data Processing : Leverage PySpark to process and analyze large datasets efficiently, ensuring optimal performance in big data environments. 3. Full-Stack Integration : Develop seamless backend-to-frontend feature integrations, collaborating with front-end developers for cohesive user experiences. 4. CI/CD Pipelines : Implement and manage CI/CD pipelines using GitHub Actions and Azure DevOps to streamline deployments and ensure system reliability. 5. Containerization : Utilize Docker for building and deploying containerized applications in development and production environments. 6. Team Leadership : Lead and mentor a team of developers, providing guidance, code reviews, and support to junior team members to ensure high-quality deliverables. 7. Code Optimization : Write clean, maintainable, and efficient Python code, with a focus on scalability, reusability, and performance. 8. Cloud Deployment : Deploy and manage applications on cloud platforms like Azure , ensuring high availability and fault tolerance. 9. Collaboration : Work closely with cross-functional teams, including product managers and designers, to translate business requirements into technical solutions. 10. Documentation : Maintain thorough documentation for APIs, processes, and systems to ensure transparency and ease of maintenance Highlighted Skillset:- Big Data : Strong PySpark skills for processing large datasets. DevOps : Proficiency in GitHub Actions , CI/CD pipelines , Azure DevOps , and Docker . Integration : Experience in backend-to-frontend feature connectivity. Leadership : Proven ability to lead and mentor development teams. Cloud : Knowledge of deploying and managing applications in Azure or other cloud environments. Team Collaboration : Strong interpersonal and communication skills for working in cross-functional teams. Best Practices : Emphasis on clean code, performance optimization, and robust documentation Share updated resume at siddhi.pandey@adecco.com or whatsapp at 6366783349

Posted 5 days ago

Apply

Exploring PySpark Jobs in India

PySpark, a powerful data processing framework built on top of Apache Spark and Python, is in high demand in the job market in India. With the increasing need for big data processing and analysis, companies are actively seeking professionals with PySpark skills to join their teams. If you are a job seeker looking to excel in the field of big data and analytics, exploring PySpark jobs in India could be a great career move.

Top Hiring Locations in India

Here are 5 major cities in India where companies are actively hiring for PySpark roles: 1. Bangalore 2. Pune 3. Hyderabad 4. Mumbai 5. Delhi

Average Salary Range

The estimated salary range for PySpark professionals in India varies based on experience levels. Entry-level positions can expect to earn around INR 6-8 lakhs per annum, while experienced professionals can earn upwards of INR 15 lakhs per annum.

Career Path

In the field of PySpark, a typical career progression may look like this: 1. Junior Developer 2. Data Engineer 3. Senior Developer 4. Tech Lead 5. Data Architect

Related Skills

In addition to PySpark, professionals in this field are often expected to have or develop skills in: - Python programming - Apache Spark - Big data technologies (Hadoop, Hive, etc.) - SQL - Data visualization tools (Tableau, Power BI)

Interview Questions

Here are 25 interview questions you may encounter when applying for PySpark roles:

  • Explain what PySpark is and its main features (basic)
  • What are the advantages of using PySpark over other big data processing frameworks? (medium)
  • How do you handle missing or null values in PySpark? (medium)
  • What is RDD in PySpark? (basic)
  • What is a DataFrame in PySpark and how is it different from an RDD? (medium)
  • How can you optimize performance in PySpark jobs? (advanced)
  • Explain the difference between map and flatMap transformations in PySpark (basic)
  • What is the role of a SparkContext in PySpark? (basic)
  • How do you handle schema inference in PySpark? (medium)
  • What is a SparkSession in PySpark? (basic)
  • How do you join DataFrames in PySpark? (medium)
  • Explain the concept of partitioning in PySpark (medium)
  • What is a UDF in PySpark? (medium)
  • How do you cache DataFrames in PySpark for optimization? (medium)
  • Explain the concept of lazy evaluation in PySpark (medium)
  • How do you handle skewed data in PySpark? (advanced)
  • What is checkpointing in PySpark and how does it help in fault tolerance? (advanced)
  • How do you tune the performance of a PySpark application? (advanced)
  • Explain the use of Accumulators in PySpark (advanced)
  • How do you handle broadcast variables in PySpark? (advanced)
  • What are the different data sources supported by PySpark? (medium)
  • How can you run PySpark on a cluster? (medium)
  • What is the purpose of the PySpark MLlib library? (medium)
  • How do you handle serialization and deserialization in PySpark? (advanced)
  • What are the best practices for deploying PySpark applications in production? (advanced)

Closing Remark

As you explore PySpark jobs in India, remember to prepare thoroughly for interviews and showcase your expertise confidently. With the right skills and knowledge, you can excel in this field and advance your career in the world of big data and analytics. Good luck!

cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies