Jobs
Interviews

361 Apache Spark Jobs - Page 3

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

3.0 - 7.0 years

0 Lacs

hyderabad, telangana

On-site

The Retail Specialized Data Scientist will play a pivotal role in utilizing advanced analytics, machine learning, and statistical modeling techniques to help the retail business make data-driven decisions. You will work closely with teams across marketing, product management, supply chain, and customer insights to drive business strategies and innovations. The ideal candidate should have experience in retail analytics and the ability to translate data into actionable insights. Key Responsibilities: - Leverage Retail Knowledge: Utilize your deep understanding of the retail industry (merchandising, customer behavior, product lifecycle) to design AI solutions that address critical retail business needs. - Gather and clean data from various retail sources, such as sales transactions, customer interactions, inventory management, website traffic, and marketing campaigns. - Apply machine learning algorithms, such as classification, clustering, regression, and deep learning, to enhance predictive models. - Use AI-driven techniques for personalization, demand forecasting, and fraud detection. - Utilize advanced statistical methods to optimize existing use cases and build new products to serve new challenges and use cases. - Stay updated on the latest trends in data science and retail technology. - Collaborate with executives, product managers, and marketing teams to translate insights into business actions. Professional & Technical Skills: - Strong analytical and statistical skills. - Expertise in machine learning and AI. - Experience with retail-specific datasets and KPIs. - Proficiency in data visualization and reporting tools. - Ability to work with large datasets and complex data structures. - Strong communication skills to interact with both technical and non-technical stakeholders. - A solid understanding of the retail business and consumer behavior. - Programming Languages: Python, R, SQL, Scala - Data Analysis Tools: Pandas, NumPy, Scikit-learn, TensorFlow, Keras - Visualization Tools: Tableau, Power BI, Matplotlib, Seaborn - Big Data Technologies: Hadoop, Spark, AWS, Google Cloud - Databases: SQL, NoSQL (MongoDB, Cassandra) Additional Information: - Job Title: Retail Specialized Data Scientist - Management Level: 09 - Consultant - Location: Bangalore / Gurgaon / Mumbai / Chennai / Pune / Hyderabad / Kolkata - Company: Accenture This position requires a solid understanding of retail industry dynamics, strong communication skills, proficiency in Python for data manipulation, statistical analysis, and machine learning, as well as familiarity with big data processing platforms and ETL processes. The Retail Specialized Data Scientist will be responsible for gathering, cleaning, and analyzing data to provide valuable insights for business decision-making and optimization of pricing strategies based on market demand and customer behavior.,

Posted 3 weeks ago

Apply

3.0 - 8.0 years

0 Lacs

chennai, tamil nadu

On-site

This is a data engineer position where you will be responsible for designing, developing, implementing, and maintaining data flow channels and data processing systems to support the collection, storage, batch and real-time processing, and analysis of information in a scalable, repeatable, and secure manner in coordination with the Data & Analytics team. Your main objective will be to define optimal solutions for data collection, processing, and warehousing, particularly within the banking & finance domain. You must have expertise in Spark Java development for big data processing, Python, and Apache Spark. You will be involved in designing, coding, and testing data systems and integrating them into the internal infrastructure. Your responsibilities will include ensuring high-quality software development with complete documentation, developing and optimizing scalable Spark Java-based data pipelines, designing and implementing distributed computing solutions for risk modeling, pricing, and regulatory compliance, ensuring efficient data storage and retrieval using Big Data, implementing best practices for Spark performance tuning, maintaining high code quality through testing, CI/CD pipelines, and version control, working on batch processing frameworks for Market risk analytics, and promoting unit/functional testing and code inspection processes. You will also collaborate with business stakeholders, Business Analysts, and other data scientists to understand and interpret complex datasets. Qualifications: - 5-8 years of experience in working in data ecosystems - 4-5 years of hands-on experience in Hadoop, Scala, Java, Spark, Hive, Kafka, Impala, Unix Scripting, and other Big data frameworks - 3+ years of experience with relational SQL and NoSQL databases such as Oracle, MongoDB, HBase - Strong proficiency in Python and Spark Java with knowledge of core Spark concepts (RDDs, Dataframes, Spark Streaming, etc.), Scala, and SQL - Data integration, migration, and large-scale ETL experience - Data modeling experience - Experience building and optimizing big data pipelines, architectures, and datasets - Strong analytic skills and experience working with unstructured datasets - Experience with various technologies like Confluent Kafka, Redhat JBPM, CI/CD build pipelines, Git, BitBucket, Jira, external cloud platforms, container technologies, and supporting frameworks - Highly effective interpersonal and communication skills - Experience with software development life cycle Education: - Bachelors/University degree or equivalent experience in computer science, engineering, or a similar domain This is a full-time position in the Data Architecture job family group within the Technology sector.,

Posted 3 weeks ago

Apply

5.0 - 10.0 years

0 Lacs

haryana

On-site

The Tech Consultant - Data & Cloud role involves supporting a leading international client with expertise in data engineering, cloud platforms, and big data technologies. As a skilled professional, you will contribute to large-scale data initiatives, implement cloud-based solutions, and collaborate with stakeholders to drive data-driven innovation. You will design scalable data architectures, optimize ETL processes, and leverage cloud technologies to deliver impactful business solutions. Key Responsibilities Data Engineering & ETL: Develop and optimize data pipelines using Apache Spark, Airflow, Sqoop, and Databricks for seamless data transformation and integration. Cloud & Infrastructure Management: Design and implement cloud-native solutions using AWS, GCP, or Azure, ensuring scalability, security, and performance. Big Data & Analytics: Work with Hadoop, Snowflake, Data Lake, and Hive to enable advanced analytics and business intelligence capabilities. Technical Excellence: Utilize Python, SQL, and cloud data warehousing solutions to drive efficiency in data processing and analytics. Agile & DevOps Best Practices: Implement CI/CD pipelines, DevOps methodologies, and Agile workflows for seamless development and deployment. Stakeholder Collaboration: Work closely with business and technology teams to translate complex data challenges into business-driven solutions. Required Qualifications & Skills 5 - 10 years of experience in data engineering, analytics, and cloud-based solutions. Strong knowledge of Big Data technologies (Hadoop, Spark, Snowflake, Hive, Databricks, Airflow, AWS). Experience with ETL pipelines, data lakes, and large-scale data processing. Proficiency in Python, SQL, and cloud data warehousing solutions. Hands-on experience in cloud platforms (AWS, Azure, GCP) and infrastructure as code (Terraform, CloudFormation). Familiarity with containerization (Docker, Kubernetes) and BI tools (Tableau, Power BI). Understanding of Agile, Scrum, and DevOps best practices. Strong communication, problem-solving, and collaboration skills. Why Join Us Work on impactful global data projects for a leading international client. Lucrative Retention Bonus: Up to 20% bonus at the end of the first year, based on performance. Career Growth & Training: Access to world-class learning in advanced cloud, AI, and analytics technologies. Collaborative & High-Performance Culture: Work in a dynamic environment that fosters innovation, leadership, and technical excellence. About Us We are a trusted technology partner specializing in enterprise data solutions, cloud transformation, and analytics-driven decision-making. Our expertise in big data, AI, and cloud infrastructure enables us to deliver scalable, high-value solutions to global enterprises.,

Posted 3 weeks ago

Apply

6.0 - 10.0 years

0 - 0 Lacs

coimbatore, tamil nadu

On-site

As a Big Data Engineer at KGIS, you will be an integral part of the team dedicated to building cutting-edge digital and analytics solutions for global enterprises. With a focus on designing, developing, and optimizing large-scale data processing systems, you will lead the way in creating scalable data pipelines, driving performance tuning, and spearheading cloud-native big data initiatives. Your responsibilities will include designing and developing robust Big Data solutions using Apache Spark, building both batch and real-time data pipelines utilizing technologies like Spark, Spark Streaming, Kafka, and RabbitMQ, implementing ETL processes for data ingestion and transformation, and optimizing Spark jobs for enhanced performance and scalability. You will also work with NoSQL technologies such as HBase, Cassandra, or MongoDB, query large datasets using tools like Hive and Impala, ensure seamless integration of data from various sources, and lead a team of data engineers while following Agile methodologies. To excel in this role, you must possess deep expertise in Apache Spark and distributed computing, strong programming skills in Python, solid experience with Hadoop v2, MapReduce, HDFS, and Sqoop, proficiency in real-time stream processing using Apache Storm or Spark Streaming, and familiarity with messaging systems like Kafka or RabbitMQ. Additionally, you should have SQL mastery, hands-on experience with NoSQL databases, knowledge of cloud-native services in AWS or Azure, a strong understanding of ETL tools and performance tuning, an Agile mindset, and excellent problem-solving skills. While not mandatory, exposure to data lake and lakehouse architectures, familiarity with DevOps tools for CI/CD and data pipeline monitoring, and certifications in cloud or big data technologies are considered advantageous. Joining KGIS will provide you with the opportunity to work on innovative projects with Fortune 500 clients, be part of a fast-paced and meritocratic culture that values ownership, gain access to cutting-edge tools and technologies, and thrive in a collaborative and growth-focused environment. If you are ready to elevate your Big Data career and contribute to our digital transformation journey, apply now and embark on this exciting opportunity at KGIS.,

Posted 3 weeks ago

Apply

5.0 - 9.0 years

0 Lacs

ahmedabad, gujarat

On-site

You will be responsible for designing and developing data solutions using Elasticsearch/OpenSearch, integrating with various data sources and systems. Your role will involve architecting, implementing, and optimizing data solutions, along with applying your expertise in machine learning to develop models, algorithms, and pipelines for data analysis, prediction, and anomaly detection within Elasticsearch/OpenSearch environments. Additionally, you will design and implement data ingestion pipelines to collect, cleanse, and transform data from diverse sources, ensuring data quality and integrity. As part of your responsibilities, you will manage and administer Elasticsearch/OpenSearch clusters, including configuration, performance tuning, index optimization, and monitoring. You will work on optimizing complex queries and search operations in Elasticsearch/OpenSearch to ensure efficient and accurate retrieval of data. Troubleshooting and resolving issues related to Elasticsearch/OpenSearch performance, scalability, and reliability will be a key aspect of your role, requiring close collaboration with DevOps and Infrastructure teams. Collaboration with cross-functional teams, including data scientists, software engineers, and business stakeholders, will be essential to understand requirements and deliver effective data solutions. You will also be responsible for documenting technical designs, processes, and best practices related to Elasticsearch/OpenSearch and machine learning integration, providing guidance and mentorship to junior team members. To qualify for this position, you should hold a Bachelor's or Master's degree in Computer Science, Data Science, or a related field. Strong experience in designing, implementing, and managing large-scale Elasticsearch/OpenSearch clusters is required, along with expertise in machine learning techniques and frameworks such as TensorFlow, PyTorch, or scikit-learn. Proficiency in programming languages like Python, Java, or Scala, and experience with data processing frameworks and distributed computing are necessary. A solid understanding of data engineering concepts, cloud platforms, and containerization technologies is highly desirable. The ideal candidate will possess strong analytical and problem-solving skills, with the ability to work effectively in a fast-paced, collaborative environment. Excellent communication skills are crucial, enabling you to translate complex technical concepts into clear explanations for both technical and non-technical stakeholders. A proven track record of successfully delivering data engineering projects on time and within budget is also expected. If you have 5+ years of experience in Data Ingestion and Transformation, Elastic Search/Open Search Administration, Machine Learning Integration, and related areas, we invite you to send your CV to careers@eventussecurity.com. Join us in Ahmedabad and be part of our SOC - Excellence team.,

Posted 3 weeks ago

Apply

10.0 - 14.0 years

0 Lacs

karnataka

On-site

Capgemini Invent is the digital innovation, consulting, and transformation brand of the Capgemini Group. As a global business line, Capgemini Invent combines expertise in strategy, technology, data science, and creative design to assist CxOs in envisioning and constructing what's next for their businesses. In this role, you will be responsible for developing and maintaining scalable data pipelines using AWS services. Your tasks will include optimizing data storage and retrieval processes, ensuring data security and compliance with industry standards, and handling large volumes of data while maintaining accuracy, security, and accessibility. Additionally, you will be involved in developing data set processes for data modeling, mining, and production, implementing data quality and validation processes, and collaborating closely with data scientists, analysts, and IT departments to understand data requirements. You will work with data architects, modelers, and IT team members on project goals, monitor and troubleshoot data pipeline issues, conduct performance tuning and optimization of data solutions, and implement disaster recovery procedures. Your role will also involve ensuring the seamless integration of HR data from various sources into the cloud environment, researching opportunities for data acquisition and new uses for existing data, and staying up to date with the latest cloud technologies and best practices. You will be expected to recommend ways to improve data reliability, efficiency, and quality. To be successful in this position, you should have 10+ years of experience in cloud data engineering and proficiency in cloud platforms such as AWS, Azure, or Google Cloud. Experience with data pipeline tools like Apache Spark and AWS Glue, strong programming skills in languages such as Python, SQL, Java, or Scala, and familiarity with Snowflake or Informatica are advantageous. Knowledge of data privacy laws, security best practices, database technologies, and a demonstrated learner attitude are also essential. Strong communication, teamwork skills, and the ability to work in an Agile framework while managing multiple projects simultaneously will be key to excelling in this role. At Capgemini, we value flexible work arrangements to support a healthy work-life balance. We offer various career growth programs and diverse professions to help you explore a world of opportunities. Additionally, you will have the opportunity to equip yourself with valuable certifications in the latest technologies such as Generative AI. Capgemini is a global business and technology transformation partner, dedicated to helping organizations accelerate their transition to a digital and sustainable world. With a diverse team of over 340,000 members in more than 50 countries, Capgemini leverages its strong heritage and expertise in AI, cloud, and data to address clients" business needs comprehensively. We are committed to unlocking technology's value and creating tangible impact for enterprises and society.,

Posted 3 weeks ago

Apply

5.0 - 9.0 years

0 Lacs

haryana

On-site

As a Data Scientist at GlobalLogic, you will be responsible for working as a Full-stack AI Engineer. You must have proficiency in programming languages like Python, Java/Scala, and experience with data processing libraries such as Pandas, NumPy, and Scikit-learn. Additionally, you should be proficient in distributed computing platforms like Apache Spark (PySpark, Scala), and Torch. It is essential to have expertise in API development using Fast API, Spring Boot, and a good understanding of O&M - logging, monitoring, fault management, security, etc. Furthermore, it would be beneficial to have hands-on experience with deployment and orchestration tools like Docker, Kubernetes, Helm. Experience with cloud platforms such as AWS (Sagemaker/ Bedrock), GCP, or Azure is also advantageous. Strong programming skills in TensorFlow, PyTorch, or similar ML frameworks for training and deployment are considered good-to-have qualities for this role. At GlobalLogic, we prioritize a culture of caring, where you will experience an inclusive environment of acceptance and belonging. Continuous learning and development opportunities are provided to help you grow personally and professionally. You will have the chance to work on interesting and meaningful projects that make an impact for clients worldwide. We believe in the importance of work-life balance and flexibility, offering various career areas, roles, and work arrangements to help you achieve the perfect balance. As a high-trust organization, integrity is key, and you can trust GlobalLogic to provide a safe, reliable, and ethical work environment. By joining us, you become part of a team that values truthfulness, candor, and integrity in everything we do. GlobalLogic, a Hitachi Group Company, is a trusted digital engineering partner known for collaborating with some of the world's largest and most innovative companies. Since 2000, we have been at the forefront of the digital revolution, creating innovative digital products and experiences. Join us in transforming businesses and redefining industries through intelligent products, platforms, and services.,

Posted 3 weeks ago

Apply

2.0 - 6.0 years

0 Lacs

karnataka

On-site

You will be joining one of the Big Four companies in India at either Bangalore or Mumbai. As a Spark/Scala Developer specializing in Big Data, you will play a key role in designing and implementing scalable data solutions while ensuring optimal performance. Your responsibilities will include translating business requirements into technical deliverables and contributing to the overall success of the team. To excel in this role, you should have 3 to 5 years of experience as a Big Data Engineer or in a similar position. Additionally, a minimum of 2 years of experience in Scala programming and SQL is required. You will be expected to design, modify, and implement solutions for handling data in Hadoop Data Lake for both batch and streaming workloads using Scala & Apache Spark. Alongside this, debugging, optimization, and performance tuning of Spark jobs will be a part of your daily tasks. Your ability to translate functional requirements and user-stories into technical solutions will be crucial. Furthermore, your expertise in developing and debugging complex SQL queries to extract valuable business insights will be highly beneficial. While not mandatory, any prior development experience with cloud services such as AWS, Azure, or GCP will be considered advantageous for this role.,

Posted 3 weeks ago

Apply

5.0 - 9.0 years

0 Lacs

chennai, tamil nadu

On-site

The Content and Data Analytics team at Elsevier is an integral part of Global Operations, delivering data analysis services using Databricks to product owners and data scientists. As a Senior Data Analyst, you will work independently to provide advanced insights and recommendations, leading analytics efforts with high complexity. Your responsibilities will include supporting data scientists within the Research Data Platform, performing various analytical activities such as diving into large datasets, data preparation, and evaluating data science algorithms. A keen eye for detail, strong analytical skills, and expertise in data analysis systems are essential, along with curiosity and dedication to high-quality work in the scientific research domain. Requirements for this role include a minimum of 5 years of work experience, coding skills in Python and SQL, familiarity with string manipulation functions like regular expressions, and experience with data analysis tools such as Pandas or Apache Spark/Databricks. Knowledge of basic statistics, visualization tools like Tableau/Power BI, and Agile tools like JIRA are advantageous. You will be expected to build and maintain relationships with stakeholders, present achievements and project updates effectively, and collaborate well within a team. Taking initiative, driving for results, and demonstrating strong stakeholder management skills are key competencies for success in this role. In addition to work-life balance initiatives, Elsevier offers comprehensive health insurance, flexible working arrangements, employee assistance programs, and various leave options. Your well-being and happiness are prioritized, with benefits including group life insurance, modern family support, and subsidized meals. Join us at Elsevier, a global leader in information and analytics supporting science, research, and healthcare. Your work will contribute to addressing global challenges and promoting a sustainable future through innovative technologies and partnerships. We are committed to fair and accessible hiring practices, ensuring a safe and inclusive workplace for all candidates.,

Posted 3 weeks ago

Apply

10.0 - 18.0 years

0 Lacs

pune, maharashtra

On-site

We are looking for a seasoned Senior Data Architect with extensive knowledge in Databricks and Microsoft Fabric to join our team. In this role, you will be responsible for leading the design and implementation of scalable data solutions for BFSI and HLS clients. As a Senior Data Architect specializing in Databricks and Microsoft Fabric, you will play a crucial role in architecting and implementing secure, high-performance data solutions on the Databricks and Azure Fabric platforms. Your responsibilities will include leading discovery workshops, designing end-to-end data pipelines, optimizing workloads for performance and cost efficiency, and ensuring compliance with data governance, security, and privacy policies. You will collaborate with client stakeholders and internal teams to deliver technical engagements and provide guidance on best practices for Databricks and Microsoft Azure. Additionally, you will stay updated on the latest industry developments and recommend new data architectures, technologies, and standards to enhance our solutions. As a subject matter expert in Databricks and Azure Fabric, you will be responsible for delivering workshops, webinars, and technical presentations, as well as developing white papers and reusable artifacts to showcase our company's value proposition. You will also work closely with Databricks partnership teams to contribute to co-marketing and joint go-to-market strategies. In terms of business development support, you will collaborate with sales and pre-sales teams to provide technical guidance during RFP responses and identify upsell and cross-sell opportunities within existing accounts. To be successful in this role, you should have a minimum of 10+ years of experience in data architecture, engineering, or analytics roles, with specific expertise in Databricks and Azure Fabric. You should also possess strong communication and presentation skills, as well as the ability to collaborate effectively with diverse teams. Additionally, certifications in cloud platforms such as AWS and Microsoft Azure will be advantageous. In return, we offer a competitive salary and benefits package, a culture focused on talent development, and opportunities to work with cutting-edge technologies. At Persistent, we are committed to fostering diversity and inclusion in the workplace and invite applications from all qualified individuals. We provide a supportive and inclusive environment where all employees can thrive and unleash their full potential. Join us at Persistent and accelerate your growth professionally and personally while making a positive impact on the world with the latest technologies.,

Posted 3 weeks ago

Apply

5.0 - 9.0 years

0 Lacs

pune, maharashtra

On-site

Are you a seasoned engineer with a passion for data science and AI We are seeking a talented individual to lead our AM Analytics platform, a cutting-edge data science and AI platform that empowers our business users to develop innovative data models using the latest technologies. Your role will involve collaborating with business stakeholders across UBS Asset Management to gather and refine requirements, ensuring their successful delivery through technology solutions. You will act as a trusted advisor by identifying opportunities and guiding stakeholders on what's possible through technology. Additionally, you will lead the development of custom capabilities and workflows within the AM Analytics solutions, overseeing the full lifecycle from functional design through development and testing. It will be your responsibility to ensure the AM Analytics platform is secure, scalable, and resilient, supporting business continuity and long-term growth. You will also serve as a techno-functional expert with a strong grasp of both business needs and technical architecture, effectively communicating with diverse audiences, managing stakeholder relationships, and translating complex technical insights into actionable business value. Furthermore, you will have the opportunity to join our Certified Engineer Development Program and participate in various technology guilds focused on data, artificial intelligence, and other domains aligned with your interests. These communities offer a platform to contribute across UBS, foster innovation, and expand your expertise. You will be joining Asset Management Technology, a truly global and diverse organization of around 1,200 professionals, and take on a leadership role within the AM Technology Data Services agile crew. Our mission at Asset Management Technology is to drive sustainable investment outcomes and empower our teams through innovation. We are a client-focused, forward-thinking organization that values creativity and collaboration. Our culture fosters individual growth and team excellence, supported by data-driven, scalable platforms that align with business goals. We take pride in our work, celebrate our achievements, and thrive in a dynamic, high-performing environment. To excel in this role, you should have proven, hands-on experience in data engineering with strong proficiency in Python, SQL, Azure Kubernetes, and Azure Cloud. Exposure to AI, DevOps practices, and web service integration is highly valued. A solid background in relational databases such as PostgreSQL, Oracle SQL, Microsoft SQL Server, and MySQL is required; experience with non-relational databases is a plus. You should possess a strong knowledge of data engineering practices, including data profiling and ETL/ELT pipeline development, as well as experience with big data platforms such as Databricks, Cassandra, Apache Spark, and Hadoop. Skills in working with distributed systems, clustering, and replication technologies are essential, along with practical experience with machine learning frameworks including TensorFlow, PyTorch, and scikit-learn. Familiarity with natural language processing (NLP) and AI model development and deployment workflows (e.g., MLOps) is advantageous. You should be comfortable working in a collaborative, multi-site environment using Agile software development methodologies. A Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent practical experience is required, along with over 5 years of diverse experience in designing and managing innovative analytics and AI solutions; experience in the financial or fintech industry is a strong advantage. UBS is the world's largest and the only truly global wealth manager, operating through four business divisions: Global Wealth Management, Personal & Corporate Banking, Asset Management, and the Investment Bank. With a presence in all major financial centers in more than 50 countries, UBS stands out due to its global reach and the breadth of expertise, setting it apart from competitors. UBS is an Equal Opportunity Employer that respects and seeks to empower each individual, supporting the diverse cultures, perspectives, skills, and experiences within its workforce.,

Posted 3 weeks ago

Apply

5.0 - 12.0 years

0 Lacs

coimbatore, tamil nadu

On-site

You should have 5-12 years of experience in Big Data & Data related technologies. Your expertise should include a deep understanding of distributed computing principles and strong knowledge of Apache Spark. Proficiency in Python programming is required, along with experience using technologies such as Hadoop v2, Map Reduce, HDFS, Sqoop, Apache Storm, and Spark-Streaming for building stream-processing systems. You should have a good understanding of Big Data querying tools like Hive and Impala, as well as experience in integrating data from various sources such as RDBMS, ERP, and Files. Knowledge of SQL queries, joins, stored procedures, and relational schemas is essential. Experience with NoSQL databases like HBase, Cassandra, and MongoDB, along with ETL techniques and frameworks, is also expected. The role requires performance tuning of Spark Jobs, experience with AZURE Databricks, and the ability to efficiently lead a team. Designing and implementing Big Data solutions, as well as following AGILE methodology, are key aspects of this position.,

Posted 3 weeks ago

Apply

5.0 - 9.0 years

0 Lacs

karnataka

On-site

As an AI Ops Expert, you will be responsible for taking full ownership of deliverables with defined quality standards, timelines, and budget constraints. Your primary role will involve designing, implementing, and managing AIops solutions to automate and optimize AI/ML workflows. Collaborating with data scientists, engineers, and stakeholders is essential to ensure the seamless integration of AI/ML models into production environments. Your duties will also include monitoring and maintaining the health and performance of AI/ML systems, developing and maintaining CI/CD pipelines specifically tailored for AI/ML models, and implementing best practices for model versioning, testing, and deployment. In case of issues related to AI/ML infrastructure or workflows, you will troubleshoot and resolve them effectively. To excel in this role, you are expected to stay abreast of the latest AIops, MLOps, and Kubernetes tools and technologies. Your strong skills should include proficiency in Python with experience in Fast API, hands-on expertise in Docker and Kubernetes (or AKS), familiarity with MS Azure and its AI/ML services like Azure ML Flow, and the ability to use DevContainer for development purposes. Furthermore, you should possess knowledge of CI/CD tools such as Jenkins, Argo CD, Helm, GitHub Actions, or Azure DevOps, experience with containerization and orchestration tools like Docker and Kubernetes, proficiency in Infrastructure as code (Terraform or equivalent), familiarity with machine learning frameworks like TensorFlow, PyTorch, or scikit-learn, and exposure to data engineering tools such as Apache Kafka, Apache Spark, or similar technologies.,

Posted 3 weeks ago

Apply

2.0 - 6.0 years

0 Lacs

hyderabad, telangana

On-site

As a Pyspark Developer at Viraaj HR Solutions, you will be responsible for developing and maintaining scalable Pyspark applications for data processing. Your role will involve collaborating with data engineers to design and implement ETL pipelines for large datasets. Additionally, you will perform data analysis and build data models using Pyspark to derive insights. It will be your responsibility to ensure data quality and integrity by implementing data cleansing routines and leveraging SQL to query databases effectively. You will also create comprehensive data reports and visualizations for stakeholders, optimize existing data processing jobs for performance and efficiency, and implement new features and enhancements as required by project specifications. Participation in code reviews to ensure adherence to best practices, troubleshooting technical issues with team members, and maintaining documentation of data processes and system configurations will be part of your daily tasks. To excel in this role, you should possess a Bachelor's degree in Computer Science, Information Technology, or a related field, along with proven experience as a Pyspark Developer or in a similar role. Strong programming skills in Pyspark and Python, a solid understanding of the Spark framework and its APIs, and proficiency in SQL for managing and querying databases are essential qualifications. Experience with ETL tools and processes, knowledge of data visualization techniques and tools, and familiarity with cloud platforms such as AWS and Azure are also required. Your problem-solving and analytical skills, along with excellent communication skills (both verbal and written), will be crucial for success in this role. You should be able to work effectively in a team environment, adapt to new technologies and methodologies, and have experience in Agile and Scrum methodologies. Prior experience in data processing on large datasets and an understanding of data governance and compliance standards will be beneficial. Key Skills: agile methodologies, data analysis, team collaboration, Python, Scrum, Pyspark, data visualization, problem-solving, ETL tools, Python scripting, Apache Spark, Spark framework, cloud platforms (AWS, Azure), SQL, cloud technologies, data processing.,

Posted 3 weeks ago

Apply

2.0 - 6.0 years

0 Lacs

navi mumbai, maharashtra

On-site

As a Data Engineer, you will be responsible for collaborating with data scientists, software engineers, and business stakeholders to comprehend data requirements and develop efficient data models. Your key tasks will include designing and implementing robust data pipelines, ETL processes, and data integration solutions to ensure scalability and reliability. You will play a crucial role in the extraction, transformation, and loading of data from multiple sources while maintaining data quality, integrity, and consistency. It will be essential to optimize data processing and storage systems to effectively handle large volumes of structured and unstructured data. Additionally, you will be expected to conduct data cleaning, normalization, and enrichment activities to prepare datasets for analysis and modeling purposes. Monitoring data flows and processes, as well as identifying and resolving data-related issues and bottlenecks, will be part of your daily responsibilities. Furthermore, contributing to enhancing data engineering practices and standards within the organization and staying abreast of industry trends and emerging technologies will be vital for your success in this role. In terms of qualifications, you should possess a strong passion for data engineering, artificial intelligence, and problem-solving. A solid understanding of data engineering concepts, data modeling, and data integration techniques is essential. Proficiency in programming languages like Python, SQL, and Web Scraping is required, and familiarity with databases such as NoSQL, relational databases, and technologies like MongoDB, Redis, and Apache Spark would be advantageous. Knowledge of distributed computing frameworks and big data technologies, such as Hadoop and Spark, will be considered a plus. Excellent analytical and problem-solving skills, attention to detail, and strong communication and collaboration abilities are also key attributes for this role. Being self-motivated, a quick learner, and adaptable to changing priorities and technologies are qualities that will help you excel in this position.,

Posted 3 weeks ago

Apply

5.0 - 9.0 years

0 Lacs

pune, maharashtra

On-site

You are a highly skilled and experienced Senior Data Scientist with a strong background in Artificial Intelligence (AI) and Machine Learning (ML). You will be joining our team as an innovative, analytical, and collaborative team player with a proven track record in end-to-end AI/ML project delivery. This includes expertise in data processing, modeling, and model deployment. With a minimum of 5-7 years of experience in data science, your focus will be on AI/ML applications. Your technical skills will include proficiency in a wide range of ML algorithms such as regression, classification, clustering, decision trees, neural networks, and deep learning architectures (e.g., CNNs, RNNs, GANs). Strong programming skills in Python, R, or Scala are required, along with experience in ML libraries like TensorFlow, PyTorch, and Scikit-Learn. You should have experience in data wrangling, cleaning, and feature engineering, with familiarity in SQL and data processing frameworks like Apache Spark. Model deployment using tools like Docker, Kubernetes, and cloud services (AWS, GCP, or Azure) should be part of your skill set. A strong foundational knowledge in statistics, probability, and mathematical concepts used in AI/ML is essential, along with proficiency in data visualization tools such as Tableau, Power BI, or matplotlib. Preferred qualifications include familiarity with big data tools like Hadoop, Hive, and distributed computing. Hands-on experience in NLP techniques like text mining, sentiment analysis, and transformers is a plus. Expertise in analyzing and forecasting time-series data, as well as familiarity with CI/CD pipelines for ML, model versioning, and performance monitoring, are also preferred. Leadership skills such as leading cross-functional project teams or managing data science projects in a production setting are valued. Your personal attributes should include problem-solving skills to break down complex problems and design innovative, data-driven solutions. Strong written and verbal communication skills are necessary to convey technical insights clearly to diverse audiences. A keen interest in staying updated with the latest advancements in AI and ML, along with the ability to quickly learn and implement new technologies, is expected from you.,

Posted 3 weeks ago

Apply

3.0 - 7.0 years

0 Lacs

hyderabad, telangana

On-site

You are sought after by NTT DATA, a global innovator in business and technology services, to join as a Snowflake Engineer - Digital Solution Consultant Sr. Analyst in Hyderabad, Telangana (IN-TG), India (IN). The ideal candidate is expected to possess experience with cloud data warehousing solutions, knowledge of big data technologies like Apache Spark and Hadoop, familiarity with CI/CD pipelines, DevOps practices, as well as data visualization tools. NTT DATA, a trusted global innovator with a revenue of $30 billion, caters to 75% of the Fortune Global 100 companies. Committed to aiding clients in innovation, optimization, and transformation for long-term success, NTT DATA boasts a diverse team of experts in over 50 countries and a robust partner ecosystem. Their services range from business and technology consulting to data and artificial intelligence, industry solutions, and the development, implementation, and management of applications, infrastructure, and connectivity. As a leading provider of digital and AI infrastructure globally, NTT DATA is part of the NTT Group, investing over $3.6 billion annually in R&D to facilitate organizations and society in confidently transitioning into the digital future. To learn more, visit us at us.nttdata.com.,

Posted 3 weeks ago

Apply

9.0 - 13.0 years

0 Lacs

chennai, tamil nadu

On-site

You are seeking a Lead - Python Developer / Tech Lead to take charge of backend development and oversee a team handling enterprise-grade, data-driven applications. In this role, you will have the opportunity to work with cutting-edge technologies such as FastAPI, Apache Spark, and Lakehouse architectures. Your responsibilities will include leading the team, making technical decisions, and ensuring timely project delivery in a dynamic work environment. Your primary duties will involve mentoring and guiding a group of Python developers, managing task assignments, maintaining code quality, and overseeing technical delivery. You will be responsible for designing and implementing scalable RESTful APIs using Python and FastAPI, as well as managing extensive data processing tasks using Pandas, NumPy, and Apache Spark. Additionally, you will drive the implementation of Lakehouse architectures and data pipelines, conduct code reviews, enforce coding best practices, and promote clean, testable code. Collaboration with cross-functional teams, including DevOps and Data Engineering, will be essential. Furthermore, you will be expected to contribute to CI/CD processes, operate in Linux-based environments, and potentially work with Kubernetes or MLOps tools. To excel in this role, you should possess 9-12 years of total experience in software development, with a strong command of Python, FastAPI, and contemporary backend frameworks. A profound understanding of data engineering workflows, Spark, and distributed systems is crucial. Experience in leading agile teams or fulfilling a tech lead position is beneficial. Proficiency in unit testing, Linux, and working in cloud/data environments is required, while exposure to Kubernetes, ML Pipelines, or MLOps would be advantageous.,

Posted 3 weeks ago

Apply

3.0 - 7.0 years

0 Lacs

noida, uttar pradesh

On-site

As a Research Scientist at Adobe, you will have the opportunity to engage in cutting-edge research within the Media and Data Science Research Laboratory. Your role will involve designing, implementing, and optimizing machine learning algorithms to address real-world problems related to understanding user behavior and enhancing marketing performance. You will also be responsible for developing scalable data generation techniques, analyzing complex data from various sources, and running experiments to study data quality. Collaboration will be a key aspect of your role, as you will work closely with product, design, and engineering teams to prototype and transition research concepts into production. Your expertise in areas such as Large Language Models, Computer Vision, Natural Language Processing, and Recommendation Systems will be crucial in driving innovative solutions that redefine how businesses operate. To thrive in this role, you should have a proven track record of empirical research, experience deploying solutions in production environments, and the ability to derive actionable insights from large datasets. A degree in Computer Science, Statistics, Economics, or a related field is required, along with a knack for taking research risks and solving complex problems independently. At Adobe, we prioritize diversity, respect, and equal opportunity, recognizing that valuable insights can come from any team member. If you are a motivated and versatile individual with a passion for transforming digital experiences, we encourage you to join our ambitious team and contribute to the future of technology innovation.,

Posted 3 weeks ago

Apply

10.0 - 14.0 years

0 Lacs

dehradun, uttarakhand

On-site

As a Data Modeler, your primary responsibility will be to design and develop conceptual, logical, and physical data models supporting enterprise data initiatives. You will work with modern storage formats like Parquet and ORC, and build and optimize data models within Databricks Unity Catalog. Collaborating with data engineers, architects, analysts, and stakeholders, you will ensure alignment with ingestion pipelines and business goals. Translating business and reporting requirements into robust data architecture, you will follow best practices in data warehousing and Lakehouse design. Your role will involve maintaining metadata artifacts, enforcing data governance, quality, and security protocols, and continuously improving modeling processes. You should have over 10 years of hands-on experience in data modeling within Big Data environments. Your expertise should include OLTP, OLAP, dimensional modeling, and enterprise data warehouse practices. Proficiency in modeling methodologies like Kimball, Inmon, and Data Vault is essential. Hands-on experience with modeling tools such as ER/Studio, ERwin, PowerDesigner, SQLDBM, dbt, or Lucidchart is preferred. Experience in Databricks with Unity Catalog and Delta Lake is required, along with a strong command of SQL and Apache Spark for querying and transformation. Familiarity with the Azure Data Platform, including Azure Data Factory, Azure Data Lake Storage, Azure Synapse Analytics, and Azure SQL Database, is beneficial. Exposure to Azure Purview or similar data cataloging tools is a plus. Strong communication and documentation skills are necessary for this role, as well as the ability to work in cross-functional agile environments. A Bachelor's or Master's degree in Computer Science, Information Systems, Data Engineering, or a related field is required. Certifications such as Microsoft DP-203: Data Engineering on Microsoft Azure are a plus. Experience working in agile/scrum environments and exposure to enterprise data security and regulatory compliance frameworks like GDPR and HIPAA are advantageous.,

Posted 3 weeks ago

Apply

4.0 - 8.0 years

0 Lacs

pune, maharashtra

On-site

As a Big Data Architect specializing in Databricks at Codvo, a global empathy-led technology services company, your role is critical in designing sophisticated data solutions that drive business value for enterprise clients and power internal AI products. Your expertise will be instrumental in architecting scalable, high-performance data lakehouse platforms and end-to-end data pipelines, making you the go-to expert for modern data architecture in a cloud-first world. Your key responsibilities will include designing and documenting robust, end-to-end big data solutions on cloud platforms (AWS, Azure, GCP) with a focus on the Databricks Lakehouse Platform. You will provide technical guidance and oversight to data engineering teams on best practices for data ingestion, transformation, and processing using Spark. Additionally, you will design and implement effective data models and establish data governance policies for data quality, security, and compliance within the lakehouse. Evaluating and recommending appropriate data technologies, tools, and frameworks to meet project requirements and collaborating closely with various stakeholders to translate complex business requirements into tangible technical architecture will also be part of your role. Leading and building Proof of Concepts (PoCs) to validate architectural approaches and new technologies in the big data and AI space will be crucial. To excel in this role, you should have 10+ years of experience in data engineering, data warehousing, or software engineering, with at least 4+ years in a dedicated Data Architect role. Deep, hands-on expertise with Apache Spark and the Databricks platform is mandatory, including Delta Lake, Unity Catalog, and Structured Streaming. Proven experience architecting and deploying data solutions on major cloud providers, proficiency in Python or Scala, expert-level SQL skills, strong understanding of modern AI concepts, and in-depth knowledge of data warehousing concepts and modern Lakehouse patterns are essential. This position is remote and based in India with working hours from 2:30 PM to 11:30 PM. Join us at Codvo and be a part of a team that values Product innovation, mature software engineering, and core values like Respect, Fairness, Growth, Agility, and Inclusiveness each day to offer expertise, outside-the-box thinking, and measurable results.,

Posted 3 weeks ago

Apply

15.0 - 19.0 years

0 Lacs

hyderabad, telangana

On-site

As a Technical Lead / Data Architect, you will play a crucial role in our organization by leveraging your expertise in modern data architectures, cloud platforms, and analytics technologies. In this leadership position, you will be responsible for designing robust data solutions, guiding engineering teams, and ensuring successful project execution in collaboration with the project manager. Your key responsibilities will include architecting and designing end-to-end data solutions across multi-cloud environments such as AWS, Azure, and GCP. You will lead and mentor a team of data engineers, BI developers, and analysts to deliver on complex project deliverables. Additionally, you will define and enforce best practices in data engineering, data warehousing, and business intelligence. You will design scalable data pipelines using tools like Snowflake, dbt, Apache Spark, and Airflow, and act as a technical liaison with clients, providing strategic recommendations and maintaining strong relationships. To be successful in this role, you should have at least 15 years of experience in IT with a focus on data architecture, engineering, and cloud-based analytics. You must have expertise in multi-cloud environments and cloud-native technologies, along with deep knowledge of Snowflake, Data Warehousing, ETL/ELT pipelines, and BI platforms. Strong leadership and mentoring skills are essential, as well as excellent communication and interpersonal abilities to engage with both technical and non-technical stakeholders. In addition to the required qualifications, certifications in major cloud platforms and experience in enterprise data governance, security, and compliance are preferred. Familiarity with AI/ML pipeline integration would be a plus. We offer a collaborative work environment, opportunities to work with cutting-edge technologies and global clients, competitive salary and benefits, and continuous learning and professional development opportunities. Join us in driving innovation and excellence in data architecture and analytics.,

Posted 3 weeks ago

Apply

9.0 - 13.0 years

0 Lacs

chennai, tamil nadu

On-site

You should have 9+ years of experience and be located in Chennai. You must possess in-depth knowledge of Python and have good experience in creating APIs using FastAPI. It is essential to have exposure to data libraries like Pandas, DataFrame, NumPy etc., as well as knowledge in Apache open-source components. Experience with Apache Spark, Lakehouse architecture, and Open table formats is required. You should also have knowledge in automated unit testing, preferably using PyTest, and exposure in distributed computing. Experience working in a Linux environment is necessary, and working knowledge in Kubernetes would be an added advantage. Basic exposure to ML and MLOps would also be advantageous.,

Posted 3 weeks ago

Apply

5.0 - 9.0 years

0 Lacs

karnataka

On-site

As a Data Engineer at Lifesight, you will play a crucial role in the Data and Business Intelligence organization by focusing on deep data engineering projects. Joining the data platform team in Bengaluru, you will have the opportunity to contribute to defining the technical strategy and data engineering team culture in India. Your responsibilities will include designing and constructing data platforms and services, as well as managing data infrastructure in cloud environments to support strategic business decisions across Lifesight products. You will be expected to build highly scalable distributed data processing systems, data solutions, and data pipelines that optimize data quality and are resilient to poor-quality data sources. Additionally, you will own data mapping, business logic, transformations, and data quality, while participating in architecture discussions, influencing the product roadmap, and taking ownership of new projects. The ideal candidate for this role should possess proficiency in Python and PySpark, a deep understanding of Apache Spark, experience with big data technologies such as HDFS, YARN, Map-Reduce, Hive, Kafka, Spark, Airflow, and Presto, and familiarity with distributed database systems. Experience working with various file formats like Parquet, Avro, and NoSQL databases, as well as AWS and GCP, is preferred. A minimum of 5 years of professional experience as a data or software engineer is required for this full-time position. If you are a self-starter who is passionate about data engineering, ready to work with big data technologies, and eager to collaborate with a team of engineers while mentoring others, we encourage you to apply for this exciting opportunity at Lifesight.,

Posted 3 weeks ago

Apply

6.0 - 10.0 years

0 Lacs

pune, maharashtra

On-site

As a Senior Data Engineer at our Pune location, you will play a critical role in designing, developing, and maintaining scalable data pipelines and architectures using Data bricks on Azure/AWS cloud platforms. With 6 to 9 years of experience in the field, you will collaborate with stakeholders to integrate large datasets, optimize performance, implement ETL/ELT processes, ensure data governance, and work closely with cross-functional teams to deliver accurate solutions. Your responsibilities will include building, maintaining, and optimizing data workflows, integrating datasets from various sources, tuning pipelines for performance and scalability, implementing ETL/ELT processes using Spark and Data bricks, ensuring data governance, collaborating with different teams, documenting data pipelines, and developing automated processes for continuous integration and deployment of data solutions. To excel in this role, you should have 6 to 9 years of hands-on experience as a Data Engineer, expertise in Apache Spark, Delta Lake, Azure/AWS Data bricks, proficiency in Python, Scala, or Java, advanced SQL skills, experience with cloud data platforms, data warehousing solutions, data modeling, ETL tools, version control systems, and automation tools. Additionally, soft skills such as problem-solving, attention to detail, and ability to work in a fast-paced environment are essential. Nice to have skills include experience with Data bricks SQL and Data bricks Delta, knowledge of machine learning concepts, and experience in CI/CD pipelines for data engineering solutions. Joining our team offers challenging work with international clients, growth opportunities, a collaborative culture, and global project involvement. We provide competitive salaries, flexible work schedules, health insurance, performance-based bonuses, and other standard benefits. If you are passionate about data engineering, possess the required skills and qualifications, and thrive in a dynamic and innovative environment, we welcome you to apply for this exciting opportunity.,

Posted 3 weeks ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies