Location: NoidaType: Full-TimeExperience Level: 2–5 yearsIndustry: Artificial Intelligence, Machine Learning, Data Science About the RoleWe are looking for a self-motivated Mid-Level Data Scientist to join our AI team focused on GenAI applications. We work at the intersection of multi-modal modeling, Retrieval-Augmented Generation (RAG), and real-time machine learning systems. You’ll collaborate with a high-impact team to design, prototype, and deploy next-generation AI solutions, especially around document understanding and multi-modal tasks .Key ResponsibilitiesDesign and implement state-of-the-art GenAI solutions, involving multi-modal, document understanding models and agents.Build and optimize RAG pipelines, including knowledge of various RAG architectures.Develop and maintain agentic workflows using tools like LangGraph, LangChain.Work with large-scale datasets and ensure efficient data processing pipelines.Perform statistical analysis, algorithm development, and performance tuning.Working with opensource LLMs and deploying them on serving frameworks such as sglang and vllm.Stay up to date with the latest developments in GenAI and ML, and actively contribute to knowledge sharing. Required QualificationsBachelor’s degree (Master’s preferred) in Computer Science, Data Science, AI/ML, or a related field.Minimum 3 years of experience working in machine learning, data science, or AI roles.Strong command of Python and familiarity with R or other scripting languages.Hands-on experience with deep learning, transformer-based models, and multi-modal learning.Proficiency in AI/ML frameworks and libraries (e.g., PyTorch, TensorFlow, Hugging Face Transformers).Strong understanding of statistics, linear algebra, and probability theory.Experience working with cloud environments, preferably Azure.Exposure to OpenAI, Anthropic, Mistral, or similar APIs and deployment of open-source models (LLaMA, MPT, etc.).Demonstrated experience in document AI, vision-language models, or OCR-based understanding systems. Preferred SkillsExperience with LangGraph, CrewAI, Autogen, or similar orchestration frameworks.Working knowledge of vector databases (e.g., Qdrant, Weaviate, Pinecone) and embedding search techniques.Exposure to Kubernetes, Docker, or ML model deployment workflows.Curiosity-driven mindset with a passion for learning and experimenting with the latest in AI research. Why Join Us?Be part of a team working on powerful AI applicationsAccess to cutting-edge tools and open modelsFlexible working hoursSupportive environment that encourages innovation, research, and upskilling
Job Title: Data Scientist Location: Noida Type: Full-Time Experience Level: 1–2 years Industry: Artificial Intelligence, Machine Learning, Data Science About the Role We are looking for a self-motivated Data Scientist to join our AI team focused on GenAI applications. We work at the intersection of multi-modal modeling, Retrieval-Augmented Generation (RAG), and real-time machine learning systems. You’ll collaborate with a high-impact team to design, prototype, and deploy next-generation AI solutions, especially around document understanding and multi-modal tasks. Key Responsibilities Design and implement state-of-the-art GenAI solutions, involving multi-modal, document understanding models and agents. Build and optimize RAG pipelines , including knowledge of various RAG architectures. Develop and maintain agentic workflows using tools like LangGraph, LangChain . Work with large-scale datasets and ensure efficient data processing pipelines. Perform statistical analysis, algorithm development, and performance tuning. Working with opensource LLMs and deploying them on serving frameworks such as sglang and vllm. Stay up to date with the latest developments in GenAI and ML, and actively contribute to knowledge sharing. Required Qualifications Bachelor’s degree (Master’s preferred) in Computer Science, Data Science, AI/ML, or a related field. 1–2 years of experience working in machine learning, data science, or AI roles. Strong command of Python and familiarity with R or other scripting languages. Hands-on experience with deep learning , transformer-based models , and multi-modal learning . Proficiency in AI/ML frameworks and libraries (e.g., PyTorch, TensorFlow, Hugging Face Transformers). Strong understanding of statistics, linear algebra, and probability theory. Experience working with cloud environments , preferably Azure . Exposure to OpenAI , Anthropic , Mistral , or similar APIs and deployment of open-source models (LLaMA, MPT, etc.). Demonstrated experience in document AI , vision-language models , or OCR-based understanding systems . Preferred Skills Experience with LangGraph , CrewAI , Autogen , or similar orchestration frameworks. Working knowledge of vector databases (e.g., Qdrant, Weaviate, Pinecone) and embedding search techniques . Exposure to Kubernetes , Docker , or ML model deployment workflows . Curiosity-driven mindset with a passion for learning and experimenting with the latest in AI research. Why Join Us? Be part of a team working on powerful AI applications Access to cutting-edge tools and open models Flexible working hours Supportive environment that encourages innovation, research, and upskilling
As a Mid-Level Data Scientist at our company in Noida, you will be an integral part of our AI team focused on GenAI applications. You will play a key role in designing, prototyping, and deploying next-generation AI solutions, particularly in the areas of document understanding and multi-modal tasks. Collaboration with a high-impact team will be crucial as you work at the intersection of multi-modal modeling, Retrieval-Augmented Generation (RAG), and real-time machine learning systems. - Design and implement state-of-the-art GenAI solutions, including multi-modal and document understanding models. - Build and optimize RAG pipelines, demonstrating knowledge of various RAG architectures. - Develop and maintain agentic workflows using tools like LangGraph and LangChain. - Handle large-scale datasets and ensure efficient data processing pipelines. - Conduct statistical analysis, algorithm development, and performance tuning. - Utilize opensource LLMs and deploy them on serving frameworks such as sglang and vllm. - Stay updated with the latest developments in GenAI and ML, actively contributing to knowledge sharing within the team. - Bachelor's degree (Master's preferred) in Computer Science, Data Science, AI/ML, or a related field. - Minimum 3 years of experience in machine learning, data science, or AI roles. - Proficiency in Python and familiarity with R or other scripting languages. - Hands-on experience with deep learning, transformer-based models, and multi-modal learning. - Strong understanding of AI/ML frameworks and libraries (e.g., PyTorch, TensorFlow, Hugging Face Transformers). - Proficient in statistics, linear algebra, and probability theory. - Experience working with cloud environments, preferably Azure. - Exposure to OpenAI, Anthropic, Mistral, or similar APIs and deployment of open-source models (LLaMA, MPT, etc.). - Demonstrated experience in document AI, vision-language models, or OCR-based understanding systems.,
Duration: 3 Months Location: Noida (Hybrid) Internship Type: Full-Time Joining: Immediate About the Role: We are looking for a Data Scientist Intern to join our team for a 3-month project-based internship. The ideal candidate should have strong Python programming skills , hands-on experience in prompt engineering , and a good understanding of Retrieval-Augmented Generation (RAG) workflows. You will work closely with our data science and AI teams to build, test, and optimize intelligent data-driven solutions. Key Responsibilities: Develop and maintain Python-based data pipelines and AI workflows. Work on prompt engineering to fine-tune LLM responses for various business use cases. Implement and test RAG pipelines integrating LLMs with external data sources. Perform exploratory data analysis (EDA), preprocessing, and feature extraction. Collaborate with cross-functional teams to design experiments and validate model outputs. Document processes, findings, and code for reproducibility and scalability. Required Skills & Qualifications: Strong proficiency in Python and libraries like Pandas, NumPy, and LangChain / LlamaIndex. Understanding of Prompt Engineering techniques for LLM optimization. Hands-on knowledge of RAG (Retrieval-Augmented Generation) implementation. Familiarity with APIs, vector databases (e.g., FAISS, Pinecone, Chroma), and embeddings. Good analytical, problem-solving, and communication skills. Ability to work independently and deliver within timelines.
You are applying for a Data Scientist Intern position with a 3-month project-based internship at our company. Your main responsibility will be to work closely with the data science and AI teams to develop, test, and optimize intelligent data-driven solutions. As a Data Scientist Intern, you will need to have strong Python programming skills, hands-on experience in prompt engineering, and a good understanding of Retrieval-Augmented Generation (RAG) workflows. Key Responsibilities: - Develop and maintain Python-based data pipelines and AI workflows. - Work on prompt engineering to fine-tune LLM responses for various business use cases. - Implement and test RAG pipelines integrating LLMs with external data sources. - Perform exploratory data analysis (EDA), preprocessing, and feature extraction. - Collaborate with cross-functional teams to design experiments and validate model outputs. - Document processes, findings, and code for reproducibility and scalability. Required Skills & Qualifications: - Strong proficiency in Python and libraries like Pandas, NumPy, and LangChain / LlamaIndex. - Understanding of Prompt Engineering techniques for LLM optimization. - Hands-on knowledge of RAG (Retrieval-Augmented Generation) implementation. - Familiarity with APIs, vector databases (e.g., FAISS, Pinecone, Chroma), and embeddings. - Good analytical, problem-solving, and communication skills. - Ability to work independently and deliver within timelines. You are applying for a Data Scientist Intern position with a 3-month project-based internship at our company. Your main responsibility will be to work closely with the data science and AI teams to develop, test, and optimize intelligent data-driven solutions. As a Data Scientist Intern, you will need to have strong Python programming skills, hands-on experience in prompt engineering, and a good understanding of Retrieval-Augmented Generation (RAG) workflows. Key Responsibilities: - Develop and maintain Python-based data pipelines and AI workflows. - Work on prompt engineering to fine-tune LLM responses for various business use cases. - Implement and test RAG pipelines integrating LLMs with external data sources. - Perform exploratory data analysis (EDA), preprocessing, and feature extraction. - Collaborate with cross-functional teams to design experiments and validate model outputs. - Document processes, findings, and code for reproducibility and scalability. Required Skills & Qualifications: - Strong proficiency in Python and libraries like Pandas, NumPy, and LangChain / LlamaIndex. - Understanding of Prompt Engineering techniques for LLM optimization. - Hands-on knowledge of RAG (Retrieval-Augmented Generation) implementation. - Familiarity with APIs, vector databases (e.g., FAISS, Pinecone, Chroma), and embeddings. - Good analytical, problem-solving, and communication skills. - Ability to work independently and deliver within timelines.