AI Engineer/ Data Scientist

5 years

0 Lacs

Posted:1 month ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

About The Role

We are building next-generation AI assistants that combine real-time responsiveness, multilingual multimodal understanding, and deep personalization. As an

LLM Engineer - AI Assistant & RAG Systems

, you'll be leading initiatives that bridge Retrieval-Augmented Generation (RAG), vector search, latency optimization, and long-term memory for highly scalable consumer-facing applications. You'll work on architecting intelligent, efficient, and privacy-aware voice and text-based assistants.We're proud to share that

Lenskart is now our strategic investor

, backing our vision to make conscious technology accessible at scale. If you're someone who thrives at the intersection of research and product, we want you on our team.

Minimum Work Experience Required

5+ years

of experience in ML/NLP roles with strong hands-on expertise in building large-scale AI/LLM systems for production.

Top 3 Daily Tasks

Design and optimize

LLM-powered assistant systems

including RAG, vector databases, rerankers, and latency-aware inference pipelines.Build feedback loops and observability layers to

evaluate and improve assistant quality

in production.Collaborate with product, mobile, and infra teams to enable seamless

multilingual + multimodal assistant experiences

minimal latency.

Top 5 Skills You Should Possess

Proven experience working with

LLMs

,

RAG pipelines

, and

vector search systems

(e.g., FAISS, Qdrant, Milvus).Deep understanding of

latency optimization

, streaming token responses, and caching strategies in LLM deployment.Experience with

retriever-reranker tuning

, LLM evaluation metrics, prompt engineering, and hallucination mitigation techniques.Strong foundation in

Python

,

PyTorch/TensorFlow

,

FastAPI

, and orchestration tools like

Airflow

,

Docker

, and

Kubernetes

.Ability to design memory modules using

long-term embeddings

, user vectors, and strategies like

memory decay

and

context truncation

for scalable personalization.

Cross-Functional Collaboration Excellence

Work closely with front-end, infra, and product teams to deliver cohesive assistant interactions.Collaborate with UX teams to define

feedback capture

,

user adaptation mechanisms

, and

privacy-aware memory usage

.Interface with Data and MLOps teams for scalable training, evaluation, and deployment pipelines.

Bonus Points For

Experience in

Agentic systems

,

autonomous workflows

, or

fine-tuning LLMs with LORA/QLORA

.Publications or writing in the domain of

LLMs

,

GenAI

, or

retrieval architectures

.Contributions to open-source projects in

RAG/LLM/prompt engineering

or published tools for LLM deployment.Exposure to building

voice interfaces

or

multimodal input pipelines

using tools like Whisper or CLIP.

What You'll Be Creating

A real-time,

multimodal and multilingual assistant

that adapts to user preferences and evolves with usage.Low-latency, scalable backend for LLM-powered interactions under minimal

latency SLA

.A robust feedback and retraining loop enabling

continuous improvement of LLM outputs

.Privacy-aware

long-term memory system

with vectorized personalization and memory decay.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You