Job Title
Senior Data Scientist / Research Scientist — LLM Training & Fine-tuning (Indian Languages, Tool Calling, Speed)
Location:
Bangalore
About The Role
We're looking for a hands-on
Data Scientist / Research Scientist
who can
fine-tune and train open-source LLMs end-to-end
—not just run LoRA scripts. You'll own model improvement for
Indian languages + code-switching (Hinglish, etc.)
,
instruction following
, and
reliable tool/function calling
, with a strong focus on
latency, throughput, and production deployability
.This is a builder role: you'll take models from research → experiments → evals → production.
What You'll Do (Responsibilities)
- Train and fine-tune open LLMs (continued pretraining, SFT, preference optimization like DPO/IPO/ORPO, reward modeling if needed) for:
Indian languages + multilingual / code-switching
Strong instruction followingReliable tool/function calling (structured JSON, function schemas, deterministic outputs)
- Build data pipelines for high-quality training corpora:
Instruction datasets, tool-call traces, multilingual data, synthetic data generation
De-duplication, contamination control, quality filtering, safety filtering
- Develop evaluation frameworks and dashboards:
Offline + online evals, regression testing
Tool-calling accuracy, format validity, multilingual benchmarks, latency/cost metrics
- Optimize models for speed and serving:
Quantization (AWQ/GPTQ/bnb), distillation, speculative decoding, KV-cache optimizations
Serve via vLLM/TGI/TensorRT-LLM/ONNX where appropriate
- Improve alignment and reliability:
Reduce hallucinations, improve refusal behavior, enforce structured outputs
Prompting + training strategies for robust compliance and guardrails
- Collaborate with engineering to ship:
Model packaging, CI for evals, A/B testing, monitoring drift and quality
Read papers, propose experiments, publish internal notes, and turn ideas into measurable gains
What We're Looking For (Qualifications)
Must-Have
- 4 - 6 years in ML/DS, with direct LLM training/fine-tuning experience
- Demonstrated ability to run end-to-end model improvement:
data → training → eval → deployment constraints → iteration
- Strong practical knowledge of:
Transformers, tokenization, multilingual modeling
Fine-tuning methods
: LoRA/QLoRA, full fine-tune, continued pretraining
Alignment
: SFT, DPO/IPO/ORPO (and when to use what)
- Experience building or improving tool/function calling and structured output reliability
- Strong coding skills in Python, deep familiarity with PyTorch
- Comfortable with distributed training and GPU stacks:
DeepSpeed / FSDP, Accelerate, multi-GPU/multi-node workflows
- Solid ML fundamentals: optimization, regularization, scaling laws intuition, error analysis
Nice-to-Have
- Research background: MS/PhD or publications / strong applied research track record
- Experience with Indian language NLP:
Indic scripts, transliteration, normalization, code-mixing, ASR/TTS text quirks
- Experience with pretraining from scratch or large-scale continued pretraining
- Practical knowledge of serving:
vLLM / TGI / TensorRT-LLM, quantization + calibration, profiling
- Experience with data governance: privacy, PII redaction, dataset documentation
Tech Stack (Typical)
- PyTorch, Hugging Face Transformers/Datasets, Accelerate
- DeepSpeed / FSDP, PEFT (LoRA/QLoRA)
- Weights & Biases / MLflow
- vLLM / TGI / TensorRT-LLM
- Ray / Airflow / Spark (optional), Docker/Kubernetes
- Vector DB / RAG stack familiarity is a plus
What Success Looks Like (90-180 Days)
- Ship a fine-tuned open model that measurably improves:
Instruction following
and
tool calling correctness
Indic language performance + code-switching robustness
Lower latency / higher throughput
at equal quality
- Stand up a repeatable pipeline:
dataset versioning, training recipes, eval harness, regression gates
- Build a roadmap for next upgrades (distillation, preference tuning, multilingual expansion)
Interview Process
- 30-min intro + role fit
- Technical deep dive: prior LLM work (training/evals/production constraints)
- Take-home or live exercise: design an LLM fine-tuning + eval plan for tool calling + Indic language
- Systems round: training/serving tradeoffs, cost/latency, failure modes
- Culture + collaboration round