LLM Training & Model Development Engineer

8 - 13 years

0 Lacs

Posted:14 hours ago| Platform: Naukri logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Overview

We are seeking a highly skilled LLM Training & Model Development Engineer with strong expertise in data engineering, model fine-tuning, and optimization. The role focuses on building, customizing, and optimizing home-grown Large Language Models (LLMs) tailored to domain-specific applications. Youll work closely with data scientists, prompt engineers, and infrastructure teams to deliver state-of-the-art AI models that can reason, converse, and adapt intelligently.

Key Responsibilities

  • Strong Data Engineer with Agentic AI experience, capable of Data Extract, Transformation, Feature Engineering, Analytics to build AI/ML models
  • Curate and preprocess training corpora for domain-specific instruction tuning.
  • Fine-tune open-source LLMs using LoRA, RLHF, DPO, and model distillation techniques.
  • Implement model evaluation pipelines and benchmark reporting.
  • Collaborate with Prompt & Data teams to create repeatable model tuning workflows.

1. Data Engineering & Preparation

  • Architect and implement data pipelines for large-scale text ingestion, cleaning, and transformation.
    • Perform data extraction, transformation, and feature engineering across structured and unstructured sources. • Develop and maintain data quality frameworks ensuring clean, diverse, and bias-mitigated datasets for model training. • Automate data labeling and annotation workflows using LLM-assisted or agentic tools. • Build domain-specific corpora for instruction tuning, conversational grounding, and retrieval-augmented training.

2. Model Training & Fine-Tuning

  • Fine-tune and adapt open-source LLMs (e.g., LLaMA, Mistral, Falcon, Gemma) using LoRA, QLoRA, RLHF, DPO, and model distillation.
    • Implement self-instruct and multi-turn conversational fine-tuning for agentic use cases. • Design training orchestration scripts for distributed GPU/TPU environments (PyTorch, DeepSpeed, HuggingFace Accelerate).

3. Model Evaluation & Benchmarking

  • Develop evaluation frameworks for automatic and human-in-the-loop assessment of LLM performance.
    • Benchmark models against standard datasets (MMLU, HELM, ARC, TruthfulQA) and custom internal benchmarks. • Generate detailed performance dashboards tracking precision, hallucination rate, factual consistency, and latency. • Conduct A/B testing and regression analysis on model updates to ensure stable improvement.

4. Collaboration & AI Workflow Automation

  • Work cross-functionally with Prompt Engineers, Data Scientists, and DevOps to operationalize model development.
    • Build repeatable pipelines for fine-tuning, version control, and continuous model improvement (MLOps). • Integrate agentic feedback loops for continuous self-improvement and autonomous retraining cycles. • Support deployment through containerized model serving (FastAPI, Triton, or Ray Serve).

5. Research & Innovation

  • Stay current with cutting-edge research in LLM fine-tuning, alignment, and model compression.
    • Contribute to internal whitepapers and experiments evaluating emerging architectures and optimization methods. • Prototype and publish novel training methodologies or agentic evaluation techniques.

Required Skills & Experience

  • Strong Python expertise with hands-on experience in PyTorch, Hugging Face Transformers, and LangChain.
    • Deep understanding of LLM architectures, tokenizer mechanics, and parameter-efficient fine-tuning. • Proficiency in data processing frameworks (Spark, Airflow, Pandas, Arrow, Dask). • Experience with distributed training and GPU/TPU optimization (CUDA, NCCL). • Knowledge of evaluation metrics and human-aligned reward modeling. • Experience with Vector Databases (FAISS, Milvus, Pinecone) for context retrieval. • Familiarity with cloud platforms (AWS, GCP, Azure) and container orchestration (Docker, Kubernetes). • Exposure to agentic AI frameworks and feedback-based continuous improvement systems is a plus.

Preferred Qualifications

  • Prior experience contributing to open-source LLM projects.
    • Background in NLP research or applied ML. • Knowledge of data privacy, ethical AI, and prompt alignment techniques. • Master’s or Ph.D. in Computer Science, AI, or related field preferred.

What You’ll Get to Build

  • A home-grown, domain-specialized LLM trained on proprietary and public datasets.
    • A scalable fine-tuning pipeline that powers multiple downstream agents and AI applications. • An autonomous model training framework capable of learning from feedback in real time.

Mock Interview

Practice Video Interview with JobPe AI

Start Artificial Intelligence Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Inoptra Digital logo
Inoptra Digital

Technology / Digital Services

Tech City

RecommendedJobs for You