AI/ML Engineer (GenAI & LLM Specialist)

3 - 7 years

15 - 18 Lacs

Posted:1 day ago| Platform: Naukri logo

Apply

Work Mode

Hybrid

Job Type

Full Time

Job Description

Role Summary

We are seeking an experienced AI/ML Engineer to lead the development of our core intelligence engine. In this role, you will own the end-to-end lifecycle of our proprietary AI tool: selecting the optimal open-source Large Language Model (LLM), fine-tuning it for our specific domain, and architecting a RAG (Retrieval-Augmented Generation) pipeline.

Your primary mission is to build a system that can ingest complex PDF documents, comprehend historical and referential data, and provide accurate, context-aware answers to user queries.

Role & responsibilities

1. Model Selection & Strategy

  • Evaluate Open-Source Models:

    Analyze and benchmark state-of-the-art open-source models (e.g., Llama 3, Mistral, Falcon, Mixtral) to identify the best balance of performance, inference cost, and license suitability for our specific use case.
  • Feasibility Analysis:

    Determine when to use Retrieval-Augmented Generation (RAG) versus Fine-Tuning (or a hybrid approach) to ensure the highest accuracy in answering queries based on uploaded PDFs.

2. RAG Pipeline & Data Engineering

  • Document Ingestion:

    Build robust pipelines to parse, clean, and OCR complex PDF files (handling tables, headers, and multi-column layouts) using tools like Unstructured, PyMuPDF, or LayoutLM.
  • Vector Database Management:

    Design and implement vector search architectures (using Pinecone, Milvus, ChromaDB, or Weaviate) to store and retrieve high-dimensional embeddings efficiently.
  • Context Optimization:

    Optimize "chunking" strategies and context window management to ensure the LLM receives the most relevant historical data without hallucinating.

3. Model Training & Fine-Tuning

  • Fine-Tuning:

    Implement efficient fine-tuning techniques (PEFT, LoRA, QLoRA) on the selected LLM to adapt its tone and reasoning capabilities to our business domain.
  • Dataset Preparation:

    Curate and format training datasets from internal data to improve the model's ability to understand domain-specific terminology found in the PDFs.

4. Deployment & Optimization

  • Inference Optimization:

    Optimize model latency and throughput using quantization (GGML/GGUF/AWQ) or engines like vLLM and TGI.
  • API Development:

    Wrap the AI engine in a robust API (FastAPI/Flask) for integration with our front-end application.

Required Skills & Qualifications

  • Core Tech:

    Expert proficiency in

    Python

    and deep learning frameworks (

    PyTorch

    or TensorFlow).
  • LLM Ecosystem:

    Deep familiarity with the

    Hugging Face

    ecosystem (Transformers, Accelerate, PEFT, Datasets).
  • GenAI Frameworks:

    Hands-on experience with orchestration frameworks like

    LangChain

    or

    LlamaIndex

    specifically for building RAG applications.
  • Vector Search:

    Experience working with Vector Databases (Pinecone, ChromaDB, Elasticsearch, or pgvector).
  • Document Processing:

    Experience extracting clean text from unstructured files (PDFs) using open source OCR tools (Nougat, Surya, PaddleOCR) or Python libraries(pymupdf, pdfplumber) for native pdfs.
  • Deployments:

    Experience with

    Containerization & Orchestration

    (Docker, Kubernetes) and serving LLMs in air-gapped or offline environments using tools like

    vLLM, Ollama, or llama.cpp

  • Mathematics:

    Solid understanding of linear algebra, probability, and how Transformer architectures (Attention mechanisms) work.

Nice-to-Have (Bonus Points)

  • Experience deploying LLMs on cloud GPUs (AWS SageMaker or RunPod).
  • Knowledge of prompt engineering techniques (Chain-of-Thought, ReAct).
  • Previous experience building "Chat with your Data" style applications.

Why Join Us?

  • High Impact:

    You will be the primary architect of the intelligence behind our product, not just maintaining legacy code.
  • Cutting Edge:

    You will work with the absolute latest developments in the Open Source LLM space.
  • Autonomy:

    You will have the freedom to choose the tech stack (Models, DBs, Frameworks) that best solves the problem.

Mock Interview

Practice Video Interview with JobPe AI

Start Machine Learning Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

pune, chennai, bengaluru