AI/GenAI Engineer

0 - 5 years

0 - 3 Lacs

Posted:2 weeks ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Position: AI/GenAI Engineer (LLM Integration Specialist)

About the Role

We're building a high-performance chat application an AI/GenAI Engineer to lead the integration and optimization of Large Language Models (LLMs). You'll be responsible for connecting LLM APIs, implementing domain-specific fine-tuning strategies, prompt engineering, and ensuring optimal performance for production use.

You'll work closely with our React and Python developers to create a seamless, intelligent chat experience that serves thousands of concurrent users.

Key Responsibilities

LLM Integration & Architecture

  • Integrate multiple LLM APIs (OpenAI, Anthropic Claude, Google Gemini, or open-source models)
  • Design and implement robust API wrapper services with retry logic, fallback mechanisms, and error handling
  • Implement streaming responses for real-time chat experience
  • Build rate limiting and quota management systems
  • Handle token counting, context window management, and cost optimization

Domain Customization & Fine-tuning

  • Develop domain-specific prompt engineering strategies
  • Implement RAG (Retrieval Augmented Generation) pipelines using vector databases
  • Fine-tune or adapt models for specific use cases using techniques like LoRA, prompt tuning
  • Create and maintain knowledge bases for domain-specific responses
  • Implement few-shot learning and in-context learning strategies

Performance & Optimization

  • Optimize API response times and reduce latency
  • Implement caching strategies for common queries
  • Monitor and optimize token usage to control costs
  • A/B test different models and prompts for quality improvements
  • Build fallback chains (primary/secondary model routing)

Safety & Quality

  • Implement content moderation and safety filters
  • Build guardrails to prevent prompt injection and jailbreaking
  • Develop evaluation frameworks to measure response quality
  • Monitor and handle hallucinations and inaccuracies
  • Implement user feedback loops for continuous improvement

Infrastructure

  • Design scalable architecture for handling concurrent LLM requests
  • Implement queue systems for managing high-volume API calls
  • Set up monitoring and logging for LLM interactions
  • Work with DevOps to deploy models (if self-hosted)

Required Skills & Experience

Must Have:

  • Experience working with LLMs and GenAI technologies
  • Strong experience with

    OpenAI API, Anthropic Claude, or similar LLM APIs

  • Proficiency in

    Python

    (FastAPI, LangChain, LlamaIndex preferred)
  • Strong understanding of

    prompt engineering

    techniques and best practices
  • Experience with

    vector databases

    (Pinecone, Weaviate, Qdrant, ChromaDB)
  • Knowledge of

    RAG (Retrieval Augmented Generation)

    implementation
  • Understanding of

    transformer architecture

    and attention mechanisms
  • Experience with

    API integration, webhooks, and streaming responses

  • Strong problem-solving skills and ability to debug complex AI systems
  • Experience with

    LangChain, LlamaIndex, or similar LLM frameworks

  • Knowledge of

    fine-tuning techniques

    (LoRA, QLoRA, PEFT)
  • Experience with

    embedding models

    and semantic search
  • Familiarity with

    HuggingFace Transformers

    library
  • Experience deploying models using

    vLLM, TGI (Text Generation Inference)

  • Knowledge of

    function calling/tool use

    with LLMs
  • Experience with

    model evaluation metrics

    (BLEU, ROUGE, BERTScore)
  • Understanding of

    token economics

    and cost optimisation
  • Experience with

    open-source models

    (Llama, Mistral, Falcon)
  • Knowledge of

    model quantization

    and optimization techniques
  • Experience with

    multi-modal models

    (vision, audio)
  • Familiarity with

    MLOps practices

    and experiment tracking (Weights & Biases, MLflow)
  • Experience with

    AWS SageMaker, Google Vertex AI, or Azure ML

  • Understanding of

    chain-of-thought prompting, ReAct, agents

  • Experience building

    chatbots or conversational AI

    systems
  • Publications or contributions to AI/ML community

Technical Stack You'll Work With

  • Languages:

    Python (primary), JavaScript/TypeScript (basic understanding)
  • LLM APIs:

    OpenAI, Anthropic, Google Gemini, Cohere
  • Frameworks:

    LangChain, LlamaIndex, FastAPI
  • Vector DBs:

    Pinecone, Weaviate, Qdrant, or ChromaDB
  • Infrastructure:

    Docker, Redis, PostgreSQL, Message Queues
  • Cloud:

    AWS/GCP/Azure (whatever your team uses)
  • Monitoring:

    Prometheus, Grafana, custom LLM analytics

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Smartncode logo
Smartncode

Information Technology

Tech City

RecommendedJobs for You

hyderabad, telangana, india

chennai, tamil nadu, india

chennai, tamil nadu, india

Hyderabad, Telangana, India

mumbai, mumbai suburban, thane, navi mumbai