Primary Title: Senior LLM Engineer (4+ years) — Hybrid, India
About The Opportunity
A technology consulting firm operating at the intersection of Enterprise AI, Generative AI and Cloud Engineering seeks an experienced LLM-focused engineer. You will build and productionize LLM-powered products and integrations for enterprise customers across knowledge management, search, automation, and conversational AI use-cases. This is a hybrid role based in India for candidates with strong hands-on LLM engineering experience.Role & Responsibilities
- Own design and implementation of end-to-end LLM solutions: data ingestion → retrieval (RAG) → fine-tuning → inference and monitoring for production workloads.
- Develop robust Python microservices to serve LLM inference, retrieval, and agentic workflows using LangChain/LangGraph or equivalent toolkits.
- Implement and optimise vector search pipelines (FAISS/Pinecone/Milvus), embedding generation, chunking strategies, and relevance tuning for sub-second retrieval.
- Perform parameter-efficient fine-tuning (LoRA/adapters) and evaluation workflows; manage model versioning and automated validation for quality and safety.
- Containerise and deploy models and services with Docker and Kubernetes; integrate with cloud infra (AWS/Azure/GCP) and CI/CD for repeatable delivery.
- Establish observability, alerting, and performance SLAs for LLM services; collaborate with cross-functional teams to define success metrics and iterate rapidly.
Skills & Qualifications
Must-Have
- 4+ years engineering experience with 2+ years working directly on LLM/Generative AI projects.
- Strong Python skills and hands-on experience with PyTorch and HuggingFace/transformers libraries.
- Practical experience building RAG pipelines, vector search (FAISS/Pinecone/Milvus), and embedding workflows.
- Experience with fine-tuning strategies (LoRA/adapters) and evaluation frameworks for model quality and safety.
- Familiarity with Docker, Kubernetes, cloud deployment (AWS/Azure/GCP), and Git-based CI/CD workflows.
- Solid understanding of prompt engineering, retrieval strategies, and production monitoring of ML services.
Preferred
- Experience with LangChain/LangGraph, agent frameworks, or building tool-calling pipelines.
- Exposure to MLOps platforms, model registry, autoscaling low-latency inference, and cost-optimisation techniques.
- Background in productionising LLMs for enterprise use-cases (knowledge bases, search, virtual assistants).
Benefits & Culture Highlights
- Hybrid work model with flexible in-office collaboration and remote days; competitive market compensation.
- Opportunity to work on high-impact enterprise AI initiatives and shape production-grade GenAI patterns across customers.
- Learning-first culture: access to technical mentorship, experimentation environments, and conferences/learning stipend.
To apply: include a brief portfolio of LLM projects, links to relevant repositories or demos, and a summary of production responsibilities. This role is ideal for engineers passionate about turning cutting-edge LLM research into reliable, scalable enterprise solutions.
Skills: open ai,gemini,llm