Primary Job Title:
Machine Learning Engineer LLM & RAGIndustry: Enterprise AI / Software & Cloud Solutions. Sector: Large Language Model (LLM) applications, Retrieval-Augmented Generation (RAG), and production ML services for business workflows. Location: India (Remote).
About The Opportunity
Join a fast-moving engineering team building production-grade LLM-powered services and RAG pipelines that enable intelligent search, document understanding, and agentic automation for enterprise customers. You will design, implement, and operate scalable retrieval, embedding, and inference pipelinesturning research-grade models into reliable, low-latency products.Role & Responsibilities
- Design and implement end-to-end RAG workflows: document ingestion, embedding generation, vector indexing, retrieval, and LLM inference.
- Develop robust Python services that integrate Transformers-based models, LangChain pipelines, and vector search (FAISS/Milvus) for production APIs.
- Optimize embedding strategies, retrieval quality, and prompt templates to improve relevance, latency, and cost-efficiency.
- Build scalable inference stacks with serving, batching, caching, and monitoring to meet SLA targets for throughput and latency.
- Collaborate with data scientists and product teams to evaluate model architectures, run A/B tests, and implement continuous retraining/validation loops.
- Implement observability, CI/CD, and reproducible deployments (Docker-based containers, model versioning, and automated tests).
Skills & Qualifications
Must-Have
- 4+ years of professional experience in ML or software engineering with hands-on LLM/RAG work.
- Strong Python programming and system-design skills for production services.
- Experience with Transformers-based models and fine-tuning/inference workflows.
- Proven experience building retrieval pipelines using vector search (FAISS, Milvus) and embeddings.
- Familiarity with LangChain or equivalent orchestration libraries for LLM workflows.
- Practical experience containerizing and deploying ML workloads (Docker, CI/CD, basic infra automation).
Preferred
- Experience with cloud ML infra (AWS, Azure or GCP) and model serving at scale.
- Familiarity with Kubernetes or other orchestration for production deployments.
- Experience with retrieval evaluation, relevance metrics, and A/B experimentation.
Benefits & Culture Highlights
- Fully remote role with flexible hours and an outcomes-driven culture.
- Opportunity to ship end-to-end LLM products and influence architecture choices.
- Mentorship-oriented environment with access to modern tools and model stacks.
Why apply: This role offers hands-on ownership of RAG systems and LLM deployment in productionideal for engineers who want to move fast, optimize for real-world impact, and work with cutting-edge LLM tooling.
Skills: python,backend,rag,llm