Posted:6 days ago| Platform:
On-site
Full Time
Key Responsibilities: Develop and Fine-Tune LLMs (e.g., GPT-4, Claude, LLaMA, Mistral, Gemini) using instruction tuning, prompt engineering, chain-of-thought prompting, and fine-tuning techniques. Build RAG Pipelines: Implement Retrieval-Augmented Generation solutions leveraging embeddings, chunking strategies, and vector databases like FAISS, Pinecone, Weaviate, and Qdrant. Implement and Orchestrate Agents: Utilize frameworks like MCP, OpenAI Agent SDK, LangChain, LlamaIndex, Haystack, and DSPy to build dynamic multi-agent systems and serverless GenAI applications. Deploy Models at Scale: Manage model deployment using HuggingFace, Azure Web Apps, vLLM, and Ollama, including handling local models with GGUF, LoRA/QLoRA, PEFT, and Quantization methods. Integrate APIs: Seamlessly integrate with APIs from OpenAI, Anthropic, Cohere, Azure, and other GenAI providers. Ensure Security and Compliance: Implement guardrails, perform PII redaction, ensure secure deployments, and monitor model performance using advanced observability tools. Optimize and Monitor: Lead LLMOps practices focusing on performance monitoring, cost optimization, and model evaluation. Work with AWS Services: Hands-on usage of AWS Bedrock, SageMaker, S3, Lambda, API Gateway, IAM, CloudWatch, and serverless computing to deploy and manage scalable AI solutions. Contribute to Use Cases: Develop AI-driven solutions like AI copilots, enterprise search engines, summarizers, and intelligent function-calling systems. Cross-functional Collaboration: Work closely with product, data, and DevOps teams to deliver scalable and secure AI products. Required Skills and Experience: Deep knowledge of LLMs and foundational models (GPT-4, Claude, Mistral, LLaMA, Gemini). Strong expertise in Prompt Engineering, Chain-of-Thought reasoning, and Fine-Tuning methods. Proven experience building RAG pipelines and working with modern vector stores ( FAISS, Pinecone, Weaviate, Qdrant ). Hands-on proficiency in LangChain, LlamaIndex, Haystack, and DSPy frameworks. Model deployment skills using HuggingFace, vLLM, Ollama, and handling LoRA/QLoRA, PEFT, GGUF models. Practical experience with AWS serverless services: Lambda, S3, API Gateway, IAM, CloudWatch. Strong coding ability in Python or similar programming languages. Experience with MLOps/LLMOps for monitoring, evaluation, and cost management. Familiarity with security standards: guardrails, PII protection, secure API interactions. Use Case Delivery Experience: Proven record of delivering AI Copilots, Summarization engines, or Enterprise GenAI applications. Experience 6-8 years of experience in AI/ML roles, focusing on LLM agent development, data science workflows, and system deployment. Demonstrated experience in designing domain-specific AI systems and integrating structured/unstructured data into AI models. Proficiency in designing scalable solutions using LangChain and vector databases. Job Type: Full-time Pay: ₹2,000,000.00 - ₹3,000,000.00 per year Benefits: Health insurance Schedule: Monday to Friday Work Location: In person
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Pune, Maharashtra, India
Salary: Not disclosed
Pune, Maharashtra, India
Salary: Not disclosed
Pune, Greater Noida
15.0 - 20.0 Lacs P.A.
Thane, Maharashtra, India
Salary: Not disclosed
Chennai
10.0 - 14.0 Lacs P.A.
Bengaluru
9.0 - 13.0 Lacs P.A.
8.0 - 18.0 Lacs P.A.
10.0 - 15.0 Lacs P.A.
20.0 - 30.0 Lacs P.A.
Chennai, Tamil Nadu, India
Salary: Not disclosed