Posted:3 weeks ago|
Platform:
Work from Office
Full Time
Role Overview Join our Pune AI Center of Excellence to drive software and product development in the AI space. As an AI/ML Engineer, youll build and ship core components of our AI products—owning end-to-end RAG pipelines, persona-driven fine-tuning, and scalable inference systems that power next-generation user experiences. Key Responsibilities Model Fine-Tuning & Persona Design Adapt and fine-tune open-source large language models (LLMs) (e.g. CodeLlama, StarCoder) to specific product domains. Define and implement “personas” (tone, knowledge scope, guardrails) at inference time to align with product requirements. RAG Architecture & Vector Search Build retrieval-augmented generation systems: ingest documents, compute embeddings, and serve with FAISS, Pinecone, or ChromaDB. Design semantic chunking strategies and optimize context-window management for product scalability. Software Pipeline & Product Integration Develop production-grade Python data pipelines (ETL) for real-time vector indexing and updates. Containerize model services in Docker/Kubernetes and integrate into CI/CD workflows for rapid iteration. Inference Optimization & Monitoring Quantize and benchmark models for CPU/GPU efficiency; implement dynamic batching and caching to meet product SLAs. Instrument monitoring dashboards (Prometheus/Grafana) to track latency, throughput, error rates, and cost. Prompt Engineering & UX Evaluation Craft, test, and iterate prompts for chatbots, summarization, and content extraction within the product UI. Define and track evaluation metrics (ROUGE, BLEU, human feedback) to continuously improve the product’s AI outputs. Must-Have Skills ML/AI Experience: 3–4 years in machine learning and generative AI, including 18 months on LLM- based products. Programming & Frameworks: Python, PyTorch (or TensorFlow), Hugging Face Transformers. RAG & Embeddings: Hands-on with FAISS, Pinecone, or ChromaDB and semantic chunking. Fine-Tuning & Quantization: Experience with LoRA/QLoRA, 4-bit/8-bit quantization, and model context protocol (MCP). Prompt & Persona Engineering: Deep expertise in prompt-tuning and persona specification for product use cases. Deployment & Orchestration: Docker, Kubernetes fundamentals, CI/CD pipelines, and GPU setup. Nice-to-Have Multi-modal AI combining text, images, or tabular data. Agentic AI systems with reasoning and planning loops. Knowledge-graph integration for enhanced retrieval. Cloud AI services (AWS SageMaker, GCP Vertex AI, or Azure Machine Learning)
Smartavya Analytica
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
My Connections Smartavya Analytica
Experience: Not specified
5.54145 - 15.0 Lacs P.A.
Bengaluru, Karnataka, India
Salary: Not disclosed
Ahmedabad, Gujarat
Experience: Not specified
Salary: Not disclosed
Chennai, Tamil Nadu, India
Salary: Not disclosed
Delhi
Experience: Not specified
Salary: Not disclosed
Delhi, Delhi, India
Experience: Not specified
Salary: Not disclosed
Bengaluru, Karnataka, India
Salary: Not disclosed
Bengaluru, Karnataka, India
Salary: Not disclosed
Hyderābād
6.1 - 9.1 Lacs P.A.
New Delhi, Delhi, India
Salary: Not disclosed