Posted:7 hours ago|
Platform:
On-site
Full Time
Role Overview
We are seeking a highly skilled GenAI Lead who can drive the development, optimization, and deployment of advanced LLMs, VLMs, and multimodal AI systems. You will lead the GenAI team, translate business requirements into technical solutions, fine-tune foundation models, design retrieval architectures, and ensure all models are production-ready with optimized inference pipelines.
• Lead the design, development, and enhancement of LLMs, VLMs, RAG systems, and multimodal generation pipelines for production use cases.
• Understand business requirements and convert them into scalable, high-performance AI model architectures and workflows.
• Fine-tune and customize Transformer-based models using proprietary datasets, advanced training strategies, and evaluation frameworks.
• Optimize tokenization, embedding generation, vector search, and retrieval flows for high-throughput applications.
• Develop high-performance inference pipelines using ONNX, TensorRT, quantization, batching, streaming, and GPU/accelerator optimizations.
• Ensure all models are production-grade—robust, scalable, monitored, and integrated into backend systems.
• Lead and mentor the GenAI engineering team, conduct code/model reviews, and drive overall technical direction.
• Research and evaluate cutting-edge architectures in multimodal models, generative AI, and retrieval-augmented techniques.
• Architect end-to-end GenAI systems including training, fine-tuning, inference Serving, and continuous model improvements.
• Work with backend teams to integrate models into scalable APIs using Triton, TensorRT, ONNX Runtime, vLLM, or custom inference engines.
• Build model evaluation pipelines—BLEU, ROUGE, alignment tests, hallucination checks, safety filters, and latency/throughput benchmarks.
• Own the roadmap for LLM/VLM improvements and drive experimentation with new architectures (Mixture-of-Experts, diffusion-based multimodal, etc.).
• Collaborate cross-functionally with product, backend, ML, and DevOps teams to deliver end-to-end GenAI features.
• Maintain documentation, ensure reproducibility, and follow best practices in model governance, versioning, and monitoring.
• Mentor the team in training deep learning models, optimizing memory/GPU usage, and deploying large-scale inference systems.
• 4–8+ years of experience in applied machine learning, deep learning, GenAI, or multimodal systems.
• Proven expertise with Transformers, LLMs, VLMs, diffusion models, and retrieval-augmented systems.
• Hands-on experience with Python, PyTorch, TensorFlow, Hugging Face, LangChain, and modern training pipelines.
• Strong knowledge of vector databases (FAISS, Pinecone, Milvus, Chroma).
• Expert-level experience with ONNX, TensorRT, quantization, model optimization, and inference engines (vLLM, FasterTransformer, Triton).
• Solid understanding of distributed training, GPU utilization, mixed precision, and large-scale model serving.
• Ability to lead teams, plan AI architecture, review work, and deliver production-quality AI systems.
Note - We accept International applicants also
EROS GenAI
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python Nowchennai, tamil nadu, india
Salary: Not disclosed
hyderabad
16.0 - 20.0 Lacs P.A.
hyderabad
16.0 - 20.0 Lacs P.A.
Noida, Uttar Pradesh, India
Salary: Not disclosed
20.0 - 25.0 Lacs P.A.
Bengaluru
15.0 - 19.0 Lacs P.A.
chennai, tamil nadu, india
Salary: Not disclosed
Noida, Uttar Pradesh, India
Salary: Not disclosed