Gen AI Lead Engineer

8 years

0 Lacs

Posted:7 hours ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

GenAI Lead Engineer


Role Overview

We are seeking a highly skilled GenAI Lead who can drive the development, optimization, and deployment of advanced LLMs, VLMs, and multimodal AI systems. You will lead the GenAI team, translate business requirements into technical solutions, fine-tune foundation models, design retrieval architectures, and ensure all models are production-ready with optimized inference pipelines.


Key Roles

• Lead the design, development, and enhancement of LLMs, VLMs, RAG systems, and multimodal generation pipelines for production use cases.

• Understand business requirements and convert them into scalable, high-performance AI model architectures and workflows.

• Fine-tune and customize Transformer-based models using proprietary datasets, advanced training strategies, and evaluation frameworks.

• Optimize tokenization, embedding generation, vector search, and retrieval flows for high-throughput applications.

• Develop high-performance inference pipelines using ONNX, TensorRT, quantization, batching, streaming, and GPU/accelerator optimizations.

• Ensure all models are production-grade—robust, scalable, monitored, and integrated into backend systems.

• Lead and mentor the GenAI engineering team, conduct code/model reviews, and drive overall technical direction.

• Research and evaluate cutting-edge architectures in multimodal models, generative AI, and retrieval-augmented techniques.


Responsibilities

• Architect end-to-end GenAI systems including training, fine-tuning, inference Serving, and continuous model improvements.

• Work with backend teams to integrate models into scalable APIs using Triton, TensorRT, ONNX Runtime, vLLM, or custom inference engines.

• Build model evaluation pipelines—BLEU, ROUGE, alignment tests, hallucination checks, safety filters, and latency/throughput benchmarks.

• Own the roadmap for LLM/VLM improvements and drive experimentation with new architectures (Mixture-of-Experts, diffusion-based multimodal, etc.).

• Collaborate cross-functionally with product, backend, ML, and DevOps teams to deliver end-to-end GenAI features.

• Maintain documentation, ensure reproducibility, and follow best practices in model governance, versioning, and monitoring.

• Mentor the team in training deep learning models, optimizing memory/GPU usage, and deploying large-scale inference systems.


Required Qualifications

• 4–8+ years of experience in applied machine learning, deep learning, GenAI, or multimodal systems.

• Proven expertise with Transformers, LLMs, VLMs, diffusion models, and retrieval-augmented systems.

• Hands-on experience with Python, PyTorch, TensorFlow, Hugging Face, LangChain, and modern training pipelines.

• Strong knowledge of vector databases (FAISS, Pinecone, Milvus, Chroma).

• Expert-level experience with ONNX, TensorRT, quantization, model optimization, and inference engines (vLLM, FasterTransformer, Triton).

• Solid understanding of distributed training, GPU utilization, mixed precision, and large-scale model serving.

• Ability to lead teams, plan AI architecture, review work, and deliver production-quality AI systems.


Note - We accept International applicants also

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

chennai, tamil nadu, india

chennai, tamil nadu, india