Contezy builds scalable web experiences and AI-driven systems empowering automation and ision-making at scale. We emphasize reproducibility, cost-effective performance, and secure deployment of machine learning infrastructure.

Position Sum y

The AI / ML Developer will architect and maintain self-hosted LLM systems for retrieval- mented generation, task-specific assistants, knowledge indexing, and real-time inference. You ll work across model selection, fine-tuning, dataset engineering, deployment, and itoring pipelines.

Key Responsibilities

Select and bench k LLMs based on performance, latency, and cost trade-offs.
Fine-tune and adapt models via supervised fine-tuning, LoRA, or PEFT using curated datasets.
Implement retrieval- mented generation (RAG) with vector stores and embedding workflows.
Develop scalable model-serving APIs and inference systems (multi-GPU, quantized models, batching).
Containerize and deploy models using Docker and Kubernetes with CI/CD workflows.
Optimize inference performance with quantization, ONNX, and accelerated runtimes.
Instrument observability and performance metrics: latency, throughput, and cost itoring.
Collaborate with cross-functional teams to integrate models into production systems.

Required Skills & Experience

2+ years professional experience in AI/ML engineering, with hands-on LLM deployment.
Expertise in Python, PyTorch, and ML pipeline development.
Experience with self-hosting LLMs, model serving (vLLM, Text-Generation-Inference, etc.), and GPU optimization.
Strong understanding of containerization (Docker/Kubernetes) and backend APIs (FastAPI/Flask).
Knowledge of vector databases (FAISS, Milvus, Pinecone) and retrieval strategies.
Familiarity with quantization, LoRA fine-tuning, and deployment optimization for cost efficiency.

Preferred Skills

Experience with Hugging Face Transformers, Accelerate, and LangChain.
Knowledge of open models (Llama, Mistral, Falcon, Mixtral, etc.) and quantized inference frameworks (GGUF, bitsandbytes, llama.cpp).
Exposure to MLOps, model observability, and reproducible training workflows.
Experience fine-tuning or bench king foundation models on custom datasets.

Disclaimer: The job location mentioned in this description is based on publicly available information or company headquarters. Candidates are advised to verify the exact job location directly with the employer before applying.

More Jobs at Contezy Media

UI / UX Developer

vadodara

1.0 - 6.0 yrs

INR 5 - 8 Lacs

AI / ML Developer LLMs & Self-Hosting

vadodara

2.0 - 7.0 yrs

INR 9 - 13 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Machine Learning Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

Contezy Media

Login to

Please Verify Your Phone or Email

Confirm Action

AI / ML Developer LLMs & Self-Hosting