Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Jobs

Interviews

Home
>
Jobs in Mumbai Metropolitan Region
>
Careerfit.ai
>
ML Inference & Optimization Engineer

ML Inference & Optimization Engineer

Careerfit.ai

4 years

0 Lacs

Mumbai Metropolitan Region

Posted:2 months ago| Platform:

Apply

Skills Required

ml inference optimization scaling learning quantization model tuning latency concurrency decoding sampling reliability efficiency fastapi pytorch onnx profiling elasticsearch engineering

Work Mode

On-site

Job Type

Full Time

Job Description

ML Inference & Optimization Engineer Location: Mumbai, Experience: 2–4 years You will be responsible for deploying and scaling domain and task-specific LLMs and deep learning models for real-time and batch inference. You'll work on quantization, model optimizations, runtime tuning, and performance-critical serving. What You'll Do Integrate models into containerized services and APIs, and build high-performance inference pipelines optimized for latency, concurrency, and cost Deploy and optimize LLMs using vLLM, TGI, SGLang, Triton, TensorRT etc. Implement model quantization, speculative decoding, KV cache optimization, dynamic batching etc. Benchmark model throughput and latency across cloud VM configurations Debug performance bottlenecks: VRAM usage, token sampling speed, latency, instability Collaborate with infra team for scaling and observability Monitor and troubleshoot inference performance, ensuring system reliability and efficiency Stay abreast of advancements in model inference technologies and best practices You Bring 3+ years of experience in deploying and optimizing machine learning models in production, with 1+ years of experience in deploying deep learning models Experience deploying async inference APIs (FastAPI, gRPC, Ray Serve etc.) Understanding of PyTorch internals and inference-time optimization Familiarity with LLM runtimes: vLLM, TGI, TensorRT-LLM, ONNX Runtime etc. Familiarity with GPU profiling tools (nsight, nvtop), model quantization pipelines Bonus: prior work on ElasticSearch, distributed KV cache, or custom tokenizers Bachelor's degree in Computer Science, Engineering, or related field Show more Show less

More Jobs at Careerfit.ai

ML Inference & Optimization Engineer

Mumbai Metropolitan Region

4 - 4 yrs

Salary: Not disclosed

Delivery Manager

Thane, Maharashtra, India

5 - 8 yrs

Salary: Not disclosed

Senior Electrical Design Engineer

Gujarat, India

5 - 5 yrs

Salary: Not disclosed

Network Engineer

India

5.0 - 5.0 yrs

Salary: Not disclosed

Section Head Basic Design Engineering Systems

Gujarat, India

Experience: Not specified

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

Careerfit.ai

Login to

Please Verify Your Phone or Email

Confirm Action

Search

Profile

Upskill and Grow with AI

ML Inference & Optimization Engineer