AI Platform Engineer -2- 4 years exp.

3 years

6 Lacs

Posted:14 hours ago| Platform: GlassDoor logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Job Specification: AI Platform Engineer

About the Role

We are seeking an AI Platform Engineer to build and scale the infrastructure that powers

our production AI services. You will take cutting-edge models—ranging from speech

recognition (ASR) to large language models (LLMs)—and deploy them into highly

available, developer-friendly APIs.

You will be responsible for creating the bridge between the R&D team, who train models,

and the applications that consume them. This means developing robust APIs, deploying

and optimizing models on Triton Inference Server (or similar frameworks), and ensuring

real-time, scalable inference.

Responsibilities

● API Development

○ Design, build, and maintain production-ready APIs for speech, language, and

other AI models.

○ Provide SDKs and documentation to enable easy developer adoption.

● Model Deployment

○ Deploy models (ASR, LLM, and others) using Triton Inference Server or

similar systems.

○ Optimize inference pipelines for low-latency, high-throughput workloads.

● Scalability & Reliability

○ Architect infrastructure for handling large-scale, concurrent inference

requests.

○ Implement monitoring, logging, and auto-scaling for deployed services.

● Collaboration

○ Work with research teams to productionize new models.

○ Partner with application teams to deliver AI functionality seamlessly through

APIs.

● DevOps & Infrastructure

○ Automate CI/CD pipelines for models and APIs.

○ Manage GPU-based infrastructure in cloud or hybrid environments.

Requirements

● Core Skills

○ Strong programming experience in Python (FastAPI, Flask) and/or

Go/Node.js for API services.

○ Hands-on experience with model deployment using Triton Inference Server,

TorchServe, or similar.

○ Familiarity with both ASR frameworks and LLM frameworks (Hugging

Face Transformers, TensorRT-LLM, vLLM, etc.).

● Infrastructure

○ Experience with Docker, Kubernetes, and managing GPU-accelerated

workloads.

○ Deep knowledge of real-time inference systems (REST, gRPC, WebSockets,

streaming).

○ Cloud experience (AWS, GCP, Azure).

● Bonus

○ Experience with model optimization (quantization, distillation, TensorRT,

ONNX).

○ Exposure to MLOps tools for deployment and monitoring

Job Types: Full-time, Permanent

Pay: From ₹50,000.00 per month

Experience:

  • total work: 3 years (Preferred)

Work Location: In person

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You