ML Eng AI Ops & Model Infrastructure Professional

3 - 6 years

5 - 8 Lacs

Posted:None| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

  • Build and maintain serving infrastructure for

    ONNX models

    ,

    Augloop

    ,

    SLM-based inference

    , and future LLM/SLM pipelines.
  • Integrate models into scalable APIs for online prediction and retrieval-augmented generation workflows.
  • Set up and run

    real-time A/B experiments

    on production Copilot features.
  • Implement

    alerting, logging, and telemetry tools

    to monitor model drift, latency, and regressions.
  • Develop dashboards for

    automated quality monitoring

    and error detection in inference traffic.
  • Optimize

    inference latency and cost

    across CPU/GPU environments.
  • Build internal tools for

    performance analysis

    , model comparison, and troubleshooting.
  • Work on

    batch and streaming inference

    frameworks, ensuring SLA adherence.
  • Implement

    resource orchestration and utilization tracking

    across CPU/GPU workloads.
  • Contribute to tools that monitor

    uptime, throughput, container health, and job scaling.

  • Ensure scalability and reliability of model APIs, with clear SLAs around

    latency

    ,

    throughput

    ,

    cost

    , and

    memory footprint

    .
  • Profile models and infra for

    cold start issues

    ,

    load testing

    , and

    concurrency handling.

  • Integrate

    Responsible AI checks

    for fairness, explainability, and performance variance.
  • Address

    AI injection attacks

    , inference sandboxing, and privacy guardrails.
  • Contribute to

    regression pipelines

    for SLA, PII, and compliance validation across Copilot features.

Required Experience

  • 3 6 years of hands-on experience as an ML SWE or MLOps Engineer in production AI systems.
  • Strong coding skills in

    Python

    ,

    C++

    , or

    Go

    , with experience in

    TensorRT

    ,

    ONNX Runtime

    , or similar.
  • Experience with

    ML Ops tools

    : Azure ML, Kubernetes, Prometheus, Grafana, MLflow, Airflow, etc.
  • Hands-on with

    monitoring systems

    ,

    load testing tools

    , and

    infra debugging utilities

    .
  • Familiarity with

    model security

    , compliance frameworks, or Responsible AI practices is a plus.

Soft Expectations

  • Able to work independently and deliver code-quality infrastructure within agile cycles.
  • Document architecture, assumptions, and SLA metrics clearly.
  • Comfort in collaborating with both AI scientists and infra/DevOps teams.
  • Availability for overlap with Prague or Redmond teams preferred.
Infra, Machine Learing., Mlops, Onnx Runtime, Python, Tensorrt

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Rarr Technologies logo
Rarr Technologies

Information Technology

San Francisco

RecommendedJobs for You

hyderabad, gurugram, mumbai (all areas)

hyderabad, pune, bengaluru