Posted:21 hours ago|
Platform:
On-site
Full Time
Location: Mumbai, India
Experience Level: 9 Plus Years
Minimum Qualification: Masters Degree in Computer Science, Engineering, or related field.
Were looking for a strategic Senior MLOps Engineer to lead the end-to-end design, implementation, and scaling of our AI infrastructure. Youll partner with researchers, product teams, and DevOps to turn prototypes into production services that meet strict SLAs for latency, reliability, and cost efficiency.
Core MLOps Pipelines: Design and implement scalable ML pipelines (training, evaluation, deployment) for LLMs, CV, and multimodal models .
Model Serving & CI/CD: Lead efforts in model serving, versioning, automated CI/CD, and real-time monitoring of AI workflows .
Inference-as-a-Service: Build and optimize GPU-backed serving infrastructure targeting p99 latency < 100 ms, 99.9% uptime, and > 80% GPU utilization .
Governance & Drift Detection: Drive initiatives on model governance, automated drift detection (?10% false positives), and data-management best practices .
Vector Search & Agent Orchestration: Integrate vector databases (Qdrant, Pinecone) for low-latency semantic retrieval, and build agentic workflows using LangChain or similar frameworks.
Enterprise Multi-Tenancy: Architect RBAC-driven, isolated ML services to securely serve 100500+ organizations.
Observability & Logging: Design Prometheus/Grafana dashboards, ELK/Fluentd logging pipelines, and alerting for all ML workloads.
CI/CD for Inference APIs: Maintain CI/CD pipelines for Python (FastAPI) and TypeScript (NestJS) inference services.
Metrics & Cost Optimization: Define and track SLAs/SLOs, optimize cloud spend by ? 20% year-over-year, and ensure GPU clusters operate at > 80% utilization.
Cross-Functional Leadership: Partner with AI researchers, product managers, and legal to align MLOps standards with compliance and roadmap goals.
Mentorship & Community: Mentor junior engineers, run quarterly brown-bags, own onboarding docs (upskill 5+ engineers/quarter), and publish ? 1 open-source contribution or talk annually.
914 years in software engineering, including ? 4 years in MLOps or ML infrastructure
Strong expertise in cloud platforms (AWS/GCP/Azure), Kubernetes, Docker, Terraform, Helm, Kubeflow, and MLflow
Experience with inference frameworks (Triton, TensorFlow Serving, BentoML, TorchServe)
Familiarity with distributed training, workload schedulers, and GPU-cluster orchestration
Proficiency in Python, TypeScript, and infrastructure-as-code (Terraform, Helm, etc.)
Proven track record building reliable, scalable ML systems in production.
Vector DB integration (Qdrant, Pinecone)
Agent orchestration (LangChain, LlamaIndex)
Multi-tenant security and RBAC
Observability stacks (Prometheus/Grafana, ELK)
CI/CD for FastAPI/NestJS services
Masters/PhD in CS/AI and certifications such as AWS ML Specialty, Google Cloud Professional ML Engineer, or CNCF CKA/CKAD.
Prior experience at AI-focused startups or enterprises scaling ML for 100500 orgs.
Understanding of low-latency streaming inference or agent-based LLM systems.
Excellent written and verbal communication, and a proven ability to drive consensus across functions.
Yotta Data Services Private Limited
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
mumbai, maharashtra, india
Salary: Not disclosed
new delhi, delhi, india
Salary: Not disclosed
new delhi, delhi, india
Salary: Not disclosed
new delhi, delhi, india
Salary: Not disclosed
telangana, india
Salary: Not disclosed
new delhi, delhi, india
Salary: Not disclosed
mumbai
2.0 - 6.0 Lacs P.A.
new delhi, delhi, india
Salary: Not disclosed
new delhi, delhi, india
Salary: Not disclosed
bengaluru, karnataka, india
5.0 - 8.0 Lacs P.A.