ML -AI Ops Engineer

7 - 12 years

12 - 22 Lacs

Posted:1 month ago| Platform: Naukri logo

Apply

Work Mode

Hybrid

Job Type

Full Time

Job Description

Infrastructure & Platform Development Design and implement end-to-end MLOps pipelines for model lifecycle management, from experimentation to production deployment Build and maintain cloud-native infrastructure for training, serving, and monitoring ML models at scale Model Deployment & Serving Deploy diverse model types including deep learning, traditional ML, transformers, and ensemble models Implement real-time and batch inference systems with appropriate scaling strategies Agentic AI Systems Design deployment architectures for autonomous AI agents and multi-agent systems Implement RAG (Retrieval-Augmented Generation) pipelines with vector databases Required Qualifications & Experience Cloud Platforms : 3+ years of production experience with at least one major cloud platform (AWS, Azure, GCP) AWS: SageMaker, EC2, Lambda, ECS/EKS, S3, CloudFormation Azure: Azure ML, AKS, Azure Functions, Blob Storage, ARM templates GCP: Vertex AI, GKE, Cloud Functions, Cloud Storage, Deployment Manager Containerization & Orchestration : Advanced proficiency in Docker and Kubernetes Experience with Helm charts, custom operators, and service mesh Infrastructure & DevOps Strong experience with Infrastructure as Code (Terraform, Pulumi, CDK) Proficiency in CI/CD tools (Jenkins, GitLab CI, GitHub Actions, Azure DevOps) Experience with observability stacks (Prometheus, Grafana, ELK, DataDog) Knowledge of message queuing systems (Kafka, RabbitMQ, cloud-native solutions) Understanding of networking, security, and IAM best practices ML/AI Knowledge Solid understanding of ML algorithms, model evaluation metrics, and optimization techniques Experience with distributed training frameworks (Horovod, Ray, DeepSpeed) Knowledge of model compression techniques (quantization, pruning, knowledge distillation) Understanding of fairness, bias detection, and explainability in ML systems Preferred Qualifications Advanced Agentic AI Experience Hands-on experience with LLM deployment and optimization (GPT, LLaMA, Claude APIs) Experience with agent frameworks (LangChain, LlamaIndex, AutoGen, CrewAI) Knowledge of vector databases (Pinecone, Weaviate, Qdrant, ChromaDB)

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Mars Telecom Systems logo
Mars Telecom Systems

Telecommunications

Odessa

RecommendedJobs for You

Kolkata, Mumbai, New Delhi, Hyderabad, Pune, Chennai, Bengaluru

Hyderabad, Chennai, Bengaluru

Noida, Hyderabad, Bengaluru