MLOps/DevOps Engineer

0 years

2 - 4 Lacs

Posted:1 day ago| Platform: GlassDoor logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Superteams.ai helps businesses build agentic systems, deploy private LLMs, and power next-gen AI workflows. Our company blends AI expertise, cloud infrastructure, and open-source technologies to help clients build, scale, and maintain custom AI solutions across sectors like education, climate, and enterprise SaaS.

Our work spans a wide spectrum of Generative AI applications - from LLMs and vision models to audio synthesis and predictive analytics. If you’re passionate about infrastructure that supports cutting-edge AI systems, you’ll thrive here.

What You’ll Do

As a MLOps Engineer, you will play a critical role in deploying, managing, and scaling infrastructure that powers AI systems. You’ll work closely with AI developers and product teams to automate deployment, monitor performance, and ensure uptime and reliability across various environments.

Key Responsibilities

  • Set up and manage cloud infrastructure (AWS, GCP, or Azure) for AI and data applications.
  • Build and maintain CI/CD pipelines for deploying AI systems and services.
  • Containerize applications using Docker and orchestrate using Kubernetes.
  • Set up observability and monitoring tools such as ELK, Prometheus, Grafana, Loki, and Graylog.
  • Manage logging, alerting, and performance tracking for uptime, latency, and failure detection.
  • Automate infrastructure provisioning and updates using IaC tools and scripting.
  • Collaborate with AI engineers to enable smooth deployments of LLMs and agentic systems.
  • Assist in database performance optimization (SQL, NoSQL and Vector DBs).

Must-Have Skills

  • Very strong command of Linux/Ubuntu systems and shell scripting. You should be able to operate a remote server comfortably.
  • Hands-on experience with at least one cloud platform: AWS, GCP, or Azure. Just being able to see the dashboard on AWS isn’t enough - you should know how to manage instances, track system health, handle firewalls, and audit logs.
  • Proven expertise in CI/CD pipeline management. You should have worked with Jenkins, Ansible or any industrial-grade CI/CD platform.
  • Solid understanding of Docker and Kubernetes. You should have managed containers, Docker swarms, or handled scalability challenges in production systems.
  • Experience with monitoring, alerting, and log centralization tools like Elastic-Kibana, Prometheus, Grafana, Graylog or Loki. You should also be aware of Sentry and other monitoring systems.
  • Experience managing uptime, load balancing, and failover systems. Understanding of snapshotting, firewalls, recovery, failover and zero-downtime is a must.

Nice to Have

  • Experience with vLLM for LLM serving and handling batching, queueing, retries and error handling with LLM requests.
  • Working knowledge of Python for scripting and automation. You should be able to understand Python code, debug it to some extent and fix issues that may happen in production.
  • Deep experience with SQL databases (e.g., PostgreSQL, MySQL).
  • Interest in or exposure to AI infrastructure, model serving, and agentic system design.

What You’ll Gain

  • Exposure to the latest in Generative AI and agentic systems.
  • Hands-on experience with AI devops challenges: model hosting, scaling inference, observability in multimodal systems, and more.
  • Opportunity to contribute to open-source AI infrastructure tools.
  • A fast-paced, collaborative environment where learning and innovation are at the core.

If you're excited to build scalable infrastructure that enables cutting-edge AI, we’d love to hear from you.

Job Type: Full-time

Pay: ₹200,000.00 - ₹400,000.00 per year

Work Location: In person

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You