AI MLOPS Engineer

5 - 9 years

22 - 25 Lacs

Posted:3 weeks ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

5+ years of experience
As the Infrastructure and Ops Engineer, you will work on operations related our UAIS AI Studio (enterprise AI/ML platform), and in particular in relation to AI/ML training initiative supporting thousands of learners on the platform. This individual contributor (IC) role requires experience on working on large-scale AI/ML platforms guaranteeing stability, reliability, scalability, and performance. Experience with modern Infrastructure and DevOps tools and paradigms, as well as hands-on knowledge with major cloud-based services like Azure, AWS and GCP is a must.


Primary Responsibilities:Continuous support: Provide continuous SRE support to thousands of geographically distributed learners on the UAIS platform: respond to tickets, triage support, liaise with customers. Automation & DevOps: Improve existing Infrastructure as Code (IaC) according to best DevOps practices.Systems Monitoring: Develop and maintain monitoring frameworks for UAIS infrastructure in relation to AI/ML training programSecurity & Compliance: Collaborate with cybersecurity teams to ensure all systems and operations comply with industry standards and are secure against evolving threats.Capacity Planning & Cost Optimization: Forecast and manage capacity requirements for the AI/ML training environment, while identifying opportunities to reduce costs without compromising performance.


Required Qualifications:Bachelors degree in computer science, information technology, or a related field.5+ years of infrastructure experience: Proven experience working on large-scale, cloud-based, enterprise-level software platforms and deep understanding of multi-cloud architectures, specifically Azure, AWS, and GCP, with hands-on experience in cloud management.3+ years of practical experience in Infrastructure-as-Code and CI/CD tools like Terraform, Git Actions and alike.2+ years of practical experience in containerization technologies (Kubernetes, Docker) and orchestration2+ years of practical experience in Scripting & Automation Skills: Advanced proficiency in scripting languages such as Python and Bash to support automation and system integration efforts.


Preferred Qualifications:Security & Compliance Knowledge: Strong understanding of security best practices and experience ensuring compliance with relevant regulatory frameworks.Machine Learning and LLM Operations: Exposure to modern tools and techniques in MLOps and LLMOps fields. Exposure to AI/ML-specific infrastructure tools (e.g., MLflow, Kubeflow) for managing and deploying models at scale.Exposure to a Regulated Industry: Experience working within a healthcare or regulated industry, with solid understanding of the unique challenges and compliance requirements.Ability to work independently, manage multiple projects simultaneously, and adapt to changing priorities in a fast-paced environment.

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
VAK Consulting LLC logo
VAK Consulting LLC

Consulting

Atlanta

RecommendedJobs for You

mumbai, delhi / ncr, bengaluru

pune, chennai, bengaluru