LLM Ops Engineer - Serverless & CI/CD (AWS)

2 - 7 years

4 - 9 Lacs

Posted:2 months ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

This isnt your average DevOps role. This isnt just about pipelines or cloud provisioning. This is about engineering the backbone of

Agentic AI systems

that drive the next generation of enterprise SaaS where

conversational interfaces, dynamic UIs, and intelligent agents

operate seamlessly on

AWS Serverless infrastructure

, with deep integration into

Salesforce and cross-agent protocols

.
This is for builders with something to prove. For engineers who ve gone beyond cloud fluency to orchestrate

complex, multi-agent ecosystems

who want to shape how enterprise applications are deployed, debugged, scaled, and observed in real time.
If you re driven by deep automation, passionate about creating fault-tolerant agentic systems, and thrive where innovation is the expectation not the exception you re in the right place. Join us to redefine SaaS infrastructure and champion a

new era of AI-powered, product-led enterprise experiences

.

The Role

We are seeking a

hands-on Agentic AI Ops Engineer

who thrives at the intersection of

cloud infrastructure

,

AI agent systems

, and

DevOps automation

. In this role, you will

build and maintain the CI/CD infrastructure for Agentic AI solutions

using

Terraform on AWS

, while also

developing, deploying, and debugging intelligent agents and their associated tools

. This position is critical to ensuring scalable, traceable, and cost-effective delivery of agentic systems in production environments.

The Responsibilities

CI/CD Infrastructure for Agentic AI
  • Design, implement, and maintain

    CI/CD pipelines

    for

    Agentic AI applications

    using

    Terraform

    ,

    AWS CodePipeline

    ,

    CodeBuild

    , and related tools.
  • Automate deployment of multi-agent systems and associated tooling, ensuring version control, rollback strategies, and consistent environment parity across dev/test/prod.

Agent Development & Debugging

  • Collaborate with ML/NLP engineers to develop and deploy

    modular, tool-integrated AI agents

    in production.
  • Lead the effort to create

    debuggable agent architectures

    , with structured logging, standardized agent behaviors, and feedback integration loops.
  • Build agent lifecycle management tools that support

    quick iteration, rollback, and debugging

    of faulty behaviors.

Monitoring, Tracing & Reliability

  • Implement

    end-to-end observability

    for agents and tools, including

    runtime performance metrics

    ,

    tool invocation traces

    , and

    latency/accuracy tracking

    .
  • Design dashboards and alerting mechanisms to capture

    agent failures, degraded performance, and tool bottlenecks

    in real-time.
  • Build lightweight tracing systems that help

    visualize agent workflows

    and simplify root cause analysis.

Cost Optimization & Usage Analysis

  • Monitor and manage

    cost metrics

    associated with agentic operations including

    API call usage

    ,

    toolchain overhead

    , and

    model inference costs

    .
  • Set up proactive

    alerts for usage anomalies

    , implement

    cost dashboards

    , and propose strategies for reducing operational expenses without compromising performance.

Collaboration & Continuous Improvement

  • Work closely with product, backend, and AI teams to evolve the

    agentic infrastructure design

    and

    tool orchestration workflows

    .
  • Drive the adoption of

    best practices for Agentic AI DevOps

    , including retraining automation, secure deployments, and compliance in cloud-hosted environments.
  • Participate in design reviews, postmortems, and architectural roadmap planning to continuously improve reliability and scalability.

  • 2+ years

    of experience in DevOps, MLOps, or Cloud Infrastructure with exposure to

    AI/ML systems

    .
  • Deep expertise in AWS serverless architecture

    , including hands-on experienc

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Expedite Commerce logo
Expedite Commerce

E-commerce Solutions

Commerce City

RecommendedJobs for You

Kolkata, Mumbai, New Delhi, Hyderabad, Pune, Chennai, Bengaluru

hyderabad, pune, delhi / ncr

kolkata, mumbai, new delhi, hyderabad, pune, chennai, bengaluru