Monitoring Engineer

3 - 5 years

0 Lacs

Posted:4 days ago| Platform: Foundit logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Job Title: LLM System Monitor Site Reliability Engineer (SRE)

Location:

Type:

Compensation:

Working Hours:

Start Date:

Openings:

Required Skills & Experience

  • 3+ years of experience

    monitoring and responding to incidents in a globally deployed web application.
  • Strong experience with

    microservices architecture

    on

    Kubernetes

    .
  • Deep understanding of

    observability tools

    and operational metrics (Grafana, Prometheus, P99, etc.).
  • Familiarity with

    AWS services

    or any major cloud provider.
  • Excellent communication and customer service skills

    must be able to clearly articulate status and updates to technical and non-technical stakeholders.
  • Ability to

    ramp up quickly

    , take ownership, and work independently in a fast-pace

Job Overview

full-time LLM System Monitor / SRE

Key Responsibilities

  • Monitor

    Grafana dashboards

    and observability tools to detect failures and performance issues.
  • Act as the

    primary SRE

    for incident response, initiating reports from automated alerts or joining active incident channels.
  • Serve as the

    main point of contact

    during incidents, delivering frequent updates to customers and incident commanders.
  • Interpret operational metrics such as

    Quantiles, P99

    , and

    Prometheus

    data to assess system health.
  • Track and manage permutations of a

    globally deployed microservices architecture

    running on

    Kubernetes

    .
  • Collaborate with engineering and support teams to resolve issues quickly and efficiently.
  • Maintain strong communication and customer service throughout incident lifecycles.
  • Utilize foundational knowledge of

    AWS

    or other cloud platforms to support infrastructure monitoring.
  • Ramp up quickly on existing systems and processes.

Why Join

  • Work with cutting-edge LLM infrastructure at

    Cisco

  • Full-time opportunity

    with Insight Global
  • Hybrid flexibility

    onsite in Bangalore 3 days/week
  • Immediate interviews and onboarding

  • Competitive compensation

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You

bengaluru, karnataka, india

mumbai, maharashtra, india

Hyderabad, Telangana, India

Kolkata, Mumbai, New Delhi, Hyderabad, Pune, Chennai, Bengaluru

Bengaluru, Karnataka, India