Service Management Practitioner

5 - 10 years

4 - 8 Lacs

Posted:13 hours ago| Platform: Foundit logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Project Role :

  • Service Management Practitioner

Project Role Description :

  • Support the delivery of programs, projects or managed services. Coordinate projects through contract management and shared service coordination. Develop and maintain relationships with key stakeholders and sponsors to ensure high levels of commitment and enable strategic agenda.

Must have skills :Site Reliability Engineering

Good to have skills :Service Integration and Management (SIAM)

Minimum7.5year(s) of experience is required

Educational Qualification :15 years full time education

Job summary :

Position Overview:

Key Responsibilities:

  • Log, Metrics, and Trace Analysis:

    Gather and analyze logs, metrics, and traces from operating systems, infrastructure, network components, and applications to assist in performance tuning and fault detection.
  • Enhance Observability and Alerting Capabilities:

    Implement, enhance, and maintain observability and alerting capabilities, especially those built on

    SLI/SLO/Error Budget

    principles.
  • Improve Existing Platforms:

    Analyze existing observability and alerting platforms to identify opportunities for improvement, ensuring they meet the evolving needs of the business and clients.
  • Unified Observability Stack Development:

    Help build a unified observability stack using a variety of observability tools to ensure comprehensive monitoring.
  • Automation & Self-Healing Systems:

    Drive automation to improve system self-healing capabilities and overall operational efficiency.
  • Symptom-Based Alerts:

    Build monitoring systems that alert on symptoms rather than just outages to proactively prevent failures.

Qualifications:

  • Educational Background:

    Bachelor's or Master's degree in

    Computer Science

    ,

    Computer Engineering

    ,

    Electrical Engineering

    , or related field, or a combination of education and equivalent work experience.

Required Experience:

  • Overall Experience:

    5-8 years of professional experience in a relevant field.
  • Monitoring Tools Development and Enhancement:

    3-5 years of experience in development, including custom metric creation, and enhancing monitoring tools such as

    Dynatrace

    ,

    AppDynamics

    ,

    New Relic

    ,

    Prometheus

    ,

    Splunk

    ,

    Sensu

    ,

    Nagios

    ,

    DataDog

    , etc.
  • Strong Knowledge of Monitoring Pillars:

    In-depth knowledge of logs, metrics, and traces as fundamental pillars of monitoring and observability.
  • Cloud Platform Experience:

    Solid working understanding of cloud platforms including

    AWS

    ,

    Azure

    , and

    GCP

    .

Good-to-Have Experience:

  • SLI/SLO/Error Budget Implementation:

    Experience in implementing SLI/SLO/Error Budget-driven observability and alerting frameworks.
  • Cloud Platforms Expertise:

    Advanced proficiency with

    AWS

    ,

    Azure

    , and

    GCP

    .
  • Programming Languages:

    Proficiency in one or more of the following languages:

    Python

    ,

    Go

    ,

    Java/Scala

    ,

    C

    , or

    C++

    .
  • Multi-Tier Application Development:

    Experience with

    J2EE

    ,

    NoSQL/SQL Datastore

    ,

    Spring Boot

    ,

    GCP/AWS/Azure

    ,

    Docker

    , and

    Kubernetes

    in multi-tier application environments.
  • SRE Principles and Practices:

    Strong understanding and practical experience with

    SRE principles

    and practices.
  • Observability Strategy Implementation:

    Ability to implement effective observability strategies to improve

    MTTD (Mean Time to Detection)

    and

    MTTR (Mean Time to Recovery)

    .
  • RESTful APIs and Microservices:

    Experience with

    RESTful APIs

    and

    microservices platforms

    .
  • Networking Knowledge:

    Working knowledge of the

    TCP/IP stack

    , internet routing, and

    load balancing

    .
  • Problem-Solving:

    Ability to solve complex architectural/design and business problems by simplifying processes, optimizing systems, and eliminating bottlenecks.

Key Competencies:

  • Technical Expertise:

    Strong experience in developing observability and alerting systems using modern tools and frameworks.
  • Problem-Solving:

    Critical thinking skills to identify performance issues and troubleshoot complex problems.
  • Automation:

    Focus on automation to improve operational efficiency and system self-healing capabilities.
  • Collaboration & Communication:

    Strong ability to collaborate with cross-functional teams and communicate complex technical concepts to both technical and non-technical stakeholders.
  • Continuous Improvement:

    Constantly seeking ways to improve systems, monitoring, and alerting capabilities to stay ahead of potential issues

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You