Observability Support Engineer

5 - 8 years

20 - 25 Lacs

Posted:3 weeks ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Position Purpose

The main responsibility of Stability & Resilience division is to support the IT strategy & Production and gathers activities contributing directly to the stability and integrity of the Production and to the Information Systems resilience.

Within the division, the domain Global Monitoring & Log Analytics oversees Global Production Observability Systems and provides platforms and services around Elasticsearch, Splunk and Dynatrace technologies.

This domain includes the following key services:

Global Monitoring, providing Dynatrace services

Splunk (decommissioning by and of the year)

Logs As a Service, providing log management platform as a service based on Elastic stack (Elasticsearch, Kibana, Fleet, Elastic Agent, Logstash, Ingest pipelines) and Kafka technology.

Elastic As a Service, providing

o Elasticsearch (+Kibana) dedicated specific clusters for some applications on its servers

o Elasticsearch dedicated standard clusters on dMZR (based on an IBM Cloud product)

CyberSOC central data platform (Databus based on Kafka+Logstash, and DAP based on Elasticsearch)

Leveraging BNP Paribas Paris teams expertise and ISPL IT skills, the goal is to enable applications flawless production by providing secure and stable environments and by ensuring that all actions on production environments are done in a controlled manner.

The Observability Support Engineer will be integrated closely in the STA04 domains SRE & Data Engineering team members which are in charge of:

Keeping a monitoring/alerting system to correctly manage infrastructure of our internal services (Log as a Service, Dedicated Elasticsearch cluster as a service, Global Monitoring)

Manage data preparation on observability metrics to take maximum benefit from it

Create and make evolutions of specific alerts and dashboards on our components and services, with high level and top/down approach, to provide best quality of service

Define house keeping procedures and surveillance, including morning and evening checks

Implement SRE approch (SLI/SLO for quality/perf improvment and reduction of incidents rate and impacts)

Responsibilities

Direct Responsibilities

Take care of our infrastructure of our internal services (Log as a Service, Dedicated Elasticsearch cluster as a service, Global monitoring) and ensure that performance and features are ok for our customers

Provide support for our customers (incident management, help on usages)

Make evolutions and enrich our end-user and internal documentation

Identify bugs or needed evolutions on our services for our customers to have an easier and richer solution, and for our team to reduce manual and/or recurrent actions

For a predefined applications scope take care of ITSM processes based on ITIL framework:

o Incidents

o Requests

o Changes

Ensure that SLA targets are met for above activities

Handover to Paris teams if knowledge and skills are not available in ISPL

General Responsibilities

Contribute to the knowledge transfer with Paris teams

Contribute to the definition of procedures and processes necessary for the team

Help build team spirit and integrate into BNP Paribas culture

Contribute to the regular activity reporting and KPI calculation

Contribute to continuous improvement actions

Work with cross-functional teams to ensure IT services align with business needs and service level agreements (SLAs).

Technical & Behavioral Competencies
Mandatory Skills

Elasticsearch, Kibana

Decent knowledge of modern observability practices (SRE, Log Management, SLI/SLO, Synthetic, APM, RUM )

Docker, Kubernetes

Python, Shell

Ansible

gitlab

Understanding of ITIL or similar ITSM frameworks & tools

Service Now

Jira, Confluence

Basic about microsegmentation (Illumio) and secured environments and safes (Vault)

Good written and spoken English

Ability to measure and identify areas for improving Quality and overall Delivery

Capable of communicating efficiently

Good to have Skills

Knowledge of IT production backup and resilience setup (High Availability setup, Disaster Recovery Plan, etc.)

Basic knowledge of RedHat Linux administration and performance management

Experience with any cloud platform (preferably IBM Cloud).

Ability to make contact with Paris team in case of difficulties, lack of information or any other problem where getting more information could help on solving issues or risk limitation

Good Team Player

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
BNP Paribas logo
BNP Paribas

Banking

Paris London

RecommendedJobs for You