Performance and Monitoring Engineer

3 - 8 years

5 - 15 Lacs

Posted:2 days ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Position Purpose

Provide a brief description of the overall purpose of the position, why this position exists and how it will contribute in achieving the teams goal.

The main responsibility of Stability Resilience division is to support the IT strategy Production and gathers activities contributing directly to the stability and integrity of the Production and to the Information Systems resilience.

Within the division, the domain Global Monitoring Log Analytics oversees Global Production Observability Systems and provides platforms and services around Elasticsearch, Splunk and Dynatrace technologies.

This domain includes the following key services:

  • Global Monitoring, providing Dynatrace services
  • Splunk (decommissioning by and of the year)
  • Logs As a Service, providing log management platform as a service based on Elastic stack (Elasticsearch, Kibana, Fleet, Elastic Agent, Logstash, Ingest pipelines) and Kafka technology.
  • Elastic As a Service, providin
  • Elasticsearch (+Kibana) dedicated specific clusters for some applications on its servers
  • Elasticsearch dedicated standard clusters on dMZR (based on an IBM Cloud product)
  • CyberSOC central data platform (Databus based on Kafka+Logstash, and DAP based on Elasticsearch)

Leveraging BNP Paribas Paris teams expertise and ISPL IT skills, the goal is to enable applications flawless production by providing secure and stable environments and by ensuring that all actions on production environments are done in a controlled manner.

The Performance and Monitoring Engineer will be integrated closely in the STA04 domains SRE Data Engineering team members which are in charge of:

  • Keeping a monitoring/alerting system to correctly manage infrastructure of our internal services (Log as a Service, Dedicated Elasticsearch cluster as a service, Global Monitoring)
  • Manage data preparation on observability metrics to take maximum benefit from it
  • Create and make evolutions of specific alerts and dashboards on our components and services, with high level and top/down approach, to provide best quality of service
  • Define house keeping procedures and surveillance, including morning and evening checks
  • Implement SRE approch (SLI/SLO for quality/perf improvment and reduction of incidents rate and impacts)

Responsibilities

Direct Responsibilities
  • Help us continuously improve our monitoring/alerting system used to take care of infrastructure of our internal services (Log as a Service, Dedicated Elasticsearch cluster as a service, Global monitoring)
  • Define and refine transformation pipelines for our metrics, when necessary, to have useful quality monitoring data
  • Make evolutions and create adapted alerts and dashboards on our components
  • Refine our house keeping procedures and surveillance, and adapt it taking in account the incidents we are facing (to avoid having twice the same problem)
  • Manage remediations (or ensure it is properly taking in account by other team members) from alerts raised and anomalies found in dashboards
  • For a predefined applications scope take care of ITSM processes based on ITIL framework:
  • Incidents
  • Requests
  • Changes
    1. Ensure that SLA targets are met for above activities
    2. Handover to Paris teams if knowledge and skills are not available in ISPL

    Contributing Responsibilities
    • Contribute to the knowledge transfer with Paris teams
    • Contribute to the definition of procedures and processes necessary for the team
    • Help build team spirit and integrate into BNP Paribas culture
    • Contribute to the regular activity reporting and KPI calculation
    • Contribute to continuous improvement actions
    • Work with cross-functional teams to ensure IT services align with business needs and service level agreements (SLAs).

    Technical Behavioral Competencies
    • Very good knowledge of usage and implementation of observability systems (

      Elasticsearch, Kibana, Grafana, Dynatrace, or others

      )
    • Good knowledge of modern observability practices (

      SRE, SLI/SLO, Synthetic, APM, RUM

      )
    • Good knowledge in script development (

      Python, Shell

      , PowerShell, )
    • Common knowledge of CI/CD tools like

      gitlab

      , gitlab runner,

      jenkins

      , .
    • Understanding of ITIL or similar ITSM frameworks tools
    • Experience with

      Service Now

      ticketing system
    • Experience in Agile framework and tools (e.g.,

      Jira, Confluence

      , etc)
    • Good written and spoken English
    • Ability to measure and identify areas for improving Quality and overall Delivery
    • Capable of communicating efficiently

    Good to have Skills
    • Knowledge of IT production backup and resilience setup (High Availability setup, Disaster Recovery Plan, etc.)
    • Understanding of key concepts of distributed systems
    • Basic knowledge of RedHat Linux administration and performance management
    • Notions of Ansible and Ansible Tower
    • Notions of containerization technologies (Docker, Kubernetes, Nomad, OpenShift)
    • Experience with any cloud platform (preferably IBM Cloud).
    • Ability to make contact with Paris team in case of difficulties, lack of information or any other problem where getting more information could help on solving issues or risk limitation
    • Good Team Player

    Specific Qualifications (if required)

    Skills Referential

    Behavioural Skills:

    Ability to collaborate / Teamwork

    Decision Making

    Transversal Skills:

    Ability to manage a project

    Ability to understand, explain and support change

    Education Level:

    Bachelor Degree or equivalent

    Mock Interview

    Practice Video Interview with JobPe AI

    Start Python Interview
    cta

    Start Your Job Search Today

    Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

    Job Application AI Bot

    Job Application AI Bot

    Apply to 20+ Portals in one click

    Download Now

    Download the Mobile App

    Instantly access job listings, apply easily, and track applications.

    coding practice

    Enhance Your Python Skills

    Practice Python coding challenges to boost your skills

    Start Practicing Python Now
    BNP Paribas logo
    BNP Paribas

    Banking

    Paris London

    RecommendedJobs for You