The BNP Paribas Group's IT department (IT GROUP) aims to ensure the overall consistency of the Information System (IS) with the Group's strategic objectives, while improving the management and control of operational risks.
The role of IT includes, among other things, providing an enhanced customer experience, accelerating digitalization, and optimizing operational efficiency. Thus, IT GROUP has modernized the Information Systems by deploying digital levers to support the digitalization of the Group's Businesses and Functions:
- The delivery of digital levers (Cloud, API, Data, Digital Working, Tech watch) is effective, and the IT Market Place enables their deployment on a Group-wide scale,
- The security of the IT environment is evolving with a focus on cybersecurity risks and IT resilience to meet increasingly stringent regulatory requirements,
- The robustness of the Information Systems is improving with the implementation of standardized operational methods,
- Human capital remains at the center of our priorities,
- Operational efficiency is growing with a focus on process automation and industrialization,
- All these advances contribute to the Group's ambition to reduce its environmental footprint.
Thanks to these solid foundations, a new phase of the IT strategy is opening up, enabling the Group to address a dual challenge: accelerating the adoption of innovative solutions around data and Artificial Intelligence (AI), and ensuring the interoperability and security of Information Systems.
To meet these challenges, IT GROUP is adapting its organization and remaining in line with that of the Bank. This evolution is based on five main levers:
- Technological innovation through data and AI: ensuring excellence in innovation through research and development of skills, relying on specific governance. This involves structuring technical foundations to the state of the art in order to accelerate experimentation, the construction of innovative solutions, and their adoption on a large scale by the Business and Functions,
- Cybersecurity: continuing to protect the Bank and its customers in terms of Cybersecurity and Cyber Fraud, in line with regulatory requirements,
- Infrastructure and Production: creating synergies, accelerating industrialization, emphasizing interoperability, and improving the resilience of Information Systems, while controlling risks.
- Payment and Card Services: meeting the challenges of the industry by ensuring optimal alignment and end-to-end execution of strategic programs among all Group stakeholders,
- Operational excellence: gaining efficiency, particularly in financial terms, to continue optimizing expenses.
Position PurposeProvide a brief description of the overall purpose of the position, why this position exists and how it will contribute in achieving the teams goal.
The main responsibility of Stability & Resilience division is to support the IT strategy & Production and gathers activities contributing directly to the stability and integrity of the Production and to the Information Systems resilience.
Within the division, the domain Global Monitoring & Log Analytics oversees Global Production Observability Systems and provides platforms and services around Elasticsearch, Splunk and Dynatrace technologies.
This domain includes the following key services:
- Global Monitoring, providing Dynatrace services
- Splunk (decommissioning by and of the year)
- Logs As a Service, providing log management platform as a service based on Elastic stack (Elasticsearch, Kibana, Fleet, Elastic Agent, Logstash, Ingest pipelines) and Kafka technology.
- Elastic As a Service, providin
- Elasticsearch (+Kibana) dedicated specific clusters for some applications on its servers
- Elasticsearch dedicated standard clusters on dMZR (based on an IBM Cloud product)
- CyberSOC central data platform (Databus based on Kafka+Logstash, and DAP based on Elasticsearch)
Leveraging BNP Paribas Paris teams expertise and ISPL IT skills, the goal is to enable applications flawless production by providing secure and stable environments and by ensuring that all actions on production environments are done in a controlled manner.
The Performance and Monitoring Engineer will be integrated closely in the STA04 domains SRE & Data Engineering team members which are in charge of:
- Keeping a monitoring/alerting system to correctly manage infrastructure of our internal services (Log as a Service, Dedicated Elasticsearch cluster as a service, Global Monitoring)
- Manage data preparation on observability metrics to take maximum benefit from it
- Create and make evolutions of specific alerts and dashboards on our components and services, with high level and top/down approach, to provide best quality of service
- Define house keeping procedures and surveillance, including morning and evening checks
- Implement SRE approch (SLI/SLO for quality/perf improvment and reduction of incidents rate and impacts)
Responsibilities
Direct Responsibilities- Help us continuously improve our monitoring/alerting system used to take care of infrastructure of our internal services (Log as a Service, Dedicated Elasticsearch cluster as a service, Global monitoring)
- Define and refine transformation pipelines for our metrics, when necessary, to have useful quality monitoring data
- Make evolutions and create adapted alerts and dashboards on our components
- Refine our house keeping procedures and surveillance, and adapt it taking in account the incidents we are facing (to avoid having twice the same problem)
- Manage remediations (or ensure it is properly taking in account by other team members) from alerts raised and anomalies found in dashboards
- For a predefined applications scope take care of ITSM processes based on ITIL framework:
- Incidents
- Requests
- Changes
- Ensure that SLA targets are met for above activities
- Handover to Paris teams if knowledge and skills are not available in ISPL
Contributing Responsibilities- Contribute to the knowledge transfer with Paris teams
- Contribute to the definition of procedures and processes necessary for the team
- Help build team spirit and integrate into BNP Paribas culture
- Contribute to the regular activity reporting and KPI calculation
- Contribute to continuous improvement actions
- Work with cross-functional teams to ensure IT services align with business needs and service level agreements (SLAs).
Technical & Behavioral Competencies- Very good knowledge of usage and implementation of observability systems (
Elasticsearch, Kibana, Grafana, Dynatrace, or others
) - Good knowledge of modern observability practices (
SRE, SLI/SLO, Synthetic, APM, RUM
) - Good knowledge in script development (
Python, Shell
, PowerShell, ) - Common knowledge of CI/CD tools like
gitlab
, gitlab runner, jenkins
, . - Understanding of ITIL or similar ITSM frameworks & tools
- Experience with
Service Now
ticketing system - Experience in Agile framework and tools (e.g.,
Jira, Confluence
, etc) - Good written and spoken English
- Ability to measure and identify areas for improving Quality and overall Delivery
- Capable of communicating efficiently
Good to have Skills- Knowledge of IT production backup and resilience setup (High Availability setup, Disaster Recovery Plan, etc.)
- Understanding of key concepts of distributed systems
- Basic knowledge of RedHat Linux administration and performance management
- Notions of Ansible and Ansible Tower
- Notions of containerization technologies (Docker, Kubernetes, Nomad, OpenShift)
- Experience with any cloud platform (preferably IBM Cloud).
- Ability to make contact with Paris team in case of difficulties, lack of information or any other problem where getting more information could help on solving issues or risk limitation
- Good Team Player