Analyst, Site Reliability Engineer, Site Reliability Engineering

2 - 7 years

12 - 14 Lacs

Posted:3 hours ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Business Function:

Group Technology and Operations (T&O) enable and empowers the bank with an efficient, nimble and resilient infrastructure through a strategic focus on productivity, quality & control, technology, people capability and innovation. In Group T&O, we manage the majority of the Banks operational processes and inspire to delight our business partners through our multiple banking delivery channels.

Responsibilities:

24/7 System Monitoring & Recovery: Perform continuous system monitoring and execute Standard Operating Procedures (SOPs) for incident detection and recovery to maintain high availability, participating in an on-call rotation to provide 24/7 operational support.
Site Reliability Engineering (SRE): Champion and integrate SRE principles into our operational practices and system designs.
Service Lifecycle Management: Oversee the deployment, ongoing support, and monitoring of new and existing services, platforms, and application stacks.

Incident Reduction & Performance: Proactively improve system monitoring and alerting mechanisms to significantly reduce incident resolution times.
Release & Deployment Management: Manage the end-to-end patching, release, and deployment functions, ensuring seamless delivery.
Environmental Optimization: Collaborate with engineering and application development teams to enhance system performance through strategic environment upgrades and continuous improvements.
Review and provide the continues feedback on the automation area for the change and release area.

Requirements:

Experience in managing VPC, OpenShift, Kubernetes, Docker, RHEL.
At least 2 years of experience of general on DevOps CI-CD tools and managements.
Can work and lead under dynamic change environment 24/7 support and have the right attitude to learn and implement.
Solid experience in container image deploy and release management with OpenShift and Kubernetes.

Must have strong automation and scripting skills proficiency in shell, groovy & python.
Good knowledge on monitoring tools Prometheus, Grafana and ELK
Background in large-scale system administration and familiarity with SRE principles and Release Engineering
Have advanced Linux System Administrator skills and advanced configuration management systems skills.
In-depth knowledge in infrastructure areas such as virtual server technologies, networking, firewall, internet protocols.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
DBS Bank logo
DBS Bank

Banking and Financial Services

Singapore

RecommendedJobs for You