Site Reliability Professional

3 - 7 years

9 - 13 Lacs

Posted:4 days ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

As a Site Reliability Engineer, you will work in an agile, collaborative environment to build, deploy, configure, and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying the latest software updates & fixes.
Your primary responsibilities include:

24x7 Observability:

Be part of a worldwide team that monitors the health of production systems and services around the clock, ensuring continuous reliability and optimal customer experience.

Cross-Functional Troubleshooting:

Collaborate with engineering teams to provide initial assessments and possible workarounds for production issues. Troubleshoot and resolve production issues effectively.

Deployment and Configuration:

Leverage Continuous Delivery (CI/CD) tools to deploy services and configuration changes at enterprise scale.

Security and Compliance Implementation:

Implementing security measures that meet or exceed industry standards for regulations such as GDPR, SOC2, ISO 27001, PCI, HIPAA, and FBA.

Maintenance and Support:

Tasks related to applying Couchbase security patches and upgrades, supporting Cassandra and Mongo for pager duty rotation, and collaborating with Couchbase Product support for issue resolution.
Required education
Bachelor''s Degree
Required technical and professional expertise
Work with Hiring Manager to ID up to 5 bullets max

System Monitoring and Troubleshooting:

Strong skills in monitoring/observability, issue response, and troubleshooting for optimal system performance.

Automation Proficiency:

Proficiency in automation for production environment changes, streamlining processes for efficiency, and reducing toil.

Linux Proficiency:

Strong knowledge of Linux operating systems.

Operation and Support Experience:

Demonstrated experience in handling day-to-day operations, alert management, incident support, migration tasks, and break-fix support.
Experience with Infrastructure as Code (Terraform/OpenTofu)
Experience with ELK/EFK stack (ElasticSearch, Logstash/Fluentd, and Kibana)
Preferred technical and professional experience
Work with Hiring Manager to ID up to 3 bullets max (encouraging then to focus on required skills)

Kubernetes/OpenShift:

Strongly preferred experience in working with production Kubernetes/OpenShift environments.

Automation/Scripting:

In depth experience with the Ansible, Python, Terraform, and CI/CD tools such as Jenkins, IBM Continuous Delivery, ArgoCD

Monitoring/Observability:

Hands on experience crafting alerts and dashboards using tools such as Instana, New Relic, Grafana/Prometheus
Experience working in an agile team, e.g., Kanban

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
IBM logo
IBM

Information Technology

Armonk

RecommendedJobs for You