Site Reliability Engineering Professional

4 years

0 Lacs

Posted:6 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Introduction

A career in IBM Software means you’ll be part of a team that transforms our customer’s challenges into solutions.Seeking new possibilities and always staying curious, we are a team dedicated to creating the world’s leading AI-powered, cloud-native software solutions for our customers. Our renowned legacy creates endless global opportunities for our IBMers, so the door is always open for those who want to grow their career.IBM’s product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive.Your Role and ResponsibilitiesAs a Site Reliability Engineer, you will work in an agile, collaborative environment to build, deploy, configure, and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying the latest software updates & fixes.

Your Role And Responsibilities

Infrastructure & Cloud Management:

  • Design, build, and manage scalable cloud infrastructure using IBM Cloud, AWS, GCP, Azure.
  • Implement Infrastructure as Code using Terraform.
  • Deploy and configure applications using container orchestration platforms like Kubernetes/OpenShift.

Automation & CI/CD:

  • Develop and maintain automation scripts and tools using Python, Groovy, and Ansible.
  • Build and manage robust CI/CD pipelines using tools like Jenkins, IBM Continuous Delivery, and ArgoCD.

System Monitoring & Reliability:

  • Monitor health and performance of production systems (24x7 observability).
  • Use tools like Instana, Grafana/Prometheus, and New Relic to build alerts and dashboards.
  • Troubleshoot and resolve production issues in collaboration with engineering and support teams.

Security & Compliance:

  • Perform regular patching, upgrade, and collaborate with product support to resolve issues.

Database & Middleware:

  • Manage open-source middleware and databases such as PostgreSQL, CouchDB, Redis, Kafka, and Spark.
  • Participate in incident response and on-call rotations.

Required Qualifications:

Required technical and professional expertise

  • 4+ years of experience as a DevOps or SRE Engineer.
  • Experience with at least one major public cloud provider or large scale private/hybrid cloud using container orchestration.
  • Experience with a modern configuration management and/or infrastructure management framework (Ansible, Puppet, Chef, Terraform, etc.).
  • Production experience with one or more monitoring frameworks (Prometheus, Nagios, etc.)
  • Strong scripting skills in at least one language (BASH, Python, Ruby, etc.)
  • Experience with source control management such (git, subversion, etc.)
  • Familiarity with Kubernetes or OpenShift platforms.
  • Good understanding of CI/CD processes and tools (e.g., Jenkins) & networking.
  • Solid grasp of monitoring, observability, and troubleshooting production environments.
  • Hands-on experience with Linux & Windows systems administration.
  • Excellent collaboration, communication, and problem-solving skills.

Required Qualifications:

Preferred technical and professional experience

  • 4+ years of experience as a DevOps or SRE Engineer.
  • Experience with at least one major public cloud provider or large scale private/hybrid cloud using container orchestration.
  • Experience with a modern configuration management and/or infrastructure management framework (Ansible, Puppet, Chef, Terraform, etc.).
  • Production experience with one or more monitoring frameworks (Prometheus, Nagios, etc.)
  • Strong scripting skills in at least one language (BASH, Python, Ruby, etc.)
  • Experience with source control management such (git, subversion, etc.)
  • Familiarity with Kubernetes or OpenShift platforms.
  • Good understanding of CI/CD processes and tools (e.g., Jenkins) & networking.
  • Solid grasp of monitoring, observability, and troubleshooting production environments.
  • Hands-on experience with Linux & Windows systems administration.
  • Excellent collaboration, communication, and problem-solving skills.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
IBM logo
IBM

Information Technology

Armonk

RecommendedJobs for You