Posted:3 days ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

About Client

Hiring for One of the Most Prestigious Multinational Corporations!



Job Title :


Qualification :


Relevant Experience :


Must Have Skills :

  • 8+ years of overall experience in roles such as Site Reliability Engineering, DevOps, or Linux Systems Engineering.
  • 5+ years of hands-on, intensive experience administering, automating, and troubleshooting Red Hat OpenShift (OCP 4.x preferred) in large-scale production environments.
  • Proven experience in a senior or lead engineering role, demonstrating ownership of complex projects and mentorship of others.

Technical Skills

  • Expert-Level OpenShift: Deep, authoritative knowledge of OCP installation (IPI/UPI), upgrades, cluster administration, node management, and disaster recovery.
  • Kubernetes Mastery: A fundamental and deep understanding of Kubernetes architecture and components (etcd, kube-apiserver, scheduler, etc.) and Operators (OLM).
  • Infrastructure as Code (IaC): Strong proficiency with Ansible and Terraform for automating infrastructure provisioning and configuration management.
  • Programming/Scripting: Advanced scripting and software development skills in Python or Go, as well as Bash.
  • Observability: Hands-on experience building and managing monitoring and logging solutions (e.g., Prometheus, Grafana, Thanos, Alertmanager, ELK Stack, Splunk, Fluentd/Vector/OTEL).
  • CI/CD & GitOps: Expertise with CI/CD tooling (e.g., Tekton ,Jenkins, GitLab CI, ArgoCD, GitHub Actions).
  • Core Infrastructure: Strong proficiency in Linux/RHEL administration, networking (SDN, OVS, routing, firewalls, load balancer), and storage (Ceph, NFS, block storage, Object).


Good to Have Skills :

  • Analytical Mindset: Exceptional problem-solving skills with the ability to diagnose complex technical issues across multiple platform layers.
  • Ownership and Accountability: A strong sense of ownership and the drive to see issues through to resolution.
  • Communication: Excellent communication and interpersonal skills, capable of explaining complex topics to both technical and non-technical audiences.
  • Composure: Ability to remain calm and effective under pressure during critical incidents.

On-Call

  • Willingness to participate in a 24x7 on-call rotation to handle critical platform incidents.

Roles and Responsibilities :

  • Define and Uphold Reliability Standards: Establish and manage Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets for the OpenShift platform and its core services.
  • Automate Everything: Design, build, and maintain robust automation to handle the full lifecycle of OpenShift clusters, including provisioning, upgrades, patching, scaling, and disaster recovery.
  • Reduce Toil: Proactively identify and eliminate manual, repetitive operational work by developing and maintaining automation scripts (Python, Go, Bash) and Ansible playbooks.
  • Incident Response and Root Cause Analysis: Lead high-severity incident response and conduct deep, blameless post-mortems to identify and implement permanent solutions to prevent recurrence.
  • Proactive Health Management: Develop and implement automated health checks and self-healing capabilities to ensure cluster and application resilience.
  • Subject Matter Expertise: Serve as the top-tier technical authority for OpenShift Container Platform architecture, networking (OVN-Kubernetes, SDN), load balancing, cross cluster management, storage (OpenShift Data Foundation/Ceph), and security.
  • Observability: Architect and manage a comprehensive observability stack (e.g., Prometheus, Grafana, ELK/Fluentd) to provide deep insights into platform and application performance.
  • CI/CD and GitOps: Engineer and optimize CI/CD pipelines for both platform components and tenant applications, championing GitOps principles for declarative configuration management.
  • Capacity and Performance: Conduct advanced performance tuning, load testing, and capacity planning to ensure the platform can meet future demand.


Location :


CTC Range :


Notice period :


Shift Timing :


Mode of Interview :


Mode of Work :


Bhuvaneshwari S

Senior Specialist

Black and White outsourcing Pvt Ltd

Bangalore, Karnataka,INDIA.

bhuvaneshwari@blackwhite.in | www.blackwhite.in


Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Black And White Business Solutions logo
Black And White Business Solutions

Consulting

Business City

RecommendedJobs for You

bengaluru, karnataka, india