Posted:2 weeks ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Job Title: OpenShift Virtualization Engineer / Cloud Platform Engineer (OpenShift Virtualization)

Experience: 5+ Years

Location: Chennai

Job Summary:

Key Responsibilities:

  • Observability, Monitoring, logging and Troubleshooting
  • Implement and maintain comprehensive end to end observability solutions (monitoring, logging, tracing) for the OSV environment, including integration with tools like Dynatrace and Prometheus/Grafana
  • Explore and implement Event Driven Architecture (EDA) for enhanced real time monitoring and response.
  • Develop capabilities to flag and report abnormalities and identify "blind spots"
  • Perform deep dive Root Cause Analysis (RCA), potentially utilizing available tooling, to quickly identify and resolve issues across the global compute environment
  • Find the needle in a haystack/unhealthy bits in the compute universe (Globally) for faster time to resolution
  • Monitor VM health, resource usage, and performance metrics proactively
  • Monitor for unusual activity that might indicate a compromise or misconfiguration Solution Design & Consulting
  • Provide technical consulting and expertise to application teams requiring OSV solutions
  • Design, implement, and validate custom or dedicated OSV clusters and VM solutions for critical applications with unique or complex requirements (e.g., specialized appliances) Knowledge Management
  • Create, maintain, and update comprehensive internal documentation and customer facing content on the Ford docs site to facilitate self service and clearly articulate platform capabilities

Skills Required:

Kubernetes, Openshift , deployment, services, Storage Capacity Management, Linux, Network Protocols & Standards, Python, Scripting, Dynatrace, VMware, Problem Solving, Technical Troubleshoot, Communications, Ability to communicate and work with cross-functional teams and all levels of management , redhat, TERRAFORM, Tekton, Ansible, GitHub, GCP, AWS, Azure, Cloud Infrastructure, CI/CD, DevOps


Experience Required:

Required Qualifications:

  • Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent practical experience.
  • 5+ years of experience in IT infrastructure, with at least 2+ years focused on Kubernetes and/or OpenShift.
  • Proven experience with OpenShift Virtualization (KubeVirt) in a production environment.
  • Strong understanding of Kubernetes concepts (Pods, Deployments, Services, Storage Classes, Operators, Custom Resources).
  • Experience with Linux administration and networking fundamentals.
  • Proficiency in scripting languages (e.g., Bash, Python) for automation.
  • Experience with monitoring tools (e.g., Prometheus, Grafana, Dynatrace) and logging solutions.
  • Solid understanding of virtualization concepts and technologies (e.g., KVM, VMware)
  • Excellent problem-solving skills and the ability to troubleshoot complex issues across multiple layers of the stack.
  • Strong communication and collaboration skills.

Experience Preferred:

Preferred Qualifications:

  • Red Hat Certified Specialist in OpenShift Virtualization or other relevant certifications.
  • Experience with Infrastructure as Code (IaC) tools like Ansible, Terraform, or OpenShift GitOps.
  • Familiarity with software-defined networking (SDN) and software-defined storage (SDS) solutions.
  • Experience with public cloud providers (AWS, Azure, GCP) and hybrid cloud architectures.
  • Knowledge of CI/CD pipelines and DevOps methodologies.


Additional Information :

Key Responsibilities:

  • Capacity Management
  • Conduct capacity planning and forecasting for the OpenShift Virtualization platform, including compute, memory, storage, and network resources, to ensure scalability and prevent resource exhaustion.
  • Analyze resource utilization trends and make recommendations for infrastructure scaling, consolidation, or optimization.
  • Collaborate with application teams and stakeholders to understand future demand and project capacity needs.
  • Develop and maintain capacity models and reports to support strategic planning. OSV Automation & Efficiency
  • Develop automation solutions (scripts, playbooks) for repetitive OSV tasks, including configuration changes, VM management (like snapshot removal), auditing, remediation and integration with ticketing systems
  • Leverage automation to enable delivering operator updates and changes efficiently at scale
  • Implement Site Reliability Engineering (SRE) principles and practices to improve overall platform stability, performance, and operational efficiency
  • Role Based Access Control deployment and auditing
  • Namespace and Resource Quota management (CPU, Disk and Storage)"

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

chennai, tamil nadu, india

chennai, tamil nadu, india

chennai, tamil nadu, india