Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in india
>
HireKul
>
HPC System Engineer

HPC System Engineer

HireKul

0 years

0 Lacs

india

Posted:2 months ago| Platform:

Apply

Skills Required

cutting storage kubernetes orchestration server deployment integration ai ml data design support optimization backup retention scalability monitoring security compliance planning scaling leadership troubleshooting devops networking

Work Mode

On-site

Job Type

Full Time

Job Description

We are seeking an experienced HPC (High-Performance Computing) System Engineer to design,

implement, and manage cutting-edge HPC infrastructure using Dell servers, AMD GPUs (MI210),

and Pure Storage systems. The ideal candidate will have expertise in Commvault backup

systems, Kubernetes container orchestration, and multitenancy configurations, ensuring

scalable, GPU-accelerated, and high-performance solutions tailored to enterprise and HPC

workloads.

Key Responsibilities:

 Dell Servers:

 Architect and deploy HPC systems using Dell PowerEdge servers, ensuring high availability

and optimized performance for compute-intensive applications.

 Manage server hardware lifecycle, including deployment, upgrades, and diagnostics.

 Configure HPC cluster nodes for seamless integration with Kubernetes and GPU workloads.

 AMD GPUs (MI210):

 Deploy and optimize AMD GPU-based servers to accelerate AI/ML, HPC, and data-intensive

applications.

 Monitor GPU utilization, troubleshoot performance bottlenecks, and optimize workloads for

GPU acceleration.

 Integrate GPUs into Kubernetes environments for containerized GPU-based applications.

Pure Storage:

 Design and manage Pure Storage solutions, including FlashBlade, to support HPC and

data-intensive workloads.

 Implement multitenancy configurations for isolated, secure, and efficient resource

utilization.

 Monitor storage health and ensure performance optimization for high-speed data access.

 Commvault Backup:

 Architect and manage enterprise-wide Commvault backup solutions, ensuring data integrity

and readiness for disaster recovery.

 Implement backup and retention policies for HPC environments, including containerized and

GPU-accelerated workloads.

Kubernetes Container Management:

 Deploy and manage Kubernetes clusters for HPC applications, ensuring scalability and fault

tolerance.

 Configure persistent storage for containerized workloads and integrate storage with GPUs for

high-performance data processing.

 Monitor cluster performance and troubleshoot HPC-specific Kubernetes challenges.

 System Optimization and Monitoring:

 Implement advanced monitoring solutions for servers, GPUs, storage, and Kubernetes

clusters to ensure peak performance.

 Develop and enforce policies for system security, resource allocation, and compliance with

industry standards.

 Lead capacity planning and scaling initiatives for HPC infrastructure.

Team Leadership and Collaboration:

 Mentor and guide junior engineers on HPC best practices, system design, and

troubleshooting techniques.

 Collaborate with cross-functional teams, including data scientists and DevOps, to align

infrastructure capabilities with organizational goals.

Qualifications:

 Technical Skills:

 Extensive experience with Dell PowerEdge servers in HPC or enterprise environments.

 Proven expertise in AMD GPUs (MI210), including their integration and optimization for AI/ML

and HPC workloads.

 Advanced knowledge of Pure Storage systems, including multitenancy and high-

performance configurations.

 Expertise in Commvault backup systems, including design, deployment, and disaster

recovery.

 Strong proficiency in Kubernetes container orchestration, particularly for GPU-accelerated

applications.

 Knowledge of high-performance interconnects (e.g., RDMA, InfiniBand) and networking for

HPC.

More Jobs at HireKul

Store Executive

Neemrana, Rajasthan, India

2.0 - 3.0 yrs

Salary: Not disclosed

Accountant (US GAAP)

India

Experience: Not specified

Salary: Not disclosed

Guidewire Developer

India

Experience: Not specified

Salary: Not disclosed

Guidewire Developer

India

Experience: Not specified

Salary: Not disclosed

Recruiter

Noida, Uttar Pradesh, India

3.0 - 3.0 yrs

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

HireKul

RecommendedJobs for You

HPC System Engineer

Global Infotech

gandhinagar

HPC System Engineer

HireKul

india

HPC System Engineer

HireKul

india

Login to

Please Verify Your Phone or Email

Confirm Action

HPC System Engineer