HPC Network Engineer

5 - 7 years

7 - 9 Lacs

Posted:2 months ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

The High-Performance ComputingNetwork Engineer is primarily responsible for the overall health and maintenance of storage technologies in our managed services customer's environments. Our Network Engineers are a valued member of the Managed Services Infrastructure Practice responsible for Tier 3 incident management, service request management and change management infrastructure support for all Managed Services customers. Key Responsibilities Provide enterprise-level operational support to Managed Services customers for incident, problem, and change management activities Plan and perform maintenance activities Assess customer environments for performance and design issues and propose resolutions Work across technical teams to troubleshoot complex infrastructure issues Create and maintain detailed documentation Serve as a subject matter expert and escalation point for storage technologies Work with vendors to resolve storage issues Communicate with customers and internal team with transparency Participate in on-call rotation Completion of training and certification as assigned to further skills and knowledge Skills Required Bachelors degree or equivalent Information Systems or related field. Unique education, specialized experience, skills, knowledge, training, or certification may be substituted for education 5+ years of expert level experience managing Network infrastructure in high-performance computing environments. Experience configuring, maintaining and troubleshooting Nvidia/Mellanox (Cumulus OS) switches required. Strong knowledge of Kubernetes and its networking components (CNI, Service Mesh, etc.) Understanding of VPNs, Load Balancers, VPCs, and hybrid cloud networking Experience with both ethernet and InfiniBand networking. 1+ years working with monitoring platforms; Elastic Observability is a bonus 1+ years working with an enterprise ITSM system: Service Now is a bonus Familiarity with high-performance computing (HPC) schedulers (e.g., SLURM, PBS, Torque) and their interaction with data storage systems. Experience with network containerization (Docker, Singularity) in an HPC context for data processing and application deployment. Solid working knowledge or Linux and Python scripting a plus. Previous experience with network automation tools such as Ansible, Puppet, or Chef a plus. Experience with machine learning or data science workflows in HPC environments a plus. Managed Services or consulting experience is required. Strong background with customer service High level problem-solving and communication skills Strong oral and written communications skills Related network certifications are a bonus.

Mock Interview

Practice Video Interview with JobPe AI

Start Network Engineering Interview Now

My Connections Ahead

Download Chrome Extension (See your connection in the Ahead )

chrome image
Download Now

RecommendedJobs for You