Jobs
Interviews

5 High-Performance Computing Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

6.0 - 11.0 years

8 - 13 Lacs

Chennai, Bengaluru

Work from Office

Your Opportunity As an HPC Architect , you will get the opportunity to architect high-performance computing solutions from scratch and design/ optimize all aspects (Compute , Memory, Network ing , Storage) for better cost of Ownership. Roles and Responsibility As an architect, you will be responsible fordesigning HPC infrastructure solutions, including compute, networking, storage, and workload management components. You will work closely with cross-functional teams, including Hardware, Software, product management, and business stakeholders, to understandcomputeworkload and translate theminto Platformarchitecture and designs that meet business needs. You will create and maintain detailed system architecture diagrams and specifications. You will evaluate and select appropriate hardware and software components for HPC environments You will Install, configure, and maintain HPC systems, including hardware, software, and networking components You will develop and implement automation scripts for system management and deployment. You will be a subject Matter expert to unblock dependent teams in the HPC domain. You will be expected to develop system benchmarks, profile systems to understand bottlenecks, optimize workflows and processes to improve cost of ownership. Identify and mitigate technical risks and issues throughout the HPC development life cycle. Ensure that ComputeCluster is resilient, reliable, and maintainable. You will be expected to stay abreast of the latest HPC technologies, including Hardware, Software and Networking Solutions Your primary focus will be to understand thecomputeworkload and design HPC cluster with right combination of Nodes, CPU/GPU, Memory, Interconnects and storageto have optimum performance at minimum cost of Ownership. Our Ideal Candidate Someone who has the drive and passion to learn quickly , has the ability to multi- task and switch contexts based on business needs . Qualifications In-depth experience with Linux System administration and Hardware/Software Configuration. Strong knowledge of HPC technologies including cluster computing, high speed interconnects (InfiniBand, RoCE), parallel filesystems (Lustre, GPFS, BeeGFSetc) Experience in creating, maintaining Operating System images with different installation and boot schemes Extremely good with automation tools like Ansible, Chef, Salt-Stack and Scripting languages (Python and Bash) Experience in Creating,maintaining Storage Solutions with different RAID configuration. Ability to design storage solution for different IOPS, Access patterns (Random vs Sequential RW) and tune storage and filesystemsfor better performance. Good of knowledgeNetworking concepts including IP addressing, routing, protocols and Switch configuration for RDMA, VLAN configuration, network bonding etc. Good Knowledge Virtualization, Hardware and Software Hypervisors Good knowledge of containerization technologies like docker, singularity. Experience in Software Defined Networking and Storage. Experience in setting-up remote management protocols like IPMI, Redfish etc. Experience in setting-up and using monitoring systems like Prometheus, Grafana. Experience System profiling and custom tuningfor targetworkloadfor higher performance and low cost of ownership Very good written and verbal communication skills. Very goodinTechnical documentation meant to serve as manuals for non-experts in the field. Additional Qualifications: Experience in HPC Cluster management and Work-load orchestration software (e.g.SLURM, Torque, LSF) Experience in Setting-up Deep-learning training/inference solutions. Experience in Private cloud infrastructure like Kubernetes, OpenStack,CloudStack etc. Experience in DistributedHigh Performance Computing and Parallel programming frameworks Good knowledge of Low-latency and high-throughput data transfer technologies(RDMA on RoCE, InfiniBand) Education : Bachelor's Degree or higher in Computer science or related Disciplines.

Posted 1 month ago

Apply

8.0 - 12.0 years

8 - 12 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

8+ years of experience in managing Linux setup. 4+ years of Experience in HPC/ Linux clusters. Install, administer, and maintain hardware, system software, networking, accounts, and security measures on VMWare configuration. Diagnose and correct system issues, whether these be issues with correct operation or performance. Reinstate integrity of system as quickly as possible following an outage in order to minimize downtime. Triage and solve user-submitted tickets, especially when they relate to the infrastructure. Track resource usage using monitoring and queuing software. Actively participate in Knowledge Management by creating new technical documents. Patch system firmware and software as needed. Peer assistance is an added trait. What you need to bring: Technical Skills: Demonstrated expertise with Linux system administration, including OS, networking, storage, and security. Expertise with high-speed networking such as InfiniBand and 10/40 Gigabit Ethernet. Expertise with high speed file transfer tools such as file catalyst Familiarity with large storage systems Some experience in scripting language Proven expertise in Hypervisor Knowledge of Horizon is preferred Experience with Linux clusters Troubleshooting Knowledge on ESXi and vCenter performance issues. Knowledge on Virtual Machine snapshots and VMware VDP Understanding of VMware Site Recovery Manager for disaster recovery Business Skills: Demonstrate strong written and verbal communication skills. Interacting and collaborating across different technology teams within HPE. Must work towards achieving HPEs vision for our customers. Affinity and a thorough understanding of support processes defined within HPE. Ability to work in a 24x7 environment in rotation shifts Exhibit Customer First and Customer Last Attitude consistently. Ability to drive cases to closure and provide Case Summary. Demonstrate high level of technical & communication skills. Takes responsibility for end-to-end problem ownership and its solutions. Mandatory Key Skills Ethernet, file catalyst, Hypervisor, VMware ESXi, VMware VDP, VMware Site Recovery Manager, networking, Linux system administration*,InfiniBand*,High-Performance Computing

Posted 1 month ago

Apply

6.0 - 10.0 years

3 - 11 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

Designing software to support large-scale geometric data analysis and high-performance computing for OPC solutions. Optimizing infrastructure for distributed computing, ensuring seamless GPU integration. Collaborating with development teams to ensure efficient data handling and computational resource allocation. Debugging and troubleshooting infrastructure issues related to production line integration. Maintaining and troubleshooting the tool to meet performance and scalability requirements. Regularly contributing to the cutting-edge of semiconductor development by enhancing software performance and scalability. The Impact You Will Have: Advancing the development of high-performance silicon chips and software content. Enabling leading IC manufacturing through efficient software solutions. Contributing to the optimization of infrastructure for distributed computing. Ensuring seamless integration and operation of infrastructure components. Improving software performance and scalability for large-scale data analysis. Enhancing the overall efficiency and effectiveness of semiconductor development processes. What You ll Need: M.S. or Ph.D. in Computer Science, Engineering, or the Physical Sciences. 6+ years of experience in software development, with a focus on computational geometry and distributed processing. Expertise in C++, Python, and distributed computing environments. Experience in debugging and troubleshooting production-related issues. Strong communication and collaboration skills to work as part of a global team. Who You Are: A proactive problem solver with a passion for innovation. Detail-oriented with a focus on optimizing performance and scalability. An effective communicator with the ability to collaborate across teams. A self-motivated individual who can work independently with limited supervision. A sophisticated professional with advanced knowledge and wide-ranging experience.

Posted 2 months ago

Apply

6 - 8 years

18 - 20 Lacs

Bengaluru

Work from Office

High-Performance Computing (HPC) NetApp Storage Systems Azure NetApp Files and Cloud Volumes ONTAP Linux Systems Administration Automation Performance Monitoring and Optimization.

Posted 2 months ago

Apply

5 - 10 years

20 - 22 Lacs

Bengaluru

Work from Office

High-Performance Computing (HPC) Parallel Filesystems Low Latency Networks Cloud Platforms Infrastructure as Code (IaC) Linux Fundamentals Networking and Virtualization Security and Compliance Programming Skills: Proficiency in programming languages such as Python, C++, or Java, which are often used in HPC applications.

Posted 2 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies