Principal Engineer - HPC/CUDA/GPU

5 - 9 years

0 Lacs

Posted:16 hours ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Role Overview: You will play a crucial role in designing and implementing cutting-edge GPU computers to support deep learning, high-performance computing, and computationally intensive workloads. Your expertise will be instrumental in identifying architectural changes and new approaches to accelerate deep learning models. As a key member of the team, you will address strategic challenges related to compute, networking, and storage design for large-scale workloads, ensuring effective resource utilization in a heterogeneous computing environment. Key Responsibilities: - Provide leadership in designing and implementing GPU computers for demanding deep learning, high-performance computing, and computationally intensive workloads. - Identify architectural changes and new approaches to accelerate deep learning models. - Address strategic challenges related to compute, networking, and storage design for large-scale workloads. - Convert business needs associated with AI-ML algorithms into product goals covering workload scenarios, end user expectations, compute infrastructure, and time of execution. - Benchmark and optimize Computer Vision Algorithms and Hardware Accelerators for performance and quality KPIs. - Optimize algorithms for optimal performance on GPU tensor cores. - Collaborate with various teams to drive an end-to-end workflow from data curation and training to performance optimization and deployment. - Provide technical leadership and expertise for project deliverables. - Lead, mentor, and manage the technical team. Qualifications Required: - MS or PhD in Computer Science, Electrical Engineering, or related field. - Strong background in deployment of complex deep learning architectures. - 5+ years of relevant experience in machine learning with a focus on Deep Neural Networks. - Experience adapting and training DNNs for various tasks. - Proficiency in developing code for DNN training frameworks such as Caffe, TensorFlow, or Torch. - Strong knowledge of Data structures and Algorithms with excellent C/C++ programming skills. - Hands-on expertise with PyTorch, TensorRT, CuDNN, GPU computing (CUDA, OpenCL, OpenACC), and HPC (MPI, OpenMP). - In-depth understanding of container technologies like Docker, Singularity, Shifter, Charliecloud. - Proficient in Python programming and bash scripting. - Proficient in Windows, Ubuntu, and Centos operating systems. - Excellent communication and collaboration skills. - Self-motivated with the ability to find creative practical solutions to problems. Additional Details: The company is seeking individuals with hands-on experience in HPC cluster job schedulers such as Kubernetes, SLURM, LSF, familiarity with cloud computing architectures, Software Defined Networking, HPC cluster networking, and cluster configuration management tools like Ansible, Puppet, Salt. Understanding of fast, distributed storage systems and Linux file systems for HPC workload is also beneficial. Role Overview: You will play a crucial role in designing and implementing cutting-edge GPU computers to support deep learning, high-performance computing, and computationally intensive workloads. Your expertise will be instrumental in identifying architectural changes and new approaches to accelerate deep learning models. As a key member of the team, you will address strategic challenges related to compute, networking, and storage design for large-scale workloads, ensuring effective resource utilization in a heterogeneous computing environment. Key Responsibilities: - Provide leadership in designing and implementing GPU computers for demanding deep learning, high-performance computing, and computationally intensive workloads. - Identify architectural changes and new approaches to accelerate deep learning models. - Address strategic challenges related to compute, networking, and storage design for large-scale workloads. - Convert business needs associated with AI-ML algorithms into product goals covering workload scenarios, end user expectations, compute infrastructure, and time of execution. - Benchmark and optimize Computer Vision Algorithms and Hardware Accelerators for performance and quality KPIs. - Optimize algorithms for optimal performance on GPU tensor cores. - Collaborate with various teams to drive an end-to-end workflow from data curation and training to performance optimization and deployment. - Provide technical leadership and expertise for project deliverables. - Lead, mentor, and manage the technical team. Qualifications Required: - MS or PhD in Computer Science, Electrical Engineering, or related field. - Strong background in deployment of complex deep learning architectures. - 5+ years of relevant experience in machine learning with a focus on Deep Neural Networks. - Experience adapting and training DNNs for various tasks. - Proficiency in developing code for DNN training frameworks such as Caffe, TensorFlow, or Torch. - Strong knowledge of Data structures and Algorithms with excellent

Mock Interview

Practice Video Interview with JobPe AI

Start Deep Learning Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You