Principal Engineer - HPC/CUDA/GPU

5 - 9 years

0 Lacs

Posted:2 days ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

You will play a pivotal role in the design and implementation of cutting-edge GPU computers optimized for demanding deep learning, high-performance computing, and computationally intensive workloads. Your expertise will be essential in identifying architectural enhancements and innovative approaches to accelerate our deep learning models. Addressing strategic challenges related to compute, networking, and storage design for large-scale, high-performance workloads will be a key responsibility. Additionally, you will contribute to effective resource utilization in a heterogeneous computing environment, evolve our cloud strategy, perform capacity modeling, and plan for growth across our products and services. As an architect, you are tasked with translating business requirements pertaining to AI-ML algorithms into a comprehensive set of product objectives encompassing workload scenarios, end user expectations, compute infrastructure, and execution timelines. This translation should culminate in a plan to operationalize the algorithms efficiently. Furthermore, you will be responsible for benchmarking and optimizing Computer Vision Algorithms and Hardware Accelerators based on performance and quality KPIs. Your role will involve fine-tuning algorithms for optimal performance on GPU tensor cores and collaborating with cross-functional teams to streamline workflows spanning data curation, training, optimization, and deployment. Providing technical leadership and expertise for project deliverables is a core aspect of this position, along with leading, mentoring, and managing the technical team to ensure successful outcomes. Your contributions will be instrumental in driving innovation and achieving project milestones effectively. Key Qualifications: - Possess an MS or PhD in Computer Science, Electrical Engineering, or a related field. - Demonstrated expertise in deploying complex deep learning architectures. - Minimum of 5 years of relevant experience in areas such as Machine Learning (with a focus on Deep Neural Networks), DNN adaptation and training, code development for DNN training frameworks (e.g., Caffe, TensorFlow, Torch), numerical analysis, performance analysis, model compression, optimization, and computer architecture. - Strong proficiency in data structures, algorithms, and C/C++ programming. - Hands-on experience with PyTorch, TensorRT, CuDNN, GPU computing (CUDA, OpenCL, OpenACC), and HPC (MPI, OpenMP). - Thorough understanding of container technologies like Docker, Singularity, Shifter, Charliecloud. - Proficient in Python programming, bash scripting, and operating systems including Windows, Ubuntu, and Centos. - Excellent communication, collaboration, and problem-solving skills. Good To Have: - Practical experience with HPC cluster job schedulers such as Kubernetes, SLURM, LSF. - Familiarity with cloud computing architectures. - Hands-on exposure to Software Defined Networking and HPC cluster networking. - Working knowledge of cluster configuration management tools like Ansible, Puppet, Salt. - Understanding of fast, distributed storage systems and Linux file systems for HPC workloads. This role offers an exciting opportunity to contribute to cutting-edge technology solutions and make a significant impact in the field of deep learning and high-performance computing. If you are a self-motivated individual with a passion for innovation and a track record of delivering results, we encourage you to apply.,

Mock Interview

Practice Video Interview with JobPe AI

Start Machine Learning Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You