MLEngneer-CUDA SDK

4 years

0 Lacs

Posted:3 months ago| Platform: SimplyHired logo

AI Match Score
Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Job Summary We’re looking for a highly skilled Machine Learning Engineer with extensive experience in CUDA SDK to enhance, port, and validate PyTorch-based Large Language Models (LLMs) for deployment on custom AI processors. This remote role offers the opportunity to work with cutting-edge hardware and collaborate with a cross-functional engineering team across the U.S. Key Responsibilities: Port and validate PyTorch-based LLMs on proprietary AI hardware using CUDA SDK APIs Extend and optimize CUDA code for compatibility and performance improvements Debug low-level integration issues between CUDA and PyTorch environments Replace off-the-shelf CUDA components with custom implementations as needed Develop tools and frameworks for validation and testing of LLMs Collaborate with AI hardware teams to ensure seamless deployment and performance tuning Profile and tune GPU kernels for speed, memory efficiency, and system scalability Required Qualifications: Bachelor’s or Master’s degree in Computer Science, Electrical Engineering , or a related field Strong hands-on experience with CUDA programming In-depth knowledge of PyTorch and large-scale deep learning models, especially LLMs Proficient in C++ and Python Experience debugging complex software and performance bottlenecks Solid understanding of GPU architectures and memory management Excellent problem-solving skills and communication abilities Preferred Qualifications: Familiarity with AI accelerator architectures Experience with TensorFlow or other deep learning frameworks Exposure to performance tuning tools (e.g., Nsight, nvprof, VTune) Experience working in remote, distributed engineering environments Personal Attributes: Passionate about AI/ML performance and innovation Strong attention to detail and ownership mindset Comfortable working independently in a fast-paced, virtual team Eager to solve real-world problems with cutting-edge technologies Job Type: Full-time Pay: ₹30.00 - ₹35.00 per hour Experience: CUDA programming: 4 years (Required) C++ and Python: 3 years (Required) PyTorch and LLM: 3 years (Required) Work Location: In person

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You