GPU Programming Engineer

3 years

0 Lacs

Saidapet, Chennai, Tamil Nadu

Posted:1 month ago| Platform: Indeed logo

Apply

Skills Required

programming software development code ai architecture portfolio learning video processing mobile algorithms cuda opencl tuning ml integration analyze profiling benchmarking optimization scalability electrical engineering linux

Work Mode

On-site

Job Type

Full Time

Job Description

Job Information Department Name Platforms & Compilers Job Type Full time Date Opened 14/05/2025 Industry Software Development Minimum Experience In Years 3 Maximum Experience In Years 5 City Saidapet Province Tamil Nadu Country India Postal Code 600089 About Us MulticoreWare is a global software solutions & products company with its HQ in San Jose, CA, USA. With worldwide offices, it serves its clients and partners in North America, EMEA and APAC regions. Started by a group of researchers, MulticoreWare has grown to serve its clients and partners on HPC & Cloud computing, GPUs, Multicore & Multithread CPUS, DSPs, FPGAs and a variety of AI hardware accelerators. MulticoreWare was founded by a team of researchers that wanted a better way to program for heterogeneous architectures. With the advent of GPUs and the increasing prevalence of multi-core, multi-architecture platforms, our clients were struggling with the difficulties of using these platforms efficiently. We started as a boot-strapped services company and have since expanded our portfolio to span products and services related to compilers, machine learning, video codecs, image processing and augmented/virtual reality. Our hardware expertise has also expanded with our team; we now employ experts on HPC and Cloud Computing, GPUs, DSPs, FPGAs, and mobile and embedded platforms. We specialize in accelerating software and algorithms, so if your code targets a multi-core, heterogeneous platform, we can help. Job Description Job Summary We are seeking an experienced GPU Programming Engineer to join our team. In this role, you will focus on developing, optimizing, and deploying GPU-accelerated solutions for high-performance machine learning workloads. The ideal candidate has strong expertise in GPU programming across one or more platforms (e.g., NVIDIA CUDA, AMD ROCm/HIP, or OpenCL) and is comfortable working at the intersection of parallel computing, performance tuning, and ML system integration. Key Responsibilities Develop, optimize, and maintain GPU-accelerated components for machine learning pipelines using frameworks such as CUDA, HIP, or OpenCL Analyze and improve GPU kernel performance through profiling, benchmarking, and resource optimization. Optimize memory access, compute throughput, and kernel execution to improve overall system performance on the target GPUs. Port existing CPU-based implementations to GPU platforms while ensuring correctness and performance scalability. Work closely with system architects, software engineers, and domain experts to integrate GPU-accelerated solutions. Required Qualifications Bachelor's or master's degree in computer science, Electrical Engineering, or a related field. 3+ years of hands-on experience in GPU programming using CUDA, HIP, OpenCL, or other GPU compute APIs. Strong understanding of GPU architecture, memory hierarchy, and parallel programming models. Proficiency in C/C++ and hands-on experience developing on Linux-based systems. Familiarity with profiling and tuning tools such as Nsight, rocprof, or Perfetto. Preferred Qualifications Familiarity with cuDNN, TensorRT, OpenCL, or other GPU computing libraries.

Mock Interview

Practice Video Interview with JobPe AI

Start Programming Interview Now

RecommendedJobs for You

Saidapet, Chennai, Tamil Nadu