Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
5.0 - 8.0 years
55 - 60 Lacs
bengaluru
Work from Office
Key Skills: CUDA, GPU Kernels, C++, Python, GPU Optimization, Triton, ROCm, PyTorch Extensions, Distributed Inference, Mixed Precision. Roles & Responsibilities: Develop, optimize, and maintain GPU kernels (CUDA, Triton, ROCm) for diffusion, attention, and convolution operators in generative AI models. Profile end-to-end inference pipelines (data movement, kernel scheduling, memory transfers) to identify and resolve performance bottlenecks. Apply optimization techniques such as operator fusion, tiling, caching, and mixed-precision compute to maximize GPU throughput. Collaborate with research teams to productionize experimental layers or model architectures. Build benchmarking tools and micro...
Posted 6 hours ago
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
137681 Jobs | Dublin
Wipro
43475 Jobs | Bengaluru
EY
34904 Jobs | London
Accenture in India
32278 Jobs | Dublin 2
Uplers
25774 Jobs | Ahmedabad
Turing
24356 Jobs | San Francisco
IBM
21415 Jobs | Armonk
Accenture services Pvt Ltd
20351 Jobs |
Capgemini
20341 Jobs | Paris,France
Infosys
20250 Jobs | Bangalore,Karnataka