Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
8.0 - 12.0 years
75 - 80 Lacs
bengaluru
Work from Office
Key Skills: Triton, C++, GPU Runtime Optimization, Multi-GPU Systems, TVM, XLA, MLIR, ROCm, Transformer Inference. Roles & Responsibilities: Architect high-performance inference runtimes, kernel dispatchers, and memory planners for large diffusion and transformer workloads. Lead investigations into cross-GPU performance bottlenecks, communication overheads, and scheduling inefficiencies. Drive multi-GPU parallelism strategies, including model, pipeline, and tensor parallelization. Establish company-wide GPU optimization standards, tooling, and SLIs. Collaborate with research teams to design scalable implementations of novel architectures. Mentor engineers in profiling, tuning, and low-level ...
Posted 3 weeks ago
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
146963 Jobs | Dublin
Wipro
46531 Jobs | Bengaluru
EY
37166 Jobs | London
Accenture in India
34066 Jobs | Dublin 2
Uplers
26668 Jobs | Ahmedabad
Turing
25985 Jobs | San Francisco
IBM
23102 Jobs | Armonk
Capgemini
21339 Jobs | Paris,France
Accenture services Pvt Ltd
21197 Jobs |
Infosys
21007 Jobs | Bangalore,Karnataka