Dive into model architectures (ASR / TTS / SLMs) and optimize them for specific GPUs and hardware profiles
Build, debug, and tune kernels using CUDA / Tinygrad / AMD toolchains
Convert, optimize, and benchmark models using TensorRT, ONNX, and other inference engines
Work hands-on with PyTorch to train, fine-tune, and evaluate real-time speech models
Run large-scale experiments, manage datasets, and analyze model performance at scale
Productionize models for ultra-low latency speech workloads
Collaborate with research, infra, and product teams to push models into production

Requirements

Strong experience with CUDA, Tinygrad, AMD GPU toolkit, or similar low-level GPU programming stacks
Hands-on proficiency with PyTorch and Python
Deep understanding of neural networks, training dynamics, and optimization
Experience handling and processing large datasets
Familiarity with production inference pipelines
Strong problem-solving skills with ability to go deep into performance bottlenecks

Great to Have

Experience training speech models (ASR, TTS, SSL, etc.)
Familiarity with audio encoders, decoders, waveform models
Experience with MLOps, experiment tracking, deployment pipelines
Training or fine-tuning models for production / published papers
Experience with TensorRT and ONNX Runtime

More Jobs at Smallest

Data Scientist (Speech) | BLR

bengaluru, karnataka

Experience: Not specified

Salary: Not disclosed

Technical Support Engineer | BLR

bengaluru

Experience: Not specified

INR 2 - 7 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Machine Learning Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.