Posted:1 week ago|
Platform:
On-site
Full Time
What You’ll Do
Dive into model architectures (ASR / TTS / SLMs) and optimize them for specific GPUs and hardware profiles
Build, debug, and tune kernels using CUDA / Tinygrad / AMD toolchains
Convert, optimize, and benchmark models using TensorRT, ONNX, and other inference engines
Work hands-on with PyTorch to train, fine-tune, and evaluate real-time speech models
Run large-scale experiments, manage datasets, and analyze model performance at scale
Productionize models for ultra-low latency speech workloads
Collaborate with research, infra, and product teams to push models into production
Requirements
Strong experience with CUDA, Tinygrad, AMD GPU toolkit, or similar low-level GPU programming stacks
Hands-on proficiency with PyTorch and Python
Deep understanding of neural networks, training dynamics, and optimization
Experience handling and processing large datasets
Familiarity with production inference pipelines
Strong problem-solving skills with ability to go deep into performance bottlenecks
Great to Have
Experience training speech models (ASR, TTS, SSL, etc.)
Familiarity with audio encoders, decoders, waveform models
Experience with MLOps, experiment tracking, deployment pipelines
Training or fine-tuning models for production / published papers
Experience with TensorRT and ONNX Runtime
Smallest
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python Nowbengaluru, karnataka
Experience: Not specified
Salary: Not disclosed
bengaluru, karnataka
Experience: Not specified
Salary: Not disclosed