AI/ML Engineer – Speech & TTS

0 years

0 Lacs

Posted:23 hours ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

real-time Voice & TTS systems


passionate AI/ML Engineer


Key Responsibilities:


TTS Development & Optimization

  • Design, train, and fine-tune text-to-speech (TTS) models

    with

    very low inference latency

    suitable for real-time applications.
  • Work on

    end-to-end pipelines

    – text normalization, phoneme conversion, acoustic modeling, vocoder optimization.
  • Implement

    streaming & low-latency architectures

    for speech generation.


Multilingual Speech Systems

  • Build scalable TTS engines supporting all major Indian languages

    .
  • Handle

    code-mixing & transliteration

    (e.g., Hinglish, Tanglish).
  • Train

    multilingual or cross-lingual TTS models

    leveraging transfer learning.


Model Training & Deployment

  • Collect, clean, and preprocess large-scale Indian language speech datasets

    .
  • Train models using frameworks like

    PyTorch, TensorFlow, or JAX

    .
  • Optimize for

    GPU/TPU inference

    with libraries like

    ONNX, TensorRT, OpenVINO

    .
  • Deploy models in

    real-time APIs

    (FastAPI, gRPC, WebRTC, LiveKit, etc.).


Research & Innovation

  • Experiment with latest speech synthesis architectures

    (FastSpeech2, VITS, Glow-TTS, HiFi-GAN, etc.).
  • Improve

    voice naturalness, prosody, and expressiveness

    while keeping

    latency <150ms

    .
  • Explore

    few-shot or zero-shot voice cloning

    for Indian voices.


Required Skills & Qualifications:

  • Strong background in

    AI/ML, Speech Processing, or Computational Linguistics

    .
  • Hands-on experience with

    TTS frameworks

    (Tacotron, FastSpeech, VITS, Glow-TTS, StyleTTS2, Vall-E, etc.).
  • Expertise in

    deep learning frameworks

    (PyTorch, TensorFlow, JAX).
  • Experience in

    low-latency inference optimization

    (ONNX, TensorRT, quantization, pruning).
  • Knowledge of

    Indian languages, phonetics, and linguistic diversity

    .
  • Experience with

    large dataset handling, audio preprocessing, feature extraction (MFCC, mel-spectrograms)

    .
  • Familiarity with

    streaming systems

    and

    real-time audio APIs

    .
  • Strong programming skills in

    Python, C++ (for inference optimization), and cloud deployment (GCP/AWS/Azure)

    .


Preferred / Nice-to-Have

  • Research publications or contributions in

    speech synthesis / TTS / ASR

    .
  • Experience with

    speech data augmentation

    techniques.
  • Prior work on

    cross-lingual speech models

    or

    Indian-language AI projects

    .
  • Familiarity with

    Rust/Go

    for real-time systems.
  • Experience with

    Voice Cloning, Emotion Modeling, and Prosody Transfer

    .


What We Offer

  • Opportunity to

    build India’s next-generation speech AI platform

    .
  • Work on

    challenging problems

    in

    real-time multilingual AI voice systems

    .
  • Competitive compensation with

    equity options

    (if applicable).
  • A fast-paced, collaborative startup culture with

    direct impact on product innovation

    .

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You