Speech AI Engineer (Voice AI & Multilingual Conversational Systems)

4 - 6 years

0 Lacs

Posted:1 week ago| Platform: Foundit logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Overview

We're building the future of voice intelligent, expressive, multilingual systems that can understand, respond, and connect with humans naturally.

As ourSpeech AI Engineer

deep learning, linguistics, and emotion modeling

Arabic, English, Hindi

This role is for those who thrive on pushing limits where milliseconds matter, and the voice you create will represent the next leap in humanAI interaction.

Key Responsibilities

  • Design and deploy

    ASR/STT systems

    using Whisper, NeMo, or Azure Speech.
  • Develop

    TTS pipelines

    capable of expressive, multilingual synthesis (Arabic, English, Hindi).
  • Fine-tune and customize models for

    dialects and prosody control

    , especially Emirati Arabic.
  • Build robust

    speech preprocessing

    , noise reduction, and diarization systems.
  • Integrate voice AI into live contact center flows with CRM and LLM backends.
  • Work with linguists and UX teams to build emotionally adaptive voice personas.
  • Optimize speech models for

    GPU inference

    ,

    batch streaming

    , and

    low-latency response

    .

Preferred Project Experience

  • Minimum 4 years

    of experience delivering results in fast-paced, high-pressure project environments
  • Delivered production-grade

    speech models

    (contact centers, IVRs, assistants).
  • Fine-tuned or trained

    custom TTS/ASR

    for specific regions or accents.
  • Integrated

    speech pipelines with LLMs

    in deployed systems.
  • Experience with

    voice analytics

    ,

    emotion detection

    , or

    speaker ID systems

    .
  • Provide a

    sample

    ,

    demo

    , or

    GitHub

    link showing your speech or voice-related project.

Minimum Qualification

  • Bachelor's degree in Computer Science, Electrical Engineering, or Linguistics with AI focus.

Preferred Qualification

  • Master's in Speech Processing, Computational Linguistics, or AI/ML.
  • Specialization in

    Signal Processing, Deep Learning for Audio

    , or

    Multilingual Speech Systems

    .
  • Research or thesis work in

    speech synthesis, accent modeling, or dialogue systems

    .

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You