AI Scientist – Conversational & Voice Intelligence

0 years

0 Lacs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

About Zudu AI

human-like voice automation

Role Overview

AI Scientist

Key Responsibilities

  • Research & Development

  • Develop and fine-tune speech recognition (ASR)

    ,

    text-to-speech (TTS)

    , and

    natural language understanding (NLU)

    models.
  • Build

    multi-turn conversational AI systems

    capable of contextual reasoning, grounding, and emotional intelligence.
  • Explore and implement

    retrieval-augmented generation (RAG)

    and

    memory-based dialogue systems

    for long-term contextuality.
  • Research

    prosody, emotion, and style transfer

    in speech synthesis for natural human-like delivery.
  • Evaluate and integrate open-source models (e.g., Whisper, Bark, FastSpeech, VITS, GPT-family models).
  • System & Data Integration

  • Work closely with platform engineers to deploy models in low-latency, production-grade

    environments.
  • Optimize

    inference performance

    on cloud and edge systems using quantization, distillation, and caching strategies.
  • Collaborate with the voice pipeline team to align model outputs with

    telephony and CPaaS audio protocols

    .
  • Experimentation & Evaluation

  • Design and conduct experiments to benchmark accuracy, naturalness, latency, and engagement

    .
  • Lead

    data annotation and synthetic voice data generation

    projects to improve training quality.
  • Publish findings internally (and externally when possible) to maintain Zudu AI’s leadership in enterprise voice automation.

Preferred Qualifications

  • Master’s or Ph.D. in

    Computer Science, AI, Speech Processing, or Computational Linguistics

    .
  • Strong background in

    deep learning

    ,

    transformer architectures

    , and

    reinforcement learning

    .
  • Hands-on experience with

    PyTorch

    ,

    TensorFlow

    , or

    JAX

    .
  • Expertise in one or more of the following areas:
  • Automatic Speech Recognition (ASR)
  • Text-to-Speech (TTS)
  • Natural Language Processing (NLP)
  • Multimodal or Emotion-aware AI
  • Understanding of

    LLM fine-tuning

    ,

    prompt engineering

    , and

    retrieval systems (RAG, FAISS, Milvus)

    .
  • Experience deploying models with

    Kubernetes

    ,

    TorchServe

    , or

    ONNX Runtime

    .
  • Familiarity with

    speech datasets

    ,

    phoneme-level modeling

    , and

    accent adaptation

    .

Nice-to-Have

  • Prior experience in

    telephony or conversational AI products

    .
  • Contributions to

    open-source speech or NLP projects

    .
  • Understanding of

    low-latency voice streaming pipelines

    (e.g., gRPC, WebRTC).
  • Exposure to

    emotion detection

    ,

    sentiment analysis

    , or

    paralinguistic research

    .

Soft Skills

  • Curiosity and scientific rigor.
  • Ability to translate research into

    deployable, scalable solutions

    .
  • Collaborative mindset — able to work with engineers, product leads, and linguists.
  • Excellent written and verbal communication.


Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You