AI Engineer - Voicebot

3 years

10 - 14 Lacs

Posted:3 weeks ago| Platform: GlassDoor logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

We are looking for a highly skilled Real-Time Voice AI Engineer with hands-on experience in building real-time voice bots, speech pipelines, and streaming AI systems using open-source technologies. This role involves working across the stack—speech recognition, voice generation, WebRTC streaming, ML deployment, and scalable Python microservices.

If you are passionate about building low-latency, high-performance voice experiences without relying on OpenAI, Azure, or Amazon Lex — this is the perfect role for you.

Key ResponsibilitiesReal-Time Voice & Streaming Systems

  • Develop and maintain real-time voice bots using WebRTC, LiveKit, WebSockets, and telephony integrations.
  • Build and optimize low-latency voice streaming pipelines for conversational AI applications.
  • Ensure secure, robust, and scalable communication channels across all voice systems.

Speech & Generative AI Pipelines

  • Build ASR pipelines using Whisper, VAD, turn detection, and custom inference modules.
  • Implement, fine-tune, and deploy Generative AI models using Ollama, vLLM, HuggingFace, etc.
  • Optimize TTS streaming using open-source models such as Orpheus, Spark, Fish Audio, or similar.

Backend & ML Infrastructure

  • Develop Python-based inference microservices and MLOps pipelines for scalable deployment.
  • Optimize models via quantization, caching, and GPU-aware serving.
  • Ensure real-time performance with low-latency audio, fast inference, and efficient resource utilization.

Required Skills & Experience

  • 3+ years of relevant experience in real-time AI/voice streaming development (must be open-source focused).
  • Strong proficiency in Python, including multiprocessing, async programming, and microservice architecture.
  • Hands-on experience with:
  • ASR/TTS systems (Whisper, VAD, diarization, TTS models)
  • WebRTC, LiveKit, WebSockets for real-time voice applications
  • Generative AI, model fine-tuning, quantization, and HuggingFace ecosystem
  • MLOps tools, scalable inference systems, and production deployment
  • Experience with telephony integration (SIP, PSTN, Twilio, Asterisk, etc.) is a strong advantage.
  • Strong understanding of low-latency system design, GPU optimization, and secure streaming

Job Types: Full-time, Permanent

Pay: ₹1,000,000.00 - ₹1,400,000.00 per year

Benefits:

  • Provident Fund

Experience:

  • total work: 2 years (Preferred)
  • AI: 2 years (Preferred)
  • voice streaming development : 2 years (Preferred)

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You