AI Engineer (Voice AI & Real-Time Communication)

2 - 7 years

15 - 30 Lacs

Posted:1 week ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Voice AI & Real-Time Communication Engineer (Junior to Senior)

Aivar Innovations

Location:

Job Overview

Aivar Innovations is seeking Voice AI and Real-Time Communication Engineers/Architects to join our innovative team, focusing on building production-grade conversational AI systems and voice automation solutions. This role involves designing and implementing state-of-the-art voice agents that handle real-time audio processing, speech recognition, synthesis, and natural conversations at scale.

As part of the Aivar Innovations team, candidates will architect voice systems capable of powering thousands of conversations daily, maintaining sub-second latency, and gracefully handling production edge cases across diverse industry verticals.

Key Responsibilities

  • Design, develop, and deploy production-grade voice AI agents using frameworks like

    Pipecat

    and

    LiveKit

    [
  • Build and optimize real-time voice processing pipelines with

    Speech-to-Text (STT)

    and

    Text-to-Speech (TTS)

    technologies
  • Implement

    speaker diarization

    systems to identify and segment multiple speakers in conversations
  • Develop voice communication infrastructure using

    WebRTC

    ,

    WebSocket

    ,

    SIP

    , and

    RTP

    protocols
  • Integrate voice agents with telephony systems (Twilio, Telnyx) for inbound and outbound calling
  • Architect low-latency, high-availability voice systems handling thousands of concurrent calls
  • Build voice orchestration layers connecting STT, LLMs, and TTS with minimal latency
  • Implement

    Voice Activity Detection (VAD),

    echo cancellation, and audio processing optimizations
  • Deploy and monitor voice AI applications in production cloud environments
  • Collaborate with product and engineering teams to define voice AI use cases and implement solutions

Required Technical Skills

Voice AI & Conversational Systems

  • Hands-on experience building voice agents using

    Pipecat

    ,

    LiveKit

    , or similar frameworks
  • Deep expertise in

    Speech-to-Text (STT)

    systems: Deepgram, Whisper, AssemblyAI, Google STT, Azure Speech
  • Proficiency with

    Text-to-Speech (TTS)

    platforms: ElevenLabs, Cartesia, Amazon Polly, Azure TTS
  • Experience with

    speaker diarization

    and utterance segmentation for multi-speaker scenarios
  • Knowledge of voice agent orchestration platforms (VAPI, Retell) and custom implementations

Real-Time Communication Protocols

  • Strong understanding of

    WebRTC

    architecture including ICE, STUN, TURN, SRTP
  • Experience with

    SIP (Session Initiation Protocol)

    and

    RTP/RTCP

    for VoIP systems
  • Proficiency in

    WebSocket

    communication for real-time bidirectional data transfer
  • Knowledge of telephony integration, call routing logic, and media servers (FreeSWITCH, Asterisk)

Audio Processing & Media

  • Experience with audio codecs (Opus, G.711, G.729) and media streaming protocols
  • Understanding of

    Voice Activity Detection (VAD)

    , echo cancellation, and noise suppression
  • Knowledge of audio processing pipelines and real-time media handling[

LLM Integration & AI

  • Deep knowledge of

    Large Language Models (LLMs)

    and optimization for low-latency responses
  • Experience integrating conversational AI with voice pipelines (GPT-4o, Claude, etc.)
  • Prompt engineering and conversation design for natural voice interactions

Programming & Development

  • Strong programming skills in

    Python

    (primary),

    TypeScript

    ,

    Node.js

    , or

    Golang

  • Proficiency with AI/ML frameworks:

    TensorFlow

    ,

    PyTorch

    ,

    Scikit-learn

  • Experience with real-time streaming systems and distributed architectures

Cloud & Infrastructure (Preferred)

  • AWS Services:

    Lambda, EC2, EKS, S3, DynamoDB, Amazon Bedrock, Polly, Transcribe
  • Knowledge of containerization (Docker, Kubernetes) and CI/CD pipelines
  • Experience deploying voice AI systems in production with monitoring and observability

Required Qualifications

  • Specialized experience building

    production voice AI systems

    handling real customers at scale
  • Demonstrated track record with

    conversational AI

    ,

    voice agents

    , and

    real-time communication

  • Portfolio showcasing voice AI implementations, including latency optimization and call handling
  • Experience maintaining sub-second latency and handling edge cases in production voice systems

Preferred Add-Ons

  • Experience with low-code voice platforms (VAPI, Retell) and custom infrastructure development
  • Knowledge of media servers and SFU/MCU architectures for scalable voice systems
  • Familiarity with Indian language models for STT/TTS applications
  • Experience with voice analytics, call transcription, and quality monitoring systems
  • Background in VoIP development with Asterisk, FreeSWITCH, Kamailio, or OpenSIPS
  • Certifications in speech technologies or cloud platforms (AWS, Azure)

Preferred AWS Certifications

While AWS certifications are not mandatory initially, candidates possessing relevant certifications will be given preference:

  • AWS Certified Solutions Architect Associate
  • AWS Certified Machine Learning Specialty
  • AWS Certified Solutions Architect Professional
  • AWS Certified AI Practitioner

Essential Soft Skills

  • Exceptional Communication:

    Ability to present complex voice AI architectures to technical and non-technical stakeholders
  • Collaborative Leadership:

    Proven experience working with cross-functional teams including product, operations, and customer-facing roles
  • Innovative Problem-Solving:

    Demonstrated ability to tackle production voice system challenges with creative solutions
  • Bias Toward Action:

    Ship daily, measure success by business impact, and iterate rapidly
  • Adaptability:

    Capacity to learn emerging voice AI technologies and frameworks in fast-paced environments

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You