Voice AI Engineer

2 years

0 Lacs

Posted:3 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

About Lyzr


Lyzr is a full-stack agent infrastructure platform that helps enterprises build, deploy, and govern autonomous AI agents across sales, marketing, operations, support, and more. Customers use Lyzr to run production agents in their own cloud (AWS / GCP / on-prem) with enterprise-grade observability, security, and control.


Voice is becoming one of the most important “front doors” to these agents — from AI SDRs doing outbound calls to service agents handling complex customer conversations. That’s where you come in.


About the Role


Senior AI Voice Engineer


You’ll own and scale our end-to-end voice agent pipeline that powers AI SDRs, customer support agents, and internal automation agents on calls. This is a hands-on, highly technical role where you’ll design and optimize low-latency, high-reliability voice systems on top of the Lyzr agent platform.


You’ll work closely with our founders, product, and platform teams, with significant ownership over architecture, benchmarks, and how voice shows up across all Lyzr agents.



What You’ll Do


  • Own the voice stack end-to-end

    – from telephony / WebRTC entrypoints to STT, turn-taking, LLM reasoning, and TTS back to the caller.
  • Design for real-time

    – architect and optimize streaming pipelines for

    sub-second latency

    , barge-in, interruptions, and graceful recovery on bad networks.
  • Integrate and tune models

    – evaluate, select, and integrate STT/TTS/LLM/VAD providers (and self-hosted models) for different use-cases, balancing quality, speed, and cost.
  • Build orchestration & tooling

    – implement agent orchestration logic, evaluation frameworks, call simulators, and dashboards for latency, quality, and reliability.
  • Harden for production

    – ensure high availability, observability, and robust fault-tolerance for thousands of concurrent calls in customer VPCs.
  • Collaborate with GTM teams

    – work with product, sales, and customer teams to prototype new voice experiences (AI SDRs, support agents, internal hotlines) and take them from PoC to production.
  • Shape the voice roadmap

    – influence how voice fits into our broader Agentic OS vision (simulation, analytics, multi-agent collaboration, etc.).


You’re a Great Fit If You Have


  • 2+ years

    of software engineering experience (backend or full-stack) in production systems.
  • Strong experience building

    real-time voice agents

    or similar systems using:
  • STT / ASR

    (e.g. Whisper, Deepgram, Assembly, AWS Transcribe, GCP Speech)
  • TTS

    (e.g. ElevenLabs, PlayHT, AWS Polly, Azure Neural TTS)
  • VAD / turn-taking

    and streaming audio pipelines
  • LLMs

    (e.g. OpenAI, Anthropic, Gemini, local models)
  • Proven track record designing and operating

    low-latency, high-throughput

    streaming systems (WebRTC, gRPC, websockets, Kafka, etc.).
  • Hands-on experience integrating

    ML models into live, user-facing applications

    with real-time inference & monitoring.
  • Solid backend skills with

    Python

    and

    TypeScript/Node.js

    ; strong fundamentals in distributed systems, concurrency, and performance optimization.
  • Experience with

    cloud infrastructure

    – especially

    AWS

    (EKS, ECS, Lambda, SQS/Kafka, API Gateway, load balancers).
  • Comfortable working in

    Kubernetes / Docker

    environments, including logging, metrics, and alerting.
  • Startup DNA

    – at least 2 years in an early or mid-stage startup where you shipped fast, owned outcomes, and worked close to the customer.


Nice to Have


  • Experience

    self-hosting

    AI models (ASR / TTS / LLMs) and optimizing them for latency, cost, and reliability.
  • Telephony integration experience (e.g.

    Twilio, Vonage, Aircall, SignalWire

    , or similar).
  • Experience with

    evaluation frameworks

    for conversational agents (call quality scoring, hallucination checks, compliance rules, etc.).
  • Background in

    speech processing

    ,

    signal processing

    , or

    dialog systems

    .
  • Experience deploying into

    enterprise VPC / on-prem

    environments and working with security/compliance constraints.


Why Lyzr


  • Massive leverage:

    Your work becomes the voice of multiple agents across banks, PE firms, manufacturers, and global enterprises.
  • Greenfield voice platform:

    You’re not just plugging into a legacy stack — you’re shaping how voice is done in an Agentic OS from the ground up.
  • High ownership:

    Direct access to founders and customers. You’ll see your work ship fast and impact real revenue.
  • Deep tech + real usage:

    This is where cutting-edge LLMs, voice, and serious enterprise use-cases meet.


Mock Interview

Practice Video Interview with JobPe AI

Start TypeScript Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You

bengaluru, karnataka, india

bengaluru, karnataka, india