AI/ML Engineer – ASR, Speech Enhancement, Computer Vision & Cloud Deployment

0 years

0 Lacs

Posted:2 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Employment type: Full time

Location: Kochi, Kerala


FriskaAi


Job Summary


Automatic Speech Recognition (ASR), Speech Enhancement, Computer Vision (CV), and Scalable Cloud Deployment

real-time, low-latency, speech-driven and audio-visual AI applications


Key Responsibilities


ASR and Speech recognition

  • Design, train and optimize ASR models (Whisper, Conformer, Wav2Vec2, SpeechBrain, etc.) with focus on

    speaker adaptation and impaired speech recognition

    .
  • Implement domain adaptation techniques (fine-tuning, transfer learning, LoRA) for

    accent, dialect, multilingual, and special-use cases

    .
  • Develop and integrate

    speaker recognition, diarization & personalization modules

    .
  • Work with

    real-time/streaming ASR, VAD, and low-latency decoding

    .


Signal processing and speech enhancement

  • Apply

    speech enhancement, noise reduction, denoising, dereverberation, and echo cancellation

    algorithms to improve input quality.
  • Work with feature extraction methods (

    MFCC, PLP, spectrogram analysis, filterbanks

    ) for robust ASR performance.
  • Handle

    far-field, multi-mic audio and beamforming-based processing

    .


Computer vision and lip reading

  • Develop and optimize

    computer vision models

    using

    CNNs, Vision Transformers, YOLO, and OpenCV

    .
  • Build and deploy

    face detection, face tracking, face recognition, and lip-reading (visual speech recognition) systems

    .
  • Develop

    audio-visual speech recognition (ASR + lip reading)

    for noisy and real-time environments.
  • Optimize CV models for

    real-time video inference

    .


Model Optimization

  • Apply

    quantization, pruning, knowledge distillation, and LoRA

    for

    low-latency, resource-efficient deployment

    .
  • Optimize models using

    ONNX, TensorRT, TorchScript

    .
  • Balance trade-offs between

    accuracy, speed, and scalability

    .


Cloud and Deployment

  • Deploy AI/ASR/CV pipelines on

    Azure (AKS, Cognitive Services, Functions)

    and

    GCP (Vertex AI, Cloud Run, BigQuery)

    .
  • Build scalable APIs and services for

    real-time speech and video processing

    .
  • Use

    Docker/Kubernetes

    for containerization and orchestration.
  • Integrate with

    CI/CD pipelines

    for automated model retraining and updates.
  • Support

    autoscaling, GPU scheduling, and high-availability deployments

    .


Backend and Engineering Practices

  • Collaborate with backend engineers to integrate ASR and CV modules with

    production systems

    .
  • Build

    RESTful APIs and microservices

    for ASR, speech enhancement, face, and lip-reading tasks.
  • Handle

    audio/video ingestion, buffering, chunking, and timestamp alignment

    .
  • Follow best practices with

    Git, versioning, unit testing, and code reviews

    .
  • Ensure system

    reliability, monitoring, and logging

    for multimodal pipelines.


Evaluation and Continuous Learning

  • Evaluate ASR models with

    WER, CER, SER, RTF

    and CV/lip-reading models with

    precision, recall, mAP, FPS

    .
  • Incorporate

    user feedback loops

    for continuous improvement.
  • Stay updated with latest research in

    ASR, Speech AI, Computer Vision, and Cloud AI services

    .


Required skills and Qualifications

  • Strong experience with

    ASR frameworks

    (Whisper, Conformer, Wav2Vec2, NeMo, Riva, Coqui STT, Kaldi).
  • Strong experience with

    Computer Vision & lip-reading frameworks

    (OpenCV, YOLO, CNNs, Vision Transformers).
  • Solid background in

    speech signal processing

    (MFCC, PLP, spectrograms, denoising, beamforming, echo cancellation).
  • Hands-on with

    Deep Learning frameworks

    (PyTorch, TensorFlow, SpeechBrain).Experience with

    streaming ASR, diarization, VAD, and real-time inference pipelines

    .
  • Cloud deployment experience:

    Azure AI, GCP Vertex AI, AWS (bonus)

    .
  • Proficiency in

    Python (FastAPI/Flask/Django)

    for backend service integration.


Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You