Artificial Intelligence Engineer gurugram,haryana,india 0 years None Not disclosed On-site Full Time

About VolumX VolumX is India’s first and leading company for digital humans, trusted by Oscar-winning studios and global brands. Using some of the most advance technologies, we are creating lifelike digital doubles for celebrities powering face replacements, de-aging, and autonomous digital humans. Now, we’re pushing into AI driven performance synthesis and we’re inviting talented engineers to join us in shaping the future of content and immersive interactions. Requirement - Strong experience in AI/ML model development (PyTorch, TensorFlow). - Practical experience with video generation, face replacement, digital doubles. - Understanding of Gaussian splats / NeRF / neural rendering concepts. - Knowledge of emotion recognition & sentiment analysis models. - Experience with real-time inference optimization (ONNX, TensorRT, quantization). - Background in speech processing (ASR, vocoders, prosody control) to train facial expression model . - Familiarity with lip-sync engines (e.g., Rhubarb, Wav2Lip, custom phoneme alignment). - Strong programming skills in Python - Experience in multimodal AI (audio + video + text). - Experience deploying AI in cloud environments (AWS/GCP/Azure, Docker, Kubernetes). - Hands-on expertise with STT/TTS frameworks (LiveKit, Whisper, Riva, Coqui TTS, Tacotron, FastSpeech, VITS, etc.). Responsibilities Face & Video Models - Train person-specific models for face reenactment, face swapping, and de-aging. - Build high-res, temporally consistent face replacement pipelines. - Test and implement neural rendering pipelines using Gaussians, Nerfs, Diffusion Lip Sync & Emotions - Implement or adapt lip-sync models (e.g. Wav2Lip-style, phoneme/viseme-based). - Research and integrate emotion recognition models (from audio/text input). - Map emotion states into facial rigs and lip-sync engines. - Interface outputs with rigs / MetaHumans via blendshapes or bone controls. Pipeline Engineering - Design low-latency inference pipelines for STT → NLU → TTS. - Optimize models for real-time streaming (GPU/TPU/Cloud deployment). - Work with backend engineers to expose AI services via APIs/WebSockets. Collaboration & Integration - Partner with Unreal engineers to sync AI outputs with Pixel Streaming. - Ensure smooth coordination between voice, facial animation, and emotional response.

Login to

Please Verify Your Phone or Email

Confirm Action

VolumX

Before You Leave... Find Your Perfect Job!

Login to

Please Verify Your Phone or Email

Confirm Action

Contact Us

VolumX