About VolumX VolumX is India’s first and leading company for digital humans, trusted by Oscar-winning studios and global brands. Using some of the most advance technologies, we are creating lifelike digital doubles for celebrities powering face replacements, de-aging, and autonomous digital humans. Now, we’re pushing into AI driven performance synthesis and we’re inviting talented engineers to join us in shaping the future of content and immersive interactions. Requirement - Strong experience in AI/ML model development (PyTorch, TensorFlow). - Practical experience with video generation, face replacement, digital doubles. - Understanding of Gaussian splats / NeRF / neural rendering concepts. - Knowledge of emotion recognition & sentiment analysis models. - Experience with real-time inference optimization (ONNX, TensorRT, quantization). - Background in speech processing (ASR, vocoders, prosody control) to train facial expression model . - Familiarity with lip-sync engines (e.g., Rhubarb, Wav2Lip, custom phoneme alignment). - Strong programming skills in Python - Experience in multimodal AI (audio + video + text). - Experience deploying AI in cloud environments (AWS/GCP/Azure, Docker, Kubernetes). - Hands-on expertise with STT/TTS frameworks (LiveKit, Whisper, Riva, Coqui TTS, Tacotron, FastSpeech, VITS, etc.). Responsibilities Face & Video Models - Train person-specific models for face reenactment, face swapping, and de-aging. - Build high-res, temporally consistent face replacement pipelines. - Test and implement neural rendering pipelines using Gaussians, Nerfs, Diffusion Lip Sync & Emotions - Implement or adapt lip-sync models (e.g. Wav2Lip-style, phoneme/viseme-based). - Research and integrate emotion recognition models (from audio/text input). - Map emotion states into facial rigs and lip-sync engines. - Interface outputs with rigs / MetaHumans via blendshapes or bone controls. Pipeline Engineering - Design low-latency inference pipelines for STT → NLU → TTS. - Optimize models for real-time streaming (GPU/TPU/Cloud deployment). - Work with backend engineers to expose AI services via APIs/WebSockets. Collaboration & Integration - Partner with Unreal engineers to sync AI outputs with Pixel Streaming. - Ensure smooth coordination between voice, facial animation, and emotional response.