Company Description
TrueFan uses proprietary AI technology to connect fans and celebrities and is now focused on revolutionizing customer-business interactions with AI-powered personalized video solutions.Our platform enables brands to create unique, engaging video experiences that drive customer loyalty and deeper connections.
Company Overview
We are a cutting-edge AI company focused on developing advanced lip-syncing technology using deep neural networks.Our solutions enable seamless synchronisation of speech with facial movements in videos, creating hyper-realistic content for various industries such as entertainment, marketing, and more.
Position : MLOps Engineer
We are looking for a talented and motivated MLOps Engineer to join our team.The ideal candidate will play a crucial role in managing and scaling our machine learning models and infrastructure, enabling seamless deployment and automation of our lip-sync video generation systems.
Key Responsibilities
Model Training/Deployment Pipelines and Monitoring :
-  Design, implement, and maintain scalable and automated pipelines for deploying deep neural network models.
-  Monitor and manage Production models, ensuring high availability, low latency, and smooth performance.
-  Automate workflows for data preprocessing (face alignment, feature extraction, audio analysis), model retraining, and video generation.
-  Implement Logging, Tracking, and Monitoring Systems to ensure data integrity and visibility into the model lifecycle.
 
Infrastructure Management
-  Build and manage cloud-based infrastructure (AWS, GCP, or Azure) for efficient model training, deployment, and data storage.
-  Collaborate with DevOps to manage containerization (Docker, Kubernetes) and ensure robust CI/CD pipelines using github and jenkins for model delivery.
-  Monitor resource for GPU/ CPU-intensive tasks like video processing, model inference, and training using Prometheus , Grafana, alert manager, ELK stack.
 
Collaboration
-  Work closely with ML engineers to integrate models into production pipelines.
-  Provide tools and frameworks for rapid experimentation and model versioning.
 
Required Skills
-  Basic Python
-  Strong experience with cloud platforms (AWS, GCP, Azure) and cloud-based machine learning services.
-  Expert knowledge of containerization technologies (Docker, Kubernetes) and infrastructure-as-code (Terraform, CloudFormation)
-  Have understanding of Deployment of both synchronous and asynchronous API using Flask, Django, Celery, Redis, RabbitMQ , Kafka
-  Deployed and Scaled AI/ML in Production.
-  Familiarity with deep learning frameworks (TensorFlow, PyTorch).
-  Familiarity with video processing tools like FFMPEG and Dlib for handling dynamic frame data.
-  Basic understanding of ML models
 
Preferred Qualifications
-  Experience in image and video-based deep learning tasks.
-  Familiarity with media streaming and video processing pipelines for real-time generation.
-  Experience with real-time inference and deploying models in latency-sensitive environments.
-  Strong problem-solving skills with a focus on optimising machine learning model infrastructure for scalability and performance
 
(ref:hirist.tech)