We're looking for a passionate Machine Learning Engineer with a strong foundation in computer vision, model training and deployment, observability pipelines, and DevOps for AI systems. You’ll play a pivotal role in building scalable, accurate, and production-ready ML systems in a collaborative and distributed environment. Responsibilities Train, fine-tune, and evaluate object detection and OCR models (YOLO, GroundingDINO, PP-OCR, TensorFlow). Build and manage observability pipelines for evaluating model performance in production (accuracy tracking, drift analysis). Develop Python-based microservices and asynchronous APIs for ML model serving and orchestration. Package and deploy ML services using Docker and Kubernetes across distributed environments. Implement and manage distributed computing workflows with NATS messaging. Collaborate with DevOps to configure networking (VLANs, ingress rules, reverse proxies) and firewall access for scalable deployment. Use Bash scripting and Linux CLI tools (e.g., sed, awk) for automation and log parsing. Design modular, testable Python code using OOP and software packaging principles. Work with PostgreSQL, MongoDB, and TinyDB for structured and semi-structured data ingestion and persistence. Manage system processes using Python concurrency primitives (threading, multiprocessing, semaphores, etc.). Required Skills Languages: Python (OOP, async IO, modularity), Bash Computer Vision: OpenCV, Label Studio, YOLO, GroundingDINO, PP-OCR ML & MLOps: Training pipelines, evaluation metrics, observability tooling DevOps & Infra: Docker, Kubernetes, NATS, ingress & firewall configs Data: PostgreSQL, MongoDB, TinyDB Networking: Subnetting, VLANs, service access, reverse proxy setup Tools: sed, awk, firewalls, reverse proxies, Linux process control
Looking for a MLOps & Computer Vision Engineer to join our team. ✨ What you’ll work on: Building & fine-tuning object detection (YOLO, TensorFlow, GroundingDINO) and OCR models (PP-OCR). Creating observability pipelines to track accuracy as part of the MLOps lifecycle. Deploying distributed systems with NATS & Kubernetes . Containerizing ML workloads with Docker for training & inference. Crafting robust Python applications (async coroutines, OOP, modular design, basic DSA). Managing PostgreSQL, MongoDB, TinyDB databases. Automating with Bash, sed, awk, and handling networking (VLANs, subnets, ingress rules, reverse proxies). 🛠 Tech stack highlights: Python, FastAPI, OpenCV, Kubernetes, Docker, NATS, Bash, PostgreSQL, MongoDB. 🌟 Experience: ~3 years in full-stack ML systems, strong ownership & problem-solving mindset.
We are looking to onboard a candidate who can support our team in conducting in-depth code reviews for AI/ML models developed using open-source frameworks. The candidate should work closely with our team, reviewing their code regularly, improving code quality, and advising on best practices for model development and deployment. We need an onsite candidate in Hyderabad with the following skills: Strong proficiency in Python Expertise in Machine Learning and Deep Learning Hands-on experience with TensorFlow, PyTorch, and other open-source Al libraries Familiarity with model explainability, fairness, and bias detection Experience in reviewing and optimizing model pipelines and training loops Solid understanding of MLOps practices, including CI/CD for ML Exposure to distributed model training and GPU including CI/CD for ML Exposure to distributed model training and GPU optimization Experience with unit testing and performance profiling for Al models Knowledge of model versioning, reproducibility, and experiment tracking (e.g., using MLflow, Weights & Biases) Familiarity with data validation, feature engineering pipelines, and data drift detection Ability to suggest improvements on architecture, modularization, and documentation of code The ideal candidate would have a proven track record of working on production-grade Al systems and would be capable of mentoring our team to elevate our engineering standards.