At Umami Bioworks, we are a leading bioplatform for the development and production of sustainable planetary biosolutions. Through the synthesis of machine learning, multi- omics biomarkers, and digital twins, UMAMI has established market-leading capability for discovery and development of cultivated bioproducts that can seamlessly transition to manufacturing with UMAMI s modular, automated, plug-and-play production solution
By partnering with market leaders as their biomanufacturing solution provider, UMAMI is democratizing access to sustainable blue bioeconomy solutions that address a wide range of global challenges.
Umami Bioworks is looking to hire an inquisitive, innovative, and independent
Machine Learning Engineer
to join our R&D team in Bangalore, India,
to develop scalable, modular ML infrastructure integrating predictive and optimization models across biological and product domains. The role focuses on orchestrating models for media formulation, bioprocess tuning, metabolic modeling, and sensory analysis to drive data-informed R&D.
The ideal candidate combines strong software engineering skills with multi-model system experience, collaborating closely with researchers to abstract biological complexity and enhance predictive accuracy.
Responsibilities
- Design and build the overall architecture for a multi-model ML system that integrates distinct models (e.g., media prediction, bioprocess optimization, sensory profile, GEM-based outputs) into a unified decision pipeline
- Develop robust interfaces between sub-models to enable modularity, information flow, and cross-validation across stages (e.g., outputs of one model feeding into another)
- Implement model orchestration logic to allow conditional routing, fallback mechanisms, and ensemble strategies across different models
- Build and maintain pipelines for training, testing, and deploying multiple models across different data domains
- Optimize inference efficiency and reproducibility by designing clean APIs and containerized deployments
- Translate conceptual product flow into technical architecture diagrams, integration roadmaps, and modular codebases
- Implement model monitoring and versioning infrastructure to track performance drift, flag outliers, and allow comparison across iterations
- Collaborate with data engineers and researchers to abstract away biological complexity and ensure a smooth ML-only engineering focus
- Lead efforts to refactor and scale ML infrastructure for future integrations (e.g., generative layers, reinforcement learning modules)
Qualifications
- Bachelor s or Master s degree in Computer Science, Machine Learning, Computational Biology, Data Science, or a related field
- Proven experience developing and deploying multi-model machine learning systems in a scientific or numerical domain
- Exposure to hybrid modeling approaches and/or reinforcement learning strategies
Experience
- Experience with multi-model systems
- Worked with numerical/scientific datasets (multi-modal datasets)
- Hybrid modelling and/or RL (AI systems)
Core technical skills
- Machine Learning Frameworks: PyTorch, TensorFlow, scikit-learn, XGBoost, CatBoost
- Model Orchestration: MLflow, Prefect, Airflow
- Multi-model Systems: Ensemble learning, model stacking, conditional pipelines
- Reinforcement Learning: RLlib, Stable-Baselines3
- Optimization Libraries: Optuna, Hyperopt, GPyOpt
- Numerical & Scientific Computing: NumPy, SciPy, panda
- Containerization & Deployment: Docker, FastAPI
- Workflow Management: Snakemake, Nextflow
- ETL & Data Pipelines: pandas pipelines, PySpark
- Data Versioning: Git
- API Design for modular ML blocks