At Mandrake Bioworks, we're building a new kind of biotech company. One where plants are programmable systems, and AI is the design engine. Our mission is to reimagine the process of trait discovery and biological design using AI-first methods and multi-omics data. From climate-resilient crops to plants engineered for high-value compound production, our work sits at the intersection of deep-tech, biology, and planetary need. This is not just about applying machine learning to biology. It’s about building novel infrastructure and models that don't exist yet, starting from first principles. You’ll be collaborating closely with a multidisciplinary team of: Some of the best Plant biotechnologists & geneticists in the country Experienced operators in climate, biotech, and venture-backed deep-tech AI/ML engineers and bioinformaticians who’ve built production-grade models and platforms at scale This is a rare chance to work at the confluence of real-world biological complexity and cutting-edge AI research with a team deeply committed to impact and excellence. You’ll work as a founding member of our AI team, helping architect systems that bring plant biology into the machine learning era. Responsibilities include: Build and optimize data pipelines that integrate large-scale, heterogeneous datasets from multiple sources Develop custom LLM workflows and domain-specific foundation models to extract actionable insights Fine-tune open-source language and graph transformer models for NER, entity resolution, and relationship extraction Design systems that blend prompt-engineering, retrieval augmentation, and probabilistic reasoning to manage high-noise data environments Translate cutting-edge research into modular, production-ready AI tools using real-world datasets We’re Looking for someone who’s obsessed with systems-level thinking, loves working at the edge of what's known, and is unafraid of complexity. Must-Haves: Strong experience with Python, PyTorch/TensorFlow, and deep learning fundamentals Familiarity with LLMs, Transformers, or graph-based models Hands-on experience building robust data pipelines and handling multi-modal datasets Passion for biology and willingness to dive deep into genomic/omics datasets Perpetually curious, grounded in first-principles thinking, and unafraid to question how things are done Comfortable with ambiguity, fast learning, and ownership-you care more about solving the problem than defending a method Bonus Points: Experience with scientific literature parsing, NER, or bio-NLP Familiarity with omics standards (e.g., FASTA, VCF, GTF, KEGG, GO terms) Exposure to working with large-scale biological or medical datasets Interest in systems biology, synthetic biology, or computational genomics What You’ll Get: Monthly stipend + pathway to full-time offer Ownership of core systems in a high-ambition deep-tech startup Mentorship from leaders across AI, biology, and engineering Opportunity to co-author papers, build open-source tools, and ship real infrastructure Front row seat into how a deep-tech moonshot is scaled towards a global impact Biology today is where the internet was in the early 2000s - fragmented, messy, and yet full of potential. At Mandrake, we’re building the models, abstractions, and infrastructure to make biology programmable at scale. Your work won’t sit in a research repo - it’ll directly power biological discoveries, product pipelines, and planetary-scale impact. Show more Show less
Experience: 0–3 years Type: Full-time, On-site (Bengaluru) About Mandrake Bioworks Mandrake Bioworks is an AI-first biotechnology company reimagining how we design and engineer life itself. Our mission is to build intelligent systems that make biology programmable . We’re unlocking a new generation of gene editing technologies that will power breakthroughs in longevity, de-aging, sustainable agriculture, and climate resilience . We’re building the foundational AI stack for biology - spanning large-scale biological data engines, foundation-model training, and generative design systems that can create new molecular tools from first principles. If you want to build at the intersection of AI, life, and the future of civilization and see your models transform what humanity can engineer, this is the place to do it. What You’ll Work On Build, train, and evaluate foundation and generative models for biological data Develop data pipelines for large-scale datasets: curation, normalization, clustering, and annotation Work hands-on with pretraining, finetuning, and reinforcement-learning (RL/RLHF/DPO) pipelines Implement and optimize transformer-based architectures (language + multimodal models) Design evaluation and benchmarking frameworks for model performance and representation learning Collaborate with the biology and ML teams to deploy AI systems that directly inform experimental design You’re a Great Fit If You Are strong in Python and PyTorch/JAX , and comfortable with deep-learning libraries (Lightning, HuggingFace, etc.) Have experience with GenAI / LLMs / diffusion models , especially pretraining or finetuning Can handle large datasets end-to-end (data wrangling, deduplication, clustering, sampling) Have solid understanding of transformers, embeddings, attention, and training optimization Have hands on experience with RL / RLHF / DPO and can implement and iterate on them Thrive in fast-moving environments and like taking full ownership of what you build Why Mandrake Work on the frontier of AI × Biology - creating technologies that could redefine longevity, health, and the future of our planet. Join a small, high-velocity team where your ideas directly shape the company’s direction and breakthroughs. You’ll have extremely high ownership. Own what you build end to end - from idea to model to deployment. We operate like an applied research lab - rapid iteration, deep focus, and no bureaucracy. Work on real problems, not toy datasets. Build, publish, and innovate at the edge of science and engineering. Be part of something consequential. This isn’t just another startup - it’s the beginning of programmable life. Passionate final year students who can join part time now and full-time early next year are welcome to apply. Location: Indiranagar, Bengaluru (in-person, full-time), 6 days a week (Remote only for exceptional candidates) PS: We don't care about degrees or credentials. Send us proof of work! Please apply on https://mandrakebio.notion.site/work-with-us