Language model powered digital biology
Published in arXiv, 2024
Recommended citation: Pickard, Joshua, et al. "Language model powered digital biology." arXiv preprint arXiv:2409.02864 (2024). https://arxiv.org/pdf/2409.02864
Recent advancements in Large Language Models (LLMs) are transforming biology, computer science, and many other research fields, as well as impacting everyday life. While transformer-based technologies are currently being deployed in biology, no available agentic system has been developed to tackle bioinformatics workflows. We present a prototype Bioinformatics Retrieval Augmented Data (BRAD) digital assistant. BRAD is a chatbot and agentic system that integrates a suite of tools to handle bioinformatics tasks, from code execution to online search. We demonstrate its capabilities through (1) improved question-and-answering with retrieval augmented generation (RAG), (2) the ability to run complex software pipelines, and (3) the ability to organize and distribute tasks in agentic workflows. We use BRAD for automation, performing tasks ranging from gene enrichment and searching the archive to automatic code generation for running biomarker identification pipelines. BRAD is a step toward autonomous, self-driving labs for digital biology.
Recommended BibTeX entry:
@article{pickard2024language,
title={Language model powered digital biology},
author={Pickard, Joshua and Choi, Marc Andrew and Oliven, Natalie and Stansbury, Cooper and Cwycyshyn, Jillian and Galioto, Nicholas and Gorodetsky, Alex and Velasquez, Alvaro and Rajapakse, Indika},
journal={arXiv preprint arXiv:2409.02864},
year={2024}
}