BioByte 097: Drivers of biological progress, immunogenicity in gene therapy & a vision for AI scientists
Welcome to Decoding Bio’s BioByte: each week our writing collective highlight notable news—from the latest scientific papers to the latest funding rounds—and everything in between. All in one place.
What we read
Blogs
Levers for Biological Progress [Asimov Press]
Thoughts on Niko McCarty’s thoughts on Dario Amodei’s thoughts…
Dario Amodei’s essay, “Machines of Loving Grace” envisions a future where AI dramatically accelerated biological research. Niko McCarty of Asimov Press responds by emphasizing that this vision requires more than AI advancements alone; we must first overcome fundamental biophysical bottlenecks that constrain experimentation and data generation in biology.
Two primary bottlenecks are identified: the speed of experiments and the complexity of biological systems. Experiments in biology are inherently slow due to the growth rates of organisms and the time-consuming nature of lab processes such as DNA cloning, this limits the number of hypotheses researchers can test. Arguably more importantly, our current tools provide a reductionist view of complex, dynamic systems, making it difficult to fully understand and predict biological behaviors: bio process operate across vast scales of time and space, from molecular interactions to evolutionary changs, complicating data collection and interpretation.
To address these challenges, McCarty advocates for the development of new tools that can accelerate experiments and collect more comprehensive data. Some examples giving are the creation of faster-growing organisms, automating laboratory techniques and employing advanced imaging methods like cryo-ET. The hope is that this will enable a clearer understanding of biology.
While AI has the potential to analyze complex datasets and model biological systems, its effectiveness is limited by data and the need to test its predictions in real organisms. Therefore, a synergistic advancement in both AI and experimental biology is necessary to achieve the rapid progress envisioned by Amodei.
Gene therapy immunogenicity primer [Eric Minikel, Nov 2024]
Gene therapy's fundamental challenge of introducing DNA into patients' cells comes with a critical consideration: the immune response to newly expressed proteins. At the heart of this response is the Human Leukocyte Antigen (HLA) system, particularly Class I and Class II. HLA Class I, expressed on nearly all nucleated cells, presents peptide fragments from proteins to cytotoxic T cells, enabling immune surveillance throughout the body. With over 20,000 alleles in the population and individuals typically carrying 3-6 different variants, each HLA allele has unique peptide-binding preferences that determine how the immune system responds to therapeutic proteins.
The scientific community has developed increasingly sophisticated methods for predicting and analyzing peptide presentation by HLA molecules. This has evolved from simple biochemical assays in the 1990s into advanced mass spectrometry techniques that can analyze peptide binding to specific HLA alleles. Modern approaches include multi-allelic and mono-allelic mass spectrometry, alongside computational tools like NetMHC and hlathena, which help researchers predict potential immunogenic responses. These tools have allowed scientists the ability to select the least immunogenic options early in development.
For central nervous system-targeted gene therapies, HLA Class I presents the primary concern since neurons, which are irreplaceable, express these molecules. While HLA Class II expression is more limited in the brain, occurring mainly in microglia and possibly some neurons, the delivery method of gene therapy remains crucial. If the therapeutic material reaches dendritic cells or monocytes, it could trigger a broader immune response. This understanding has led to the development of specialized mouse models with human HLA alleles, enabling more accurate testing of potential immune responses to gene therapies.
The complexity of immune responses to gene therapy has driven significant development in prediction and testing methods. Companies now routinely use immunoaffinity enrichment followed by mass spectrometry to evaluate drug candidates, typically selecting diverse donors with various DR alleles for testing. While some proteins, like GFP, show remarkably low immunogenicity, many therapeutic proteins can trigger significant immune responses. This has led to the development of sophisticated experimental systems, including humanized mice from providers like Taconic and JAX, which carry common human HLA alleles and allow for more accurate prediction of human immune responses to gene therapies.
Papers
Empowering biomedical discovery with AI agents [Gao et al., Cell, October 2024]
In a new perspective published in Cell, Marina Zitnik’s lab presents a vision for "AI scientists" as systems capable of skeptical learning and reasoning that can empower biomedical research through collaborative agents. The authors propose four distinct levels of autonomy for AI agents: Level 0 (no AI agent, using ML models as tools), Level 1 (AI as research assistant for narrow tasks), Level 2 (AI as collaborator capable of refining hypotheses with scientists), and Level 3 (AI as scientist capable of creative hypothesis generation and experimental design).
The paper details essential components for effective AI agents, including perception modules for processing multimodal data, interaction modules for engaging with humans and other agents, memory modules for storing both short-term and long-term knowledge, and reasoning modules for planning and decision-making. The authors describe various configurations for multi-agent systems, such as brainstorming agents, expert consultation agents, and research debate agents, each serving different roles in the research process.
The framework is illustrated through specific applications in genetics, cell biology, and chemical biology. For example, in genetics, Level 3 agents could optimize experimental designs for genome-wide association studies while developing novel statistical methods for identifying causal variants. However, the authors also emphasize significant challenges that must be addressed, including issues of robustness and reliability, the need for comprehensive evaluation protocols, challenges in dataset generation, and the importance of establishing proper governance frameworks.
Successful implementation of AI agents requires careful attention to ethical considerations and maintaining meaningful human oversight. While AI agents have the potential to transform biomedical research by accelerating discovery workflows, their development must be guided by robust safety protocols and clear understanding of their limitations to ensure they complement rather than replace human scientific expertise.
RapidDock: Unlocking Proteome-scale Molecular Docking [Powalski, arXiv, Oct 2024]
Why it matters: Current molecular docking tools are too slow to screen drugs against the whole proteome; these tools report run-times in the scale of seconds per protein on a single GPU. Therefore, even with hundreds of GPUs, screening one million molecules against the whole proteome would take years.
To address this, the authors developed RapidDock, a transformer-based model for blind molecular docking which achieves a 100-fold speed advantage over existing methods (0.04s average inference time) without compromising on accuracy. RapidDock had a higher accuracy (52.1) as measured by % RMSD <2Å on the Posebusters benchmarks than NeuralPLexer (22.6) and DiffDock-L (40.8), but lower than AlphaFold3 (76.9). With RapidDock, docking ten million molecules to all human proteins on a 512 GPU cluster would take 9 days. This is in contrast to about 20 years with DiffDock-L or 200 years with AlphaFold3.
Analysis of 10,478 cancer genomes identifies candidate driver genes and opportunities for precision oncology [Kinnersley et al., Nature Genetics, June 2024]
This new study tracks whole genome sequencing (WGS) of 10,000 cancer patients from the UK 100,000 Genomes Project, aiming to uncover the genetic drivers behind various cancers. The goal is to show how WGS can bring personalized treatments closer to reality. Some interesting points:
The team identified 330 genes that may drive cancer, including 74 genes previously unknown to have any connection to cancer. This discovery broadens our understanding of cancer’s genetic roots, paving the way for more tailored treatments
They found that over half of the patients had mutations that could guide treatment choices, from predicting responses to existing drugs to qualifying for clinical trials. This shows how WGS could help more patients access treatments specifically suited to the genetics of their cancer, which is really a myriad of diseases
Some well-known genes, like TP53, PIK3CA, and KRAS, turned up in multiple cancer types. These mutations often make patients eligible for specific therapies, suggesting a larger group of patients could benefit from treatments based on their genetic profile.
Compared to traditional gene panels, WGS provides a richer and more detailed view of clinically relevant mutations. By allowing doctors to look at the full spectrum of a tumor’s mutations, WGS supports “basket trials” that focus on treating specific mutations across various cancers
The study also highlighted 96 previously unexplored genes as possible therapeutic targets. This chemogenomic analysis, which examined the druggability of each gene, opens new doors for treating cancers with limited existing therapies
Overall, this study showcases how whole-genome sequencing could transform cancer treatment, providing a single comprehensive test that not only maps the genetic landscape of cancer but also points to new ways of tackling it.
Notable deals
Novo Nordisk signs a $285 million partnership with Ascendis Pharma to access their TransCon platform. With the partnership, Novo seeks to discover a GLP-1 candidate with a reduced dosing frequency, easing the patient burden and reducing the cost. For Ascendis, the deal includes an up-front payment, along with royalties, milestone payments for the lead program, and a potential additional $77.5 million for each additional program developed using its platform.
Surrozen and TCGFB announce a collaboration to find antibody targets for TGF-β for the potential treatment of patients with idiopathic pulmonary fibrosis. In addition to paying $6 million to access Surrozen’s antibody discovery services for up to two years, TCGFB will also issue the company a warrant for up to 3,380,000 shares of their common stock.
PrognomiQ raises $34 million in a Series D led by Seer, Inc. The company, founded in 2020, leverages multi-omic data to empower early disease detection. They seek to utilize this funding round to advance development of their lung cancer test, which they will provide as a lab developed test (LDT) and later as an In Vitro Diagnostics (IVD) test.
What we liked on socials channels
Events
The Single-Cell and Spatial Metabolomics Online Day is a pioneering virtual conference taking place on Zoom on December 10th, 3 pm CET, or 9 am EST. Register here.
Field Trip
Did we miss anything? Would you like to contribute to Decoding Bio by writing a guest post? Drop us a note here or chat with us on Twitter: @ameekapadia @ketanyerneni @morgancheatham @pablolubroth @patricksmalone @ZahraKhwaja
Parallel Bio could be a new arrow in the quiver for testing immunogenicity of drug candidates.