Skip to main content
  • CIS
    Members: Free
    IEEE Members: Free
    Non-members: Free
    Length: 01:06:30
06 Jun 2023

Serafim Batzoglou, SeerBio, USA; ABSTRACT: Large Language Models (LLMs) have catalyzed transformative and unexpected advancements across many domains. In this talk, we aim to illustrate why molecular biology is an optimal area for applying these models. Molecular biological processes connect our DNA to cellular biology of the approximately 30 trillion cells in our bodies. Through interactions with our environment, these processes influence our traits, health, and susceptibility to disease. These processes are complex, messy, with structure that is mostly inscrutable by humans, but also robust and reproducible. This makes deep learning, and specifically LLMs, uniquely suited for modeling molecular biology. We will highlight recent breakthroughs that LLMs have accomplished in molecular biology, including gene regulation modeling, gene prediction, protein 3D structure modeling, cellular molecular process modeling, and personal genome interpretation. Some of these problems were significant unresolved challenges before the advent of deep learning. Looking to the near future, we will explore the potential of LLMs to integrate everything — from DNA to cellular molecular biology to phenotype — within the context of large-scale cohort studies, like the UK Biobank and All Of Us, along with similar global initiatives. These advancements promise to lead to impactful applications in biotechnology, drug development, and the diagnosis and treatment of diseases.