ChatGPT for genomes: introducing a CRISPR-designing generative AI

In 2018, during her chemistry Nobel Prize lecture, Frances Arnold noted that scientists had arrived at a point where they could read, write, and edit any sequence of DNA. But composing whole genes or even whole genomes from scratch — that was something only evolution could do.

A few years later, not long after helping to launch the Arc Institute, a nonprofit research center in the Bay Area, molecular engineer Patrick Hsu wondered if it was possible to imitate the forces of evolution that Arnold had been referring to. DNA is a language, after all, and with all the advances in generative AI — chatbots that could hold eerily lifelike conversations if trained on enough text — maybe recreating all the cellular complexity contained in a genome wasn’t that far behind.

advertisement

Working with Brian Hie, a computational biologist at Stanford University and a fellow Arc Institute member, Hsu, who is also an assistant professor at the University of California, Berkeley, began assembling a team of scientists to train an AI model on vast troves of biological data — 300 billion DNA letters, including long sequences from 80,000 genomes of bacteria and archaea.

STAT+ Exclusive Story

STAT+

This article is exclusive to STAT+ subscribers

Unlock this article — plus in-depth analysis, newsletters, premium events, and networking platform access.

Already have an account? Log in

Already have an account? Log in

View All Plans

Get unlimited access to award-winning journalism and exclusive events.

Subscribe