Using AI, images are fueling a new boom in cell biology

Compared to molecular techniques to study single cells, images feel a little like “old school biology,” says Anne Carpenter, an artificial intelligence and cell biology researcher at the Broad Institute. Yet  images are a gold mine that can yield information as rich as the genome — once you learn how to extract it. 

Carpenter is using AI and other computational methods to do exactly that, helping to propel an AI-driven boom in cell biology and medicine over the past decade. She and other scientists spoke of this visual feast at the annual STAT Summit in Boston.

advertisement

The next era is merging two oceans of data: the knowledge from molecular science with cellular science, said Sarah Teichmann, professor of stem cell medicine at the Cambridge Stem Cell Institute. What that gives you, she said, is a new ability to understand the human cell and all its types, functions, and — importantly — reactions to things like its environment or drugs.  

“We’re firmly in the spatial era for in vivo human cell and tissue characterization,” Teichmann said. “One of the exciting things is to be able to connect molecular data to more cellular morphological data and tissue morphology.”

The endeavor to deeply probe the lives of single cells is at the heart of the international Human Cell Atlas Consortium, which Teichmann co-founded. The consortium, made up of hundreds of scientific groups around the globe, set out in 2016 to painstakingly map all the types and states of all the 37 trillion cells in the healthy human body. This work is primarily carried out through single-cell RNA sequencing, a molecular technique that tells you which genes are switched on or off in any given cell and can tell you what a cell is and what it’s doing.

advertisement

In the last couple of years, the Human Cell Atlas published its first draft — a map of over a million cells. That’s yielded insights like helping scientists understand why GLP-1 agonist drugs such as Wegovy seem to have an effect on patients’ resting heart rate. The Atlas holds profiles of all the different kinds of cells in the human heart, including the rare pacemaker cells which set how fast the heart beats despite making up just 1% of heart cells.

“Having the molecular fingerprint of the pacemaker cell meant that we’re able to also map all known compounds with known targets against those cells, as well as other cells in the heart,” Teichmann said. “One really unexpected finding was that the GLP-1 receptor was being expressed in those pacemaker cells. The surprising thing there was that the human cells basically can be used as a guidebook for where drugs are acting.”

Images can provide another layer of information. There’s a lot about a cell that should be visible, Carpenter said. For instance, what a cell is doing or whether it’s healthy or hurt should give off visual clues. “But it’s only in the past decade that quantifying that information in a really rich way has become equally powerful to mRNA profiling,” Carpenter said.

One example comes from a technique that Carpenter helped pioneer called “cell painting,” where biologists stain cells with six different fluorescent dyes that can label different parts of the cell. In one slide, Carpenter showed skin cells from a healthy patient on one side and cells from a bipolar patient on the other. With the dyes, the cells’ nuclei look like blue gumdrops caught in glowing green cobwebs and dusted all over with red dots.

“I don’t blame you if you don’t really see it right off the bat,” Carpenter said to the crowd. “The mitochondria, the red parts of the cell, are more dispersed” in the cells from the patient with bipolar disorder. “It’s a way when we do these perturbations at scale, we can do this for one condition, but why not scale this up to all the conditions, all the patient phenotypes. Then we can test all the drug perturbations. That’s where things really start to accelerate.”

advertisement

These different domains of information are powerful on their own, and AI is beginning to make it possible to make sense of all of it together, said Kun Zhang, a principal investigator at Altos Labs. It’s similar in some ways to how large language models like ChatGPT work to integrate different kinds of information and transform it from text to video or images, he said.

“You can actually take information input from one domain and project to a different domain,” Zhang said. “A similar concept can apply to these biological data collected at different scales and different levels, from RNA to spatial arrangement to cell painting. You can ask, if I add this drug or turn on that gene or suppress the other gene, what happens?”

That kind of integration is just beginning. At the moment, the field is still collecting and organizing the right data sets, Carpenter said. Part of what her lab does is take “messy” biology problems and reduce them into mathematical ones. They’re using simple methods, she said, but in a field where things are so new and moving so fast, it’s still pushing the envelope of biology and drug discovery.

“But we’re talking about a field where every year, a completely new application comes out,” she said. “Just the simplest methods work just fine to get us further than we’ve ever been before.”