Over the past few years, researchers have started using AI to capture signals from our voices in hopes of quickly diagnosing diseases. Academics and startup founders in the nascent field talked about the potential benefits, as well as the privacy implications, of this technology at the Milken Institute Future of Health Summit on Tuesday.
“There’s a very rich set of data for us to look at,” said Nate Blaylock, chief technology officer at voice AI company Canary Speech. “Our technology extracts over 12 million biomarkers per minute of speech.”
advertisement
Using voice biomarkers for early disease detection is a hot research area, and has spawned several buzzy startups. Kintsugi Labs, used in call centers and telehealth platforms, aims to detect signs of depression and anxiety in speech. Winterlight Labs focuses on dementia, and Canary Speech works on a range of conditions, including Parkinson’s and Alzheimer’s.
But while research shows correlation between voice signals and certain conditions, the field has not yet proved that this technology will meaningfully help patients. It’s unclear if providers can translate an earlier diagnosis from voice biomarker technology into better disease outcomes.
Nevertheless, panelists were optimistic about the tool’s eventual ability to diagnose conditions, potentially in conjunction with other methods. Olivier Elemento, a biophysics professor at Cornell, said unlocking voice biomarkers could dramatically expand diagnosis. There are very few barriers to recording a patient’s voice — which is mostly positive but also slightly dystopian, Elemento noted, as in the future people might be able to record and diagnose others without their consent.
advertisement
“Voice is probably the cheapest biomarker I can think of,” Elemento said. “It’s extremely easy to collect. It’s also very easy to do, there’s no need for biopsies of any kind. Just recording somebody.”
Blaylock said Canary has been working with the Food and Drug Administration for over a year to design a trial testing its device. The company first plans to test how accurately it diagnoses depression and anxiety compared to several psychiatrists’ opinions; a difficult standard of care, Blaylock said, as doctors often disagree with each other. In the meantime, Blaylock said customers use Canary to track their wellness.
“We are not using AI to diagnose,” Blaylock said. “Doctors diagnose. This is an aid for a clinician to look at.”
The tech is trained on a vast trove of data from previous voice recordings. Experts underscored the need for a diverse set of training data, including voices with different accents and intonations. Grace Chang, CEO of Kintsugi, recalled her company’s decision early on to release a consumer voice-journaling app that allowed them to build a powerful database.
“Because we had trained on such a robust data set of different languages, our models ended up being language agnostic, so people could be speaking in French and Spanish and Chinese,” Chang said.
Panelists also grappled with questions about protecting patient data, as therapy sessions or call center confessions are sensitive. Mark Hasegawa-Johnson, an engineering professor at the University of Illinois, advocated for a clear data framework in the United States. He said the European Union’s general data privacy regulation, universally declaring data a human right, is a step in the right direction.
“We need that framework,” he said. “It needs to be not something mysterious and science fiction. It needs to be something very practical.”