Patients’ social needs often get lost in health records. Generative AI could help

Generative AI’s earliest applications in medicine have largely focused on curing not patients, but the plague of productivity physicians lose to digital documentation. Now, research suggests a way that large language models like ChatGPT could benefit both patients and providers: by automatically extracting a patient’s social needs from reams of text in their clinical records.

Factors like housing, transportation, financial stability, and community support play a critical role in patients’ health once they leave the doctor’s office. But it takes concerted effort to screen patients for gaps in these so-called social determinants of health — and even when screening occurs, this critical information is usually scattered in the rambling clinical notes that providers write each time a patient has a visit.

advertisement

As a physician trying to understand a patient’s needs, “you’re trying to do a needle in a haystack type search for clinical information,” said Danielle Bitterman, a radiation oncologist and artificial intelligence researcher at Mass General Brigham. “Patients oftentimes have thousands of notes.”

Experimental natural language processing algorithms that extract social determinants of health from free text can be confused by the complexity of language used to describe, for example, a patient’s unhoused status. The new generation of powerful large language models offer an opportunity for higher accuracy. But they could also introduce new forms of bias into patient care.

Published Thursday in the journal npj Digital Medicine, Bitterman and her colleagues used several large language models to pinpoint references to SDoH in visit notes for 770 cancer patients who received radiation therapy at Dana-Farber Brigham Cancer Center. Of those patients, 48 had an adverse social determinant hidden in their clinical notes — challenges related to employment, housing, transportation, parental status, relationships, and social support. The best-performing model, fine-tuned by the researchers, identified 45 of those patients, compared to just one whose provider recorded a social need with a structured code in their health record.

advertisement

“I think of this as a substantial advancement forward and a demonstration of the value of what large language models can do in this area,” said cardiologist Lucas Zier, who leads a group at UCSF and Zuckerberg San Francisco General Hospital researching AI applications to support vulnerable patients. “It’s pretty clear, at least to me and my team, that large language models are going to be the most effective source of data extraction for social determinants of health moving forward.”

These systems need significant development before being implemented in real care settings, Bitterman emphasized. Large language models are known to perpetuate biases embedded in the data used to train them, and the study showed that models’ SDoH predictions varied when they were given sentences that included a patient’s race and gender; fine-tuning the LLMs reduced the variation compared to commercial versions somewhat. Any mistakes could put the most vulnerable patients at risk if automated systems are used to identify which patients need support.

But some health systems already have pilots in place to test LLMs’ ability to extract social determinants. In Mount Sinai’s emergency department, for example, researchers are studying machine learning’s ability to infer patients’ social determinants of health from where they live and free text in medical records, said Girish Nadkarni, system chief of Mount Sinai’s division of data-driven and digital medicine. “As more and more of these pilots are successful, I think people will start deploying them in production soon,” said Nadkarni, who is also an editor for npj Digital Medicine.

Those projects reflect a growing push to collect and use social determinants of health data from patients. The Office of the National Coordinator for Health IT now requires certain social determinants data to live as fields in certified electronic health records, and the Centers for Medicare and Medicaid Services are increasing pressure on federally-supported health systems to collect those standardized data starting this year. This month, the Joint Commission started requiring that accredited hospitals collect and act on health-related social needs.

Many health systems are expanding efforts to manually fill in tabular, dedicated fields for social determinants to comply with these requirements. Patient-driven data collection “is the gold standard,” said Allison Bryant Mantha, Mass General Brigham’s associate chief health equity officer, allowing a patient to describe their life in their own words. “The other thing it allows us to do is to say, ‘Do you want help with each of these?’”

But Zier and Nadkarni imagine that generative AI could help complete that task, reducing the burden on staff members to screen patients. “Potentially, you could have an organization go along as they are, and you could use the LLM to comb through the medical record and start to extract that data,” said Zier. Or an LLM-powered chatbot could shoulder some of the workload of screening, asking patients about their social determinants of health. “You could do it on the front end or the back end.”

That strategy, though, could introduce other risks. Research has shown that race and gender can influence the way that clinicians document patient encounters — bias that could influence the SDoH extracted by an automated system. And there’s no guarantee that an algorithm that performs well in one care setting will translate to another. The MGB study included Boston-area patients who are predominantly white, who reported relatively few gaps in social and financial support. “Unmet and adverse SDoH are more prevalent in diverse populations, so that might be an issue with generalizability,” said Nadkarni.

If generative AI tools can accurately and equitably extract data about social needs, health systems could use them to share information about local resources or pair patients with community health workers. But today, “there is still a significant implementation gap,” said Zier. “The open-ended question is once you have the data, how can you effectively take that data to improve health outcomes?” Mining a patient’s medical record for social needs does no good if a health system doesn’t have a way to refer them to support systems.

It could also do harm, by revealing social needs that patients believed they discussed in confidence with their physician. “Even if they’re accurate, they can create stigma,” said Andrea Hartzler, a bioinformatics researcher at the University of Washington who advocates for the involvement of patients in technology development. She imagines, for example, an algorithm extracting a history of substance abuse and surfacing it to physicians who don’t have an existing relationship with a patient.

“We really need to understand the potential unintended consequences — potential harms, potential benefits as well — to make sure that we don’t perpetuate biases, that we don’t infuse mistrust and destroy some of those relationships that are so hard to cultivate in health care,” said Hartzler.

Because of those risks, algorithm developers, including Bitterman, emphasize that patients need to be able to consent to new tools that dig into their medical information. “Patients should always be told and have an option to opt out of having AI monitoring their records.”