A replacement for race: Medical experts explore how to eliminate bias in clinical algorithms

WASHINGTON — Most of the medical community has acknowledged that racism is baked into many of its clinical tools: pulse oximeters and kidney function calculators are prime examples. But as presentations at a conference Tuesday showed, physicians remain divided on when to remove race from calculators and algorithms — and crucially, what characteristics should replace it.

“Many say that we should expunge race out of everything,” said Neil Powe, chief of medicine at San Francisco General Hospital, at one session. “That would be great. But what is the replacement, and does the replacement do more harm than good?”

advertisement

The conference, hosted on Tuesday by the Doris Duke Foundation and others, focused on how inclusion of race in the clinic impacts health outcomes, and how to minimize the use of biased algorithms. The Doris Duke Foundation also announced over $10 million in grants on Tuesday, divvied up between five medical organizations to support more consistent approaches to race in clinical research.

“There are too many algorithms for one organization to really take this on and make a difference,” said Sindy Escobar Alvarez, who leads Doris Duke’s medical research program. “We need concerted action from multiple organizations.”

The American Academy of Pediatrics will use the money to test a replacement for biased 2011 guidelines for diagnosing urinary tract infections in children up to age 2. “If you actually look at the algorithm, you come to a decision point where it asks if the child is Black or white. And because of that dichotomization, it skews the workup,” Joseph Wright, chief health equity officer for the American Academy of Pediatrics, said in a press call prior to the event. (The event was also sponsored by the Gordon and Betty Moore Foundation, which supports STAT’s reporting on artificial intelligence in health care.)

advertisement

If the calculator estimates the risk of a UTI is low, physicians are less likely to take a urine specimen to test for an infection. But the race-based decision point gave lower risk scores to Black girls, skewing  recommendations against testing their urine, and potentially leading to more missed UTIs.

In 2020, though, pediatricians started challenging the appropriateness of the use of race as a clinical variable, and a year later the academy retired the UTI clinical guideline and started examining a replacement.

“Race is not a biologic proxy,” said Wright. “Race is a social construct and has no place being embedded in a clinical guideline like this.” Last year, the AAP issued a policy statement on eliminating race-based medicine as a whole.

Now, the question is what physiological variables could prove useful — and unbiased — replacements. Researchers have studied new algorithms that replace race in the UTI algorithm with clinical variables like history of UTI and duration of fever, but their accuracy has only been measured against historical patient records, which may not capture current infection rates or the impact of a new algorithm when it’s rolled into clinical practice. To ensure the best outcomes, they’ll need to be tested in the field.

“Now, let’s test the non-race based algorithm in a busy clinical setting and see, can we refine it even more?” Wright told STAT at the event. “Can we make it even better? More importantly, can we demonstrate that we are not discriminating against anybody?”

Simply removing race is not enough to ensure equitable health outcomes, as shown in a study published this month that examined predictive calculations for colorectal cancer. Researchers tested the performance of four algorithms on thousands of patient records and found that the model including race and ethnicity worked better than a model with race redacted. That doesn’t mean using race is necessarily the right way to go, but it does show the need for thoughtful, individualized alternatives in each flawed equation.

Several experts at the conference acknowledged that medicine is not yet able to remove race completely from clinical practice — largely because racism creates different, worse health outcomes for patients of color compared to white patients that have to be taken into account.

But in some cases, like the pediatric UTI calculator, use of race is so blatantly wrong that it’s best to remove it.

“Some of these things are just low-hanging fruit,” said David Jones, professor at Harvard Medical School. “If any of these tools have really careless uses of race data, we can get rid of them.”

Experts are hopeful that artificial intelligence tools could also help the medical community come up with more complex, nuanced treatment guidelines. The tools, particularly generative AI, are growing quickly in medicine and might make algebraic clinical calculators obsolete.

“If you could integrate complex administrative datasets, and health datasets, machine learning tools could figure out nuanced ways to do this,” Jones said. “You wouldn’t be swapping race for one particular variable. You would be pulling race out and putting in a bunch of other things.”

Wright said AI tools have tremendous potential, but he believes the best approach is to start by fixing the clinical calculators. If doctors are still using equations that encourage them to assume a patient’s race and treat it as biological, those biased data points might be amplified in an AI algorithm.

“I almost feel like we need to just draw a line in the sand and start from scratch,” Wright said. “At this point moving forward, we will collect data in an unbiased fashion with race consciousness. With AI, I just worry about capturing stuff that was biased historically.”

Combating bias in AI has garnered a lot of attention, but all of the projects funded by the Doris Duke Foundation’s grant focus on clinical calculator racial bias.

“As it pertains to research, beyond the AI-based algorithm community, getting traction to revisit existing clinical equations has been challenging,” Alvarez wrote in an email to STAT. “We thought hard about how to promote change in the space.”

The American Society of Hematology is working to make sure neutrophil reference ranges based primarily on white male patients don’t result in overtreatment of Black patients for neutropenia, while the American Heart Association is giving $1.2 million in grants to study algorithms prioritized by its clinicians. The National Academies of Sciences, Engineering, and Medicine will develop a report on the use of race and ethnicity in biomedical research, and the Coalition to End Racism in Clinical Algorithms will provide targeted support to safety net hospitals in implementing new race-redacted algorithms.

“We have a lot of work to undo,” said Toni Eyssallenne, CERCA’s senior medical advisor, pointing out that scientific rigor was often lacking in the creation of race-adjusted clinical tools. Her coalition’s project, and others, are about reintroducing that rigor. “We need to use race consciousness to reduce inequities, not to actually exacerbate them.”

This story is part of a series examining the use of artificial intelligence in health care and practices for exchanging and analyzing patient data. It is supported with funding from the Gordon and Betty Moore Foundation.