How the shakeup at OpenAI underscores the need for AI standards in health care

The leadership turmoil within OpenAI, the maker of ChatGPT, is triggering calls for stepped-up efforts to establish standards for how generative AI is used across the health care industry, where experts worry that one or two companies could end up with too much control.

Microsoft was already a driving force behind efforts to deploy generative AI in health care. But its hiring of Sam Altman and Greg Brockman, formerly top executives at OpenAI, gives the company even more power over the technology’s development, testing, and use in the care of patients.

advertisement

“There is a concern that we’re entering a sort of Coke or Pepsi moment where there will be a few big players in the generative AI…market and their systems will form the backbone of a lot of what is to come,” said Glenn Cohen, director of Harvard Law School’s Petrie-Flom Center for Health Law Policy.

He added that those players — namely Microsoft and Google — will have the power to set prices and establish practices for testing generative AI and assessing its risks, exerting a level of control that is “not ideal for physicians or patients, especially when a market is just getting off the ground.”

Brett Beaulieu-Jones, a health data researcher at University of Chicago who works with GPT platforms, noted that experts have been worried about OpenAI becoming more profit-driven even before the recent chaos.

advertisement

“There are a lot of people in health care who are nervous about working with them because they feel misled by their prior shift,” Beaulieu-Jones said. “There’s a common joke that they started out as non-profit OpenAI and became ClosedAI for all profit.”

While the technology has the potential to help patients, and save money on administrative tasks, it also poses significant dangers. It can perpetuate biases and skew decision-making with inaccurate, or hallucinated, outputs. Large technology businesses, including OpenAI and Microsoft, have not been transparent about the data fed into their models or the error rates of specific applications.

The turmoil within OpenAI’s leadership is an ongoing saga, and it remains to be seen how the power balance will ultimately play out between the nonprofit and Microsoft.

But the creeping consolidation harkens to a similar process that unfolded in the business of selling electronic health records, where a couple of companies, Epic and Oracle, own most of the market and can set prices and the pace of innovation. In recent months, Microsoft has aligned itself with Epic to begin to embed generative AI tools and capabilities in health systems across the country.

Those inroads, combined with the dramatic shakeup at OpenAI, have left some health AI experts with a disquieting feeling about how fast the technology is advancing at a time when there is no consensus on how to assure its safety or even measure performance.

“All of a sudden, within a span of three days, it feels like things have completely changed,” said Suresh Balu, program director of the Duke Institute for Health Innovation. “It’s even more important to have guardrails where we have an independent entity through which we can ensure safety, quality, and equity.”

No such entity exists. The Food and Drug Administration regulates artificial intelligence in health care, but it doesn’t concern itself with many of the applications contemplated for generative AI, such as automating clinician notetaking, answering patients’ questions, auditing claims, and writing letters to contest payment denials. Another federal agency, the Office of the National Coordinator for Health Information Technology, is just beginning to consider regulations for those kinds of uses.

Meanwhile, imperatives for innovation and caution are clashing not only within OpenAI and Microsoft, but also within large health systems that want to be seen as working on the cutting edge of AI in health care. At Duke, for example, clinical leaders have struck up a partnership with Microsoft to develop generative AI tools, while Balu and other prominent data scientists are emphasizing the need for standards in the use of the technology.

Duke is far from alone. New York University’s health system has also signed a deal to experiment with the GPT line of AI models, and many providers are testing a Microsoft product known as Dax Copilot that relies on generative AI to automatically document patient-doctor conversations. Mayo Clinic is testing that technology, and it has inked a long-term data and technology partnership with Google, which is also developing generative AI tools focused on medicine.

Some of Mayo’s clinical leaders are also working to develop standards for AI adoption and testing through the Coalition for Health AI, a group that includes Stanford and Johns Hopkins as well as Microsoft and federal regulators.

Brian Anderson, a co-founder of the coalition and chief digital health physician at Mitre Corp., said the sudden reshuffling of roles within Microsoft and OpenAI underscores the urgency of the coalition’s work.

“It’s critically important to have independent validation of an organization’s internally developed models,” Anderson said. He added that with Altman and Brockman, Microsoft is adding to an already deep roster of technical experts focused on the risks of building and deploying generative AI tools in safety-critical sectors like health care.

But that doesn’t mean the company, with its overarching focus on driving earnings and a pipeline of products, can objectively assess whether its generative models are being properly vetted prior to commercial use. The coalition has recommended that such work be carried out by a network of laboratories that could validate the models and help mitigate risks.

“We just don’t have that yet, and we desperately need it,” Anderson said. “We need to come together, not just to build a testing and evaluatory framework, but to build this ecosystem of labs to actually do that work.”

Mohana Ravindranath and Lizzy Lawrence contributed reporting.

This story is part of a series examining the use of artificial intelligence in health care and practices for exchanging and analyzing patient data. It is supported with funding from the Gordon and Betty Moore Foundation.