After learning the hard way that government data may not always be available or reliable, the research community is finding alternative ways to host important government health data and guidance online.
The Alt CDC Bluesky account posted about one notable archive of CDC datasets hosted on the nonprofit Internet Archive. It houses hundreds of CSV files, metadata files, zip files, PDFs of infographics, and more — uploaded before Jan. 28, 2025 — available to download. Alt CDC also gave a shout out to the data archivists who made it possible.
Rachel Hoopsick, PhD, MS, MPH, assistant professor of epidemiology at the University of Illinois Urbana-Champaign, said this archive has salvaged data from the Behavioral Risk Factor Surveillance System, Youth Risk Behavior Surveillance System, and Household Pulse Survey. These tools monitor health-related risk behaviors, chronic conditions, use of preventive services, and factors that contribute to illness, death, and disability among young people.
On Tuesday, a federal judge ordered federal health agencies to restore pages and datasets that had been removed to comply with President Donald Trump’s executive order to scrap language around diversity, equity, and inclusion. The judge’s temporary restraining order was part of a lawsuit brought by Doctors for America to get the pages restored. As of Wednesday, some pages had been restored, according to the Associated Press, but not all.
Hoopsick noted that many of the CDC datasets already had been restored since the initial purge last month, though “the newly available versions have been censored, most notably to remove data related to transgender people.”
“These unprecedented restrictions on scientific information are a threat not only to the integrity of the data itself, but also to the ways in which we respond to that information, including how healthcare professionals provide patient care, and ultimately, the health of populations – especially those that are already made vulnerable in so many other ways,” Hoopsick told MedPage Today.
Becky Smullin Dawson, PhD, MPH, an epidemiologist and professor at Allegheny College in Meadville, Pennsylvania, noted that the archived CDC data is a warehouse for the datasets that researchers would use for peer-reviewed publications. However, it’s “not a user-friendly site like we are used to on CDC.gov,” she added.
While not as comprehensive, Dawson recommended a CDC guidelines resource being put together by journalist Jessica Valenti and hosted in her “Abortion, Every Day” newsletter for selected user-friendly reports and guidelines. In addition, the American College of Obstetricians and Gynecologists (ACOG) is now hosting PDFs of relevant government guidance it has endorsed.
Researchers noted that while these alternatives are important, they are no true replacement for government surveillance systems.
Katelyn Jetelina, PhD, MPH, author of the “Your Local Epidemiologist” newsletter, told MedPage Today that “grassroots efforts, like this one, swiftly archiving key datasets was a critical move, as it provides a reference point before any alterations occurred. The biggest thing missing is the next step: a full analysis comparing the new datasets to these archived ones to understand what changed, and how, and whether there are implications to Americans.”
Hoopsick agreed with Jetelina that comparing archived data to censored government data will determine what information is being censored — and how — which is critical for combatting disinformation and addressing health disparities.
“For better or worse, CDC became a clearinghouse not just of data for researchers, but also for the public,” Smullin Dawson said. “Watching what happened in the past week has me asking, do we need to have more than one warehouse? What are the costs versus benefits of locating all the data in one spot? Should we have a backup?”
-
Rachael Robertson is a writer on the MedPage Today enterprise and investigative team, also covering OB/GYN news. Her print, data, and audio stories have appeared in Everyday Health, Gizmodo, the Bronx Times, and multiple podcasts. Follow
Please enable JavaScript to view the