Get all your news in one place.
100's of premium titles.
One app.
Start reading
The Guardian - US
The Guardian - US
World
Amy Qin

‘Things were going dark left and right’: the race to save US government datasets before they’re deleted

Composite of activists that help to save US government data

André spent 2025 trying to stay one step ahead of the Trump administration.

Every morning, he woke up and downloaded as many government datasets as he could before they were deleted. He continued throughout the afternoon, and sometimes through the night, if a notification from his group chat popped up on his phone saying that a new webpage had been taken down.

“We didn’t really know what was going to go down usually until right before it happened,” said André, a computer programmer. He asked not to use his real name to protect himself and his family. “Things were going dark left and right.”

In an effort to purge the federal bureaucracy of “woke ideology”, the Trump administration has removed or altered thousands of webpages and data on climate change, reproductive health, international aid, LGBTQ people and more – information that’s used every day to keep people safe, set policy and provide essential services.

“The idea that it could be destroyed in a way that seemed really reckless, and almost pointless, really struck something deep inside me,” André said.

André is part of a group of “data rescuers” who have banded together during Trump’s second term. They have been quietly racing to save hundreds of critical government datasets before they are no longer available. Now known as the Data Rescue Project, it’s a grassroots network of more than 800 people around the world who spend up to 40 hours a week painstakingly archiving the US government’s digital footprint in their spare time.

Anyone can join, but the majority of volunteers are librarians and academics, said Lynda Kellam, a university data librarian and a founding member of the Data Rescue Project. Programmers from the open-source software community and retirees also work on the project. Some, like André, contribute anonymously. What brings everyone together is the belief that “public data should be a public good”, said Kellam.

“We want people to recognize that this is a public good, just like roads and bridges and other kinds of infrastructure.”

Sef Kloninger, a former engineer at Google, joined the Data Rescue Project because he was appalled that the US government would intentionally delete so much information that shapes people and policies. “A less informed populace is going to be easier to control,” he said.

In the early days of the Data Rescue Project, there was a mad dash to save data from any agency they thought Trump or Doge would target next. Volunteers would download every dataset from the Centers for Disease Control and Prevention (CDC) or National Oceanic and Atmospheric Administration (Noaa) webpages as backup copies in case they ever went dark.

But as the pace of deletions seemed to slow down last fall, the group had more time to consider ways to build longer-term data resiliency. The ultimate goal is not just to save data, but to make it accessible and discoverable to the broader public in the same way that libraries curate research and books, said Kellam.

All archived information is now uploaded on to a searchable public repository for at-risk government data hosted by the University of Michigan, DataLumos. Volunteers also write metadata – short descriptions to help users understand what the data measures – for every item they preserve.

“It’s one thing for one person just to download and copy some information, but that’s not really preserving it,” said Frank Donnelly, the head of geographic information systems and data services at a university library who began volunteering last winter. “You need to have a larger ecosystem where you’re saving data and the metadata that goes with it.”

They are also trying to be more intentional about what they save. It’s hard to track how many datasets the US government produces, but there are at least 500,000 federal datasets listed on Data.gov, a site that makes federal data publicly available.

“Thinking that we’ll get it all is impossible,” said Lena Bohman, a medical data librarian and co-founder of the Data Rescue Project who oversees the volunteer network. “What we can do is try to prioritize the things that we think are most at risk.”

As of late April, Data Rescue Project volunteers have archived more than 3,000 items across hundreds of government departments, according to their public tracker. They have been downloaded from their public repository more than 18,900 times.

Some items are more than just single datasets, but entire websites and databases, like a full archive of all Nasa webpages and the entire US Fish & Wildlife Service’s Feather Atlas, which contains high-resolution images of the feathers of North American birds.

Volunteers have also managed to save several datasets before they were deleted. HIFLD Open was a collection of more than 400 maps showing critical infrastructure, like hospitals and highways, that emergency responders use during climate disasters. The Department of Homeland Security took it down last summer, but another two groups, the Public Environmental Data Partners and Fulton Ring used HIFLD data that volunteers archived to rebuild a public version of the deleted federal tool, which they are calling HIFLD Next.

Other items they have saved have been altered, like CDC data on queer and trans people.

For André, things have calmed down this year. He’s no longer putting out fires all day, and is instead helping with longer-term projects, like figuring out systems for data storage.

He’s also had more time to reflect. A few years ago, he quit a factory job and had severe chronic pain from a spine injury that left him housebound.

“Diving into data rescue helped me take some control and feel like I was actually doing something to help in some way,” he said. Although he’s volunteering without pay, the work has given him a newfound sense of purpose.

“It’s so inspiring to see how many people are willing to put in their time for no compensation at all to just make even the tiniest bit of difference,” he said. “But it makes all the difference when you combine all those efforts together.”

That’s how Kellam feels too. “It’s very much a social movement,” she said, adding that there were now more than 20 groups and thousands of people advocating for public data in different ways, from the Internet Archive to the Environmental Data & Governance Initiative, which is documenting changes to federal environmental data and language with the help of webpages archived on the Internet Archive.

“We’re not going to have a million-person march on Washington on public data,” Kellam said. “But being able to get people interested in a topic that is somewhat nerdy and niche – I think we’ve been really successful with that.”

The Guardian’s Deleted Data series explores how critical US government information is being deleted and what the consequences will be, and will preserve or re-create lost datasets. If you know about any datasets, webpages or government materials that have been deleted or altered in the past year, or how those changes affect you, we’d love to hear from you. Please reach out at deleted-data@theguardian.com.

Sign up to read this article
Read news from 100's of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.