Loading...
Loading...
Major publishers are increasingly blocking the Internet Archive’s Wayback Machine crawler and limiting API access, arguing that archived news can be repurposed for scraping and AI training—especially for paywalled content. Investigations found dozens of prominent sites, plus Reddit, restricting ia_archiverbot, while outlets like The Guardian and Financial Times are selectively filtering article URLs. Digital rights groups such as the EFF warn the move won’t meaningfully stop AI but will degrade a critical public record used for journalism, research, and legal evidence, especially when articles are edited or removed. The dispute lands amid broader legal pressure on the Archive, raising fears the web is becoming effectively “unarchivable.”
Blocking the Wayback Machine reduces access to archived news and web records that journalists, researchers, and engineers rely on for verification, provenance, and reproducibility. Tech professionals building datasets, compliance tools, or integrity checks face degraded sources and legal uncertainty when public web history is incomplete.
Dossier last updated: 2026-05-10 03:58:03
Software engineer Gregor Stocks has launched the Library of Leng, a searchable database preserving writing about Magic: The Gathering, indexing about 175,000 articles spanning roughly 30 years. The project aggregates early Usenet posts, hobbyist webpages captured by the Internet Archive, and Wizards of the Coast updates that are often later removed. Rather than republishing content, the site lists headlines, snippets, and links to archived copies on the Wayback Machine/Internet Archive unless authors explicitly consent to full reprints. Stocks told 404 Media the biggest challenge was parsing inconsistent 1990s–2000s web data and metadata. Early material includes strategy writing from 1994 and more recent tournament-rule announcements. Community response has been positive; Stocks has contacted Wizards for permission to host its old content but has not received a reply.
A public petition urges major publishers — specifically The New York Times, The Atlantic, and USA Today — to stop blocking the Internet Archive’s Wayback Machine from preserving news articles. The petition argues publishers’ AI concerns are speculative and that archiving is vital for journalistic integrity, fact-checking, historical record, and resistance to censorship and authoritarian pressure. It highlights instances where outlets blocked archiving even as their reporting relies on or benefits from preserved pages, and stresses that the nonprofit Internet Archive operates with long-term stewardship and ethical restraint compared with for-profit actors. The plea calls on newsroom leadership to publicly commit to restoring access so future readers and researchers can access and verify reporting.
A petition urges major news outlets — specifically The New York Times, The Atlantic, and USA Today — to allow the Internet Archive’s Wayback Machine to continue preserving their reporting after some publishers began blocking archival crawls. Supporters argue that the archive is a neutral public good that safeguards journalism against censorship, deletion, and revision in an era of growing authoritarian pressure and AI-fueled content misuse. The petition points to recent reporting showing contradictions where outlets rely on archived material while preventing its preservation, and warns that blocking the Wayback Machine hands an advantage to unscrupulous actors who scrape content without the Archive’s ethical constraints. It calls on media leaders to publicly recommit to working with the Internet Archive.
The Internet Archive has launched a new foundation based in Switzerland to expand its global mission of preserving digital knowledge. The move establishes a European legal and operational hub intended to improve international governance, fundraising, and collaborations while offering a jurisdiction perceived as favorable for cultural preservation and data stewardship. Key players include the Internet Archive organization and its leadership, positioning the new Swiss foundation as complementary to the U.S.-based nonprofit. This matters because a European presence can ease cross-border partnerships, address legal and copyright complexities, and reassure international donors and partners about stewardship and governance of large digital collections. The step may influence how digital archives are maintained and governed globally.
A blog post reports that George Orwell’s 1939 review of Bertrand Russell’s book “Power: A New Social Analysis” had become difficult to find online and was no longer appearing in search results. The author says they previously learned from the piece, describing it as more than a standard book review, and recently tried to locate it again. After “sleuthing,” they recovered a copy via the Internet Archive and note that the review was originally published in The Adelphi in January 1939. The post also points readers to a scanned version of the original publication hosted on the Internet Archive. The item matters as an example of digital preservation and the role of archives in restoring access to historical texts.
Two hundred journalists publicly praised the Internet Archive and its Wayback Machine for preserving news, documents and rare books, signing a letter defending the service as major outlets debate restricting archival access. Signatories — including Rachel Maddow, Justin Bank, Ashley Belanger, Annalee Newitz and other reporters — cited daily use for fact-checking, tracing policy and terms-of-service changes, reconstructing altered press releases, and accessing hard-to-find scholarly works. They argue the Archive is an essential public record and forensic tool that prevents “memory-holing” of online content, likening its role to national libraries and urging protection of broad, impartial access to digital history. The endorsements spotlight tensions between publishers’ rights and public-interest archiving.
About 200 journalists have publicly praised the Internet Archive and its Wayback Machine for preserving news, historical materials, and rare books, urging support as some media outlets debate blocking archival copies of their work. Signatories — including Rachel Maddow, Justin Bank, Ashley Belanger, Annalee Newitz and other reporters — describe using the Archive daily to fact-check, recover altered press releases, compare past website versions, and access otherwise rare research materials. They argue the Archive functions as a critical public ledger and research tool that protects the historical record against revision, censorship, and link rot, making it essential for accountability reporting and cultural preservation.
The Internet Archive’s Wayback Machine, a critical web archiving service run by the nonprofit Internet Archive, is facing financial and legal pressures that threaten its ability to preserve web history. The organization has grown into a vital resource for researchers, journalists, and developers by storing billions of web pages, books, and multimedia; now rising costs, server maintenance, and copyright litigation risks imperil its operations. Key players include the Internet Archive, its founder Brewster Kahle, libraries and academic partners, and rights holders involved in lawsuits. The outcome matters for digital preservation, public access to historical internet content, and the dependability of a shared web heritage relied on by tech companies and the broader internet ecosystem.
The Internet Archive’s Wayback Machine—home to over a trillion archived web pages used by journalists, researchers, and courts—is losing access to major publishers after The New York Times and others began blocking its crawlers to prevent AI scraping. The move, driven by publisher concerns about AI models being trained on copyrighted news, risks erasing a decades‑long historical record of how stories originally appeared online. The article argues that archiving and searchable copying have established fair‑use precedent (citing past cases like Google Books) and that nonprofit preservation serves a transformative, public‑interest purpose distinct from commercial AI training. It warns that cutting off the Archive to control AI access would harm future research and the public record.