This Week's Journal Outage: AI Scraping
Posted by A. David Lewis, with resource assistance from Matthew Noe on 2025-09-20
During a recent meeting of the NNLM Region 7 COI for Graphic Medicine, one of the participants, interested in resource and publishing opportunities with the Graphic Medicine Review, noted that the site was down. Indeed, it was: the entire Janeway journal system was offline due to a mass AI bot attack.
Fortunately, the dilemma was resolved relatively quickly and without further issue -- except, that is to say, for the growing problem of AI crawlers. As Starchy Grant notes in their article for the Electronic Frontier Foundation, "scraping itself is not the problem. Automated access is a fundamental technique of archivists, computer scientists, and everyday users that we hope is here to stay—as long as it can be done non-destructively." It's not a problem that our sites are being accessed and catalogued by automatic programs; it's that they're being scraped haphazardly and relentlessly.
Writing for Nature magazine, Diane Kwon notes that "the rise of generative AI has led to a deluge of bots, including many ‘bad’ ones that scrape without permission." The more ubiquitous and the less computationally intensive AI gets, the more this is likely to happen. NPR technology correspondent Bobby Allyn covered this growing crisis last year, and it comes down to a profit motive: "As they explode norms in search of more data, the AI firms are getting richer." And, while infrastructure firms like Cloudflare are launching new tools to block such bot floods, they are also capitalizing on it, "moving forward with a Pay Per Crawl program that lets customers charge AI companies to scrape their websites," according to Kate Knibbs of Wired.
All this is to say that GMR will continue working with the robust Janeway system to be vigilant about such outages. But this trend may continue well beyond this journal, across all manner of academic, library, and adminsitrative platforms. There may come a time soon when editors and stakeholders from all over the kingdoms of online content will have to come together and dig a deeper moat.
Allyn, B. (2024). Artificial intelligence web crawlers are running amok. NPR. Retrieved from https://www.npr.org/2024/07/05/nx-s1-5026932/artificial-intelligence-web-crawlers-are-running-amok
Grant, S. (2025). Keeping the web up under the weight of AI crawlers. Electronic Frontier Foundation. Retrieved from https://www.eff.org/deeplinks/2025/06/keeping-web-under-weight-ai-crawlers
Knibbs, K. (2025). Cloudflare is blocking AI crawlers by default. Wired. Retrieved from https://www.wired.com/story/cloudflare-blocks-ai-crawlers-default/
Kwon, D. (2025). Web-scraping AI bots disrupt databases and journals. Nature 642(8067): 281-282. Retrieved from https://doi.org/10.1038/d41586-025-01661-4