What Is Client‑Side Proof‑of‑Work (Anubis) — and Does It Stop AI Scrapers?
# What Is Client‑Side Proof‑of‑Work (Anubis) — and Does It Stop AI Scrapers?
Client‑side proof‑of‑work (PoW) tools like Anubis don’t “stop” AI scrapers outright—but they can materially raise the cost and friction of large‑scale scraping, especially for unsophisticated bots and bulk crawlers. In practice, Anubis is best understood as an economic and operational speed bump: it helps smaller sites survive abusive traffic and makes indiscriminate harvesting more expensive, but it won’t deter a determined, well‑resourced adversary that’s willing to pay for compute and engineering workarounds.
What client‑side Proof‑of‑Work (Anubis) is, in plain terms
Anubis is an open‑source anti‑scraping tool that asks a visitor’s browser to perform a small amount of computation before the site serves the requested content. Instead of the server doing all the work up front—potentially at scale for a scraper—the client pays a “compute toll.”
Conceptually, it borrows from Hashcash (1997), a proof‑of‑work idea originally used to make spam more expensive. Anubis applies the same logic to web abuse: if each page view or request comes with a small required amount of CPU work, then a casual human visitor experiences a brief delay, but a scraper attempting millions of requests faces a meaningful bill.
Anubis was created by Xe Iaso (Techaro) after being hit by heavy crawler traffic (including a reported incident involving Amazon’s crawler hitting a git server), and it’s been adopted primarily by Git forges, free/open‑source communities, and small sites that need low‑cost protection against aggressive automated crawling.
How it works technically: the challenge‑and‑solve flow
Under the hood, Anubis implements a challenge‑response pattern at the web edge:
- Initial request gets challenged. When a client hits a protected endpoint, the server responds with a challenge page or HTTP response rather than the upstream content.
- Browser computes a nonce. The client runs JavaScript that repeatedly tries values (a nonce) until it finds one that makes a SHA‑256 hash meet a difficulty target—typically something like:
SHA-256(hash_input || nonce) has a required number of leading zeros.
- Client returns solution; server verifies. Once the puzzle is solved, the client submits the nonce. The server verifies the work (verification is cheap compared to solving).
- Temporary access is granted. If valid, the server issues access to the upstream resource—often via a short‑lived token/cookie so the client isn’t forced to solve on every single request.
A notable detail is tunable difficulty. Anubis has a conservative default—reported as ~5 leading‑zero bits/characters (configurable)—intended to impose only modest work on a typical modern browser while still significantly changing the economics of mass parallel requests.
Because this is client‑side, it implicitly favors “real browsers” that can execute JavaScript and compute SHA‑256 quickly enough. That’s part of the point: a lot of simplistic crawlers and headless scripts either don’t run JS at all or aren’t engineered to handle interactive challenges efficiently.
Why Anubis is appealing to small sites and OSS projects
For small operators, Anubis is attractive because it’s designed to be lightweight and deployable without major infrastructure changes. It can sit in front of upstream services like a web firewall/edge utility, reducing the risk that a burst of automated traffic overloads an origin server—particularly common for community infrastructure like git hosting.
It’s also transparent in a way many commercial bot defenses are not: it’s open source (TecharoHQ/anubis on GitHub) and documented, including design notes that describe how the challenge flow works. That matters for community sites that prefer tools they can audit and explain to users.
Strengths: what Anubis blocks well
Anubis is most effective when the abuse looks like “cheap requests at scale.” Concretely, it can:
- Slow down or deter non‑JavaScript clients and basic HTTP scraping scripts that don’t execute the challenge.
- Raise the cost of high‑volume parallel scraping. If each request forces measurable CPU work, “just add more requests” becomes “add more compute,” which changes the economics of indiscriminate harvesting.
- Reduce server load during crawler spikes by pushing work to clients and gating upstream access behind a solved challenge.
This is why PoW tools often show up first in places like OSS infrastructure: the aim is less “perfect security” and more “keep the lights on” against abusive automation.
Limits and trade‑offs: where Anubis falls short
Client‑side PoW is a deterrent, not a wall. The key limitations are baked into the threat model:
- Determined adversaries can pay the toll. A well‑resourced scraper can amortize proof‑of‑work using cloud instances, distributed botnets, or large‑scale browser automation. In that world, PoW becomes an operational expense—sometimes acceptable if the data is valuable enough.
- It can hurt legitimate users. By design, PoW shifts cost to visitors. Users on low‑power devices, constrained environments, or those using tools that block JavaScript may see degraded performance—or be unable to access the site.
- It can block “good bots” too. Archivers, research crawlers, and other legitimate automated clients may be caught in the same net unless operators provide exceptions or alternative access paths.
- Arms‑race dynamics apply. Attackers can adapt (parallelize more, run real browsers at scale, or build specialized solving pipelines), which can pressure defenders to crank difficulty—potentially worsening UX and accessibility.
If you’re already thinking “this sounds like spam filters,” that’s the right mental model: PoW helps reshape incentives, but it doesn’t end abuse.
Why It Matters Now
The renewed interest in tools like Anubis is closely tied to the broader friction between publishers and AI firms over web‑scale scraping and automated data harvesting. As more sites experience heavy crawler traffic—sometimes overwhelming smaller infrastructure—cheap, deployable countermeasures become attractive.
At the same time, debates about how AI systems interact with developer and publishing workflows are spilling into public view. Even relatively small product changes can become contentious when they touch provenance and attribution, a theme that shows up in developer tooling discussions such as commit metadata and AI assistance (see: Why Is VS Code Adding “Co-Authored‑by: Copilot” to My Git Commits?). In that context, Anubis is part of a broader question: who bears the cost of AI-era automation—platforms, model builders, or the people running websites?
For many small operators, PoW is attractive precisely because it’s not a years‑long policy fight. It’s a practical knob they can turn today.
Practical mitigations for publishers and developers
Anubis works best as defense in depth, not a single silver bullet:
- Pair PoW with rate limiting, and use logging to distinguish nuisance traffic from persistent scraping attempts.
- Start with conservative difficulty and monitor false positives, especially from low‑power devices.
- Provide clear documentation on why PoW is used and a path for legitimate automated clients to request access or exceptions.
- Consider escalation paths (challenge → stricter throttling → blocks) rather than jumping straight to maximum friction.
For a broader view of where “control” measures run into real-world constraints, see AI Agents Meet Security and Control Limits.
What to Watch
- Adoption vs. adaptation: Whether more small sites and git forges deploy PoW—and how scraper operators respond (more compute, distributed solving, browser automation).
- UX and accessibility pressure: How communities balance protection with usability for legitimate users and constrained devices.
- Next-gen alternatives: Whether improved bot detection, attestations, or other edge controls become easier to deploy than PoW—or are layered on top of it.
- Norms and governance: As the “acceptable scraping” debate evolves, watch whether technical barriers like PoW become a default baseline, or a stopgap until policy and platform rules catch up.
Sources: aitoolly.com, github.com, theregister.com, anubis.techaro.lol, xeiaso.net, en.wikipedia.org
About the Author
yrzhe
AI Product Thinker & Builder. Curating and analyzing tech news at TechScan AI. Follow @yrzhe_top on X for daily tech insights and commentary.