What Claude Code’s undocumented hooks really do — and how to sandbox them safely

By yrzheMay 29, 20268 min read

# What Claude Code’s undocumented hooks really do — and how to sandbox them safely

Claude Code’s “undocumented” hooks—more accurately: under-used, reference-doc-heavy extension points—let you attach deterministic handlers (shell commands, HTTP endpoints, or prompt/agent-style verifiers) to specific lifecycle events so you can reliably inspect, block, or augment what the agent is about to do (or just did). Unlike a CLAUDE.md instruction the model might ignore, hooks execute as real processes or real network calls and can enforce policy every time.

What the hooks can actually do (in practice)

The core capability is simple: on key events in the Claude Code loop, Claude Code emits a structured JSON payload and invokes your handler. That handler can then take enforceable actions: fail fast to block a tool call, run a formatter, scan for secrets, or emit audit logs/telemetry. The important builder consequence is that hooks turn “guidance” into enforcement: they’re executed outside the model’s stochastic reasoning path.

Common patterns shown across the official docs and community guides include:

Block destructive commands before they run (for example, denying risky bash invocations) via PreToolUse.
Protect sensitive files (like .env) by denying edits/writes that match path patterns.
Run deterministic hygiene after tool calls: formatting, linting, quick tests, or log shipping via PostToolUse.
Inject or validate context at prompt time (UserPromptSubmit / UserPromptExpansion) so the agent always starts from consistent repo state or policy constraints.

Hook types, events, and the decision model

Claude Code supports three practical handler styles:

Command hooks: execute a shell command; Claude Code passes event context as JSON on stdin. Your script returns an exit code (and, depending on the event, structured allow/deny decisions).
HTTP hooks: Claude Code POSTs JSON to a configured endpoint, letting you centralize policy decisions in a service.
Prompt/agent hooks (often described as verifiers): used when you want richer “decisioning” than a shell script can express, while still keeping the hook invocation deterministic.

These handlers can be attached to six main lifecycle events (each with documented JSON schemas and decision options in the hooks reference):

SessionStart (session begins/resumes)
Setup (init/maintenance style flows)
UserPromptSubmit and UserPromptExpansion (before Claude processes the prompt)
PreToolUse (before each tool call in the agent loop; can block)
PostToolUse (after each tool call; cannot retroactively block)
Stop / StopFailure and SessionEnd (turn/session wrap-up)

The key constraint: PreToolUse is the enforcement choke point. If your handler returns a non-zero exit code (or an explicit “deny” decision where supported), the tool call doesn’t run. PostToolUse is for follow-ups only: you can’t prevent the preceding action because it already happened.

Why hooks change how you must sandbox agents

Hooks run as actual code (a process or an HTTP call), not as “soft” model instructions. That’s exactly why they’re valuable—and why they expand your attack surface.

Mechanism-wise, a command hook is just a shell process with full access to whatever environment and filesystem you give it. An HTTP hook is a network egress path that receives rich JSON context about prompts, tool calls, file paths, and tool outputs. The moment you add hooks, you’ve introduced new components that can:

Exfiltrate data (intentionally or accidentally) if an HTTP endpoint is mis-scoped or reachable beyond localhost.
Corrupt the workspace (a buggy formatter, an over-broad cleanup script).
Escalate impact by running at every tool call (PreToolUse/PostToolUse fire frequently).

So the builder consequence is counterintuitive: hooks make the agent more governable, but they also require a stronger sandbox story than “I trust the model” because you must now trust (and constrain) your handlers.

How to sandbox hooks safely: rules that hold up under stress

A workable approach is to treat hook handlers like you would treat CI scripts that run on untrusted input: locked down by default, narrowly scoped, and logged.

Least privilege by configuration

Prefer narrow matchers so hooks run only where intended (specific tools, file patterns, or command types).
Avoid “global” command hooks that run on every repo and every tool call unless you truly need them.

Process isolation for command hooks

Run command hooks in a container or similarly restricted execution environment: unprivileged user, minimal filesystem mounts, and only the directories required for the check. If the hook only needs to read diffs or scan files, mount the workspace read-only.

Network discipline for HTTP hooks

If you use HTTP hooks, decide whether the endpoint should ever be reachable off-machine. A local-only endpoint reduces exfiltration risk; a remote endpoint increases central control but is a bigger data-handling commitment because event payloads may include sensitive context.

Auditability as a first-class feature

Log the full event payloads (or a carefully redacted subset), handler stdout/stderr, exit codes, and allow/deny decisions. Hooks are supposed to be deterministic; an append-only audit log makes that determinism debuggable and replayable.

For related practitioner context on agent workflow orchestration, see Claude Code goes dynamic — practical wins for agent builders.

How a solo builder can prototype a safe local equivalent

You can prototype “hook-like” enforcement without needing any special internal APIs by recreating the basic event loop:

Emit a JSON envelope that looks like the hook schema for a small subset of events you care about (start with SessionStart, PreToolUse, PostToolUse).
Pipe that JSON into a local script (command hook style) or POST it to a localhost service (HTTP hook style).
Enforce decisions with simple semantics: exit code 0 = allow, non-zero = deny (for PreToolUse).

Containment doesn’t need to be exotic to be useful. For example:

Run your hook scripts in an ephemeral Docker container with --network=none when the hook doesn’t need egress.
Mount only the needed directory into the container, read-only when possible.
Add a --dry-run mode that prints what it would deny/modify before you flip it into blocking mode.

This is also where a solo builder can learn quickly: because hooks fire on every tool call, your observability loop (timestamps, payload capture, exit code tracking) matters as much as the policy logic.

If you’re building toward multi-agent patterns, the orchestration angle in How Claude Code’s Dynamic Workflows Orchestrate Hundreds of Subagents — and How to Prototype a Safe Local Equivalent is a useful complement: it frames where enforcement points belong when the number of tool invocations explodes.

Concrete examples to try (with safe defaults)

Secret-scan as PreToolUse (guardrail)

Trigger: PreToolUse on write/edit actions or before shell commands that could publish artifacts.
Action: scan targeted files (or staged diffs) for patterns you consider secrets.
Safe default: run in a container with read-only mounts; deny only on high-confidence matches; log the finding and the file path.

Auto-format as PostToolUse

Trigger: PostToolUse after edits/writes to code files.
Action: run a formatter (for example, Prettier) deterministically.
Safe default: start in dry-run; write formatted output to a temporary location or a separate branch for review, because PostToolUse can create churn if it rewrites broadly.

Telemetry-and-policy via HTTP hook

Trigger: PostToolUse (safer) or PreToolUse (enforcement).
Action: POST the event JSON to a localhost verifier that logs decisions and optionally returns allow/deny.
Safe default: keep the endpoint bound to localhost; store logs locally; explicitly exclude prompt text or tool output if you don’t need them.

Why It Matters Now

What’s changing isn’t that hooks exist—it’s how often builders are running agents in places where mistakes are expensive: local dev machines, CI-style initialization flows (Setup), and repetitive tool-call loops where PreToolUse/PostToolUse can run dozens or hundreds of times in a session. The 2026 wave of practical guides and example repos around Claude Code hooks reflects a broader trend: builders want deterministic enforcement (formatters, secret scans, command guards, telemetry) embedded directly into agent execution, not bolted on afterward.

That trend makes sandboxing a primary design problem. The more you rely on hooks to make agents “safe,” the more your real safety boundary becomes “how constrained are the hook handlers,” not “how well behaved is the model.”

What to Watch

Whether Claude Code evolves a clearer permission model around hook types (command vs HTTP vs verifier) and scoping/matchers, because that’s where most real-world risk concentrates.
More community-standard templates for safe hook development: containerized runners, redaction-by-default logging, and replay tooling for hook events.
Patterns that shift enforcement earlier (PreToolUse) without killing developer flow—especially approaches that start in observe-only mode and graduate to blocking once false positives are understood.

Sources: code.claude.com , paul-schick.com , claudebuddy.art , deepwiki.com , eesel.ai , penligent.ai

About the Author

yrzhe

AI Product Thinker & Builder. Curating and analyzing tech news at TechScan AI. Follow @yrzhe_top on X for daily tech insights and commentary.

X/Twitter GitHub Blog