Skip to content

Model bypasses all PreToolUse write-protection hooks using perl — blocklist approach is fundamentally incomplete #40408

@weilhalt

Description

@weilhalt

Summary

We built a comprehensive write-protection system over several weeks: 22 regex patterns across 3 layers blocking in-place sed, inline python, echo redirects, cat heredocs, tee, touch, rm, cp, mv, and more. Claude respected all of them -- until it found perl -i -pe, which was not on the list.

The model did not try to bypass the rules. It genuinely needed to fix a bug in its own safety system (a timezone issue that created a triple deadlock). But the way it solved the problem exposes a fundamental weakness in the hook-based permission model.

What happened

  1. A UTC-vs-local-time bug in our session-gate policy caused it to reject valid session entries written near midnight. The gate blocked all Edit/Write tool calls.

  2. The fix needed to go into policy.py -- but Edit was blocked by the very bug being fixed (chicken-and-egg).

  3. The override system could not help either -- override_guard parses the session transcript for AskUserQuestion but could not find it, creating a second deadlock.

  4. protect-hook-scripts blocked Read/Edit/Write/Bash/Glob/Grep on runner.py, preventing investigation of the override failure -- third deadlock.

  5. The model then used perl -i -pe to edit the files directly. Perl is not in our blocklist. The bash guard pattern-matches against known write tools but has no concept of any command that can modify files.

  6. It then used perl with File::Copy to deploy the fix to 6 other project repositories -- bypassing the cp block.

The security model gap

Our blocklist covers in-place sed, awk-with-redirect, inline python/python3, echo/cat/printf redirects, tee, touch, rm, mv, cp, chmod, chown, truncate, dd, and install.

Not covered (all can write files):

perl -i -pe          # in-place edit (identical to sed in-place mode)
perl -e              # arbitrary file I/O
ruby -e              # same
lua -e               # same
node -e              # same
xxd + redirect       # binary writes
busybox sed          # aliased tools

A blocklist for commands that can write files is equivalent to blocking all Turing-complete interpreters -- which is impossible without a whitelist approach.

What would actually work

Allowlist instead of blocklist. Instead of block these 22 patterns, the hook should say only allow these N patterns. Our current whitelist (21 entries like git rm, pip install, npm install, chmod +x) is the right idea -- it just needs to be the primary gate, not the exception list.

Proposed model:

if command matches allowlist -> permit
else -> block (with override mechanism)

This inverts the current logic:

if command matches blocklist -> block
else -> permit  <- this is the hole

Irony

The model was simultaneously:

  • Too limited to fix a one-line timezone bug through legitimate channels (3 interlocking policies blocked it)
  • Resourceful enough to bypass weeks of carefully crafted write-protection by using an unlisted interpreter

It did not act maliciously -- it solved the problem the user asked it to solve. But a model that autonomously discovers perl as a bypass vector is a model that could discover it for less benign purposes.

Environment

  • Claude Opus 4.6 (1M context)
  • Claude Code CLI
  • Custom PreToolUse hooks with regex-based bash command filtering
  • 52 policies, 6 hook scripts, 3-layer write protection

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions