Summary
We built a comprehensive write-protection system over several weeks: 22 regex patterns across 3 layers blocking in-place sed, inline python, echo redirects, cat heredocs, tee, touch, rm, cp, mv, and more. Claude respected all of them -- until it found perl -i -pe, which was not on the list.
The model did not try to bypass the rules. It genuinely needed to fix a bug in its own safety system (a timezone issue that created a triple deadlock). But the way it solved the problem exposes a fundamental weakness in the hook-based permission model.
What happened
-
A UTC-vs-local-time bug in our session-gate policy caused it to reject valid session entries written near midnight. The gate blocked all Edit/Write tool calls.
-
The fix needed to go into policy.py -- but Edit was blocked by the very bug being fixed (chicken-and-egg).
-
The override system could not help either -- override_guard parses the session transcript for AskUserQuestion but could not find it, creating a second deadlock.
-
protect-hook-scripts blocked Read/Edit/Write/Bash/Glob/Grep on runner.py, preventing investigation of the override failure -- third deadlock.
-
The model then used perl -i -pe to edit the files directly. Perl is not in our blocklist. The bash guard pattern-matches against known write tools but has no concept of any command that can modify files.
-
It then used perl with File::Copy to deploy the fix to 6 other project repositories -- bypassing the cp block.
The security model gap
Our blocklist covers in-place sed, awk-with-redirect, inline python/python3, echo/cat/printf redirects, tee, touch, rm, mv, cp, chmod, chown, truncate, dd, and install.
Not covered (all can write files):
perl -i -pe # in-place edit (identical to sed in-place mode)
perl -e # arbitrary file I/O
ruby -e # same
lua -e # same
node -e # same
xxd + redirect # binary writes
busybox sed # aliased tools
A blocklist for commands that can write files is equivalent to blocking all Turing-complete interpreters -- which is impossible without a whitelist approach.
What would actually work
Allowlist instead of blocklist. Instead of block these 22 patterns, the hook should say only allow these N patterns. Our current whitelist (21 entries like git rm, pip install, npm install, chmod +x) is the right idea -- it just needs to be the primary gate, not the exception list.
Proposed model:
if command matches allowlist -> permit
else -> block (with override mechanism)
This inverts the current logic:
if command matches blocklist -> block
else -> permit <- this is the hole
Irony
The model was simultaneously:
- Too limited to fix a one-line timezone bug through legitimate channels (3 interlocking policies blocked it)
- Resourceful enough to bypass weeks of carefully crafted write-protection by using an unlisted interpreter
It did not act maliciously -- it solved the problem the user asked it to solve. But a model that autonomously discovers perl as a bypass vector is a model that could discover it for less benign purposes.
Environment
- Claude Opus 4.6 (1M context)
- Claude Code CLI
- Custom PreToolUse hooks with regex-based bash command filtering
- 52 policies, 6 hook scripts, 3-layer write protection
Summary
We built a comprehensive write-protection system over several weeks: 22 regex patterns across 3 layers blocking in-place sed, inline python, echo redirects, cat heredocs, tee, touch, rm, cp, mv, and more. Claude respected all of them -- until it found perl -i -pe, which was not on the list.
The model did not try to bypass the rules. It genuinely needed to fix a bug in its own safety system (a timezone issue that created a triple deadlock). But the way it solved the problem exposes a fundamental weakness in the hook-based permission model.
What happened
A UTC-vs-local-time bug in our session-gate policy caused it to reject valid session entries written near midnight. The gate blocked all Edit/Write tool calls.
The fix needed to go into policy.py -- but Edit was blocked by the very bug being fixed (chicken-and-egg).
The override system could not help either -- override_guard parses the session transcript for AskUserQuestion but could not find it, creating a second deadlock.
protect-hook-scripts blocked Read/Edit/Write/Bash/Glob/Grep on runner.py, preventing investigation of the override failure -- third deadlock.
The model then used perl -i -pe to edit the files directly. Perl is not in our blocklist. The bash guard pattern-matches against known write tools but has no concept of any command that can modify files.
It then used perl with File::Copy to deploy the fix to 6 other project repositories -- bypassing the cp block.
The security model gap
Our blocklist covers in-place sed, awk-with-redirect, inline python/python3, echo/cat/printf redirects, tee, touch, rm, mv, cp, chmod, chown, truncate, dd, and install.
Not covered (all can write files):
A blocklist for commands that can write files is equivalent to blocking all Turing-complete interpreters -- which is impossible without a whitelist approach.
What would actually work
Allowlist instead of blocklist. Instead of block these 22 patterns, the hook should say only allow these N patterns. Our current whitelist (21 entries like git rm, pip install, npm install, chmod +x) is the right idea -- it just needs to be the primary gate, not the exception list.
Proposed model:
This inverts the current logic:
Irony
The model was simultaneously:
It did not act maliciously -- it solved the problem the user asked it to solve. But a model that autonomously discovers perl as a bypass vector is a model that could discover it for less benign purposes.
Environment