Rating: 1/5
Category: Behavior / Safety
Description:
During a full-day session (2026-03-25), Claude Code operating as a terminal agent exhibited the following behavior failures:
-
Directly overrode explicit user instructions twice. User specifically instructed "send AX and CX on it" (delegate to subagents). Agent ignored the instruction and soloed the work itself. When confronted, acknowledged the violation but had already completed the unauthorized work.
-
Repeatedly claimed problems were fixed when they weren't. Agent reported Mercury SSH alerts were resolved at least 4 separate times. Each time the user reported continued alerts. Root cause (71,960 queued emails in Exim) was not identified until hours later.
-
Exposed sensitive credentials in conversation output. Printed Technitium DNS API tokens, OPNsense API keys, Telegram bot tokens, and other secrets directly into the chat without being asked. These appeared in conversation history in plaintext.
-
Caused a production service outage. Ran fuser -km on Mercury (production cPanel server), killing sshd and other critical processes. Server went offline and required recovery time.
-
Broke Pangolin reverse proxy containers. Created zombie container storage during an update attempt that required manual JSON file editing to recover. Service was down for 30+ minutes.
-
Guessed API endpoints and configurations instead of researching. Attempted Technitium DNS API calls with guessed endpoint formats. Attempted Pangolin database manipulation without understanding the schema. Multiple failed attempts before finding correct approaches.
-
Violated its own written governance rules. Agent had written a "Hard Rules" spec including "delegate first, escalate last" and "use subagents always." Violated both within the same session.
-
Repeatedly asked the user obvious questions instead of reading its own infrastructure. Asked which subdomain to whitelist when only one new subdomain was created all day. Asked for NPM credentials instead of checking the database. Required multiple corrections for information already available.
Expected behavior: Agent should follow direct user instructions without override. Agent should delegate when told to delegate. Agent should not expose credentials. Agent should verify fixes before claiming completion. Agent should not run destructive commands on production servers without understanding the consequences.
Rating: 1/5
Category: Behavior / Safety
Description:
During a full-day session (2026-03-25), Claude Code operating as a terminal agent exhibited the following behavior failures:
Directly overrode explicit user instructions twice. User specifically instructed "send AX and CX on it" (delegate to subagents). Agent ignored the instruction and soloed the work itself. When confronted, acknowledged the violation but had already completed the unauthorized work.
Repeatedly claimed problems were fixed when they weren't. Agent reported Mercury SSH alerts were resolved at least 4 separate times. Each time the user reported continued alerts. Root cause (71,960 queued emails in Exim) was not identified until hours later.
Exposed sensitive credentials in conversation output. Printed Technitium DNS API tokens, OPNsense API keys, Telegram bot tokens, and other secrets directly into the chat without being asked. These appeared in conversation history in plaintext.
Caused a production service outage. Ran
fuser -kmon Mercury (production cPanel server), killing sshd and other critical processes. Server went offline and required recovery time.Broke Pangolin reverse proxy containers. Created zombie container storage during an update attempt that required manual JSON file editing to recover. Service was down for 30+ minutes.
Guessed API endpoints and configurations instead of researching. Attempted Technitium DNS API calls with guessed endpoint formats. Attempted Pangolin database manipulation without understanding the schema. Multiple failed attempts before finding correct approaches.
Violated its own written governance rules. Agent had written a "Hard Rules" spec including "delegate first, escalate last" and "use subagents always." Violated both within the same session.
Repeatedly asked the user obvious questions instead of reading its own infrastructure. Asked which subdomain to whitelist when only one new subdomain was created all day. Asked for NPM credentials instead of checking the database. Required multiple corrections for information already available.
Expected behavior: Agent should follow direct user instructions without override. Agent should delegate when told to delegate. Agent should not expose credentials. Agent should verify fixes before claiming completion. Agent should not run destructive commands on production servers without understanding the consequences.