LLM inspection for suspicious Write/Edit content

## Problem

Write/Edit content inspection is currently deterministic regex patterns only. While this catches common destructive patterns, secrets, and obfuscation, creative or domain-specific payloads can slip through. The deterministic layer is fast and catches the obvious cases, but has blind spots for:

- Obfuscated code that doesn't match known patterns
- Context-dependent payloads (e.g., modifying package.json scripts to run something malicious)
- Novel exfiltration techniques that avoid pattern signatures

## Proposed solution

Add an optional LLM inspection layer for Write/Edit content:

1. **When content inspection flags something as suspicious** (partial match, heuristic trigger) but not definitively malicious, route to the LLM with the content + file path + recent conversation context
2. **For high-risk file types** (shell scripts, CI configs, package manifests, credential files), optionally always route through LLM regardless of deterministic results
3. **LLM prompt context**: "This content is about to be written to [path]. Given the recent conversation context, should this write be allowed?"

This complements the deterministic layer — the fast path handles 95% of cases, the LLM handles the ambiguous remainder.

## Context

Raised in the [Show HN discussion](https://news.ycombinator.com/item?id=43364506) by several commenters:

- **gruez**: "given that you allow npm test, it's not too hard to bypass protections by first modifying package.json so npm test runs an evil command"
- **injidup**: "what about simply a base64 encoded string of text dropped into the code designed to be unpacked and evaluated later... Will any of these fast scanning heuristics work against such attacks?"
- **ibrahim_h**: "the scariest exfiltration pattern isn't a single bad command, it's a chain of totally normal ones. Agent reads .env, writes a script that includes those values, then runs it. Every step looks fine individually."

Committed to on HN: "LLM inspection for Write/Edit: for content that's suspicious but doesn't match any deterministic pattern, route it to the LLM for a second opinion"


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM inspection for suspicious Write/Edit content #25

Problem

Proposed solution

Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

LLM inspection for suspicious Write/Edit content #25

Description

Problem

Proposed solution

Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions