feat: L1+L2 verification discipline — path tracing, invariant statements, test coverage warnings (#23)

aviadr1 · claude · web-flow · commit 3dc5044e2362 · 2026-03-16T18:21:47.000+02:00
Add mandatory verification policies (kaizen #11, #15, #17): - CLAUDE.md: path-tracing checklist, invariant statement requirement, runtime artifact verification - Pre-commit hook: advisory warning when source files are staged without corresponding test changes - Global CLAUDE.md: path tracing and test invariant requirements for container agents Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
diff --git a/.husky/pre-commit b/.husky/pre-commit
@@ -9,3 +9,20 @@ npm run format:fix
 if [ -n "$STAGED_TS" ]; then
   echo "$STAGED_TS" | xargs git add
 fi
+
+# Advisory: warn if staged source files have no corresponding test changes (kaizen #15)
+STAGED_SRC=$(git diff --cached --name-only --diff-filter=d | \
+  grep -E '\.(ts|js|tsx|jsx)$' | \
+  grep -vE '(\.test\.|\.spec\.|__tests__|\.config\.|vitest\.|CLAUDE\.md|\.claude/|\.husky/)' || true)
+
+STAGED_TESTS=$(git diff --cached --name-only --diff-filter=d | \
+  grep -E '\.(test|spec)\.(ts|js|tsx|jsx)$' || true)
+
+if [ -n "$STAGED_SRC" ] && [ -z "$STAGED_TESTS" ]; then
+  SRC_COUNT=$(echo "$STAGED_SRC" | wc -l | tr -d ' ')
+  echo "" >&2
+  echo "⚠️  $SRC_COUNT source file(s) staged with no test changes:" >&2
+  echo "$STAGED_SRC" | sed 's/^/  - /' >&2
+  echo "  Consider adding tests before this reaches PR review." >&2
+  echo "" >&2
+fi
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -105,6 +105,45 @@ These policies were learned from past mistakes. Follow them strictly.
     - **Affects humans directly?** → Must be Level 3 (humans should never wait on agent mistakes)
     - CLAUDE.md instructions are Level 1 — necessary but not sufficient. When they fail, escalate to hooks (Level 2) or architectural enforcement (Level 3).
 
+## Verification Discipline (Kaizen #11, #15, #17)
+
+### Path Tracing — MANDATORY before any fix
+
+Before writing ANY fix, map the full execution path from trigger to user-visible outcome:
+
+```
+1. MAP the chain: input → layer 1 → layer 2 → ... → user-visible outcome
+2. For each link: how to verify it works, what artifact/log/query proves it
+3. After the fix: verify EVERY link, not just the one you changed
+4. Self-review must trace the path — "I changed layer N, what happens at N+1...?"
+```
+
+**Never fix a single layer and declare done.** The fix isn't complete until the final outcome is verified end-to-end.
+
+### Invariant Statement — MANDATORY before writing tests
+
+Before writing ANY test, state explicitly:
+
+```
+INVARIANT: [what must be true]
+SUT: [exact system/function/artifact under test]
+VERIFICATION: [how the test proves the invariant holds]
+```
+
+**Anti-patterns to avoid:**
+- Testing mocks instead of real code (you're proving your mocks work, not your code)
+- Testing the wrong artifact (e.g., `/app/dist/` when runtime uses `/tmp/dist/`)
+- "All 275 tests pass" when none cover the actual change
+- Verifying implementation details (`cpSync was called`) instead of outcomes (`agent has the tool`)
+
+### Runtime Artifact Verification
+
+Always test the **actual deployed artifact**, not just source presence:
+- If code is compiled, test the compiled output
+- If code runs in a container, verify inside the container
+- If a mount provides a file, verify the mount exists AND the consumer reads it
+- "The file exists in the repo" is not verification — "the agent receives it at runtime" is
+
 ## Kaizen Backlog
 
 Future work, process improvements, and cross-repo engineering proposals are tracked as GitHub Issues in [`Garsson-io/kaizen`](https://github.com/Garsson-io/kaizen). When a dev agent identifies an improvement that's out of scope for the current PR, file it there with the `kaizen` label. Include: what, why, when, how, reproduction steps, and verification criteria.
diff --git a/groups/global/CLAUDE.md b/groups/global/CLAUDE.md
@@ -111,6 +111,24 @@ When marking a case done, reflect on:
 - What would make this type of work faster next time?
 - Suggest dev cases for any improvements needed
 
+## Verification Discipline
+
+### Path Tracing (dev cases)
+
+Before fixing anything, map the full execution path from trigger to outcome:
+1. Write down every link in the chain (e.g., policy file → mount → agent-runner reads it → SDK receives it → agent acts)
+2. After the fix, verify EVERY link — not just the one you changed
+3. The fix isn't done until the final user-visible outcome works
+
+### Test Invariants (dev cases)
+
+Before writing ANY test, state:
+- *INVARIANT:* what must be true
+- *SUT:* the exact system/function/artifact under test
+- *VERIFICATION:* how the test proves the invariant
+
+Never test mocks instead of real code. Never test the wrong artifact. Never declare "tests pass" when none cover your change.
+
 ## Your Workspace
 
 Files you create are saved in `/workspace/group/`. Use this for notes, research, or anything that should persist across cases.