Garsson-io
diff --git a/‎.claude/kaizen/README.md‎
Lines changed: 17 additions & 0 deletions b/‎.claude/kaizen/README.md‎
Lines changed: 17 additions & 0 deletions
diff --git a/‎.claude/kaizen/policies.md‎
Lines changed: 19 additions & 0 deletions b/‎.claude/kaizen/policies.md‎
Lines changed: 19 additions & 0 deletions
diff --git a/‎.claude/kaizen/skills/trim-claude-md/SKILL.md‎
Lines changed: 106 additions & 0 deletions b/‎.claude/kaizen/skills/trim-claude-md/SKILL.md‎
Lines changed: 106 additions & 0 deletions
diff --git a/‎.claude/kaizen/verification.md‎
Lines changed: 63 additions & 0 deletions b/‎.claude/kaizen/verification.md‎
Lines changed: 63 additions & 0 deletions
diff --git a/‎.claude/kaizen/workflow.md‎
Lines changed: 40 additions & 0 deletions b/‎.claude/kaizen/workflow.md‎
Lines changed: 40 additions & 0 deletions
diff --git a/‎.claude/skills/trim-claude-md‎
Lines changed: 1 addition & 0 deletions b/‎.claude/skills/trim-claude-md‎
Lines changed: 1 addition & 0 deletions
@@ -71,6 +71,17 @@ These are registered in `.claude/settings.json` and fire on Claude Code tool-use
 | `test-*.sh` | Per-hook and integration test files (15+) |
 | `test_hooks.py` | Python-based hook tests |
 
+### System Documents (this directory)
+
+| File | Purpose |
+|------|---------|
+| `policies.md` | Kaizen enforcement policies (#11-17) — recursive kaizen, hooks, MCP, security, isolation, testing, language boundaries |
+| `verification.md` | Verification discipline — path tracing, invariant statements, runtime artifact verification, smoke tests |
+| `workflow.md` | Dev work skill chain — trigger→skill routing for `/pick-work` → `/accept-case` → `/implement-spec` → `/kaizen` |
+| `practices.md` | Engineering practices checklist — consulted before shipping (advisory) |
+| `zen.md` | The Zen of Kaizen — philosophical principles |
+| `horizon.md` | Horizon tracking dimensions |
+
 ### Documentation (`docs/`)
 
 | File | Purpose |
@@ -79,6 +90,12 @@ These are registered in `.claude/settings.json` and fire on Claude Code tool-use
 | `hook-portability-matrix.md` | Maps each hook to its best portable alternative |
 | `hook-migration-plan.md` | Phase plan for moving enforcement to strongest layers |
 
+### Skills (`skills/`)
+
+| Skill | Purpose |
+|-------|---------|
+| `trim-claude-md` | Measure CLAUDE.md, identify bloat, extract to kaizen system or reference docs. Symlinked from `.claude/skills/` for discoverability. |
+
 ### Integration Points (outside this directory)
 
 These participate in kAIzen but live where their tools require:
 
@@ -0,0 +1,19 @@
+# Kaizen Enforcement Policies
+
+These policies govern the kaizen enforcement system specifically. General dev policies live in CLAUDE.md. These rules were learned from past kaizen incidents — follow them strictly.
+
+1. **Recursive kaizen on every fix-PR.** See `.claude/skills/kaizen/SKILL.md` for the full framework. After every fix, assess:
+    - **What level is this fix?** Level 1 (instructions) → Level 2 (hooks/checks) → Level 3 (mechanistic code)
+    - **Has this type of failure happened before?** If yes, the previous level wasn't enough — escalate.
+    - **Affects humans directly?** → Must be Level 3 (humans should never wait on agent mistakes)
+    - CLAUDE.md instructions are Level 1 — necessary but not sufficient. When they fail, escalate to hooks (Level 2) or architectural enforcement (Level 3).
+2. **Hooks are the foundation of our kaizen infrastructure.** The `.claude/kaizen/hooks/` directory contains Level 2 enforcement — automated checks that catch mistakes before they reach humans. See `.claude/kaizen/README.md` for the full kAIzen Agent Control Flow system documentation. When a hook blocks you:
+    - **Do NOT override it blindly.** The hook exists because a past mistake proved instructions alone weren't enough.
+    - **If it's a false positive**, fix the hook. Improve its matching logic, add exclusions with rationale, and add a test case that covers the false-positive scenario. This is recursive kaizen — making the enforcement smarter, not weaker.
+    - **If it's a true positive**, fix the underlying issue. The hook is doing its job.
+    - **Always add a test** for any hook change in `.claude/kaizen/hooks/tests/`. Hooks without tests are Level 1 pretending to be Level 2.
+3. **MCP tools are Level 3 enforcement points, not passthroughs.** When an agent behavior problem surfaces through an MCP tool, the fix belongs in the tool's logic — validation, auto-detection, or rejection. Don't default to updating the tool's description text (Level 1) when the kaizen rules demand Level 3. The MCP boundary is where agent intent meets system action; that's where policy enforcement belongs. Level 1 description improvements are defense-in-depth on top of Level 3, not a substitute.
+4. **Authoritative security files: do NOT duplicate, do NOT bypass.** Files with `security`, `auth`, or `allowlist` in their name (`case-auth.ts`, `mount-security.ts`, `sender-allowlist.ts`) are the single source of truth for their policy domain. All authorization decisions in that domain MUST go through the authoritative file. Never inline ad-hoc authorization checks elsewhere — call the gate function instead. Changes to these files require careful review and tests.
+5. **Hooks MUST be worktree-isolated.** A hook running in worktree A must NEVER read, modify, or block based on state from worktree B. This is a hard safety invariant — violations cause cross-worktree contamination where one agent's work hijacks another agent's session. All state file iteration MUST go through `lib/state-utils.sh` (`is_state_for_current_worktree`, `list_state_files_for_current_worktree`). Never iterate `/tmp/.pr-review-state/` directly. State files without a BRANCH field are treated as unattributable and skipped.
+6. **Co-commit source and test changes.** Every source file change must have a corresponding test file change in the same PR. Test utilities use the `.test-util.ts` extension (excluded from coverage checks). If a source change genuinely doesn't need tests (e.g., trivial constant change, already covered by existing tests), declare it in the PR body using the `test-exceptions` fenced block — this is public and auditable.
+7. **Hook language boundaries: L1-L2 bash, L3-L4 TypeScript.** Simple guards and pattern matching stay in bash. Scripts that need arithmetic, data transformation, error recovery, or their own test assertions belong in TypeScript. The signal: if you're hand-rolling `try/catch` or `expect()` in bash, move to TypeScript. See [`docs/hook-language-boundaries.md`](../../docs/hook-language-boundaries.md) for the full decision framework and migration plan.
@@ -0,0 +1,106 @@
+# Trim CLAUDE.md
+
+Measure CLAUDE.md size, identify bloat, and extract verbose sections into reference docs or kaizen system files — leaving compact pointers behind.
+
+CLAUDE.md is loaded into every conversation. Every line costs context. This skill keeps it lean.
+
+## When to use
+
+- CLAUDE.md exceeds ~250 lines
+- A section has grown verbose during development (common after adding policies or procedures)
+- After adding a new feature that introduced a large CLAUDE.md section
+- Periodic maintenance (every ~10 PRs)
+
+## The Process
+
+### Step 1: Measure
+
+```bash
+# Total size
+wc -l CLAUDE.md
+
+# Per-section breakdown
+awk '/^## /{if(section) printf "%3d lines: %s\n", NR-start, section; section=$0; start=NR} END{printf "%3d lines: %s\n", NR-start+1, section}' CLAUDE.md
+```
+
+**Target:** CLAUDE.md under 250 lines. If it's under 200, probably fine — don't trim for the sake of trimming.
+
+### Step 2: Classify each section
+
+For each section over ~15 lines, ask:
+
+| Question | If yes → |
+|----------|----------|
+| Is this kaizen enforcement? (hooks, verification, workflow, policies learned from incidents) | `.claude/kaizen/{name}.md` |
+| Is this a reference lookup? (architecture, procedures, recipes) | `docs/{name}.md` |
+| Is this an interactive workflow with decision points? | `.claude/skills/{name}/SKILL.md` |
+| Is this routing data? (trigger phrases → skills, key files, quick context) | **Keep in CLAUDE.md** |
+| Is this a short rule or policy? (< 3 lines per item) | **Keep in CLAUDE.md** |
+
+### Step 3: Decide what stays
+
+**Always keep in CLAUDE.md** (agents need this in every conversation):
+- Quick Context — project orientation
+- Cases overview — core concept
+- Key Files — navigation
+- Skill trigger mappings — routing data (which phrases invoke which skills)
+- Short policies — rules that fit in 1-2 lines each
+- Database — query recipes
+- Development — build/run commands
+- Git Remotes — tiny, always needed
+
+**Extract from CLAUDE.md** (agents only need when doing specific tasks):
+- Detailed procedures (merging, deploying, IPC messaging)
+- Verbose policies with sub-bullets and examples
+- Architecture diagrams and layer tables
+- Verification checklists
+- Philosophical content (zen aphorisms)
+
+### Step 4: Extract
+
+For each section being extracted:
+
+1. **Create the target file** with the full content, properly titled
+2. **Replace in CLAUDE.md** with a 2-3 line pointer:
+   ```markdown
+   ## Section Name
+
+   **Read [`path/to/file.md`](path/to/file.md)** when [trigger condition].
+
+   Key points: [1-2 most important rules that agents should always remember].
+   ```
+3. **Verify** the pointer's path is correct and the file exists
+
+### Step 5: Verify
+
+```bash
+# Check final size
+wc -l CLAUDE.md
+
+# Check all internal links resolve
+grep -oE '\[.*?\]\((.*?)\)' CLAUDE.md | grep -oE '\(.*?\)' | tr -d '()' | while read link; do
+  [ -f "$link" ] || echo "BROKEN: $link"
+done
+```
+
+## Classification guide: kaizen vs not
+
+**Belongs in `.claude/kaizen/`** if the content:
+- Was learned from kaizen incidents (policies #11-17, verification discipline)
+- Enforces the kaizen workflow (skill chain, reflection triggers)
+- Would make sense if kaizen were extracted as a standalone system
+- Is referenced by kaizen hooks or skills
+
+**Does NOT belong in `.claude/kaizen/`** if the content:
+- Is general engineering practice (TDD, dependency management)
+- Is project infrastructure (merging, deploying, database)
+- Is domain-specific (vertical architecture, IPC messaging)
+
+When in doubt: if removing kaizen from the project would make this content irrelevant, it belongs in kaizen. If it would still be useful, it belongs elsewhere.
+
+## Anti-patterns
+
+- **Extracting too aggressively.** Short sections (< 10 lines) don't need extraction — the pointer overhead isn't worth it.
+- **Losing routing data.** Skill trigger phrases MUST stay in CLAUDE.md. If agents can't see them, they won't invoke skills correctly.
+- **Creating skills for passive content.** Reference docs are not skills. Only create a skill if the content is an interactive workflow with decision points.
+- **Putting general dev practices in kaizen.** "Declare all dependencies" is not kaizen-specific. Keep it in CLAUDE.md.
@@ -0,0 +1,63 @@
+# Verification Discipline
+
+Learned from kaizen #11, #15, #17. These are mandatory practices for all dev work.
+
+## Path Tracing — MANDATORY before any fix
+
+Before writing ANY fix, map the full execution path from trigger to user-visible outcome:
+
+```
+1. MAP the chain: input → layer 1 → layer 2 → ... → user-visible outcome
+2. For each link: how to verify it works, what artifact/log/query proves it
+3. After the fix: verify EVERY link, not just the one you changed
+4. Self-review must trace the path — "I changed layer N, what happens at N+1...?"
+```
+
+**Never fix a single layer and declare done.** The fix isn't complete until the final outcome is verified end-to-end.
+
+## Invariant Statement — MANDATORY before writing tests
+
+Before writing ANY test, state explicitly:
+
+```
+INVARIANT: [what must be true]
+SUT: [exact system/function/artifact under test]
+VERIFICATION: [how the test proves the invariant holds]
+```
+
+**Anti-patterns to avoid:**
+
+- Testing mocks instead of real code (you're proving your mocks work, not your code)
+- Testing the wrong artifact (e.g., `/app/dist/` when runtime uses `/tmp/dist/`)
+- "All 275 tests pass" when none cover the actual change
+- Verifying implementation details (`cpSync was called`) instead of outcomes (`agent has the tool`)
+- Hardcoding values that the SUT computes (e.g., `PROJECT_ROOT="$REPO_ROOT"` bypasses testing path resolution)
+
+**Meta tests — MANDATORY for infrastructure scripts:**
+Scripts that resolve paths, detect environments, or set up state used by all subsequent logic MUST have tests that verify the resolution/detection itself — not tests that hardcode the resolved value and only test downstream logic. If a test bypasses the setup that the real script performs, it can't catch bugs in that setup. Examples:
+
+- Path resolution: test that the output is absolute, points to the right directory, works from subdirectories and worktrees
+- Environment detection: test that detection works in the actual environments it will run in (main checkout, worktree, background process)
+- State initialization: test that initialization produces valid state, not just that functions work given pre-initialized state
+
+## Runtime Artifact Verification
+
+Always test the **actual deployed artifact**, not just source presence:
+
+- If code is compiled, test the compiled output
+- If code runs in a container, verify inside the container
+- If a mount provides a file, verify the mount exists AND the consumer reads it
+- "The file exists in the repo" is not verification — "the agent receives it at runtime" is
+
+## Smoke Tests — MANDATORY when review identifies them
+
+When a PR review says a smoke test is needed, **you must perform it before declaring the PR ready**. "Pending manual smoke test" is not an acceptable review outcome — it means the review is incomplete.
+
+Smoke test checklist:
+
+1. **Identify what to smoke test** — the review will name the untested path (e.g., "never hit real GitHub API", "never ran in container")
+2. **Run it** — execute the actual end-to-end path. If it requires credentials or infrastructure you don't have, ask the user to provide them or run the test together.
+3. **Record the result** — include the smoke test output (success or failure) in the PR or review comment.
+4. **If you can't smoke test** — explicitly state what's blocking and ask the user. Don't hand-wave it as "recommended before deploy."
+
+The point of review is to catch gaps. A gap identified but not closed is not a review — it's a TODO list.
@@ -0,0 +1,40 @@
+# Dev Work Skill Chain
+
+When the conversation involves **selecting, evaluating, or starting dev work**, activate the right skills in sequence. Do NOT jump straight to writing code.
+
+## Flow
+
+```
+User asks "where are the gaps", "analyze gaps", "what should we invest in"
+  → /gap-analysis  (strategic: tooling/testing gaps, horizon concentration, unnamed dimensions)
+    → produces: low-hanging fruit, feature PRD candidates, meta/horizon PRD candidates
+
+User asks "make a dent", "hero mode", "fix the category", "deep dive", "autonomous fix"
+  → /make-a-dent  (autonomous: find root cause category, fix bugs, add interaction tests, ship PR)
+
+User asks "what's next", "pick work", "pick a kaizen", "what should we work on"
+  → /pick-work  (filter claimed issues, score by momentum/diversity, present options)
+
+User discusses a specific issue, PR, case, or spec
+  → /accept-case  (collision check, evaluate, find low-hanging fruit, get admin input)
+
+User greenlights: "lets do it", "go ahead", "build it", "do it", "yes", etc.
+  → /implement-spec  (five-step algorithm, create case + worktree, then execute)
+  → MUST pass githubIssue number when creating case for a kaizen issue
+
+Work is large enough to need multiple PRs
+  → /plan-work  (break into sequenced PRs with dependency graph)
+
+Work is done
+  → /kaizen  (reflect on impediments, suggest improvements)
+```
+
+## Key Triggers to Recognize
+
+- **Strategic gap analysis:** "gap analysis", "analyze gaps", "where are problems concentrated", "tooling gaps", "testing gaps" → `/gap-analysis`
+- **Autonomous deep-dive:** "make a dent", "hero mode", "fix the category", "deep dive kaizen", "autonomous fix" → `/make-a-dent`
+- **Selecting work from backlog:** "pick a kaizen", "what's next", "what should we work on", "find work", "choose issue" → `/pick-work`
+- **Evaluating specific work:** "look at issue #N", "check PR #N", "find low hanging fruit", "evaluate this" → `/accept-case`
+- **Greenlighting work:** "lets do it", "go ahead", "build it", "start on this", "ship it", "make it happen" → `/implement-spec`
+- **All dev work MUST be in a case.** If `/implement-spec` activates, create a case with worktree before writing any code.
+- **Kaizen issue lifecycle:** When working on a kaizen issue, the `status:active`/`status:done` labels are auto-synced by `case-backend-github.ts`. Collision detection in `ipc-cases.ts` blocks duplicate case creation for the same issue.
@@ -0,0 +1 @@
+../kaizen/skills/trim-claude-md