Skip to content

Commit 35c6389

Browse files
aviadr1claude
andcommitted
refactor: trim CLAUDE.md 512→193 lines — extract kaizen system + reference docs (kaizen #242)
Kaizen system (.claude/kaizen/): - policies.md: enforcement policies #11-17 (recursive kaizen, hooks, MCP, security, isolation, testing, language) - verification.md: path tracing, invariant statements, runtime artifact verification, smoke tests - workflow.md: dev work skill chain trigger→skill routing - skills/trim-claude-md: reusable skill for measuring and trimming CLAUDE.md (symlinked to .claude/skills/) Reference docs (docs/): - architecture-layers.md: layered architecture, file naming, layer rules, cases↔kaizen relationship - harness-vertical-architecture.md: harness/vertical split, dependency placement, config contract - merging-prs.md: merge procedure, CI monitoring, troubleshooting, post-merge auto-deploy - ipc-messaging.md: sending Telegram messages from host via IPC files CLAUDE.md retains: quick context, cases overview, key files, skill triggers, general dev policies #1-10, database, development commands, git remotes, end-of-session cleanup — plus compact pointers to all extracted content with "when to read" hints. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent f1ec304 commit 35c6389

11 files changed

Lines changed: 491 additions & 365 deletions

File tree

.claude/kaizen/README.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,17 @@ These are registered in `.claude/settings.json` and fire on Claude Code tool-use
7171
| `test-*.sh` | Per-hook and integration test files (15+) |
7272
| `test_hooks.py` | Python-based hook tests |
7373

74+
### System Documents (this directory)
75+
76+
| File | Purpose |
77+
|------|---------|
78+
| `policies.md` | Kaizen enforcement policies (#11-17) — recursive kaizen, hooks, MCP, security, isolation, testing, language boundaries |
79+
| `verification.md` | Verification discipline — path tracing, invariant statements, runtime artifact verification, smoke tests |
80+
| `workflow.md` | Dev work skill chain — trigger→skill routing for `/pick-work``/accept-case``/implement-spec``/kaizen` |
81+
| `practices.md` | Engineering practices checklist — consulted before shipping (advisory) |
82+
| `zen.md` | The Zen of Kaizen — philosophical principles |
83+
| `horizon.md` | Horizon tracking dimensions |
84+
7485
### Documentation (`docs/`)
7586

7687
| File | Purpose |
@@ -79,6 +90,12 @@ These are registered in `.claude/settings.json` and fire on Claude Code tool-use
7990
| `hook-portability-matrix.md` | Maps each hook to its best portable alternative |
8091
| `hook-migration-plan.md` | Phase plan for moving enforcement to strongest layers |
8192

93+
### Skills (`skills/`)
94+
95+
| Skill | Purpose |
96+
|-------|---------|
97+
| `trim-claude-md` | Measure CLAUDE.md, identify bloat, extract to kaizen system or reference docs. Symlinked from `.claude/skills/` for discoverability. |
98+
8299
### Integration Points (outside this directory)
83100

84101
These participate in kAIzen but live where their tools require:

.claude/kaizen/policies.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Kaizen Enforcement Policies
2+
3+
These policies govern the kaizen enforcement system specifically. General dev policies live in CLAUDE.md. These rules were learned from past kaizen incidents — follow them strictly.
4+
5+
1. **Recursive kaizen on every fix-PR.** See `.claude/skills/kaizen/SKILL.md` for the full framework. After every fix, assess:
6+
- **What level is this fix?** Level 1 (instructions) → Level 2 (hooks/checks) → Level 3 (mechanistic code)
7+
- **Has this type of failure happened before?** If yes, the previous level wasn't enough — escalate.
8+
- **Affects humans directly?** → Must be Level 3 (humans should never wait on agent mistakes)
9+
- CLAUDE.md instructions are Level 1 — necessary but not sufficient. When they fail, escalate to hooks (Level 2) or architectural enforcement (Level 3).
10+
2. **Hooks are the foundation of our kaizen infrastructure.** The `.claude/kaizen/hooks/` directory contains Level 2 enforcement — automated checks that catch mistakes before they reach humans. See `.claude/kaizen/README.md` for the full kAIzen Agent Control Flow system documentation. When a hook blocks you:
11+
- **Do NOT override it blindly.** The hook exists because a past mistake proved instructions alone weren't enough.
12+
- **If it's a false positive**, fix the hook. Improve its matching logic, add exclusions with rationale, and add a test case that covers the false-positive scenario. This is recursive kaizen — making the enforcement smarter, not weaker.
13+
- **If it's a true positive**, fix the underlying issue. The hook is doing its job.
14+
- **Always add a test** for any hook change in `.claude/kaizen/hooks/tests/`. Hooks without tests are Level 1 pretending to be Level 2.
15+
3. **MCP tools are Level 3 enforcement points, not passthroughs.** When an agent behavior problem surfaces through an MCP tool, the fix belongs in the tool's logic — validation, auto-detection, or rejection. Don't default to updating the tool's description text (Level 1) when the kaizen rules demand Level 3. The MCP boundary is where agent intent meets system action; that's where policy enforcement belongs. Level 1 description improvements are defense-in-depth on top of Level 3, not a substitute.
16+
4. **Authoritative security files: do NOT duplicate, do NOT bypass.** Files with `security`, `auth`, or `allowlist` in their name (`case-auth.ts`, `mount-security.ts`, `sender-allowlist.ts`) are the single source of truth for their policy domain. All authorization decisions in that domain MUST go through the authoritative file. Never inline ad-hoc authorization checks elsewhere — call the gate function instead. Changes to these files require careful review and tests.
17+
5. **Hooks MUST be worktree-isolated.** A hook running in worktree A must NEVER read, modify, or block based on state from worktree B. This is a hard safety invariant — violations cause cross-worktree contamination where one agent's work hijacks another agent's session. All state file iteration MUST go through `lib/state-utils.sh` (`is_state_for_current_worktree`, `list_state_files_for_current_worktree`). Never iterate `/tmp/.pr-review-state/` directly. State files without a BRANCH field are treated as unattributable and skipped.
18+
6. **Co-commit source and test changes.** Every source file change must have a corresponding test file change in the same PR. Test utilities use the `.test-util.ts` extension (excluded from coverage checks). If a source change genuinely doesn't need tests (e.g., trivial constant change, already covered by existing tests), declare it in the PR body using the `test-exceptions` fenced block — this is public and auditable.
19+
7. **Hook language boundaries: L1-L2 bash, L3-L4 TypeScript.** Simple guards and pattern matching stay in bash. Scripts that need arithmetic, data transformation, error recovery, or their own test assertions belong in TypeScript. The signal: if you're hand-rolling `try/catch` or `expect()` in bash, move to TypeScript. See [`docs/hook-language-boundaries.md`](../../docs/hook-language-boundaries.md) for the full decision framework and migration plan.
Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# Trim CLAUDE.md
2+
3+
Measure CLAUDE.md size, identify bloat, and extract verbose sections into reference docs or kaizen system files — leaving compact pointers behind.
4+
5+
CLAUDE.md is loaded into every conversation. Every line costs context. This skill keeps it lean.
6+
7+
## When to use
8+
9+
- CLAUDE.md exceeds ~250 lines
10+
- A section has grown verbose during development (common after adding policies or procedures)
11+
- After adding a new feature that introduced a large CLAUDE.md section
12+
- Periodic maintenance (every ~10 PRs)
13+
14+
## The Process
15+
16+
### Step 1: Measure
17+
18+
```bash
19+
# Total size
20+
wc -l CLAUDE.md
21+
22+
# Per-section breakdown
23+
awk '/^## /{if(section) printf "%3d lines: %s\n", NR-start, section; section=$0; start=NR} END{printf "%3d lines: %s\n", NR-start+1, section}' CLAUDE.md
24+
```
25+
26+
**Target:** CLAUDE.md under 250 lines. If it's under 200, probably fine — don't trim for the sake of trimming.
27+
28+
### Step 2: Classify each section
29+
30+
For each section over ~15 lines, ask:
31+
32+
| Question | If yes → |
33+
|----------|----------|
34+
| Is this kaizen enforcement? (hooks, verification, workflow, policies learned from incidents) | `.claude/kaizen/{name}.md` |
35+
| Is this a reference lookup? (architecture, procedures, recipes) | `docs/{name}.md` |
36+
| Is this an interactive workflow with decision points? | `.claude/skills/{name}/SKILL.md` |
37+
| Is this routing data? (trigger phrases → skills, key files, quick context) | **Keep in CLAUDE.md** |
38+
| Is this a short rule or policy? (< 3 lines per item) | **Keep in CLAUDE.md** |
39+
40+
### Step 3: Decide what stays
41+
42+
**Always keep in CLAUDE.md** (agents need this in every conversation):
43+
- Quick Context — project orientation
44+
- Cases overview — core concept
45+
- Key Files — navigation
46+
- Skill trigger mappings — routing data (which phrases invoke which skills)
47+
- Short policies — rules that fit in 1-2 lines each
48+
- Database — query recipes
49+
- Development — build/run commands
50+
- Git Remotes — tiny, always needed
51+
52+
**Extract from CLAUDE.md** (agents only need when doing specific tasks):
53+
- Detailed procedures (merging, deploying, IPC messaging)
54+
- Verbose policies with sub-bullets and examples
55+
- Architecture diagrams and layer tables
56+
- Verification checklists
57+
- Philosophical content (zen aphorisms)
58+
59+
### Step 4: Extract
60+
61+
For each section being extracted:
62+
63+
1. **Create the target file** with the full content, properly titled
64+
2. **Replace in CLAUDE.md** with a 2-3 line pointer:
65+
```markdown
66+
## Section Name
67+
68+
**Read [`path/to/file.md`](path/to/file.md)** when [trigger condition].
69+
70+
Key points: [1-2 most important rules that agents should always remember].
71+
```
72+
3. **Verify** the pointer's path is correct and the file exists
73+
74+
### Step 5: Verify
75+
76+
```bash
77+
# Check final size
78+
wc -l CLAUDE.md
79+
80+
# Check all internal links resolve
81+
grep -oE '\[.*?\]\((.*?)\)' CLAUDE.md | grep -oE '\(.*?\)' | tr -d '()' | while read link; do
82+
[ -f "$link" ] || echo "BROKEN: $link"
83+
done
84+
```
85+
86+
## Classification guide: kaizen vs not
87+
88+
**Belongs in `.claude/kaizen/`** if the content:
89+
- Was learned from kaizen incidents (policies #11-17, verification discipline)
90+
- Enforces the kaizen workflow (skill chain, reflection triggers)
91+
- Would make sense if kaizen were extracted as a standalone system
92+
- Is referenced by kaizen hooks or skills
93+
94+
**Does NOT belong in `.claude/kaizen/`** if the content:
95+
- Is general engineering practice (TDD, dependency management)
96+
- Is project infrastructure (merging, deploying, database)
97+
- Is domain-specific (vertical architecture, IPC messaging)
98+
99+
When in doubt: if removing kaizen from the project would make this content irrelevant, it belongs in kaizen. If it would still be useful, it belongs elsewhere.
100+
101+
## Anti-patterns
102+
103+
- **Extracting too aggressively.** Short sections (< 10 lines) don't need extraction — the pointer overhead isn't worth it.
104+
- **Losing routing data.** Skill trigger phrases MUST stay in CLAUDE.md. If agents can't see them, they won't invoke skills correctly.
105+
- **Creating skills for passive content.** Reference docs are not skills. Only create a skill if the content is an interactive workflow with decision points.
106+
- **Putting general dev practices in kaizen.** "Declare all dependencies" is not kaizen-specific. Keep it in CLAUDE.md.

.claude/kaizen/verification.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# Verification Discipline
2+
3+
Learned from kaizen #11, #15, #17. These are mandatory practices for all dev work.
4+
5+
## Path Tracing — MANDATORY before any fix
6+
7+
Before writing ANY fix, map the full execution path from trigger to user-visible outcome:
8+
9+
```
10+
1. MAP the chain: input → layer 1 → layer 2 → ... → user-visible outcome
11+
2. For each link: how to verify it works, what artifact/log/query proves it
12+
3. After the fix: verify EVERY link, not just the one you changed
13+
4. Self-review must trace the path — "I changed layer N, what happens at N+1...?"
14+
```
15+
16+
**Never fix a single layer and declare done.** The fix isn't complete until the final outcome is verified end-to-end.
17+
18+
## Invariant Statement — MANDATORY before writing tests
19+
20+
Before writing ANY test, state explicitly:
21+
22+
```
23+
INVARIANT: [what must be true]
24+
SUT: [exact system/function/artifact under test]
25+
VERIFICATION: [how the test proves the invariant holds]
26+
```
27+
28+
**Anti-patterns to avoid:**
29+
30+
- Testing mocks instead of real code (you're proving your mocks work, not your code)
31+
- Testing the wrong artifact (e.g., `/app/dist/` when runtime uses `/tmp/dist/`)
32+
- "All 275 tests pass" when none cover the actual change
33+
- Verifying implementation details (`cpSync was called`) instead of outcomes (`agent has the tool`)
34+
- Hardcoding values that the SUT computes (e.g., `PROJECT_ROOT="$REPO_ROOT"` bypasses testing path resolution)
35+
36+
**Meta tests — MANDATORY for infrastructure scripts:**
37+
Scripts that resolve paths, detect environments, or set up state used by all subsequent logic MUST have tests that verify the resolution/detection itself — not tests that hardcode the resolved value and only test downstream logic. If a test bypasses the setup that the real script performs, it can't catch bugs in that setup. Examples:
38+
39+
- Path resolution: test that the output is absolute, points to the right directory, works from subdirectories and worktrees
40+
- Environment detection: test that detection works in the actual environments it will run in (main checkout, worktree, background process)
41+
- State initialization: test that initialization produces valid state, not just that functions work given pre-initialized state
42+
43+
## Runtime Artifact Verification
44+
45+
Always test the **actual deployed artifact**, not just source presence:
46+
47+
- If code is compiled, test the compiled output
48+
- If code runs in a container, verify inside the container
49+
- If a mount provides a file, verify the mount exists AND the consumer reads it
50+
- "The file exists in the repo" is not verification — "the agent receives it at runtime" is
51+
52+
## Smoke Tests — MANDATORY when review identifies them
53+
54+
When a PR review says a smoke test is needed, **you must perform it before declaring the PR ready**. "Pending manual smoke test" is not an acceptable review outcome — it means the review is incomplete.
55+
56+
Smoke test checklist:
57+
58+
1. **Identify what to smoke test** — the review will name the untested path (e.g., "never hit real GitHub API", "never ran in container")
59+
2. **Run it** — execute the actual end-to-end path. If it requires credentials or infrastructure you don't have, ask the user to provide them or run the test together.
60+
3. **Record the result** — include the smoke test output (success or failure) in the PR or review comment.
61+
4. **If you can't smoke test** — explicitly state what's blocking and ask the user. Don't hand-wave it as "recommended before deploy."
62+
63+
The point of review is to catch gaps. A gap identified but not closed is not a review — it's a TODO list.

.claude/kaizen/workflow.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Dev Work Skill Chain
2+
3+
When the conversation involves **selecting, evaluating, or starting dev work**, activate the right skills in sequence. Do NOT jump straight to writing code.
4+
5+
## Flow
6+
7+
```
8+
User asks "where are the gaps", "analyze gaps", "what should we invest in"
9+
→ /gap-analysis (strategic: tooling/testing gaps, horizon concentration, unnamed dimensions)
10+
→ produces: low-hanging fruit, feature PRD candidates, meta/horizon PRD candidates
11+
12+
User asks "make a dent", "hero mode", "fix the category", "deep dive", "autonomous fix"
13+
→ /make-a-dent (autonomous: find root cause category, fix bugs, add interaction tests, ship PR)
14+
15+
User asks "what's next", "pick work", "pick a kaizen", "what should we work on"
16+
→ /pick-work (filter claimed issues, score by momentum/diversity, present options)
17+
18+
User discusses a specific issue, PR, case, or spec
19+
→ /accept-case (collision check, evaluate, find low-hanging fruit, get admin input)
20+
21+
User greenlights: "lets do it", "go ahead", "build it", "do it", "yes", etc.
22+
→ /implement-spec (five-step algorithm, create case + worktree, then execute)
23+
→ MUST pass githubIssue number when creating case for a kaizen issue
24+
25+
Work is large enough to need multiple PRs
26+
→ /plan-work (break into sequenced PRs with dependency graph)
27+
28+
Work is done
29+
→ /kaizen (reflect on impediments, suggest improvements)
30+
```
31+
32+
## Key Triggers to Recognize
33+
34+
- **Strategic gap analysis:** "gap analysis", "analyze gaps", "where are problems concentrated", "tooling gaps", "testing gaps" → `/gap-analysis`
35+
- **Autonomous deep-dive:** "make a dent", "hero mode", "fix the category", "deep dive kaizen", "autonomous fix" → `/make-a-dent`
36+
- **Selecting work from backlog:** "pick a kaizen", "what's next", "what should we work on", "find work", "choose issue" → `/pick-work`
37+
- **Evaluating specific work:** "look at issue #N", "check PR #N", "find low hanging fruit", "evaluate this" → `/accept-case`
38+
- **Greenlighting work:** "lets do it", "go ahead", "build it", "start on this", "ship it", "make it happen" → `/implement-spec`
39+
- **All dev work MUST be in a case.** If `/implement-spec` activates, create a case with worktree before writing any code.
40+
- **Kaizen issue lifecycle:** When working on a kaizen issue, the `status:active`/`status:done` labels are auto-synced by `case-backend-github.ts`. Collision detection in `ipc-cases.ts` blocks duplicate case creation for the same issue.

.claude/skills/trim-claude-md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../kaizen/skills/trim-claude-md

0 commit comments

Comments
 (0)