Your AI agent is only as good as your repo. AgentLint finds what's broken — file structure, instruction quality, build setup, session continuity, security posture — and fixes it. 33 checks, every one backed by data.
We analyzed 265 versions of Anthropic's Claude Code system prompt, documented the hard limits, audited thousands of real repos, and reviewed the academic research. The result: a single command that tells you exactly what your AI agent is struggling with and why.
npm install -g @0xmariowu/agent-lintThen start a new Claude Code session:
/al
That's it. AgentLint scans your projects, scores them, shows what's wrong, and fixes what it can.
$ /al
AgentLint — Score: 68/100
Findability ██████████████░░░░░░ 7/10
Instructions ████████████████░░░░ 8/10
Workability ████████████░░░░░░░░ 6/10
Safety ██████████░░░░░░░░░░ 5/10
Continuity ██████████████░░░░░░ 7/10
Fix Plan (7 items):
[guided] Pin 8 GitHub Actions to SHA (supply chain risk)
[guided] Add .env to .gitignore (AI exposes secrets)
[assisted] Generate HANDOFF.md
[guided] Reduce IMPORTANT keywords (7 found, Anthropic uses 4)
Select items → AgentLint fixes → re-scores → saves HTML report
The HTML report shows a segmented gauge, expandable dimension breakdowns with per-check detail, and a prioritized issues list. Before/after comparison when fixes are applied.
AI coding agents read your repo structure, docs, CI config, and handoff notes. They git push, trigger pipelines, and write files. A well-structured repo gets dramatically better AI output. A poorly structured one wastes tokens, ignores rules, repeats mistakes, and may expose secrets.
AgentLint is built on data most developers never see:
- 265 versions of Anthropic's Claude Code system prompt — every word added, deleted, and rewritten
- Claude Code internals — hard limits (40K char max, 256KB file read limit, pre-commit hook behavior) that silently break your setup
- Production security audits across open-source codebases — the gaps AI agents walk into
- 6 academic papers on instruction-following, context files, and documentation decay
| Check | What | Why |
|---|---|---|
| F1 | Entry file exists | No CLAUDE.md = AI starts blind |
| F2 | Project description in first 10 lines | AI needs context before rules |
| F3 | Conditional loading guidance | "If working on X, read Y" prevents context bloat |
| F4 | Large directories have INDEX | >10 files without index = AI reads everything |
| F5 | All references resolve | Broken links waste tokens on dead-end reads |
| F6 | Standard file naming | README.md, CLAUDE.md are auto-discovered |
| F7 | @include directives resolve | Missing targets are silently ignored — you think it's loaded, it isn't |
| Check | What | Why |
|---|---|---|
| I1 | Emphasis keyword count | Anthropic cut IMPORTANT from 12 to 4 across 265 versions |
| I2 | Keyword density | More emphasis = less compliance. Anthropic: 7.5 → 1.4 per 1K words |
| I3 | Rule specificity | "Don't X. Instead Y. Because Z." — Anthropic's golden formula |
| I4 | Action-oriented headings | Anthropic deleted all "You are a..." identity sections |
| I5 | No identity language | "Follow conventions" removed — model already does this |
| I6 | Entry file length | 60-120 lines is the sweet spot. Longer dilutes priority |
| I7 | Under 40,000 characters | Claude Code hard limit. Above this, your file is truncated |
| Check | What | Why |
|---|---|---|
| W1 | Build/test commands documented | AI can't guess your test runner |
| W2 | CI exists | Rules without enforcement are suggestions |
| W3 | Tests exist (not empty shell) | A CI that runs pytest with 0 test files always "passes" |
| W4 | Linter configured | Mechanical formatting frees AI from guessing style |
| W5 | No files over 256 KB | Claude Code cannot read them — hard error |
| W6 | Pre-commit hooks are fast | Claude Code never uses --no-verify. Slow hooks = stuck commits |
| Check | What | Why |
|---|---|---|
| C1 | Document freshness | Stale instructions are worse than no instructions |
| C2 | Handoff file exists | Without it, every session starts from zero |
| C3 | Changelog has "why" | "Updated INDEX" says nothing. "Fixed broken path" says everything |
| C4 | Plans in repo | Plans in Jira don't exist for AI |
| C5 | CLAUDE.local.md not in git | Private per-user file. Claude Code requires .gitignore |
| Check | What | Why |
|---|---|---|
| S1 | .env in .gitignore | AI's Glob tool ignores .gitignore by default — secrets are visible |
| S2 | Actions SHA pinned | AI push triggers CI. Floating tags = supply chain attack vector |
| S3 | Secret scanning configured | AI won't self-check for accidentally written API keys |
| S4 | SECURITY.md exists | AI needs security context for sensitive code decisions |
| S5 | Workflow permissions minimized | AI-triggered workflows shouldn't have write access by default |
| S6 | No hardcoded secrets | Detects sk-, ghp_, AKIA, private key patterns in source |
| S7 | No personal paths | /Users/xxx/ in source = AI copies and spreads the leak |
| S8 | No pull_request_target | AI pushes trigger CI. Elevated permissions = attack vector |
Spawns AI subagents to find what mechanical checks can't:
- Contradictory rules that confuse the model
- Dead-weight rules the model would follow without being told
- Vague rules without decision boundaries
Reads your Claude Code session logs to find:
- Instructions you repeat across sessions (should be in CLAUDE.md)
- Rules AI keeps ignoring (need rewriting)
- Friction hotspots by project
Each check produces a 0-1 score, weighted by dimension, scaled to 100.
| Dimension | Weight | Why? |
|---|---|---|
| Instructions | 30% | Unique value. No other tool checks CLAUDE.md quality |
| Findability | 20% | AI can't follow rules it can't find |
| Workability | 20% | Can AI actually run your code? |
| Safety | 15% | Is AI working without exposing secrets or triggering vulnerabilities? |
| Continuity | 15% | Does knowledge survive across sessions? |
Scores are measurements, not judgments. Reference values come from Anthropic's own data. You decide what to fix.
claude plugin update agent-lint@agent-lintEvery check cites its source. Full citations in standards/evidence.json.
| Source | Type |
|---|---|
| Anthropic 265 versions | Primary dataset |
| Claude Code internals | Hard limits and observed behavior |
| IFScale (NeurIPS) | Instruction compliance at scale |
| ETH Zurich | Do context files help coding agents? |
| Codified Context | Stale content as #1 failure mode |
| Agent READMEs | Concrete vs abstract effectiveness |
- Claude Code
jqandnode20+
MIT
