Skip to content

Commit f26f960

Browse files
rayhanadevclaude
andcommitted
feat(stats): scan every Cursor store (Nightly GUI + CLI agent) and deslop the engine
Broaden which local agent history `stats` reads, and refine the engine. Coverage: - Cursor GUI: scan both the stable and Nightly builds' composer databases (was stable-only — a Nightly-only user got zero GUI sessions), and read a live, editor-locked database via SQLite's immutable mode instead of letting the lock crash the run. - Cursor CLI agent: new source for the per-session content-addressed stores under ~/.cursor and ~/.cursor-nightly — decode the hex meta row, parse the binary message manifest, and map Write/ApplyPatch/StrReplace/Delete tool calls to edits, capturing Read results as reconstruction bases. - Codex (~/.codex) was already covered; verified. Engine deslop (behavior-preserving): - consolidate the zip-slip path-inside guard into one audited core util (@react-doctor/core isPathInside), and share the node:sqlite read-only open and the empty-string-preserving string narrow (coerce asNullableString) instead of hand-rolling copies - drop dead code (write-only session timestamps, an unused export, an unreachable branch), replace forbidden nested ternaries with if/else and a lookup table, collapse a redundant variable and a pass-through wrapper - guard every SQLite close so a locked/unreadable store degrades to "skip" rather than sinking the whole stats run Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 509f229 commit f26f960

23 files changed

Lines changed: 655 additions & 155 deletions

.changeset/stats-agent-leaderboard.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,11 @@
44

55
Add a `react-doctor stats` subcommand — a per-model code-quality leaderboard built from local AI agent chat history.
66

7-
`stats` reads local agent history — Claude Code (`~/.claude`) and Codex (`~/.codex`) transcripts, plus the Cursor composer database — reconstructs the file content each model actually wrote (Claude post-edit snapshots, Cursor full post-edit file snapshots, Codex `apply_patch` envelopes), lints that content with the existing engine, and ranks models and providers by their React Doctor score and diagnostics-per-file. The job: answer "which agent/model writes the cleanest React code in my repo".
7+
`stats` reads local agent history — Claude Code (`~/.claude`) and Codex (`~/.codex`) transcripts, plus Cursor's GUI composer databases and CLI agent stores (`~/.cursor`, `~/.cursor-nightly`) — reconstructs the file content each model actually wrote (Claude post-edit snapshots, Cursor full post-edit file snapshots, Codex `apply_patch` envelopes), lints that content with the existing engine, and ranks models and providers by their React Doctor score and diagnostics-per-file. The job: answer "which agent/model writes the cleanest React code in my repo".
88

99
- Only the React code each model wrote is scored. Reconstructed files are filtered to actual React (JSX/TSX, `use client`/`use server` directives, or a React-ecosystem import) before linting, so a model's plain backend/util/config files don't pad its file count or dilute its diagnostics-per-file. A scan that errors, is skipped, or whose lint phase fails is dropped rather than counted as zero-diagnostic "clean" code, so un-lintable output can't inflate a model's score.
1010
- Ranking is by a confidence-weighted score, not the raw score: each group's score is regressed toward the global mean by its evidence, so a model with a handful of clean files can't top the board on a tiny sample. Files are the dominant signal; sessions only lightly discount the file weight (many files from one session are one correlated sample) and never below a floor.
11-
- Cursor attribution reads the canonical composer database (`state.vscdb`) directly, so each session carries its real model (e.g. `claude-opus-4-8`, `gpt-5.5`, `composer-2`) and an exact post-edit snapshot of every edited file — the model-less agent-transcript JSONL files are no longer used. Attribution falls back to `unknown` only for chats left on the "Auto" model.
11+
- Cursor is read from every place it stores chats: the GUI composer database (`state.vscdb`) for both the stable and Nightly builds, and the CLI agent's per-session stores under `~/.cursor` and `~/.cursor-nightly`. Each session carries its real model (e.g. `claude-opus-4-8`, `gpt-5.5`, `composer-2.5`) and a faithful reconstruction of every edited file (full GUI post-edit snapshots; CLI `Write`/`ApplyPatch`/`StrReplace`/`Delete` tool calls replayed against captured reads). A database a running editor holds locked is read via SQLite's `immutable` mode rather than skipped. Attribution falls back to `unknown` only for GUI chats left on the "Auto" model.
1212
- Default scope is the current repository (sessions whose cwd or edits touch the repo root); `--global` ranks across every repo on the machine. `--since`, `--limit`, and `--provider` bound the work.
1313
- `--json` emits a structured leaderboard (`{ schemaVersion, scope, models, providers, best, worst, … }`); the terminal output shows the top models and per-tool tables with a single score bar (the confidence-weighted score) and a best/worst callout.
14-
- Coverage is honest about its limits: Codex shell-based edits are not faithfully reconstructable (surfaced as skipped), the Cursor composer database requires `node:sqlite` (Node 22.13+) and covers GUI agent sessions (not cursor-agent CLI runs), and the score requires network access.
14+
- Coverage is honest about its limits: Codex shell-based edits are not faithfully reconstructable (surfaced as skipped), reading any Cursor database requires `node:sqlite` (Node 22.13+), and the score requires network access.

packages/core/src/index.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,7 @@ export * from "./utils/define-config.js";
8888
export * from "./utils/group-by.js";
8989
export * from "./utils/has-published-fix-recipe.js";
9090
export * from "./utils/is-large-minified-file.js";
91+
export * from "./utils/is-path-inside.js";
9192
export * from "./utils/list-source-files.js";
9293
export * from "./utils/map-with-concurrency.js";
9394
export * from "./utils/match-glob-pattern.js";

packages/core/src/materialize-source-tree.ts

Lines changed: 2 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -3,24 +3,14 @@ import fs from "node:fs";
33
import path from "node:path";
44
import { STAGED_FILES_PROJECT_CONFIG_FILENAMES } from "./constants.js";
55
import type { ReactDoctorError } from "./errors.js";
6+
import { isPathInside } from "./utils/is-path-inside.js";
67

78
export interface MaterializedTree {
89
readonly tempDirectory: string;
910
readonly materializedFiles: ReadonlyArray<string>;
1011
readonly cleanup: () => void;
1112
}
1213

13-
/**
14-
* Zip-Slip defense: relative paths come from git (`diff --name-only`), which
15-
* normalizes during ordinary adds, but a crafted index/pack/symlinked tree can
16-
* smuggle `..` segments that escape the temp root. Resolve against the temp dir
17-
* and reject anything that lands outside before writing.
18-
*/
19-
const isPathInsideDirectory = (childAbsolutePath: string, parentAbsolutePath: string): boolean => {
20-
const relative = path.relative(parentAbsolutePath, childAbsolutePath);
21-
return Boolean(relative) && !relative.startsWith("..") && !path.isAbsolute(relative);
22-
};
23-
2414
/**
2515
* Writes a set of source files (supplied by `readContent` — e.g.
2616
* `git show <ref>:<path>` for a baseline tree, or `git show :<path>` for the
@@ -44,7 +34,7 @@ export const materializeSourceTree = (input: {
4434
const content = yield* input.readContent(relativePath).pipe(Effect.orElseSucceed(() => null));
4535
if (content === null) continue;
4636
const candidateTargetPath = path.resolve(resolvedTempDirectory, relativePath);
47-
if (!isPathInsideDirectory(candidateTargetPath, resolvedTempDirectory)) continue;
37+
if (!isPathInside(candidateTargetPath, resolvedTempDirectory)) continue;
4838
yield* Effect.sync(() => {
4939
fs.mkdirSync(path.dirname(candidateTargetPath), { recursive: true });
5040
fs.writeFileSync(candidateTargetPath, content);

packages/react-doctor/src/stats/is-path-inside.ts renamed to packages/core/src/utils/is-path-inside.ts

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,13 @@ export interface IsPathInsideOptions {
99
* `true` when `childPath` resolves within `parentPath`. By default the parent
1010
* directory itself does not count (the strict zip-slip guard); pass
1111
* `allowSame: true` to treat an exact match as inside (scope membership).
12+
*
13+
* Zip-Slip defense: relative paths can arrive from untrusted sources — a
14+
* crafted git index/pack/symlinked tree, or a reconstructed agent transcript —
15+
* and smuggle `..` segments that escape a temp root. Resolve against the parent
16+
* and reject anything that lands outside before writing. This is the one
17+
* audited copy of that guard, shared across the staged/baseline scan paths and
18+
* the stats reconstruction tree so the two cannot drift.
1219
*/
1320
export const isPathInside = (
1421
childPath: string,

packages/react-doctor/src/cli/commands/stats.ts

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,6 @@ export const statsAction = async (flags: StatsFlags): Promise<void> => {
8080
// ora renders to stderr; suppress it in JSON mode so the run stays quiet.
8181
const progress = flags.json ? null : spinner("Looking through your agent history…").start();
8282
let report: StatsReport;
83-
let providerCount: number;
8483
try {
8584
const sessions = await discoverSessions(root, scope, (foundCount) =>
8685
progress?.update(`Looking through your agent history… (${foundCount} found)`),
@@ -92,7 +91,6 @@ export const statsAction = async (flags: StatsFlags): Promise<void> => {
9291
});
9392
progress?.update("Scoring…");
9493
const aggregated = await aggregateStats(results, userConfig);
95-
providerCount = aggregated.providers.length;
9694

9795
report = {
9896
scope: scope.global ? "global" : "repo",
@@ -122,7 +120,7 @@ export const statsAction = async (flags: StatsFlags): Promise<void> => {
122120
recordCount(METRIC.statsRun, 1, {
123121
scope: report.scope,
124122
sessions: report.sessionsAnalyzed,
125-
providers: providerCount,
123+
providers: report.providers.length,
126124
provider: scope.provider ?? "all",
127125
});
128126

packages/react-doctor/src/cli/index.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -235,7 +235,7 @@ program
235235
.option("-c, --cwd <cwd>", "working directory", process.cwd())
236236
.option("--color", "force colored output")
237237
.option("--no-color", "disable colored output (also honors NO_COLOR)")
238-
.action((location, options) => whyAction(location, options));
238+
.action(whyAction);
239239

240240
program
241241
.command("install")

packages/react-doctor/src/stats/aggregate-stats.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ const upsert = (
6969
* mean; high-evidence groups keep their raw score. Returns the raw score when
7070
* there's no prior.
7171
*/
72-
export const confidenceWeightedScore = (
72+
const confidenceWeightedScore = (
7373
rawScore: number | null,
7474
priorScore: number | null,
7575
filesScanned: number,

packages/react-doctor/src/stats/coerce.ts

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,10 @@
66
export const asString = (value: unknown): string | undefined =>
77
typeof value === "string" && value.length > 0 ? value : undefined;
88

9+
/** Narrow an unknown to a string, preserving the empty string (unlike `asString`). */
10+
export const asNullableString = (value: unknown): string | null =>
11+
typeof value === "string" ? value : null;
12+
913
/** Narrow an unknown to a plain object record, else undefined. */
1014
export const asRecord = (value: unknown): Record<string, unknown> | undefined =>
1115
value && typeof value === "object" && !Array.isArray(value)
Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
import { asRecord } from "./coerce.js";
2+
import { openReadOnlySqlite } from "./open-sqlite.js";
3+
4+
// The Cursor CLI agent (`~/.cursor` / `~/.cursor-nightly`) stores each chat as
5+
// its own content-addressed SQLite store, distinct from the GUI's single
6+
// `state.vscdb`. The `meta` table holds one row whose `value` is hex-encoded
7+
// JSON (the latest root blob id + last-used model); the `blobs` table maps a
8+
// sha256 id to either a message (JSON: `{ role, content }`) or the binary root
9+
// manifest. The manifest is a protobuf-style flat list of `0x0a 0x20` followed
10+
// by a 32-byte blob id, giving the conversation's messages in order.
11+
12+
export interface CursorCliMessage {
13+
readonly role: string;
14+
readonly content: unknown;
15+
}
16+
17+
export interface CursorCliStore {
18+
readonly lastUsedModel: string | null;
19+
readonly messages: CursorCliMessage[];
20+
}
21+
22+
const MANIFEST_RECORD_TAG = 0x0a;
23+
const MANIFEST_ID_LENGTH = 0x20;
24+
const MANIFEST_RECORD_LENGTH = 2 + MANIFEST_ID_LENGTH;
25+
26+
/**
27+
* The conversation's message blob ids, in order, read from the leading run of
28+
* `[0x0a, 0x20, <32-byte id>]` records. Trailing protobuf fields after the run
29+
* are ignored; a manifest that doesn't start with the run yields `[]`.
30+
*/
31+
const parseManifestBlobIds = (manifest: Buffer): string[] => {
32+
const ids: string[] = [];
33+
let offset = 0;
34+
while (
35+
offset + MANIFEST_RECORD_LENGTH <= manifest.length &&
36+
manifest[offset] === MANIFEST_RECORD_TAG &&
37+
manifest[offset + 1] === MANIFEST_ID_LENGTH
38+
) {
39+
ids.push(manifest.subarray(offset + 2, offset + MANIFEST_RECORD_LENGTH).toString("hex"));
40+
offset += MANIFEST_RECORD_LENGTH;
41+
}
42+
return ids;
43+
};
44+
45+
/** blobs.data is a BLOB (Uint8Array); meta.value is hex-encoded TEXT. */
46+
const toBuffer = (value: unknown): Buffer | null => {
47+
if (value instanceof Uint8Array) return Buffer.from(value);
48+
if (typeof value === "string") return Buffer.from(value, "hex");
49+
return null;
50+
};
51+
52+
/**
53+
* Read a Cursor CLI per-session `store.db`: the last-used model and every
54+
* conversation message in order. Returns `null` when the store can't be opened
55+
* (older Node without `node:sqlite`, or an unreadable/locked file) or has no
56+
* usable `meta` row; the messages array is empty when the manifest is missing.
57+
*/
58+
export const readCursorCliStore = (storeDbPath: string): CursorCliStore | null => {
59+
const database = openReadOnlySqlite(storeDbPath);
60+
if (!database) return null;
61+
try {
62+
const metaRow = asRecord(database.prepare("SELECT value FROM meta LIMIT 1").get());
63+
const metaValue = metaRow && typeof metaRow.value === "string" ? metaRow.value : null;
64+
if (!metaValue) return null;
65+
let meta: Record<string, unknown> | undefined;
66+
try {
67+
meta = asRecord(JSON.parse(Buffer.from(metaValue, "hex").toString("utf8")));
68+
} catch {
69+
return null;
70+
}
71+
if (!meta) return null;
72+
73+
const lastUsedModel = typeof meta.lastUsedModel === "string" ? meta.lastUsedModel : null;
74+
const latestRootBlobId =
75+
typeof meta.latestRootBlobId === "string" ? meta.latestRootBlobId : null;
76+
if (!latestRootBlobId) return { lastUsedModel, messages: [] };
77+
78+
const blobStatement = database.prepare("SELECT data FROM blobs WHERE id = ?");
79+
const blobBuffer = (id: string): Buffer | null => {
80+
const row = asRecord(blobStatement.get(id));
81+
return row ? toBuffer(row.data) : null;
82+
};
83+
84+
const manifest = blobBuffer(latestRootBlobId);
85+
if (!manifest) return { lastUsedModel, messages: [] };
86+
87+
const messages: CursorCliMessage[] = [];
88+
for (const blobId of parseManifestBlobIds(manifest)) {
89+
const raw = blobBuffer(blobId);
90+
if (!raw) continue;
91+
const text = raw.toString("utf8");
92+
if (!text.startsWith("{")) continue;
93+
let message: Record<string, unknown> | undefined;
94+
try {
95+
message = asRecord(JSON.parse(text));
96+
} catch {
97+
continue;
98+
}
99+
if (message && typeof message.role === "string") {
100+
messages.push({ role: message.role, content: message.content });
101+
}
102+
}
103+
return { lastUsedModel, messages };
104+
} catch {
105+
// A locked or unreadable store can throw mid-read; skip it rather than
106+
// sinking the whole stats run.
107+
return null;
108+
} finally {
109+
try {
110+
database.close();
111+
} catch {
112+
// Already closed or never fully opened — nothing to release.
113+
}
114+
}
115+
};

0 commit comments

Comments
 (0)