feat(cli): genie recover-orphans — attach orphaned Claude JSONLs to executor rows#1699
Conversation
…xecutor rows Backfills `executors.claude_session_id` for Claude Code session JSONLs that survived an executor crash, host reboot, or pre-#1684 spawn path that forgot to write the session row. Without this, `genie agent resume` cannot find the on-disk transcript even though Claude itself would happily resume it. Surface: genie recover-orphans # default: list orphans grouped by agent cwd genie recover-orphans --list # explicit dry-run genie recover-orphans --dir <cwd> # restrict to one agent cwd genie recover-orphans --apply --newest # auto-attach the newest orphan per dir genie recover-orphans --apply --uuid <id> Algorithm: 1. Scan <claudeConfigDir>/projects/<encoded-cwd>/*.jsonl. 2. Cross-reference each session UUID against `executors.claude_session_id` (already-attached files are reported, not re-attached — idempotent). 3. Map encoded dir → agent via `agents.repo_path` (re-encoded to match Claude Code's directory-naming scheme); prefer `dir:` master rows. 4. Apply: insert/update `executors` row with `claudeSessionId = <uuid>`, `state = 'terminated'`, `metadata.source = 'recover-orphans'`. Sets `agents.current_executor_id` only if currently null. Refuses to overwrite a live executor (heal-not-wipe). 5. Audit: emit `executor.recovered_from_orphan` per attach. Manual smoke against ~/.claude/projects/ surfaces 436 orphans across the four main genie agent dirs (genie/felipe/scout/configure) — more than the 181-baseline this CLI was built to recover. Tests cover: filename/UUID gating, first-message preview parsing, canonical-agent picker, scan + parse round-trip, --apply --newest attaches and links, idempotency on rerun, refusal to overwrite a live executor, and --list non-mutation. Closes task #213. Companion to #1698 (P0 hotfix that prevents future leaks). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8bf50c03ff
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| claudeSessionId: candidate.sessionId, | ||
| state: 'terminated', | ||
| metadata, |
There was a problem hiding this comment.
Set ended_at when creating terminated recovered executors
This insert creates executors with state: 'terminated' but never sets ended_at, which leaves them looking live to this command’s own liveness check (isExecutorLive returns true when ended_at is null). In practice, after one recovery attach, later --apply/--uuid runs can be incorrectly skipped as “live executor” for that agent even though the recovered row is terminated; the row should be created with a non-null ended_at (or transitioned through the normal termination path).
Useful? React with 👍 / 👎.
| const fd = readFileSync(jsonlPath, { encoding: 'utf-8', flag: 'r' }); | ||
| head = fd.slice(0, 16384); |
There was a problem hiding this comment.
Read only the JSONL head instead of loading entire files
The helper says it only inspects the first ~16 KiB, but readFileSync loads the full JSONL into memory before slicing. When scanning many large transcripts, this can cause significant memory and latency spikes during recover-orphans runs, especially in repos with long Claude histories; use a bounded read (e.g., open/read fixed-size buffer) so cost stays proportional to the intended head size.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This pull request introduces the recover-orphans command to Genie, which scans for orphaned Claude session JSONL files and attaches them to database executor rows to enable session resumption. The implementation includes directory scanning, session mapping, and a dry-run mode. Review feedback suggests optimizing file I/O in readFirstUserMessagePreview by using openSync and readSync to avoid loading large files entirely into memory. Furthermore, it was noted that redundant database queries in the apply logic could be eliminated by reusing data already fetched during the initial scan.
| * `ended_at IS NULL`, the orphan is reported but not attached. | ||
| */ | ||
|
|
||
| import { existsSync, readFileSync, readdirSync, statSync } from 'node:fs'; |
| const fd = readFileSync(jsonlPath, { encoding: 'utf-8', flag: 'r' }); | ||
| head = fd.slice(0, 16384); |
There was a problem hiding this comment.
The current implementation of readFirstUserMessagePreview reads the entire JSONL file into memory as a string before slicing the first 16KB. For large session logs, this is inefficient and can lead to high memory usage. It is better to read only the necessary amount of data using openSync and readSync. Using a hardcoded limit here is acceptable to prevent performance issues.
const buffer = Buffer.alloc(16384);
const fd = openSync(jsonlPath, 'r');
try {
const bytesRead = readSync(fd, buffer, 0, 16384, 0);
head = buffer.toString('utf-8', 0, bytesRead);
} finally {
closeSync(fd);
}References
- It is acceptable to use hardcoded numeric limits (magic numbers) in non-critical fallback logic, especially when they serve as intentional caps to prevent performance issues like excessive I/O.
| const rows = await sql<{ current_executor_id: string | null }[]>` | ||
| SELECT current_executor_id FROM agents WHERE id = ${agentRow.id} LIMIT 1 | ||
| `; |
There was a problem hiding this comment.
| const rows = await sql<{ current_executor_id: string | null }[]>` | ||
| SELECT current_executor_id FROM agents WHERE id = ${s.agent.id} LIMIT 1 | ||
| `; |
Brings main's session-id writer hotfix (#1698) and recover-orphans CLI (#1699) onto dev so the next dev → main PR triggers Version workflow's @latest npm publish (gated on '/dev' in commit message). Conflict resolutions: - src/genie-commands/session.ts: kept BOTH _deps injection from #1698 AND findOrCreateAgent UUID identity from wish #175 G3. Hotfix's claudeSessionId plumbing into createAndLinkExecutor preserved. - src/genie.ts: additive — recover-orphans subcommand registered. - src/lib/agent-directory.ts, executor-registry.ts, protocol-router.ts: surrounding context kept consistent with both branches' direction. - src/__tests__/agent-team-inheritance.test.ts: adapted seedTemplate helper to post-migration-061 UUID-id + name lookup schema. Carries main's other in-flight fixes: - migrations 054 + 055 (subagent team inheritance, auto_resume default) - agent-team-inheritance test fixture (132 LOC) - release.yml + 044 test refinements
Summary
genie recover-orphans [--dir <cwd>] [--list] [--apply --newest|--uuid <id>]executors.claude_session_idfor Claude session JSONLs that survived an executor crash, host reboot, or pre-fix(registry): preserve native_team_enabled + provider across ON CONFLICT in register() #1684 spawn path that forgot to write the session rowAlgorithm
<claudeConfigDir>/projects/<encoded-cwd>/*.jsonlexecutors.claude_session_id— already-attached files are reported, not re-attached (idempotent)agents.repo_path(re-encoded to match Claude Code's dir-naming scheme); preferdir:master rows; lex-smallest UUID otherwiseexecutorsrow withclaudeSessionId = <uuid>,state = 'terminated',metadata.source = 'recover-orphans'; setagents.current_executor_idonly if currently null; refuse if a live executor exists (heal-not-wipe)executor.recovered_from_orphanper attachManual smoke against
~/.claude/projects/436 orphans surfaced across the four main genie agent dirs — comfortably above the 181 baseline from the original brain plan. (The "unmapped" rows mean the agents in those dirs have a
repo_pathother than the cwd Claude was launched in — operator can update or use--uuidfor those.)Test plan
encodeCwdForClaudeProjectsmatches Claude Code's encoder,isSessionJsonlrejects backups + trimmed copies + non-UUID names,readFirstUserMessagePreviewextracts first user text + tolerates malformed JSON,pickCanonicalAgentprefersdir:*then lex-smallest--apply --newestattaches the newest orphan and links it ascurrent_executor_id; rerun is a no-op (idempotent); refuses to overwrite a live executor;--listnever mutatesbun run typecheck— cleanbun run lint— no new errors (only pre-existing warnings in unrelated files)genie recover-orphans --listagainst real~/.claude/projects/lists 436 orphans across genie agent dirs--helprenders all flagsOut of scope
agents.repo_paththemselves, or uses--uuid)Closes task #213.
🤖 Generated with Claude Code