Skip to content

fix(prune): broaden TTL-archive filter to include stale_spawn_dead_pane (PR-B G3 #1)#1636

Merged
automagik-genie merged 1 commit into
devfrom
fix/prune-stale-spawn-reason
May 4, 2026
Merged

fix(prune): broaden TTL-archive filter to include stale_spawn_dead_pane (PR-B G3 #1)#1636
automagik-genie merged 1 commit into
devfrom
fix/prune-stale-spawn-reason

Conversation

@automagik-genie
Copy link
Copy Markdown
Contributor

Summary

Cli-noise-and-hygiene-cleanup G3 deliverable #1 — broaden archiveExhaustedZombies + listExhaustedZombies reason filter from 'dead_pane_zombie' (only) to IN ('dead_pane_zombie', 'stale_spawn_dead_pane').

Before this fix, genie prune --zombies ignored spawning-flavor dead-pane workers; they accumulated in genie ls indefinitely after auto_resume exhausted. After: both flavors are eligible for TTL-based archive once their auto_resume budget is spent.

Behavior for dead_pane_zombie rows is unchanged. TTL + auto_resume=false guards still apply.

Files

  • src/lib/agent-registry.ts (+10 / -8) — broaden SQL filter on both queries + update doc comment
  • src/lib/__tests__/zombie-spawns.test.ts (+14 / -0) — source-grep regression test

Test plan

  • typecheck clean
  • biome clean
  • Smoke: seeded 2h-old stale_spawn_dead_pane row → listExhaustedZombies(1) now surfaces it (returned 0 pre-fix)
  • CI green on GitHub

Wish + sequencing

🤖 Generated with Claude Code

Reconciler tags two flavors of dead-pane workers:
  - dead_pane_zombie       (active state → pane died)
  - stale_spawn_dead_pane  (spawning state → pane died before ready)

`archiveExhaustedZombies` and `listExhaustedZombies` only matched
`dead_pane_zombie`, so the spawning-flavor zombies stayed visible in
`genie ls` forever even after auto_resume exhaustion.

Broaden the audit_events EXISTS filter in both queries to match either
reason. Behavior unchanged for `dead_pane_zombie` rows; pre-existing
TTL + auto_resume=false guards still apply.

Wish: cli-noise-and-hygiene-cleanup G3 (deliverable #1).
Subsequent PR adds `genie prune --errored` mode (G3 deliverables #2-5).

Smoke: seeded a 2h-old stale_spawn_dead_pane row → listExhaustedZombies(1)
now returns it (was 0 before).
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ae0b24de-f2b2-40d5-9294-906d859f0fa4

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/prune-stale-spawn-reason

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@automagik-genie automagik-genie merged commit 0581006 into dev May 4, 2026
16 checks passed
@automagik-genie automagik-genie deleted the fix/prune-stale-spawn-reason branch May 4, 2026 16:55
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request expands the zombie agent archival criteria to include the 'stale_spawn_dead_pane' reason, updating the documentation, archival and listing logic, and adding a consistency test. The reviewer suggests also including the 'stale_spawn' reason to more comprehensively clean up exhausted agents that failed during the initial spawn phase, noting that documentation and tests should be updated to reflect this addition.

Comment thread src/lib/agent-registry.ts
* by the scheduler's exhaustion branch. Without this TTL, such rows stayed
* visible in `genie ls` forever (#1293), holding registry slots and confusing
* users into thinking the agent is still recoverable.
* `reason IN ('dead_pane_zombie', 'stale_spawn_dead_pane')` AND whose
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If stale_spawn is added to the archival filter, this doc comment should be updated to include it in the list of reasons that define a zombie.

 * reason IN ('dead_pane_zombie', 'stale_spawn_dead_pane', 'stale_spawn') AND whose

Comment thread src/lib/agent-registry.ts
Comment on lines +623 to +624
* `reason IN ('dead_pane_zombie', 'stale_spawn_dead_pane')` —
* both reconciler reasons indicate a dead pane and are TTL-eligible.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update the documentation to reflect that both dead-pane and failed-spawn reasons are now TTL-eligible for archival.

 *      reason IN ('dead_pane_zombie', 'stale_spawn_dead_pane', 'stale_spawn') 
 *      these reconciler reasons indicate a dead pane or failed spawn and are TTL-eligible.

Comment thread src/lib/agent-registry.ts
AND e.entity_id = a.id
AND e.event_type = 'state_changed'
AND e.details->>'reason' = 'dead_pane_zombie'
AND e.details->>'reason' IN ('dead_pane_zombie', 'stale_spawn_dead_pane')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider including the stale_spawn reason in this filter. Agents that fail to spawn initially (Pass 1 of the reconciler) and subsequently exhaust their auto_resume budget are also effectively zombies that clutter the registry. Since Pass 1 already excludes dir: agents, adding stale_spawn here would provide more comprehensive cleanup of exhausted agents without affecting permanent directory placeholders. Additionally, note that this filter string is duplicated in listExhaustedZombies, which increases maintenance risk.

            AND e.details->>'reason' IN ('dead_pane_zombie', 'stale_spawn_dead_pane', 'stale_spawn')

Comment thread src/lib/agent-registry.ts
AND e.entity_id = a.id
AND e.event_type = 'state_changed'
AND e.details->>'reason' = 'dead_pane_zombie'
AND e.details->>'reason' IN ('dead_pane_zombie', 'stale_spawn_dead_pane')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This filter should be kept in sync with the one in archiveExhaustedZombies to ensure the dry-run output remains accurate and consistent with the actual archival logic.

            AND e.details->>'reason' IN ('dead_pane_zombie', 'stale_spawn_dead_pane', 'stale_spawn')

// accumulating forever in `genie ls` even after auto_resume exhaustion.
const source = readFileSync(join(__dirname, '..', 'agent-registry.ts'), 'utf-8');

const archiveFilter = "e.details->>'reason' IN ('dead_pane_zombie', 'stale_spawn_dead_pane')";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the stale_spawn reason is added to the registry filters, this test expectation must be updated to match the new SQL string, as the test relies on exact source-code matching.

Suggested change
const archiveFilter = "e.details->>'reason' IN ('dead_pane_zombie', 'stale_spawn_dead_pane')";
const archiveFilter = "e.details->>'reason' IN ('dead_pane_zombie', 'stale_spawn_dead_pane', 'stale_spawn')";

namastex888 added a commit that referenced this pull request May 7, 2026
…coped per reviewer

Lands the wish doc that scaffolds PR-A (#1634) and PR-B (#1636/#1637/#1638/
#1640/#1642), plus the 2026-05-07 PR-C draft + reviewer FIX-FIRST corrections.

Why this is a separate docs commit:
- The wish file was authored 2026-05-04 but only ever sat in a stash; never
  committed despite shipping work referencing it. This commit lands the
  reference document for completed + pending work in one place.
- PR-C as originally drafted had three invalid premises against live
  4.260507.1 (G3 amendment already implemented at scheduler-daemon.ts:1296;
  G9 line is on stderr not stdout; G10 design assumes binary-spawn that the
  HTTP probe doesn't do). Reviewer corrections folded in.
- Only G8 (kill-path shadow+UUID dedup) survives intact — file path
  corrected to src/term-commands/agents.ts:2817 (handleWorkerKill).
- G9 reframed as stderr-noise reduction (DEBUG=pgserve gating).
- G10 deferred pending /trace into update.ts:362.

QA dogfooding-72h artifacts (AUDIT.md, QA-PLAN.md) document the 72-h fix-audit
sweep that surfaced the bugs and triggered the wish update.

Refs: #1677

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants