Skip to content

[Bug]: TUI compression continuation creates ghost sessions with messages but incomplete metadata, polluting session_search results #20001

@lwj-9650

Description

@lwj-9650

Bug Description

Ghost sessions accumulate in state.db under a pattern not fully covered by the fix in #18370. These sessions contain real conversation messages but have incomplete metadata (api_call_count=0, title=NULL, end_reason=NULL), polluting session_search sort order.

Related to #12029 but with a different reproduction pattern — this happens in TUI/CLI-only setups (no gateway, no cron) and the ghost sessions contain messages (not empty stubs).

Environment

  • Hermes v0.12.0, installed from source
  • Platform: WSL2 (Linux)
  • Usage pattern: daily TUI sessions with frequent context compression

Reproduction Steps

  1. Use Hermes TUI daily with context compression enabled
  2. Allow multiple compression cycles within a single session (compression continuation)
  3. Use /new to start new sessions or let TUI recover after interruptions
  4. After extended usage, observe ghost sessions accumulating in state.db

Observed Behavior

Over time, state.db accumulates a large number of ghost sessions (over 50% of total) with the following pattern:

SELECT COUNT(*) FROM sessions
WHERE api_call_count = 0 AND title IS NULL AND end_reason IS NULL;
-- Returns a significant portion of total sessions
Metric Ghost Sessions Normal Sessions
api_call_count 0 >0
title NULL Set
end_reason NULL Set
parent_session_id NULL (all root) Mixed
Messages per session Similar to normal Normal

Key observations:

  • All ghost sessions are root sessions (parent_session_id IS NULL)
  • They contain real messages (not empty stubs) — with similar counts to normal sessions
  • Their content is duplicated from corresponding normal sessions (confirmed by first-user-message matching)
  • They accumulate steadily over time

Root Cause

Context compression continuation creates new session rows, but the TUI session recovery path does not update api_call_count, title, or end_reason on these rows.

Evidence of duplication — same user message appears in multiple ghost + one normal session:

Ghost session A (api=0, msgs>50) → same first message
Ghost session B (api=0, msgs>50) → identical first message
Normal session  (api>0, msgs>100) → same first message, properly tracked

The pattern: each compression + session recovery creates a new root session row without updating metadata, leaving ghost records that look identical to normal sessions except for the missing api_call_count/title/end_reason.

Why #18370 Did Not Fix This

The prune_empty_ghost_sessions() from #18370 only matches:

WHERE source = 'tui'
  AND title IS NULL
  AND ended_at IS NOT NULL      # Our ghosts have ended_at = NULL
  AND started_at < (now - 86400)
  AND NOT EXISTS (
      SELECT 1 FROM messages    # Our ghosts HAVE messages
      WHERE messages.session_id = sessions.id
  )

Our ghost sessions fail both critical filters:

  1. ended_at IS NULL — sessions were never marked as ended
  2. Messages exist — these are not empty stubs

The lazy session creation fix prevents new empty stubs from being created on TUI open/close, but does not address the case where compression continuation creates sessions that receive messages but never get their metadata updated.

Impact

  1. session_search returns wrong results: The effective_last_active sort in list_sessions_rich() ranks ghost sessions high because their message timestamps are recent (from compression time), pushing actual latest sessions out of the top-N results.

  2. session_search auxiliary model timeouts: With many duplicate sessions, the auxiliary model has to process far more data than necessary, contributing to repeated timeouts observed in logs.

  3. State bloat: Ghost sessions can consume a large fraction of total message storage with duplicate data.

Workaround

Manual SQL cleanup (safe because all ghost content is preserved in normal sessions):

DELETE FROM messages WHERE session_id IN (
    SELECT id FROM sessions
    WHERE api_call_count = 0 AND title IS NULL AND end_reason IS NULL
);

DELETE FROM sessions
WHERE api_call_count = 0 AND title IS NULL AND end_reason IS NULL;

Suggested Fix

Either:

  1. Broaden prune_empty_ghost_sessions() to also catch sessions where api_call_count = 0 AND title IS NULL AND ended_at IS NULL, even if they have messages (since their content is duplicated in the corresponding normal session).

  2. Fix the root cause: Ensure the TUI compression continuation path properly writes api_call_count, title, and end_reason to the session row when a session is finalized — similar to how fix: lazy session creation — defer DB row until first message #18370 added _ensure_db_session() but extended to also update metadata on session end.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/agentCore agent loop, run_agent.py, prompt buildercomp/tuiTerminal UI (ui-tui/ + tui_gateway/)type/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions