Skip to content

fix(gateway): /status reads token totals from SessionDB#5989

Closed
Tranquil-Flow wants to merge 2 commits into
NousResearch:mainfrom
Tranquil-Flow:review/gateway-status-sessiondb-ready
Closed

fix(gateway): /status reads token totals from SessionDB#5989
Tranquil-Flow wants to merge 2 commits into
NousResearch:mainfrom
Tranquil-Flow:review/gateway-status-sessiondb-ready

Conversation

@Tranquil-Flow
Copy link
Copy Markdown
Contributor

Summary

  • /status was reading session_entry.total_tokens, which is never updated after the token-persistence refactor (commit 20441cf) — real usage lives in SessionDB (SQLite)
  • Adds get_session_token_totals() helper to hermes_state.py that reads aggregated token counts from SessionDB
  • /status now prefers SessionDB totals and falls back to session_entry.total_tokens when the DB row is missing
  • Corrects None-handling so missing rows don't raise, they fall back cleanly

Fixes #5960.

Files changed

  • hermes_state.pyget_session_token_totals() helper + None-row fallback fix
  • gateway/run.py/status wired to use the new helper
  • tests/gateway/test_status_command.py — tests for SessionDB-preferred totals and session_store fallback
  • tests/test_hermes_state.py — unit tests for token aggregation and missing-row None behavior

Test plan

  • pytest tests/gateway/test_status_command.py tests/test_hermes_state.py passes
  • /status on a live session reports correct cumulative token counts
  • /status with no SessionDB row (fresh install / DB unavailable) falls back without error

/status was reading session_entry.total_tokens which is never kept in
sync after the token persistence refactor (commit 20441cf).  Real
token usage lives in SessionDB (SQLite).  Read from SessionDB via the
new get_session_token_totals() helper and fall back to
session_entry.total_tokens when SessionDB is unavailable.

Fixes NousResearch#5960.
@blasai1739217-cmyk
Copy link
Copy Markdown

Friendly bump on this PR in case it fell through the cracks \u2014 would love a review when someone has a minute. Thanks!

@malaiwah
Copy link
Copy Markdown
Contributor

malaiwah commented Apr 9, 2026

We tested this approach on a downstream fork that had carried a parallel "fix" for the same symptom — a fork-local commit had reintroduced the gateway-side accumulator pattern that 20441cf2 deliberately removed, and then layered a _lifetime_mirror delta-tracker on top to paper over the resulting quadratic blowup. Both the original fork commit and the band-aid were doing the wrong thing in the wrong layer.

We've now reverted both and adopted this PR's diff as-is (gitea PR + merge: see commit summary below). The result on a real production session that had been showing Tokens: 0 for ~12 hours of agent traffic:

Session ID: 20260409_113641_3e20c9f2
Tokens: 50,409,258
  input_tokens:    1,407,005
  output_tokens:     100,493
  cache_read_tokens: 48,901,760

get_session_token_totals reads from SessionDB cleanly, the total_tokens aggregation matches the agent's lifetime counters, and the SessionStore-fallback path is exercised correctly when the DB row is missing (we tested by clearing the persisted entry). All 202 tests in tests/test_hermes_state.py tests/gateway/test_status_command.py tests/gateway/test_session.py pass with the diff applied.

Strong +1 from a production user. This is the architecturally correct fix — /status should always have read from SessionDB once 20441cf2 moved the source of truth there. The fork's accumulator divergence is a cautionary tale of fixing symptoms instead of root causes; this PR deserves to be the canonical resolution of #5960.

One micro-suggestion (non-blocking): the docstring on get_session_token_totals might note that total_tokens includes cache reads/writes, which is what some downstream consumers expect for "tokens used" but might surprise users coming from raw provider billing semantics where cache reads have different costs. Just a comment clarification, not a behavior change.

@Loping151
Copy link
Copy Markdown

I reproduced this on a live Feishu gateway session as well.

Observed behavior before the fix:

  • token usage in the TUI was increasing normally during the same conversation
  • but /status in Feishu reported Tokens: 0

So this does not look like Feishu losing usage data. It matches the root cause described here: /status was reading stale SessionStore / session_entry.total_tokens state, while the real token counts were already present in SessionDB / SQLite.

I verified the affected session locally:

  • session_entry.total_tokens stayed at 0
  • the same session_id in state.db had large non-zero values for input_tokens and output_tokens

After applying the fix, /status in Feishu started reporting the correct cumulative token count for the same session.

So from my side, this PR’s diagnosis and fix direction are correct.

malaiwah pushed a commit to malaiwah/hermes-agent that referenced this pull request Apr 11, 2026
…search#5989)

Reverts the fork-local accumulator divergence and adopts upstream
PR NousResearch#5989's architecturally correct fix: read /status token totals
from SessionDB (the source of truth where the agent persists tokens
directly), not from the gateway-side SessionStore accumulator.

## Background

The user reported `/status` on Telegram showed `Tokens: 0` while the
local CLI status bar correctly showed token usage. We initially
shipped PR NousResearch#10 with a `_lifetime_mirror` accumulator-delta-tracking
machinery to fix the symptom — until we found that:

1. Issue NousResearch#5960 had been filed 2 days earlier upstream by Louise-Qiuqiu
   with a more thorough root-cause analysis.
2. PR NousResearch#5989 by Tranquil-Flow was already open with a much better fix.
3. The fork was carrying a fork-local commit `1daa37bb`
   ("fix(gateway): wire LLM token usage into session store for
   /status") that re-introduced an accumulator pattern upstream had
   deliberately removed in commit 20441cf ("fix(insights): persist
   token usage for non-CLI sessions"). The accumulator divergence
   was the actual mistake; PR NousResearch#10 was patching its symptoms instead
   of reverting the mistake.

Architecturally correct flow (upstream + this commit):
  - Agent persists token counts directly into SessionDB
    via `_flush_messages_to_session_db` after each turn.
  - SessionDB (`hermes_state.py`) is the source of truth.
  - Gateway's `/status` command reads from SessionDB via
    `get_session_token_totals(session_id)`.
  - SessionStore's `update_session` only tracks lightweight metadata
    (`last_prompt_tokens`) for compression / context-window decisions.

## Changes

- `hermes_state.py`: add `get_session_token_totals(session_id)` that
  aggregates input/output/cache_read/cache_write/reasoning columns
  from the `sessions` table and returns the sum as `total_tokens`.
  Returns `None` if the session is not in the DB. Verbatim from
  upstream PR NousResearch#5989.
- `gateway/run.py:_handle_status_command`: query SessionDB for token
  totals; fall back to `session_entry.total_tokens` only if the DB
  row is missing (fresh install, DB unavailable, pre-SessionDB
  session). Verbatim from upstream PR NousResearch#5989.
- `gateway/run.py:_run_agent`: revert PR NousResearch#10's `_total_toks` plumbing
  in both return dicts. No longer needed — token persistence happens
  via the agent → SessionDB path.
- `gateway/run.py` (turn-end persistence call): revert PR NousResearch#10's
  `update_session(input_tokens=, output_tokens=, total_tokens=)`
  call to upstream's `update_session(session_key,
  last_prompt_tokens=...)` shape. Token totals are not gateway's
  concern.
- `gateway/session.py:update_session`: revert to upstream's
  lightweight signature (`session_key, last_prompt_tokens=None`).
  Drop the `_lifetime_mirror` accumulator infrastructure entirely.
- `tests/gateway/test_session.py`: drop the 5 lifetime_mirror
  regression tests added in PR NousResearch#10. They test machinery that no
  longer exists.
- `tests/gateway/test_status_command.py`: add 2 tests from PR NousResearch#5989
  covering SessionDB-preferred totals + the SessionStore fallback
  when the DB row is missing.
- `tests/test_hermes_state.py`: add 2 tests from PR NousResearch#5989 covering
  the new helper (column sum + missing-row None behavior).

## Result

| | Before PR NousResearch#10 | After PR NousResearch#10 | After this commit |
|---|---|---|---|
| Architecture | fork accumulator (broken) | fork accumulator + mirror band-aid | upstream SessionDB read |
| /status accuracy | Tokens: 0 | correct | correct |
| Maintenance burden | high (diverges from upstream) | high | low (matches upstream) |
| Subagent-found bugs | - | zero-call mirror corruption + reset_session leak | n/a — machinery gone |

## Test results

`pytest tests/test_hermes_state.py tests/gateway/test_status_command.py tests/gateway/test_session.py -q`
202 passed, 0 failed.
malaiwah pushed a commit to malaiwah/hermes-agent that referenced this pull request Apr 11, 2026
…sionDB (adopt NousResearch#5989)' (NousResearch#11) from fix/status-tokens-from-sessiondb into main
malaiwah pushed a commit to malaiwah/hermes-agent that referenced this pull request Apr 12, 2026
Replace the basic /status output with a comprehensive session snapshot
showing model, provider, token breakdown (input/output/cache/reasoning),
cost, context window usage with idle fallback, compression count,
queue depth, and platform categorization.

- Extract shared formatting helpers to hermes_cli/status_format.py
- Add SessionDB.get_session_token_totals() — fixes Tokens: 0 (NousResearch#5960)
- Add SessionDB.get_session_last_active() for relative timestamps
- Idle context fallback via get_model_context_length (inspired by NousResearch#4678)
- 42 unit tests covering all code paths

Closes NousResearch#7317, closes NousResearch#7714, supersedes NousResearch#4678 and NousResearch#5989.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@malaiwah
Copy link
Copy Markdown
Contributor

PR #8355 includes get_session_token_totals() from this PR as part of a broader /status overhaul. We tested this approach on our downstream fork (as noted in our earlier comment) and it's been solid in production. Credit to @blasai1739217-cmyk for the original fix.

malaiwah pushed a commit to malaiwah/hermes-agent that referenced this pull request Apr 12, 2026
Replace the basic /status output with a comprehensive session snapshot
showing model, provider, token breakdown, cost, context window, and more.

- Extract shared formatting helpers to hermes_cli/status_format.py
- Add SessionDB.get_session_token_totals() — fixes Tokens: 0 (NousResearch#5960)
- Add SessionDB.get_session_last_active() for relative timestamps
- Idle context fallback via get_model_context_length (inspired by NousResearch#4678)
- Persist compression_count in SessionEntry so idle sessions show
  compression history (NousResearch#7317)
- 42 unit tests covering all code paths

Closes NousResearch#7317, closes NousResearch#7714, supersedes NousResearch#4678 and NousResearch#5989.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/gateway Gateway runner, session dispatch, delivery labels Apr 30, 2026
@alt-glitch
Copy link
Copy Markdown
Collaborator

Related to #13820 and #12565 (same fix for #5960). Likely superseded by #8355 which provides a richer /status implementation.

@teknium1
Copy link
Copy Markdown
Contributor

teknium1 commented May 1, 2026

Fixed on main in 7abc9ce via #17158 — same Camp A approach you proposed (read from SessionDB at display time instead of mirroring into SessionStore). Your PR was submitted first and architecturally correct. Thanks for the thorough fix + state-db regression tests; sorry we didn't land yours directly.

@teknium1 teknium1 closed this May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: /status Tokens: 0 regressed — SessionStore totals drift from SessionDB/state.db

6 participants