fix(gateway): /status reads token totals from SessionDB#5989
fix(gateway): /status reads token totals from SessionDB#5989Tranquil-Flow wants to merge 2 commits into
Conversation
/status was reading session_entry.total_tokens which is never kept in sync after the token persistence refactor (commit 20441cf). Real token usage lives in SessionDB (SQLite). Read from SessionDB via the new get_session_token_totals() helper and fall back to session_entry.total_tokens when SessionDB is unavailable. Fixes NousResearch#5960.
|
Friendly bump on this PR in case it fell through the cracks \u2014 would love a review when someone has a minute. Thanks! |
|
We tested this approach on a downstream fork that had carried a parallel "fix" for the same symptom — a fork-local commit had reintroduced the gateway-side accumulator pattern that We've now reverted both and adopted this PR's diff as-is (gitea PR + merge: see commit summary below). The result on a real production session that had been showing
Strong +1 from a production user. This is the architecturally correct fix — One micro-suggestion (non-blocking): the docstring on |
|
I reproduced this on a live Feishu gateway session as well. Observed behavior before the fix:
So this does not look like Feishu losing usage data. It matches the root cause described here: I verified the affected session locally:
After applying the fix, So from my side, this PR’s diagnosis and fix direction are correct. |
…search#5989) Reverts the fork-local accumulator divergence and adopts upstream PR NousResearch#5989's architecturally correct fix: read /status token totals from SessionDB (the source of truth where the agent persists tokens directly), not from the gateway-side SessionStore accumulator. ## Background The user reported `/status` on Telegram showed `Tokens: 0` while the local CLI status bar correctly showed token usage. We initially shipped PR NousResearch#10 with a `_lifetime_mirror` accumulator-delta-tracking machinery to fix the symptom — until we found that: 1. Issue NousResearch#5960 had been filed 2 days earlier upstream by Louise-Qiuqiu with a more thorough root-cause analysis. 2. PR NousResearch#5989 by Tranquil-Flow was already open with a much better fix. 3. The fork was carrying a fork-local commit `1daa37bb` ("fix(gateway): wire LLM token usage into session store for /status") that re-introduced an accumulator pattern upstream had deliberately removed in commit 20441cf ("fix(insights): persist token usage for non-CLI sessions"). The accumulator divergence was the actual mistake; PR NousResearch#10 was patching its symptoms instead of reverting the mistake. Architecturally correct flow (upstream + this commit): - Agent persists token counts directly into SessionDB via `_flush_messages_to_session_db` after each turn. - SessionDB (`hermes_state.py`) is the source of truth. - Gateway's `/status` command reads from SessionDB via `get_session_token_totals(session_id)`. - SessionStore's `update_session` only tracks lightweight metadata (`last_prompt_tokens`) for compression / context-window decisions. ## Changes - `hermes_state.py`: add `get_session_token_totals(session_id)` that aggregates input/output/cache_read/cache_write/reasoning columns from the `sessions` table and returns the sum as `total_tokens`. Returns `None` if the session is not in the DB. Verbatim from upstream PR NousResearch#5989. - `gateway/run.py:_handle_status_command`: query SessionDB for token totals; fall back to `session_entry.total_tokens` only if the DB row is missing (fresh install, DB unavailable, pre-SessionDB session). Verbatim from upstream PR NousResearch#5989. - `gateway/run.py:_run_agent`: revert PR NousResearch#10's `_total_toks` plumbing in both return dicts. No longer needed — token persistence happens via the agent → SessionDB path. - `gateway/run.py` (turn-end persistence call): revert PR NousResearch#10's `update_session(input_tokens=, output_tokens=, total_tokens=)` call to upstream's `update_session(session_key, last_prompt_tokens=...)` shape. Token totals are not gateway's concern. - `gateway/session.py:update_session`: revert to upstream's lightweight signature (`session_key, last_prompt_tokens=None`). Drop the `_lifetime_mirror` accumulator infrastructure entirely. - `tests/gateway/test_session.py`: drop the 5 lifetime_mirror regression tests added in PR NousResearch#10. They test machinery that no longer exists. - `tests/gateway/test_status_command.py`: add 2 tests from PR NousResearch#5989 covering SessionDB-preferred totals + the SessionStore fallback when the DB row is missing. - `tests/test_hermes_state.py`: add 2 tests from PR NousResearch#5989 covering the new helper (column sum + missing-row None behavior). ## Result | | Before PR NousResearch#10 | After PR NousResearch#10 | After this commit | |---|---|---|---| | Architecture | fork accumulator (broken) | fork accumulator + mirror band-aid | upstream SessionDB read | | /status accuracy | Tokens: 0 | correct | correct | | Maintenance burden | high (diverges from upstream) | high | low (matches upstream) | | Subagent-found bugs | - | zero-call mirror corruption + reset_session leak | n/a — machinery gone | ## Test results `pytest tests/test_hermes_state.py tests/gateway/test_status_command.py tests/gateway/test_session.py -q` 202 passed, 0 failed.
…sionDB (adopt NousResearch#5989)' (NousResearch#11) from fix/status-tokens-from-sessiondb into main
Replace the basic /status output with a comprehensive session snapshot showing model, provider, token breakdown (input/output/cache/reasoning), cost, context window usage with idle fallback, compression count, queue depth, and platform categorization. - Extract shared formatting helpers to hermes_cli/status_format.py - Add SessionDB.get_session_token_totals() — fixes Tokens: 0 (NousResearch#5960) - Add SessionDB.get_session_last_active() for relative timestamps - Idle context fallback via get_model_context_length (inspired by NousResearch#4678) - 42 unit tests covering all code paths Closes NousResearch#7317, closes NousResearch#7714, supersedes NousResearch#4678 and NousResearch#5989. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
PR #8355 includes |
Replace the basic /status output with a comprehensive session snapshot showing model, provider, token breakdown, cost, context window, and more. - Extract shared formatting helpers to hermes_cli/status_format.py - Add SessionDB.get_session_token_totals() — fixes Tokens: 0 (NousResearch#5960) - Add SessionDB.get_session_last_active() for relative timestamps - Idle context fallback via get_model_context_length (inspired by NousResearch#4678) - Persist compression_count in SessionEntry so idle sessions show compression history (NousResearch#7317) - 42 unit tests covering all code paths Closes NousResearch#7317, closes NousResearch#7714, supersedes NousResearch#4678 and NousResearch#5989. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Fixed on main in 7abc9ce via #17158 — same Camp A approach you proposed (read from SessionDB at display time instead of mirroring into SessionStore). Your PR was submitted first and architecturally correct. Thanks for the thorough fix + state-db regression tests; sorry we didn't land yours directly. |
Summary
/statuswas readingsession_entry.total_tokens, which is never updated after the token-persistence refactor (commit 20441cf) — real usage lives in SessionDB (SQLite)get_session_token_totals()helper tohermes_state.pythat reads aggregated token counts from SessionDB/statusnow prefers SessionDB totals and falls back tosession_entry.total_tokenswhen the DB row is missingNone-handling so missing rows don't raise, they fall back cleanlyFixes #5960.
Files changed
hermes_state.py—get_session_token_totals()helper +None-row fallback fixgateway/run.py—/statuswired to use the new helpertests/gateway/test_status_command.py— tests for SessionDB-preferred totals and session_store fallbacktests/test_hermes_state.py— unit tests for token aggregation and missing-row None behaviorTest plan
pytest tests/gateway/test_status_command.py tests/test_hermes_state.pypasses/statuson a live session reports correct cumulative token counts/statuswith no SessionDB row (fresh install / DB unavailable) falls back without error