Skip to content

fix(gateway): persist token counts to session store for /status display#17158

Closed
JezzaHehn wants to merge 1 commit into
NousResearch:mainfrom
JezzaHehn:fix/gateway-status-token-count
Closed

fix(gateway): persist token counts to session store for /status display#17158
JezzaHehn wants to merge 1 commit into
NousResearch:mainfrom
JezzaHehn:fix/gateway-status-token-count

Conversation

@JezzaHehn
Copy link
Copy Markdown
Contributor

What

Fixed the /status command showing 0 tokens despite active session activity.

Why

Two bugs:

  1. Primary: gateway/run.py checked agent_result.get("total_tokens") which returns None (agent only returns input_tokens/output_tokens, not total_tokens). The condition if total > 0 never executed.
  2. Secondary: SessionEntry class lacked reasoning_tokens field, causing AttributeError when update_token_counts() tried to increment it.

How to test

  1. Start a Hermes gateway session (Discord/Telegram/etc)
  2. Have a conversation (make 1+ agent calls)
  3. Run /status
  4. Verify token counts are non-zero

Files changed

  • gateway/run.py — calculate total from components, call update_token_counts()
  • gateway/session.py — add reasoning_tokens field to SessionEntry

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery labels Apr 28, 2026
@alt-glitch
Copy link
Copy Markdown
Collaborator

Related to #5960 (same symptom: /status shows 0 tokens) and competing with #9750 (also fixes token tracking). Check for overlap.

@JezzaHehn
Copy link
Copy Markdown
Contributor Author

Small note that the arithmetic around line 990 of the updated session.py is most likely to be incorrect if the math is wrong for the total tokens. And, big thanks to the development team 💚

@JezzaHehn
Copy link
Copy Markdown
Contributor Author

Looks like #9750 changes more files and includes code for a testing monkeypatch, not sure if the extra stuff is necessary, I defer to the judgement of those who have more oversight 👍

teknium1 added a commit that referenced this pull request May 1, 2026
/status was reading session_entry.total_tokens from the in-memory
SessionStore (gateway/session.py), which the agent never writes to —
so the token count was always 0.

The agent already persists token deltas to the SQLite SessionDB
(run_agent.py:11497) for every platform with a session_id. Route
/status through that single source of truth instead of duplicating
token writes into a second store.

Fix:
- gateway/run.py: _handle_status_command now calls
  self._session_db.get_session(session_id) and sums the five token
  component columns (input/output/cache_read/cache_write/reasoning).
  Falls back to 0 when no SessionDB is configured or no row exists.
- Two new regression tests covering the populated-row and
  missing-row paths.

Co-authored-by: Hermes <127238744+teknium1@users.noreply.github.com>
@teknium1
Copy link
Copy Markdown
Contributor

teknium1 commented May 1, 2026

Salvaged via #18206 — merged as commit 7abc9ce on main.

Your bug identification was spot-on: /status really was reading from a store nothing writes to. The salvage takes a different fix path — instead of adding a parallel write into the in-memory SessionStore, we route /status through the SQLite SessionDB that the agent already writes to (run_agent.py:11497). Single source of truth, no new fields or methods needed. Your commit was preserved as the authored commit on the salvage PR via rebase-merge, so your name lands in git log.

Thanks for the report and the fix!

teknium1 pushed a commit that referenced this pull request May 1, 2026
…before

_process_message_background snapshotted callback_generation from the
interrupt event at the TOP of the task — before the handler ran.
_hermes_run_generation is only set on the event by
GatewayRunner._bind_adapter_run_generation during
_handle_message_with_agent, which runs DURING the handler await. The
early snapshot always captured None, which then flowed into
pop_post_delivery_callback(..., generation=None) in the finally block.

In pop_post_delivery_callback, generation=None with a tuple-registered
entry (generation, callback) bypasses the ownership check — it pops and
fires the callback regardless of which run owns it. Result: a stale run
could fire a fresher run's post-delivery callback (e.g. a
background-review notification attributed to the wrong turn).

Fix: move the snapshot into the finally block, after the handler has
run and _hermes_run_generation has been bound to the current run.

Regression test added: simulates a stale handler at generation=1 and a
fresher callback registered at generation=2. Pre-fix: snapshot=None →
pop fires the generation=2 callback under generation=1's ownership
("newer" fires). Post-fix: snapshot=1 → pop skips the mismatched
entry, callback stays in the dict for the correct run to claim.

Verified: test FAILS on current main (captures "newer" in fired list),
PASSES with this fix.

Salvaged from PR #12565 (the callback-ownership portion only; the
/status totals portion was already fixed on main in 7abc9ce via #17158).

Co-authored-by: Oxidane-bot <1317078257maroon@gmail.com>
teknium1 pushed a commit that referenced this pull request May 1, 2026
…before

_process_message_background snapshotted callback_generation from the
interrupt event at the TOP of the task — before the handler ran.
_hermes_run_generation is only set on the event by
GatewayRunner._bind_adapter_run_generation during
_handle_message_with_agent, which runs DURING the handler await. The
early snapshot always captured None, which then flowed into
pop_post_delivery_callback(..., generation=None) in the finally block.

In pop_post_delivery_callback, generation=None with a tuple-registered
entry (generation, callback) bypasses the ownership check — it pops and
fires the callback regardless of which run owns it. Result: a stale run
could fire a fresher run's post-delivery callback (e.g. a
background-review notification attributed to the wrong turn).

Fix: move the snapshot into the finally block, after the handler has
run and _hermes_run_generation has been bound to the current run.

Regression test added: simulates a stale handler at generation=1 and a
fresher callback registered at generation=2. Pre-fix: snapshot=None →
pop fires the generation=2 callback under generation=1's ownership
("newer" fires). Post-fix: snapshot=1 → pop skips the mismatched
entry, callback stays in the dict for the correct run to claim.

Verified: test FAILS on current main (captures "newer" in fired list),
PASSES with this fix.

Salvaged from PR #12565 (the callback-ownership portion only; the
/status totals portion was already fixed on main in 7abc9ce via #17158).

Co-authored-by: Oxidane-bot <1317078257maroon@gmail.com>
donald131 pushed a commit to donald131/hermes-agent that referenced this pull request May 2, 2026
…17158)

/status was reading session_entry.total_tokens from the in-memory
SessionStore (gateway/session.py), which the agent never writes to —
so the token count was always 0.

The agent already persists token deltas to the SQLite SessionDB
(run_agent.py:11497) for every platform with a session_id. Route
/status through that single source of truth instead of duplicating
token writes into a second store.

Fix:
- gateway/run.py: _handle_status_command now calls
  self._session_db.get_session(session_id) and sums the five token
  component columns (input/output/cache_read/cache_write/reasoning).
  Falls back to 0 when no SessionDB is configured or no row exists.
- Two new regression tests covering the populated-row and
  missing-row paths.

Co-authored-by: Hermes <127238744+teknium1@users.noreply.github.com>
donald131 pushed a commit to donald131/hermes-agent that referenced this pull request May 2, 2026
…before

_process_message_background snapshotted callback_generation from the
interrupt event at the TOP of the task — before the handler ran.
_hermes_run_generation is only set on the event by
GatewayRunner._bind_adapter_run_generation during
_handle_message_with_agent, which runs DURING the handler await. The
early snapshot always captured None, which then flowed into
pop_post_delivery_callback(..., generation=None) in the finally block.

In pop_post_delivery_callback, generation=None with a tuple-registered
entry (generation, callback) bypasses the ownership check — it pops and
fires the callback regardless of which run owns it. Result: a stale run
could fire a fresher run's post-delivery callback (e.g. a
background-review notification attributed to the wrong turn).

Fix: move the snapshot into the finally block, after the handler has
run and _hermes_run_generation has been bound to the current run.

Regression test added: simulates a stale handler at generation=1 and a
fresher callback registered at generation=2. Pre-fix: snapshot=None →
pop fires the generation=2 callback under generation=1's ownership
("newer" fires). Post-fix: snapshot=1 → pop skips the mismatched
entry, callback stays in the dict for the correct run to claim.

Verified: test FAILS on current main (captures "newer" in fired list),
PASSES with this fix.

Salvaged from PR NousResearch#12565 (the callback-ownership portion only; the
/status totals portion was already fixed on main in 7abc9ce via NousResearch#17158).

Co-authored-by: Oxidane-bot <1317078257maroon@gmail.com>
nickdlkk pushed a commit to nickdlkk/hermes-agent that referenced this pull request May 11, 2026
…17158)

/status was reading session_entry.total_tokens from the in-memory
SessionStore (gateway/session.py), which the agent never writes to —
so the token count was always 0.

The agent already persists token deltas to the SQLite SessionDB
(run_agent.py:11497) for every platform with a session_id. Route
/status through that single source of truth instead of duplicating
token writes into a second store.

Fix:
- gateway/run.py: _handle_status_command now calls
  self._session_db.get_session(session_id) and sums the five token
  component columns (input/output/cache_read/cache_write/reasoning).
  Falls back to 0 when no SessionDB is configured or no row exists.
- Two new regression tests covering the populated-row and
  missing-row paths.

Co-authored-by: Hermes <127238744+teknium1@users.noreply.github.com>
nickdlkk pushed a commit to nickdlkk/hermes-agent that referenced this pull request May 11, 2026
…before

_process_message_background snapshotted callback_generation from the
interrupt event at the TOP of the task — before the handler ran.
_hermes_run_generation is only set on the event by
GatewayRunner._bind_adapter_run_generation during
_handle_message_with_agent, which runs DURING the handler await. The
early snapshot always captured None, which then flowed into
pop_post_delivery_callback(..., generation=None) in the finally block.

In pop_post_delivery_callback, generation=None with a tuple-registered
entry (generation, callback) bypasses the ownership check — it pops and
fires the callback regardless of which run owns it. Result: a stale run
could fire a fresher run's post-delivery callback (e.g. a
background-review notification attributed to the wrong turn).

Fix: move the snapshot into the finally block, after the handler has
run and _hermes_run_generation has been bound to the current run.

Regression test added: simulates a stale handler at generation=1 and a
fresher callback registered at generation=2. Pre-fix: snapshot=None →
pop fires the generation=2 callback under generation=1's ownership
("newer" fires). Post-fix: snapshot=1 → pop skips the mismatched
entry, callback stays in the dict for the correct run to claim.

Verified: test FAILS on current main (captures "newer" in fired list),
PASSES with this fix.

Salvaged from PR NousResearch#12565 (the callback-ownership portion only; the
/status totals portion was already fixed on main in 7abc9ce via NousResearch#17158).

Co-authored-by: Oxidane-bot <1317078257maroon@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants