Skip to content

feat(status): add context window display and fix usage token tracking#9750

Open
WqyJh wants to merge 2 commits into
NousResearch:mainfrom
WqyJh:feat/usage-context-window
Open

feat(status): add context window display and fix usage token tracking#9750
WqyJh wants to merge 2 commits into
NousResearch:mainfrom
WqyJh:feat/usage-context-window

Conversation

@WqyJh
Copy link
Copy Markdown

@WqyJh WqyJh commented Apr 14, 2026

Changes

1. Fix reasoning_tokens extraction (agent/usage_pricing.py)

  • normalize_usage() was reading output_tokens_details (Anthropic format) for ALL API formats
  • OpenAI-compatible APIs (e.g. MiMo) use completion_tokens_details instead, so reasoning_tokens was always 0
  • Fix: move the extraction into each per-format branch with the correct field name

2. Persist token stats to session (gateway/session.py)

  • SessionEntry now tracks compression_count (persisted to sessions.json)
  • update_session() accepts input_tokens, output_tokens, total_tokens, compression_count and accumulates them

3. Show context window in /status (gateway/run.py)

  • New line: 📚 Context: 134k/1M (13%) · 🧹 Compactions: 0
  • Looks up context window from DEFAULT_CONTEXT_LENGTHS (no network calls)
  • Uses last_prompt_tokens from the session for current usage

4. Wire up compression count (run_agent.py)

  • Agent result dict includes compression_count from ContextCompressor
  • Gateway passes it through to update_session()

Test

  • All 2541 gateway tests pass (1 pre-existing unrelated failure)

Hermes Agent added 2 commits April 14, 2026 20:39
The GatewayStreamConsumer was not passing reply_to when sending the
first streamed message, so platform adapters that support reply/quoting
(e.g. Feishu) could not reference the user's original message.

Changes:
- Add reply_to parameter to GatewayStreamConsumer.__init__
- Pass reply_to on first message send in _send_or_edit()
- Pass event_message_id as reply_to when creating the consumer in _run_agent()
- Fix reasoning_tokens extraction: move into per-API-format branches
  (Anthropic reads output_tokens_details, OpenAI reads completion_tokens_details)
- Add compression_count to SessionEntry (persisted to sessions.json)
- Pass input/output/total tokens and compression_count through update_session
- Add context window line to /status: 📚 Context: 134k/1M (13%) · 🧹 Compactions: 0
@WqyJh WqyJh changed the title Feat/usage context window feat(status): add context window display and fix usage token tracking Apr 14, 2026
@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder comp/gateway Gateway runner, session dispatch, delivery comp/cli CLI entry point, hermes_cli/, setup wizard labels Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants