fix(insights): show cache tokens in /insights token breakdown by liuhao1024 · Pull Request #18632 · NousResearch/hermes-agent

liuhao1024 · 2026-05-02T02:46:39Z

Summary

The /insights report's total_tokens includes cache_read_tokens + cache_write_tokens, but the (in: / out: ) breakdown only showed input and output tokens. For Anthropic users with prompt caching enabled, cache tokens can dominate the total (e.g., 5.2M total but only 2.6K input / 88K output), making the in/out numbers appear swapped or wrong.

Root Cause

In agent/insights.py, the _compute_overview() method correctly sums all four token categories into total_tokens:

total_tokens = total_input + total_output + total_cache_read + total_cache_write

But both display methods (format_terminal and format_gateway) only showed input and output:

Tokens: 5,239,805 (in: 2,642 / out: 88,463)   # 5.15M cache tokens hidden!

Fix

Terminal format: Added a Cache tokens line (with read/write breakdown) that appears only when cache tokens are non-zero
Gateway format: Appends / cache: N to the existing (in: / out: ) breakdown when cache tokens exist
Both formats remain clean (no cache mention) when cache tokens are all zero

After fix

Tokens: 5,239,805 (in: 2,642 / out: 88,463 / cache: 5,148,700)

Test Plan

5 new regression tests for cache token display behavior
All 61 insights tests pass (pytest tests/agent/test_insights.py)
Existing zero-cache tests still pass (no noise when cache = 0)

Closes #18615

The /insights report total_tokens includes cache_read_tokens + cache_write_tokens, but the (in: / out: ) breakdown only showed input and output tokens. For Anthropic users with prompt caching, the cache tokens dominate the total, making the in/out numbers appear wrong or swapped. - Terminal format: add a 'Cache tokens' line when cache > 0 - Gateway format: append '/ cache: N' to the token breakdown - Both formats hide cache when all values are 0 (no noise) - Add 5 regression tests for cache display behavior Closes NousResearch#18615

Community PRs applied: - NousResearch#18596: Enable secret redaction by default (SECURITY) - NousResearch#18650: Sanitize malformed tool messages + auto-recover on API 400 - NousResearch#18607: Emergency compression before max_iterations exhaustion - NousResearch#18603: Compression fallback to main model on 413 rate limit - NousResearch#18638: Pass threshold_percent on model switch - NousResearch#18663: Strip extra_content from tool_calls for strict APIs - NousResearch#18618: Forward explicit_api_key to OpenRouter - NousResearch#18632: Show cache tokens in /insights breakdown - NousResearch#18614: Add idempotency guard for patch duplicate loops - NousResearch#18600: Raise ValueError when HERMES_HOME unset in profile mode - NousResearch#18616: Allow ZWJ emoji in context files - NousResearch#18582: Reload .env on /restart - NousResearch#18547: Stabilize system prompt prefix for KV cache reuse - NousResearch#18692: Strip FTS5 operators from session search truncation terms Fix: Add order_by_last_active=True to list_sessions_rich call (pre-existing commit 142b4bf code sync)

alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/cli CLI entry point, hermes_cli/, setup wizard labels May 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(insights): show cache tokens in /insights token breakdown#18632

fix(insights): show cache tokens in /insights token breakdown#18632
liuhao1024 wants to merge 1 commit into
NousResearch:mainfrom
liuhao1024:fix/insights-token-cache-breakdown

liuhao1024 commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

liuhao1024 commented May 2, 2026

Summary

Root Cause

Fix

After fix

Test Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants