fix(skills): key skill-command cache per platform to prevent cross-platform leaks by draix · Pull Request #14594 · NousResearch/hermes-agent

draix · 2026-04-23T14:47:52Z

Problem

agent/skill_commands.py maintains a process-global _skill_commands dict. The first platform to call scan_skill_commands() seeds this cache with its own platform-specific disabled-skill view. Subsequent platforms (e.g. Discord after Telegram) hit get_skill_commands(), find the cache non-empty, and return the wrong skill set — inheriting the first platform's disabled list.

Root Cause

# Before
_skill_commands: Dict[str, Dict[str, Any]] = {}  # single global cache

def get_skill_commands():
    if not _skill_commands:   # any non-empty cache → returned as-is for ALL platforms
        scan_skill_commands()
    return _skill_commands

_get_disabled_skill_names() correctly resolves per-platform disabled lists via HERMES_SESSION_PLATFORM, but the cached result was shared across all platforms.

Fix

	Before	After
Cache structure	`Dict[str, Any]` (flat)	`Dict[str \| None, Dict[str, Any]]` (per-platform)
`scan_skill_commands()`	no platform arg	accepts `platform=None`, resolves and uses as cache key
`get_skill_commands()`	no platform arg	accepts `platform=None`, checks per-platform slot
Platform resolution	always from env	explicit arg → `HERMES_SESSION_PLATFORM` → `HERMES_PLATFORM` → `None`

All existing call-sites use the no-arg form and continue to work — they now automatically resolve platform from session context.

Tests Added (`TestPlatformAwareCache`)

test_different_platforms_get_independent_caches — Telegram and Discord see their own disabled lists
test_platform_none_uses_global_disabled_list — global disabled list respected when platform is None
test_rescan_updates_platform_cache_entry — rescan replaces the per-platform slot

…atform leaks The process-global `_skill_commands` dict was seeded by whichever platform executed `scan_skill_commands()` first. Subsequent platforms then got the wrong enabled/disabled skill set from `get_skill_commands()` without re-scanning, because the guard `if not _skill_commands` treated any non-empty cache as valid for all platforms. Changes: - Change `_skill_commands` from a flat dict to a platform-keyed dict `Dict[str | None, Dict[str, Dict[str, Any]]]`. - Add `_resolve_platform()` helper that follows the same priority order as `get_disabled_skill_names`: explicit arg → HERMES_SESSION_PLATFORM context var → HERMES_PLATFORM env var → None. - `scan_skill_commands(platform=None)` now accepts an optional platform, resolves it, and stores/returns only the per-platform slice. - `get_skill_commands(platform=None)` resolves platform and checks only the per-platform key, triggering a fresh scan on a cache miss. - All existing call-sites use the no-arg form and continue to work: they now automatically resolve from session context. Fixes NousResearch#14536 Tests added (TestPlatformAwareCache): - test_different_platforms_get_independent_caches - test_platform_none_uses_global_disabled_list - test_rescan_updates_platform_cache_entry

alt-glitch · 2026-04-23T15:07:37Z

Competing fix with #14570 for #14536 — this PR keys cache per-platform rather than invalidating on change. Maintainer should pick one.

teknium1 · 2026-05-02T08:34:19Z

Thanks for working on this. Closing in favor of #18739 which salvages #14570.

Applying this PR to main and running the issue's repro returns an empty result for both platforms — scan_skill_commands() now calls _get_disabled_skill_names(resolved_platform), but that function in tools/skills_tool.py takes no argument. The resulting TypeError gets swallowed by the outer except Exception in scan_skill_commands(), so scanning silently returns empty:

telegram: []
discord: []

The architectural approach (per-platform keyed cache, passing platform through explicitly) is cleaner than the rescan strategy in #18739, but getting there would require updating _get_disabled_skill_names() to accept a platform argument in tools/skills_tool.py AND agent/skill_utils.py. Happy to revisit if you want to open a follow-up with that full change — for now shipping the less-invasive rescan fix.

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder tool/skills Skills system (list, view, manage) labels Apr 23, 2026

alt-glitch mentioned this pull request Apr 24, 2026

fix(agent): add per-platform skill_commands cache to prevent cross-platform leakage (#14536) #15375

Closed

teknium1 mentioned this pull request May 2, 2026

fix(skills): rescan skill_commands cache when platform scope changes (salvage #14570) #18739

Merged

teknium1 closed this May 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(skills): key skill-command cache per platform to prevent cross-platform leaks#14594

fix(skills): key skill-command cache per platform to prevent cross-platform leaks#14594
draix wants to merge 1 commit into
NousResearch:mainfrom
draix:fix/14536-skill-commands-platform-cache

draix commented Apr 23, 2026

Uh oh!

alt-glitch commented Apr 23, 2026

Uh oh!

teknium1 commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

draix commented Apr 23, 2026

Problem

Root Cause

Fix

Tests Added (TestPlatformAwareCache)

Uh oh!

alt-glitch commented Apr 23, 2026

Uh oh!

teknium1 commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Tests Added (`TestPlatformAwareCache`)