docs(open-webui): fill gaps in Quick Setup by teknium1 · Pull Request #19654 · NousResearch/hermes-agent

teknium1 · 2026-05-04T10:07:06Z

Summary

Reported by @neopabo — the Open WebUI messaging page was missing several steps users actually hit during first-time setup.

Changes

Step 1 uses hermes config set instead of hand-editing .env (matches current UX; hermes config set auto-routes secrets).
Added a restart-gateway note so users with an already-running gateway pick up API_SERVER_ENABLED.
New Step 3 verifies /health and /v1/models with curl before jumping to Docker, with guidance when those fail.
ENABLE_OLLAMA_API=false added to the docker run snippet and the docker-compose.yml snippet — otherwise Open WebUI shows an empty Ollama section that shadows the Hermes model picker.
First-launch wait note (15–30s while Open WebUI downloads embedding models).
Troubleshooting entry for the empty-Ollama-shadowing case.
/v1/models troubleshoot curl now includes the Authorization header so it works when API_SERVER_KEY is set.

Validation

cd website && node scripts/prebuild.mjs && npx docusaurus build → [SUCCESS]. Only a preexisting unrelated broken anchor on adding-platform-adapters.

…g, restart note Reported by @neopabo — the Open WebUI page was missing several steps users hit in practice: - Use hermes config set instead of hand-editing .env (matches current UX) - Restart-gateway note after enabling API_SERVER_ENABLED - curl /health + /v1/models verification step before jumping to Docker - ENABLE_OLLAMA_API=false in both docker run and compose snippets to suppress the empty Ollama backend that otherwise clutters the picker - 15-30s startup wait note for first-run embedding model download - Troubleshooting entry for the empty-Ollama-shadowing case - /v1/models troubleshoot command now includes the Authorization header

…g, restart note (NousResearch#19654) Reported by @neopabo — the Open WebUI page was missing several steps users hit in practice: - Use hermes config set instead of hand-editing .env (matches current UX) - Restart-gateway note after enabling API_SERVER_ENABLED - curl /health + /v1/models verification step before jumping to Docker - ENABLE_OLLAMA_API=false in both docker run and compose snippets to suppress the empty Ollama backend that otherwise clutters the picker - 15-30s startup wait note for first-run embedding model download - Troubleshooting entry for the empty-Ollama-shadowing case - /v1/models troubleshoot command now includes the Authorization header

* fix(auxiliary): propagate explicit_api_key to _try_anthropic() _try_anthropic() lacked the explicit_api_key parameter added to _try_openrouter() in #18768. When resolve_provider_client() is called with provider="anthropic" and an explicit key (e.g. from a fallback_model entry with api_key set), the key was silently ignored — _try_anthropic() always fell back to resolve_anthropic_token(), so the fallback returned None,None for users without a default Anthropic credential configured. Fix: add explicit_api_key: str = None to _try_anthropic() and use explicit_api_key or <pool/env fallback> in both the pool-present and no-pool paths. Pass explicit_api_key=explicit_api_key at the call site in resolve_provider_client(). Symmetric with the _try_openrouter() fix. No behavior change when explicit_api_key is None. * fix(cron): bump skill usage when cron jobs load skills Cron jobs that reference skills via their skills: config never bumped the usage counters in .usage.json, so the curator could auto-archive skills actively used by cron jobs based on stale timestamps. Now _build_job_prompt() calls bump_use(skill_name) for each successfully loaded skill so the curator sees them as active. * fix(tui): tolerate npm's peer-flag drop in lockfile comparison `_tui_need_npm_install()` compares the canonical `package-lock.json` against the hidden `node_modules/.package-lock.json` to decide whether `npm install` needs to re-run. npm 9 drops the `"peer": true` field from the hidden lock on dev-deps that are *also* declared as peers (the canonical lock preserves the dual annotation). That made the check flag 16 packages (`@babel/core`, `@types/node`, `@types/react`, `@typescript-eslint/*`, `react`, `vite`, `tsx`, `typescript`, …) as mismatched on every launch, triggering a runtime `npm install`. Inside the Docker image, that runtime install then fails with EACCES because `/opt/hermes/ui-tui/node_modules/` is root-owned from build time, so `docker run … hermes-agent --tui` prints: Installing TUI dependencies… npm install failed. …and exits 1, with no preview. The empty preview is a second bug: the launcher captured only stderr, but npm 9 writes EACCES to stdout, which was DEVNULL'd. Fixes: - Add `"peer"` to `_NPM_LOCK_RUNTIME_KEYS` so the comparison ignores the non-deterministic field, alongside the existing `"ideallyInert"`. - Capture stdout as well as stderr in the install subprocess so future failures surface a useful preview instead of a bare "failed." line. Regression tests: - `test_no_install_when_only_peer_annotation_differs` — the exact scenario - `test_install_when_version_differs_even_with_peer_drop` — guards against the peer-drop tolerance masking a real version skew On-host impact: the same false-positive was firing on every `hermes --tui` invocation from a normal checkout, silently running a no-op `npm install` each time (it converged because the host's `node_modules/` is writable). Startup time on the TUI should drop noticeably. * feat(docker): launch dashboard as side-process via HERMES_DASHBOARD=1 Adds an optional dashboard side-process to the container entrypoint, toggled by `HERMES_DASHBOARD=1` (also accepts `true` / `yes`). When set, the entrypoint backgrounds `hermes dashboard` before `exec`-ing the main command so the user's chosen foreground process (gateway, chat, `sleep infinity`, …) remains PID-of-interest for the container runtime. docker run -d \ -v ~/.hermes:/opt/data \ -p 8642:8642 -p 9119:9119 \ -e HERMES_DASHBOARD=1 \ nousresearch/hermes-agent gateway run Defaults chosen for the container case: - Host: 0.0.0.0 (reachable through published port; can override to 127.0.0.1 via HERMES_DASHBOARD_HOST for sidecar/reverse-proxy setups) - Port: 9119 (matches `hermes dashboard`) - Auto-adds `--insecure` when binding to non-localhost, matching the dashboard's own safety gate for exposing API keys - HERMES_DASHBOARD_TUI is read by `hermes dashboard` directly — no entrypoint plumbing needed Dashboard output is prefixed with `[dashboard]` via `stdbuf`+`sed -u` so it's easy to separate from gateway logs in `docker logs`. No supervision: if the dashboard crashes it stays down until the container restarts (documented in the `:::note` panel). Other changes bundled in: - Deprecate GATEWAY_HEALTH_URL / GATEWAY_HEALTH_TIMEOUT env vars in hermes_cli/web_server.py with a DEPRECATED block comment and a `.. deprecated::` note on _probe_gateway_health. The feature still works for this release; it'll be removed alongside the move to a first-class dashboard config key. - Rewrite the "Running the dashboard" doc section around the new single-container pattern. Drops the previously-documented dashboard-as-its-own-container setup — that pattern relied on the deprecated env vars for cross-container gateway-liveness detection, and without them the dashboard would permanently report the gateway as "not running". - Collapse the two-service Compose example (gateway + dashboard container) into a single service with HERMES_DASHBOARD=1. Removes the now-unnecessary bridge network and `depends_on`. - Drop the ":::warning" caveat about "Running a dashboard container alongside the gateway is safe" — that case no longer exists. * fix(cli): local backend CLI always uses launch directory, stops .env sync of TERMINAL_CWD (#19334) The old CWD heuristic was fooled by: 1. TERMINAL_CWD persisted to .env by `hermes config set terminal.cwd` 2. Inherited TERMINAL_CWD from parent hermes processes 3. Only resolved when config had a placeholder value (not explicit paths) Fix: - load_cli_config() unconditionally uses os.getcwd() for local backend - TERMINAL_CWD always force-exported in CLI mode (overrides stale values) - Gateway sets _HERMES_GATEWAY=1 marker so lazy cli.py imports don't clobber - Remove terminal.cwd from config-set .env sync map (prevents re-poisoning) - Clarify setup wizard label as 'Gateway working directory' Closes #19214 * fix(skill): reference built-in video_analyze/vision_analyze tools in kanban-video-orchestrator (#19562) The tool-matrix.md had a vague 'Gemini multimodal / Claude vision' entry in the external tools table that didn't point to the actual built-in Hermes tools. Now that video_analyze exists (merged in #19301), update the skill to reference it properly: - Add 'Built-in Hermes tools for media review' section with proper toolset names, enablement instructions, and capability details - Add video + vision toolsets to cinematographer, editor, and reviewer profile configs - Update role-archetypes.md to reference tools by name - Update API key table to explain video_analyze routing * test(kanban): add failing test for list_profiles_on_disk with custom HERMES_HOME list_profiles_on_disk() hardcodes Path.home() / ".hermes" / "profiles", ignoring HERMES_HOME when set to a custom root (e.g. /opt/data). Add test_list_profiles_on_disk_custom_root to cover this case. Related to #18442, #18985. * fix(kanban): use get_default_hermes_root() in list_profiles_on_disk Path.home() / ".hermes" / "profiles" breaks custom-root deployments (e.g. HERMES_HOME=/opt/data). Switch to get_default_hermes_root() so profile discovery is consistent with kanban_db_path() and workspaces_root() fixed in #18985. Fixes #19017. Related to #18442, #18985. * fix(curator): prevent false-positive consolidation from substring matching _classify_removed_skills used naive 'in' substring matching to detect whether a removed skill's name appeared in skill_manage arguments. Short/common skill names (api, git, test, foo, etc.) matched incorrectly when they appeared as substrings of longer words in file paths (references/api-design.md) or content (latest, testing). Replace with field-aware matching: - file_path: needle must match a complete filename stem or directory name, with -/_ normalised for variant tolerance - content fields: word-boundary regex (\b) prevents embedding in longer words Also add 3 regression tests covering the false-positive scenarios. * chore(release): map daixin1204@gmail.com to @SimbaKingjoe * skills-hub: hash binary skill bundle files correctly * test(skills): add bytes-vs-str equivalence and on-disk hash parity tests Follow-up on #9925 cherry-pick adding two additional tests: - bytes content hashes identically to its str-decoded form - mixed bytes+str bundle hash equals the on-disk content_hash from skills_guard (the production invariant used to detect drift) Also map dodofun@126.com and 1615063567@qq.com in AUTHOR_MAP so the CI contributor check passes for the cherry-picked commit. Co-authored-by: LeonSGP43 <cine.dreamer.one@gmail.com> Co-authored-by: zhao0112 <1615063567@qq.com> * fix(cron): recover null next_run_at jobs and tolerate non-dict origin Fixes #18722 get_due_jobs() now recomputes next_run_at via compute_next_run() for cron/interval jobs that arrived with null next_run_at (e.g. via direct jobs.json edits) instead of silently skipping them. _resolve_origin() guards with isinstance(origin, dict), and _deliver_result() now routes through _resolve_origin() so string/non-dict origins no longer crash the ticker. References: references #18735 (open competing fix from automated bulk PR touching 79 files); this PR is a focused single-issue contribution and adds the missing interval-recovery test variant Co-Authored-By: Claude <noreply@anthropic.com> * test(cron): cover null next_run_at recovery and non-dict origin tolerance Adds four regression tests guarding the bugfix in the previous commit: - TestGetDueJobs::test_broken_cron_without_next_run_is_recovered exercises cron schedules whose next_run_at was lost; expects compute_next_run to repopulate it within get_due_jobs() rather than silently skipping the job. - TestGetDueJobs::test_broken_interval_without_next_run_is_recovered does the same for interval schedules. - TestResolveOrigin::test_string_origin_is_tolerated and test_non_dict_origin_is_tolerated confirm _resolve_origin() returns None for legacy/hand-edited origins (string, list, int) instead of raising. Co-Authored-By: Claude <noreply@anthropic.com> * chore(release): add AUTHOR_MAP entries for upcoming salvage batch Pre-adds author-email mappings for the 21 Tier 1b salvage PRs so their cherry-picked commits land with mapped GitHub logins in the release notes. * fix(cli): avoid voice TTS restart race * test: cover max-iterations summary message sanitization * fix: treat ctrl-c as curses cancel * fix(gateway): show other profiles in `gateway status` to prevent confusion When multiple gateway profiles are running (e.g. default and wx1), `hermes gateway status` can be misleading — stopping one profile's gateway and checking status may still show the other profile's process without indicating which profile it belongs to. Add `_print_other_profiles_gateway_status()` which displays running gateways from other profiles at the bottom of the status output: Other profiles: ✓ wx1 — PID 166893 This uses the existing `find_profile_gateway_processes()` and `get_active_profile_name()` — no new dependencies. Closes #19113 Related: #4402, #4587 * fix(setup): add missing SLACK_HOME_CHANNEL prompt to _setup_slack() _setup_slack() was the only platform setup function that did not prompt for a home channel. All four sibling setups (_setup_telegram, _setup_discord, _setup_mattermost, _setup_bluebubbles) close with an identical home-channel block, and setup_gateway() already checks for SLACK_HOME_CHANNEL presence at the end of the wizard — but the value was never collected, leaving cron delivery and cross-platform notifications silently broken for Slack after a fresh hermes setup run. Add the standard home-channel prompt at the end of _setup_slack(), symmetric with the Discord implementation. Add two unit tests that verify the prompt is saved when provided and skipped when left blank. * fix(signal): skip reactions for unauthorized senders The on_processing_start hook fired a reaction emoji (👀) on every inbound Signal message before run.py's _is_user_authorized check. This meant contacts not in SIGNAL_ALLOWED_USERS would see the bot react to their messages even though Hermes silently dropped them — leaking the presence of the bot and causing confusing UX. Two changes to gateway/platforms/signal.py: 1. Read SIGNAL_ALLOWED_USERS into self.dm_allow_from in __init__ (mirrors the group_allow_from pattern already in place). 2. Add _reactions_enabled(event) — two-gate check: - SIGNAL_REACTIONS=false/0/no disables reactions globally - If SIGNAL_ALLOWED_USERS is set, only react to senders in the allowlist (skips unauthorized contacts) Both on_processing_start and on_processing_complete now call this guard before sending any reaction. Telegram already has an equivalent _reactions_enabled() guard (controlled by TELEGRAM_REACTIONS). This brings Signal to parity. * fix: exclude ancestor PIDs from gateway process scan (#13242) _scan_gateway_pids() uses ps-based pattern matching to find running gateways. When invoked from the CLI (e.g. `hermes gateway status`), the calling process itself matches gateway patterns, causing false positives — the CLI is mistakenly counted as a running gateway. Add _get_ancestor_pids() that walks the process tree from the current PID up to init (PID 1). Merge this set into exclude_pids at the top of _scan_gateway_pids() so the entire ancestor chain is filtered out. This complements the existing os.getpid() exclusion in _append_unique_pid() by also covering parent/grandparent processes (e.g. when hermes is invoked via a wrapper script or shell). Closes #13242 * fix(cli): allow custom:* provider slugs in model validation Two related fixes for custom_providers model switching: 1. validate_requested_model() now recognizes custom:<name> slugs (e.g. custom:volcengine) as custom endpoints, not generic providers. Previously only the bare 'custom' slug matched the relaxed validation branch, causing model validation to fail with 'not found in provider listing' for all named custom providers. 2. switch_model() now consults the custom_providers list when deciding whether to override a validation rejection. If the requested model matches the entry's 'model' field or any key in its 'models' dict, the switch is accepted even when the remote /v1/models endpoint does not list it. Both changes are covered by existing tests (86 passed). * fix(gateway): move quick-command dispatch before built-in handlers Quick commands of type "alias" that target built-in slash commands (e.g. /h -> /model) were processed too late in _handle_message — after the if-canonical=="model" checks. This meant alias expansion never reached the target handler and fell through to the LLM as raw text. Two fixes: 1. Move the quick_commands block before built-in dispatch so alias targets (like /model) hit the correct handler after expansion. 2. Extract bare command name from target_command via .split()[0] to feed _resolve_cmd() correctly (was using the full arg-string). * fix(tui): call process.exit(0) after Ink exit to trigger terminal cleanup Ink's exit() calls unmount() which resets terminal modes (kitty keyboard, mouse, etc.) but does NOT call process.exit(). The Node process stays alive because stdin is still open (Ink listens on it), so the process.on('exit') handler in entry.tsx — which sends the final resetTerminalModes() — never fires. This left kitty keyboard protocol and other terminal modes enabled in the parent shell after /quit, Ctrl+C, or Ctrl+D, breaking arrow keys and other input in subsequent programs. Add explicit process.exit(0) after exit() in die() so the process actually terminates and the exit handler runs. Fixes #19194 * fix(tests): tolerate ps ancestor-walk in find_gateway_pids fallback test (#19590) Follow-up to #19586 (@cixuuz salvage): _get_ancestor_pids walks ps -o ppid= up the process tree, which the pre-existing mock in test_find_gateway_pids_falls_back_to_pid_file_when_process_scan_fails didn't expect. Return empty stdout so the ancestor loop terminates cleanly and the original fallback assertion still passes. * fix(web): add missing icons for config page category sidebar Add icon mappings for 9 categories that fell back to FileQuestion: - bedrock (Cloud), curator (Sparkles), kanban (LayoutDashboard) - model_catalog (BookOpen), openrouter (Route), sessions (History) - tool_loop_guardrails (Shield), tool_output (FileOutput), updates (RefreshCw) * fix(agent): surface preflight compression status Preflight compression can run synchronously before the first model call when a loaded session exceeds the active context threshold. Gateway users saw no visible progress while the compression LLM call was in flight, which can look like a dropped message during long compactions.\n\nEmit the existing lifecycle status through _emit_status before starting preflight compression so CLI, gateway, and WebUI status callbacks all get immediate feedback.\n\nAdds a regression assertion for the preflight path. * Clarify session_search auxiliary model docs * fix: _chromium_installed() now checks AGENT_BROWSER_EXECUTABLE_PATH and system Chrome Before this fix, _chromium_installed() only searched Playwright-style chromium-* / chromium_headless_shell-* directories, which meant users with system Chrome or AGENT_BROWSER_EXECUTABLE_PATH configured still had all browser_* tools gated. Now checks three sources in priority order: 1. AGENT_BROWSER_EXECUTABLE_PATH env var (if set and points to a real binary) 2. System Chrome/Chromium via shutil.which() (google-chrome, chromium-browser, chrome) 3. Playwright browser cache (existing logic, kept as fallback) Closes #19294 * fix(feishu): enable MEDIA attachment delivery in send_message tool The _send_feishu() function already supports media_files (images, video, audio, documents) via the adapter's send_image_file/send_video/send_voice /send_document methods, but _send_to_platform() never routed Feishu into the early media-handling branch — media attachments were silently dropped with a "not supported" warning. Add a Feishu-specific media branch (matching the existing Yuanbao/Signal pattern) so that MEDIA:<path> tags in send_message calls are correctly delivered as native Feishu attachments. Also update the two error/warning message strings to include feishu in the supported platform list. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(security): bind Meet node server to localhost and restrict token file to owner read * fix: back up config.yaml before hermes setup modifies it Create a timestamped backup (~/.hermes/config.yaml.bak.YYYYMMDD_HHMMSS) before the setup wizard runs any configuration sections. After setup completes, show the backup path and a restore command. This protects user-customized values (compression thresholds, provider routing, PII redaction, auxiliary model configs) from being silently overwritten by setup defaults. Addresses #3522 * fix: inherit reasoning config in API server runs * fix(run_agent): gate iteration-limit provider routing to OpenRouter * fix(delegate): inherit parent fallback_chain in _build_child_agent _build_child_agent constructed child AIAgents without passing fallback_model, leaving _fallback_chain=[] for every subagent. When a subagent hit a rate-limit or credential exhaustion the runtime fallback check (run_agent.py:7486 / 12267) found an empty chain and failed immediately — even though the parent agent was configured with fallback_providers and would have recovered. The cron scheduler already propagates fallback_model correctly (scheduler.py:1038). Fix closes the parity gap by reading the parent's _fallback_chain (the normalised list form accepted by AIAgent's fallback_model parameter) and threading it through. Empty chains coerce to None so AIAgent initialises _fallback_chain=[] as usual rather than iterating an empty list. * fix: allow kanban tools for orchestrator profiles with kanban toolset The _check_kanban_mode() gating function only checked for HERMES_KANBAN_TASK env var, which is only set by the dispatcher when spawning workers. This prevented orchestrator profiles (like techlead) from using kanban_create, kanban_link, etc. even when they had 'kanban' explicitly in their toolsets config. Now uses load_config() from hermes_cli.config (which has mtime-based caching) to check if 'kanban' is in the profile's toolsets list. This enables orchestrators to route work via Kanban while workers continue using the dispatcher env var. Fixes #18968 * test(kanban): update worker-prompt header assertion to match #19427 PR #19427 dropped the 'You are a Kanban worker' identity line from KANBAN_GUIDANCE so SOUL.md stays authoritative for profile identity. This test assertion was stale against that change; update it to the new protocol-only header. * fix(tui): harden plugin slash exec errors * fix(skills): keep manual skills out of curator * chore(release): map cine.dreamer.one@gmail.com to @LeonSGP43 * chore(release): AUTHOR_MAP entries for Tier 1c salvage batch Pre-adds author-email mappings for upcoming Tier 1c salvage PRs (small Apr 24-25 fixes). * fix(compressor): reset _summary_failure_cooldown_until in on_session_reset() on_session_reset() cleared _previous_summary, _last_summary_error, and _ineffective_compression_count but left _summary_failure_cooldown_until intact. When a transient summary error sets a 60 s cooldown (or 600 s for a missing-provider RuntimeError) and the user immediately runs /reset or /new, the cooldown carries into the new session. If the new session reaches the compression threshold before the cooldown expires, _generate_summary() returns None early, middle turns are silently dropped without a summary, and the agent continues with no indication that compaction was skipped. Fix: set _summary_failure_cooldown_until = 0.0 in on_session_reset(), matching the value assigned in __init__ and symmetric with the other per-session fields already cleared there. Fixes #15547 * fix(delegation): pass target_model to resolve_runtime_provider in _resolve_delegation_credentials When delegation.model differs from model.default and the provider is opencode-go or opencode-zen, the wrong api_mode is computed because resolve_runtime_provider falls back to model_cfg.get('default') — the main model — instead of the configured delegation model. For example, with model.default=minimax-m2.7 (anthropic_messages) and delegation.model=glm-5.1 (chat_completions), subagents get anthropic_messages, which strips /v1 from the base URL and causes a 404. resolve_runtime_provider already accepts target_model for exactly this purpose; _resolve_delegation_credentials just wasn't passing it. Fixes #15319 Related: #13678 * fix(anthropic): cap max_tokens at 65536 for Qwen models via DashScope DashScope's Anthropic-compatible endpoint enforces max_tokens ∈ [1, 65536]. Adding "qwen3" to _ANTHROPIC_OUTPUT_LIMITS prevents 400 errors that were misclassified as context overflow, triggering premature compression. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(tui): declare nanostores dependency * fix(docker): exclude compose/profile runtime state from build context * fix(cron): drop stale env-var override of persisted provider Cron jobs were passing os.getenv("HERMES_INFERENCE_PROVIDER") as the "requested" arg to resolve_runtime_provider(), which short-circuited the resolver's own precedence (explicit arg → persisted config → env) and let stale shell/.env values outrank the user's saved provider. Long-lived cron daemons inherit env from the shell that launched them, so a since-changed provider (e.g. DeepSeek) could keep firing for jobs that don't pin provider/model. Same bug class as f0b763c74 fixed for the TUI /model switch. Pass only job.get("provider") and let resolve_requested_provider fall through to persisted config and env in the documented order. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cron): skip AI call when script produces no output When a cron job has a pre-run script that runs successfully but produces no output (e.g. email checker with no new mail), the scheduler previously injected "[Script ran successfully but produced no output.]" into the prompt and still called the AI model. This wastes tokens on every cycle. Now _build_job_prompt() returns None when script output is empty, and run_job() short-circuits with a SILENT response - zero API calls when there is nothing to report. * fix(gateway): allow free_response_channels to override DISCORD_IGNORE_NO_MENTION When DISCORD_IGNORE_NO_MENTION is true (default), the bot ignores messages without @mention. However, this check ran before evaluating free_response_channels, so messages in free-response channels were wrongly dropped unless they contained a mention. This change adds a carve-out: if the message lands in a channel that is configured as a free response channel (or its parent category is), the ignore-no-mention rule is skipped. Also removes the unconditional skip_thread for free response channels so that auto_thread still creates threads there unless explicitly disabled via DISCORD_NO_THREAD_CHANNELS. * fix(telegram): fallback to document when photo dimensions exceed limits Telegram's send_photo has dimension limits (sum of width+height <= 10000px). When sending large screenshots or tall images, the API returns 'Photo_invalid_dimensions' error. Fix: Catch this specific error in send_image_file() and automatically fallback to send_document() which has no dimension limits (only 50MB size). This is similar to the existing 5MB URL fallback (commit 542faf22) but handles local files with dimension issues instead of URL size issues. * fix(gemini): extract usageMetadata from streaming chunks for token tracking * fix(cli): sync use_gateway in _reconfigure_provider for tts, browser, and web _reconfigure_provider() updates cloud_provider/backend/tts.provider when switching tool providers via "hermes setup tools → Reconfigure", but did not update the matching use_gateway flag. _configure_provider() (the initial-setup path) sets use_gateway on all three tool categories. The omission in _reconfigure_provider leaves a stale value in config.yaml: switching from a Nous-managed provider (use_gateway=True) to a self-hosted one keeps use_gateway=True, continuing to route requests through the Nous gateway; switching the other way leaves use_gateway unset so the managed feature does not activate. Fix: mirror _configure_provider's use_gateway = bool(managed_feature) assignment in the tts, browser, and web blocks of _reconfigure_provider. Symmetric across all three tool categories. No behavior change for any provider that does not set tts_provider, browser_provider, or web_backend. Fixes #15229 * fix(tui): close AIAgent on session teardown to prevent FD leak session.close only closed the slash_worker subprocess but never called agent.close() on the AIAgent instance. In the long-lived TUI gateway process, this left httpx clients for GC to finalize. When the OS recycled a closed FD number for a new active connection, the stale finalizer would close the live socket, causing intermittent [Errno 9] Bad file descriptor on subsequent LLM API calls. Call agent.close() (which properly shuts down the httpx transport pool and TCP sockets) before closing the slash_worker. * fix(tui): prevent trailing space in picker-command completions Commands that open pickers (/model, /skin, /personality) previously received a trailing space in their completions to keep the dropdown visible in the classic CLI. However, the TUI's submit handler applies the completion when Enter is pressed and the result differs from the input — so '/model' + space became '/model ' and the command was never executed. Picker commands now omit the trailing space for exact matches, allowing Enter to submit and open the picker. Non-picker commands (/help, etc.) are unaffected. * fix(pty): default TERM for resize probes Preserve explicit caller overrides, but backfill a sensible default TERM=xterm-256color when missing or blank in the spawn env. CI often runs without TERM in the parent process, which makes terminal probes like 'tput cols' fail before winsize reads. Salvage of #15278's core code fix only — the test changes conflict with subsequent test refactors on main that now exercise TIOCGWINSZ directly instead of via 'tput'. Co-authored-by: LeonSGP43 <154585401+LeonSGP43@users.noreply.github.com> * fix(setup): skip AUXILIARY_VISION_MODEL write when input is blank Guard the save_env_value('AUXILIARY_VISION_MODEL', ...) call with 'if _selected_vision_model:' so blank input at the non-OpenAI vision model prompt doesn't nuke existing values in .env. save_env_value has no internal guard against empty strings — it faithfully writes whatever it receives, including empty values that shadow the previously-configured model. Salvage of #15504 (core hunk). Contributor's test was dropped because it collided with subsequent test refactors; the fix stands on its own. Co-authored-by: alt-glitch <balyan.sid@gmail.com> * fix(kanban-dashboard): widen drawer, bump body fonts, fix code-block contrast (#19638) Closes #18576. Addresses three of four complaints from the readability report; live-verified in a dashboard against a seeded task with body, comments, and run history. - Drawer default width 480px → 640px, exposed as the CSS var `--hermes-kanban-drawer-width` so deployments / user themes can override without forking the plugin. - Bump body/meta/pre/log/run-history font sizes from the 0.65-0.75rem cluster to the 0.78-0.85rem cluster. Long paths and code snippets in task bodies, run metadata, and worker logs are legible again instead of requiring a squint. - Fix the black-text-on-dark-theme regression in fenced markdown code blocks. Root cause: themes that don't define `--color-foreground` (NERV, at least) leave `color: var(--color-foreground)` resolving empty on <code>, which then falls back to the UA default (near-black) instead of inheriting from the drawer's <body>. Fix: force `color: inherit` on both inline and fenced code, and give the fenced block background via `currentColor` instead of `--color-foreground` so there's a visible card even when the theme var is absent. Out of scope for this PR (comments added to #18576): - Draggable resize handle (structural JS work; plugin ships built-only, no src/ in-tree). - Live worker-log viewer for running tasks (backend WS + component). - Sibling fix: themes like NERV should define --color-foreground. The current changes make the drawer robust against that gap, but the root fix belongs in the theme layer. * fix(curator): only mark agent-created for background-review sediment (#19621) Tighten the provenance semantics added in #19618: skills a user asks a foreground agent to write via skill_manage(create) now stay invisible to the curator. Only skills the background self-improvement review fork sediments through skill_manage get the created_by=agent marker. - tools/skill_provenance.py — new ContextVar module mirroring the _approval_session_key pattern: set_current_write_origin / reset / get / is_background_review. Default origin is 'foreground'; the review fork sets 'background_review'. - run_agent.py — run_conversation() binds the ContextVar from self._memory_write_origin at the top of each call. The review fork runs on its own thread (fresh context), so foreground and review contexts never cross-contaminate. - tools/skill_manager_tool.py — skill_manage(action='create') now only calls mark_agent_created() when is_background_review(). All other cases (foreground create, patch, edit, write_file, delete) continue as before. - tests: test_skill_provenance.py (6 tests covering the ContextVar surface), split test_full_create_via_dispatcher into foreground vs. review-fork variants, curator status tests now mark-first. Why: the agent routinely edits existing user skills on the user's behalf; those writes must never flip provenance. And when a user explicitly asks the foreground agent to create a skill, that skill belongs to the user. The curator should only be cleaning up after its own autonomous sediment from the review nudge loop. * fix(agent): disable SDK retries on per-request OpenAI clients Per-request OpenAI-wire clients (used by both non-streaming and streaming chat-completions paths in _interruptible_api_call) should not run the SDK's built-in retry loop: the agent's outer loop owns retries with credential rotation, provider fallback, and backoff that the SDK can't see. Leaving SDK retries on (default 2) compounds with our outer retries and lets a single hung provider request stretch to ~3x the per-call timeout before our stale detector reports it. Shared/primary clients and Anthropic / Bedrock paths are unaffected (they don't go through here). Salvage of #15811 core improvement — the timeout push-down in the original PR required scaffolding that has since been refactored on main, so only the max_retries=0 change is preserved. Co-authored-by: QifengKuang <k2767567815@gmail.com> * fix(cli): omit empty api_mode when probing custom models * fix(agent): detect Qwen3/Ollama inline thinking after tool calls Ollama serves Qwen3 thinking inside the content field as <think>...</think> blocks rather than in the API-level reasoning_content field. This means _has_structured was False for these responses, so an empty-looking reply after a tool call triggered the nudge instead of the prefill continuation, causing a double-response loop. Fix: detect <think>/<thinking>/<reasoning> in final_response and: 1. Skip the nudge when thinking is present (model is still reasoning) 2. Include _has_inline_thinking in _has_structured so prefill kicks in * fix(email): add required Date header to send_message_tool._send_email Adds RFC 5322 Date header to the _send_email tool path in tools/send_message_tool.py. Issue #15160 noted that both gateway/platforms/email.py and tools/send_message_tool.py construct MIMEMultipart/MIMEText messages without setting a Date header. RFC 5322 requires the Date header; mail filters reject messages that lack it. PR #15207 fixed the gateway/platforms/email.py path but did not cover tools/send_message_tool._send_email, which is used by the send_message tool for cross-channel messaging. This change adds msg["Date"] = formatdate(localtime=True) to _send_email, mirroring the fix applied to the gateway email adapter. Closes #15160 * fix(cli): detect quoted relative paths in _detect_file_drop Closes #15197 * docs(model-catalog): rename x-ai/grok-4.20-beta to x-ai/grok-4.20 (#19640) OpenRouter and Nous Portal dropped the -beta suffix from the Grok 4.20 slug. The OpenRouter section already used the new slug; this updates the Nous Portal section and bumps updated_at. * docs: document /kanban slash command (#19584) * docs: document /kanban slash command The kanban user guide and slash-commands reference only mentioned the /kanban slash command in passing. Add a proper section covering: - CLI and gateway both expose the full hermes kanban surface via hermes_cli.kanban.run_slash (identical argument surface) - Mid-run usage: /kanban bypasses the running-agent guard, so reads and writes land immediately while an agent is still in a turn - Auto-subscribe on /kanban create from the gateway — originating chat is subscribed to terminal events, with a worked example - Output truncation (~3800 chars) in messaging - Autocomplete hint list vs full subcommand surface Also adds /kanban rows to both slash-command tables (CLI + messaging) in reference/slash-commands.md and moves it into the 'works in both' notes bucket. * docs(kanban): frame the model's tool surface as primary, CLI as the human surface The kanban user guide and CLI reference read as if you drive the board by running `hermes kanban` commands everywhere. In practice: - **You** (human, scripts, cron, dashboard) use the `hermes kanban …` CLI, the `/kanban …` slash command, or the REST/dashboard. - **Workers** spawned by the dispatcher use a dedicated `kanban_*` toolset (`kanban_show`, `kanban_complete`, `kanban_block`, `kanban_heartbeat`, `kanban_comment`, `kanban_create`, `kanban_link`) and never shell out to the CLI. Changes to `user-guide/features/kanban.md`: - New 'Two surfaces' intro distinguishes the two front doors up front. - Quick-start section re-labelled so each step says who is running it (you vs. orchestrator vs. worker). - 'How workers interact with the board' rewritten: - Lead with "Workers do not shell out to `hermes kanban`." - Tool table extended with required params. - Concrete worker-turn example (`kanban_show` → `kanban_heartbeat` → `kanban_complete`) and an orchestrator fan-out example (`kanban_create` x N with `parents=[...]`). - Moved 'Why tools not CLI' from a defensive aside to a clean follow-up section. - 'Worker skill' section explicitly says the lifecycle is taught in tool calls, not CLI commands. - 'Pinning extra skills' reordered — orchestrator tool form first (the usual case), human/CLI second, dashboard third. - 'Orchestrator skill' now shows a canonical `kanban_create` / `kanban_link` / `kanban_complete` tool-call sequence instead of only describing what the skill teaches. - CLI-command-reference heading now clarifies this is the human surface, with a cross-link to the tool-surface section. - 'Runs — one row per attempt' structured-handoff example replaced: the primary example is now `kanban_complete(summary=..., metadata=...)` (what a worker actually does), with the CLI form retained as "when you, the human, need to close a task a worker can't." Changes to `reference/cli-commands.md`: - `hermes kanban` intro marks itself as the human / scripting surface and links out to the worker tool surface. - Corrected `comment <id>` description — the next worker reads it via `kanban_show()`, not by running `hermes kanban show`. * docs(kanban-tutorial): reframe worker actions as tool calls Honest answer to Teknium's follow-up: no, the first pass missed the tutorial. The four stories all showed `hermes kanban claim / complete / block / unblock` as if the backend-dev, pm, and reviewer personas were humans running CLI commands. In a real hermes kanban run those agents are dispatcher-spawned workers driving the board through the `kanban_*` tool surface. Changes: - Setup intro now distinguishes the three surfaces up front (dashboard / CLI for you, `kanban_*` tools for workers) and establishes the convention: `bash` blocks are commands *you* run, `# worker tool calls` blocks are what the agent emits. - Story 1 (solo dev schema): 'Claim the schema task, do the work, hand off' block replaced with the dispatcher spawning the backend-dev worker and a `kanban_show → kanban_heartbeat → kanban_complete` tool-call sequence. The 'On the CLI' `hermes kanban show / runs` block re-labelled as 'you peeking at the board' to keep it correct as a human inspection step. - Story 2 (fleet farming): note about structured handoff updated from `--summary` / `--metadata` CLI flags to `kanban_complete(summary=..., metadata=...)` tool form. - Story 3 (role pipeline): the big PM/engineer/reviewer block fully rewritten as three worker tool-call sequences — PM worker completes spec, engineer worker blocks, human/reviewer `hermes kanban unblock` (or `/kanban unblock`), engineer worker respawns and completes. The respawn-as-new-run mechanic is now explicit. - Reviewer paragraph: `build_worker_context` replaced with `kanban_show()` — that's the tool that delivers the parent handoff to the model. - Structured handoff section heading and body updated: `--summary`/`--metadata` → `summary`/`metadata` (tool params), with a note that the tool surface doesn't expose a bulk variant for the same reason the CLI refuses multi-task `complete`. Story 4 (circuit breaker) unchanged — its workers fail to spawn, so there are no tool calls to show; the `hermes kanban create` and `hermes kanban runs` commands in it are correctly human-driven. * fix(dashboard): defer unknown-route redirect while dashboard plugins load * fix(dashboard): render null instead of flashing spinner during plugin load * chore(release): AUTHOR_MAP entries for Tier 1d salvage batch * fix(status): show NVIDIA NIM api key status hermes status was missing NVIDIA API key from its API keys display. Now shows NVIDIA NIM ✓/✗ with key hash like other providers. Fixes #16082 * fix(cronjob): advertise 'custom:<name>' provider format in tool schema The `provider` field in CRONJOB_SCHEMA only showed examples like 'openrouter' and 'anthropic', with no mention of the canonical 'custom:<name>' form required for custom_providers entries. When the user has custom providers configured, LLMs tend to write the bare type name ('custom') because the schema does not advertise the ':<name>' suffix. The bare value then serializes into jobs.json and causes the cron job to fail silently at run time — `_resolve_model_override` treats it as a user-specified provider and skips the pin-to-current fallback, but no provider ever resolves from the bare 'custom' string. Clarifying the schema so the canonical form is discoverable addresses the root cause at the tool-definition boundary. * fix(agent): preserve dots in model names for Xiaomi MiMo provider Add 'xiaomi' to the _anthropic_preserve_dots() provider whitelist and 'xiaomimimo.com' to the URL-based fallback check. Without this, normalize_model_name() converts mimo-v2.5 to mimo-v2-5, which the Xiaomi API rejects with HTTP 400. Fixes #16156 * fix(tui): use --outdir instead of --outfile in hermes-ink build script esbuild raises 'Must use outdir when there are multiple input files' on Android/Termux ARM64 with esbuild >=0.25. The build script used --outfile=dist/ink-bundle.js which is only valid for a single entry point with no code splitting. Switching to --outdir=dist fixes the error and names the output file dist/entry-exports.js (matching the input file name). Update index.js to import from the new path. Fixes #16072 * fix(delegate): guard _load_config() against delegation: null in config.yaml YAML parses `delegation: null` as Python None. `dict.get(key, {})` only uses the default when the key is *missing*, not when it exists with a None value, so `cfg.get("max_concurrent_children")` crashes with `'NoneType' object has no attribute 'get'`. Same pattern as fd9b692d (fix(tui): tolerate null top-level sections). Use `dict.get(key) or {}` to handle both missing and None-valued keys. Closes: delegation null config crash (same class as #7215, #7346) * fix(doctor): skip /models health check for MiniMax CN (returns 404) MiniMax China (api.minimaxi.com) does not expose a /v1/models endpoint. The doctor command was probing it and reporting HTTP 404 as a warning, even though the API works correctly for chat completions. Set supports_health_check=False for MiniMax CN so doctor shows "(key configured)" instead of the false 404 warning. Refs #12768, #13757 * fix(wecom): set SUPPORTS_MESSAGE_EDITING=False to prevent broken streaming * fix(compressor): trigger fallback on timeout errors alongside model-not-found Previously only HTTP 404/503 and specific error strings triggered a fallback to the main model when the summary model was unavailable. Timeout errors (HTTP 408/429/502/504, or error strings containing 'timeout') entered a short cooldown instead, leaving context to grow unbounded for the rest of the session. Add _is_timeout detection alongside _is_model_not_found so that transient timeout errors on the summary model also trigger immediate fallback to the main model, preventing compression failure from cascading. Closes #15935 * fix(cronjob): treat bare 'custom' provider as unspecified in override `_resolve_model_override` treated any non-empty `provider` string from the LLM as user-specified and skipped the pin-to-current-provider fallback. When the LLM wrote bare `'custom'` (instead of the canonical `'custom:<name>'` referring to a custom_providers entry), the value serialized into jobs.json as `"provider": "custom"` and the scheduler could never resolve a provider from it — the cron job failed silently at run time. Treat bare `'custom'` as "no provider supplied" so the current main provider gets pinned instead, matching behaviour for the omitted case. Defence-in-depth complement to a schema-description fix (#15477) that discourages the LLM from emitting bare `'custom'` in the first place. * fix(cli): remove dead 'q' check from quit command resolution The 'q' alias is defined for 'queue' command in commands.py:93. The hardcoded 'q' in cli.py:5910 was dead code - resolve_command('q') returns the queue CommandDef, so canonical would never be 'q'. Removes the misleading check without changing any behavior: - /quit and /exit still exit (defined aliases) - /q still maps to queue (as intended) * fix(cli): reject invalid argv values from -p/--profile before resolving `_apply_profile_override()` scans `sys.argv` for `-p / --profile` at module import time. When `hermes_cli.main` is imported inside pytest with `-p no:xdist` on the command line, it picks up `'no:xdist'` as a profile name candidate, then passes it to `resolve_profile_env()` which raises `ValueError` (invalid format), and the function calls `sys.exit(1)` — aborting test collection with an INTERNALERROR before any test runs. The same conflict affects any tool or wrapper that uses `-p` for its own flag and then imports `hermes_cli.main`. Fix: add a format guard immediately after step 1 (explicit flag scan). If `consume == 2` (the value came from `-p <value>`, not `--profile=value`) and the candidate doesn't match the canonical profile-name pattern `[a-z0-9][a-z0-9_-]{0,63}` (mirrored from `hermes_cli.profiles._PROFILE_ID_RE`), discard it and continue as if no `-p` flag was found. The `active_profile` file-based fallback (step 2) only reads a file written by hermes itself, so it always produces valid names and needs no guard. Regression guard: with the guard reverted, importing `hermes_cli.main` with `sys.argv = ['pytest', '-p', 'no:xdist', ...]` raises `SystemExit(1)`. With the guard in place, the import succeeds and `sys.argv` is left intact for pytest. Legitimate `-p coder` still flows through to `resolve_profile_env()` unchanged. Rebased onto current `origin/main` (`e5dad4ac5`) — the prior branch base (`4fade39c9`) was 824 commits behind and the PR was DIRTY / CONFLICTING. The 1.5 HERMES_HOME-set early-return block has since landed between the original insertion point and step 2; the new guard is positioned correctly before the early return so a bogus `-p` value no longer prevents the early return from kicking in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(model-picker): exclude providers with empty credential pool entries The auth check in list_authenticated_providers used mere key presence in credential_pool to conclude a provider is authenticated. An empty entry (pool_store key with no actual credentials) caused providers like ollama-cloud to appear as authenticated in the model picker even when no OLLAMA_API_KEY was set. The user's picker then offered nemotron-3-super under Ollama Cloud; selecting it routed every subsequent turn to https://ollama.com/v1, which rejected the requests with HTTP 400. Fix: drop the pool_store key-existence check from both section 2 (HERMES_OVERLAYS) and section 2b (CANONICAL_PROVIDERS). The following load_pool().has_credentials() call already handles the legitimate pooled- credential case; checking for an empty key just ahead of it was redundant and actively harmful. * fix(browser): allow CDP override to pass requirement checks Treat explicit CDP override mode as a valid browser backend even when agent-browser is absent, and add a regression test to prevent false-negative availability gating. * fix(doctor): check global agent-browser when local install not found When agent-browser is globally installed via 'npm install -g agent-browser' but not present in the local node_modules, doctor falsely warns that it's not installed. Add shutil.which('agent-browser') as a fallback check after the local path check. Closes #15951 * feat(cli,gateway): /new accepts optional session name argument Allow users to start a fresh session and immediately set its title by passing a name to /new (or /reset): /new Refactor auth module Changes: - hermes_cli/commands.py: add args_hint='[name]' to /new command - cli.py: parse title argument in process_command(), pass to new_session() - cli.py: new_session() accepts title=None, sets title via SessionDB - gateway/run.py: _handle_reset_command() parses title, sets on new entry - gateway/session.py: reset_session() accepts optional display_name - tests: add test_new_session_with_title, test_reset_command_with_title, test_new_command_in_help_output All 36 affected tests pass. * fix(cli,gateway): surface title errors from /new <name> The contributor's PR silently swallowed ValueError from SessionDB.set_session_title() with bare except Exception: pass. Users typing /new <title> with an already-in-use title got an untitled session and no feedback. Changes: - cli.py: catch ValueError from both sanitize_title() and set_session_title(); print the error and mark the session untitled in the banner (never echo the rejected title back). - gateway/run.py: append a warning note to the reset reply on title rejection; reflect the accepted title in the header. - Add regression tests for the duplicate-title path in CLI and gateway. Also map exx@example.com -> @exxmen in scripts/release.py. * fix(file-tools): cap read_file result size to prevent context window overflow Set max_result_size_chars=100_000 on the read_file registry entry (was float('inf')), closing the Layer 2 defense-in-depth gap in tool_result_storage.py. The existing Layer 1 guard inside _handle_read_file already returns a JSON error for oversized reads; this aligns the registry cap with every other tool. Update test_read_file_never_persisted → test_read_file_result_size_cap to assert 100_000, and add test_read_file_registry_cap_is_100k as an explicit regression guard against re-introducing float('inf'). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(google_oauth): close TOCTOU window when saving credentials * fix(anthropic): strip top-level oneOf/allOf/anyOf from tool input_schema Extends the existing _normalize_tool_input_schema to also drop top-level union keywords that Anthropic's tool schema validator rejects with HTTP 400. Several upstream and plugin tools ship schemas with a top-level oneOf/ allOf/anyOf (common for Pydantic discriminated unions). The existing strip_nullable_unions pass only handles anyOf-with-null patterns; a non-null top-level union keyword sails through and hits the API. Salvage of #16471 — approach folded into the existing normalize helper rather than introducing a parallel _sanitize_input_schema function, to avoid two schema-munging code paths running against the same input. Co-authored-by: Grey0202 <grey0202@users.noreply.github.com> * feat(kanban-dashboard): workspace kind + path inputs in inline create form (#19679) Closes #18718. Exposes the existing `workspace_kind` + `workspace_path` fields (already accepted by POST /api/plugins/kanban/tasks) in the dashboard's per-column inline-create form so users can create tasks targeting a git worktree or an explicit directory without dropping back to the CLI. - Add a workspace-kind Select (scratch / worktree / dir) to InlineCreate in plugins/kanban/dashboard/dist/index.js. - Conditionally render a workspace_path Input next to the select when kind != scratch; placeholder tells the user whether the path is required (dir) or optional (worktree — derived from assignee when blank). - Submit wires `workspace_kind` / `workspace_path` into the POST body only when they're non-default, keeping the request shape small and interoperable with older dispatcher versions. E2E verified in a dashboard pointed at the worktree: selecting dir + typing /tmp/test-18718 produces a POST body with {workspace_kind: 'dir', workspace_path: '/tmp/test-18718'} and the task lands in sqlite with those fields set. 42/42 kanban dashboard plugin tests pass. * fix: refresh systemd unit on gateway boot (not just start/restart) (#19684) The resilient restart settings from PR #18639 only took effect when the gateway was started via `hermes gateway start` or `hermes gateway restart` — both of which call refresh_systemd_unit_if_needed() which writes the new unit and runs daemon-reload. However, when the gateway self-restarts via exit-code-75 (stale-code detection after `hermes update`, or the /restart command), systemd respawns the process directly without going through any CLI function. The unit file on disk stays stale, and systemd keeps using the old cached settings (StartLimitBurst=5, RestartSec=30) until someone manually runs `hermes gateway restart`. This meant that after PR #18639 was deployed, users who never ran `hermes gateway restart` manually were still vulnerable to the permanent-death-on-network-outage bug. Fix: call refresh_systemd_unit_if_needed() at the top of run_gateway() (the foreground entry point that systemd's ExecStart invokes). This ensures that on every boot — whether triggered by systemd restart, exit-75 respawn, or manual foreground run — the unit definition and daemon state are current. The call is best-effort (exceptions caught) and a no-op when the unit is already current (one stat + string compare). * docs(open-webui): fill gaps in quick setup — verify curls, ollama flag, restart note (#19654) Reported by @neopabo — the Open WebUI page was missing several steps users hit in practice: - Use hermes config set instead of hand-editing .env (matches current UX) - Restart-gateway note after enabling API_SERVER_ENABLED - curl /health + /v1/models verification step before jumping to Docker - ENABLE_OLLAMA_API=false in both docker run and compose snippets to suppress the empty Ollama backend that otherwise clutters the picker - 15-30s startup wait note for first-run embedding model download - Troubleshooting entry for the empty-Ollama-shadowing case - /v1/models troubleshoot command now includes the Authorization header * chore(release): AUTHOR_MAP entries for Tier 1e salvage batch * fix(test): add skip marker for transcription tests requiring faster_whisper TestTranscribeLocalExtended patches faster_whisper.WhisperModel, which triggers an ImportError when the faster_whisper package is not installed. Added a pytest.mark.skipif marker using importlib.util.find_spec so these tests are gracefully skipped instead of failing with ModuleNotFoundError. * fix(test): skip bedrock adapter tests when botocore is not installed Six tests in test_bedrock_adapter.py import botocore.exceptions directly (ConnectionClosedError, EndpointConnectionError, ReadTimeoutError, ClientError) without guarding the import. When botocore is not installed (it's an optional dependency), these tests fail with ModuleNotFoundError instead of being gracefully skipped. Added pytest.importorskip('botocore') to each affected test function, following the same pattern used elsewhere in the test suite (e.g. test_voice_mode.py for numpy, test_mcp_oauth.py for mcp). Tests affected: - TestIsStaleConnectionError: 3 tests - TestCallConverseInvalidatesOnStaleError: 3 tests Before: 6 FAIL with ModuleNotFoundError After: 6 SKIP with reason message * fix(mcp): decouple AnyUrl import from mcp dependency AnyUrl was imported inside the same try block as mcp.client.auth, so when the mcp package was not installed, AnyUrl was undefined and _build_client_metadata raised NameError at runtime. Moved the AnyUrl import to its own try/except block so it's available whenever pydantic is installed (which is a core dependency), regardless of whether the mcp SDK is present. Also added pytest.importorskip('mcp') to the three test_build_client_metadata tests that exercise _build_client_metadata, since that function depends on OAuthClientMetadata from the mcp package. * feat(kanban): multi-project boards — one install, many kanbans (#19653) Adds first-class board support to kanban so users can separate unrelated streams of work (projects, repos, domains) into isolated queues. Single- project users stay on the 'default' board and see no UI change. Isolation model --------------- - Each board is a directory at `~/.hermes/kanban/boards/<slug>/` with its own `kanban.db`, `workspaces/`, and `logs/`. The 'default' board keeps its legacy path (`~/.hermes/kanban.db`) for back-compat — fresh installs and pre-boards users get zero migration. - Workers spawned by the dispatcher have `HERMES_KANBAN_BOARD` pinned in their env alongside the existing `HERMES_KANBAN_DB` / `HERMES_KANBAN_WORKSPACES_ROOT` pins, so workers physically cannot see other boards' tasks. - The gateway's single dispatcher loop now sweeps every board per tick; per-tick cost is a few extra filesystem stats. - CAS concurrency guarantees are preserved per-board (each board is its own SQLite DB, same WAL+IMMEDIATE machinery as before). CLI --- hermes kanban boards list|create|switch|show|rename|rm hermes kanban --board <slug> <any-subcommand> Board resolution order: `--board` flag → `HERMES_KANBAN_BOARD` env → `~/.hermes/kanban/current` file → `default`. Slug validation is strict: lowercase alphanumerics + hyphens + underscores, 1-64 chars, starts with alphanumeric. Uppercase is auto-downcased; slashes / dots / `..` / control chars are rejected so boards can't name their way out of the boards/ directory. Passive discoverability: when more than one board exists, `hermes kanban list` prints a one-line header ("Board: foo (2 other boards …)") so users who stumble across multi-project never have to hunt for the feature. Invisible for single-board installs. Dashboard --------- - New `BoardSwitcher` component at the top of the Kanban tab: dropdown with all boards + task counts, `+ New board` button, `Archive` button (non-default only). Hidden entirely when only `default` exists and is empty — single-project users never see it. - New `NewBoardDialog` modal: slug / display name / description / icon + "switch to this board after creating" checkbox. - Selected board persists to `localStorage` so browser users don't shift the CLI's active board out from under a terminal they left open. - New `?board=<slug>` query param on every existing endpoint plus a new `/boards` CRUD surface (`GET /boards`, `POST /boards`, `PATCH /boards/<slug>`, `DELETE /boards/<slug>`, `POST /boards/<slug>/switch`). - Events WebSocket is pinned to a board at connection time; switching opens a fresh WS against the new board. Also fixes a pre-existing bug in the plugin's tenant / assignee filters: the SDK's `Select` uses `onValueChange(value)`, not native `onChange(event)`, so those filters silently didn't work. New `selectChangeHandler` helper wires both signatures. Tests ----- 49 new tests in `tests/hermes_cli/test_kanban_boards.py` covering: slug validation (valid / invalid / auto-downcase), path resolution (default = legacy path, named = `boards/<slug>/`, env var override), current-board resolution chain (env > file > default), board CRUD + archive / hard-delete, per-board connection isolation (tasks don't leak), worker spawn env injection (`HERMES_KANBAN_BOARD`, `HERMES_KANBAN_DB`, `HERMES_KANBAN_WORKSPACES_ROOT` all point at the right board), and end-to-end CLI surface. Regression surface: all 264 pre-existing kanban tests continue to pass. Live-tested via the dashboard: created 3 boards (default, hermes-agent, atm10-server), created tasks on each via both CLI (`--board <slug> create`) and dashboard (inline create on the Ready column), confirmed zero cross-board leakage, confirmed `BoardSwitcher` + `NewBoardDialog` work end-to-end in the browser. * fix(cli): check updates against upstream/main for fork users * fix(image-gen): preserve xAI API error status * fix(dashboard): show custom theme palette swatches * fix(security): restore .env/auth.json/state.db with 0600 perms `hermes import` was creating secret files with the process umask (typically 0644) instead of 0600. zipfile.open() does not honor the Unix mode bits stored in zip member external_attr; the restore loop used open(target, "wb") which always falls back to umask. Threat: silent privilege downgrade after a routine restore on multi-user systems (shared dev boxes, CI runners, jump hosts) — any local user could read API keys and OAuth tokens from ~/.hermes/. Fix mirrors the convention already used at file creation (hermes_cli/auth.py: stat.S_IRUSR | stat.S_IWUSR for auth.json). The quick-snapshot restore path (restore_quick_snapshot) is unaffected — it uses shutil.copy2 which preserves perms via copystat(). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(profiles): normalize profile IDs for Kanban assignees and lookups - Add normalize_profile_name() for lowercase canonical IDs and Default alias - Use canonical names in create/delete/rename/export/import/set_active paths - Canonicalize Kanban assignee on create/assign, list filter, and worker spawn - Tests for mixed-case assignees and profile resolution (fixes #18498) * fix(profiles): keep validate_profile_name strict; callers normalize first Follow-up to @changchun989's cherry-pick: reverts the validate-via- normalize change so validate_profile_name remains a strict regex check on the input AS-GIVEN. Callers that accept mixed-case user input (dashboard UI, CLI args, import flows) call normalize_profile_name() first, then validate the result. This keeps validate honest about what the on-disk directory name must look like — e.g. ' jules ' (trailing whitespace) is now rejected instead of silently trimmed and accepted. - validate_profile_name: strict lowercase/regex check again, 'UPPER' back in the invalid-names parametrize - 8 call sites in profiles.py (create_profile, delete_profile, set_active_profile, export_profile, import_profile, rename_profile, resolve_profile_env, plus the clone_from branch): swap the normalize-then-validate order - scripts/release.py: add changchun989@proton.me -> changchun989 to AUTHOR_MAP so CI doesn't block on the unmapped contributor email All kanban + profile tests pass (268 across test_profiles.py + test_kanban_db.py + test_kanban_core_functionality.py, plus 73 in test_kanban_tools.py + test_kanban_dashboard_plugin.py). Closes #18498. * fix(env): pass -- to cd for hyphen-prefixed workdirs * fix(test): correct _coerce_number inf/nan test assertions The test 'test_inf_stays_string_for_integer_only' incorrectly asserted that _coerce_number('inf') returns float('inf'), but the function correctly returns the original string 'inf' because infinity is not JSON-serializable. Fixed the assertion to expect the string 'inf', and added two new tests for negative infinity and NaN edge cases to improve coverage of the non-JSON-serializable number guard in _coerce_number(). * fix(kanban): reject direct status transition to 'running' via dashboard API The PATCH /tasks/:id endpoint allows setting status='running' via _set_status_direct(), bypassing the dispatcher/claim path that creates run rows, claim locks, expiry, and worker process metadata. This can leave tasks stuck in 'running' with no active worker. Fix: reject status='running' with HTTP 400, requiring all transitions to 'running' to go through the canonical claim_task() path. Closes #19535 * test(kanban): regression for status=running rejection at dashboard PATCH Reporter of #19535 explicitly asked for a regression test — covers it here so a future refactor of _set_status_direct can't silently re-enable the direct ready/todo -> running bypass. Asserts both: (a) HTTP 400 with 'running' in the detail message, and (b) the task's status is unchanged after the rejected PATCH (pre-request status preserved, no partial mutation). * docs(kanban): backfill multi-board refs in reference docs (#19704) Followup to #19653. The feature PR updated the Kanban user guide but missed four other pages that document the same surface. Caught when Teknium asked 'did you add docs to the guide and any other kanban related docs around this?'. - reference/cli-commands.md: rewrite the `hermes kanban` section to document the `--board <slug>` global flag, the `boards` subcommand group (list/create/switch/show/rename/rm), board resolution order, and worked examples. Also fills in the `create` / `complete` flag lists that had drifted from the current CLI (`--summary`, `--metadata`, `--triage`, `--idempotency-key`, `--max-runtime`, `--skill`). - reference/environment-variables.md: add `HERMES_KANBAN_BOARD` row, update `HERMES_KANBAN_DB` precedence note. - reference/slash-commands.md: add `/kanban boards ...` and `/kanban --board <slug> ...`…

…g, restart note (NousResearch#19654) Reported by @neopabo — the Open WebUI page was missing several steps users hit in practice: - Use hermes config set instead of hand-editing .env (matches current UX) - Restart-gateway note after enabling API_SERVER_ENABLED - curl /health + /v1/models verification step before jumping to Docker - ENABLE_OLLAMA_API=false in both docker run and compose snippets to suppress the empty Ollama backend that otherwise clutters the picker - 15-30s startup wait note for first-run embedding model download - Troubleshooting entry for the empty-Ollama-shadowing case - /v1/models troubleshoot command now includes the Authorization header

alt-glitch added type/docs Documentation improvements P3 Low — cosmetic, nice to have labels May 4, 2026

teknium1 merged commit 110387d into main May 4, 2026
5 of 8 checks passed

teknium1 deleted the hermes/hermes-a3b30a83 branch May 4, 2026 11:36

bot-ted mentioned this pull request May 4, 2026

chore: sync with upstream main (2026-05-04) bot-ted/hermes-agent#19

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(open-webui): fill gaps in Quick Setup#19654

docs(open-webui): fill gaps in Quick Setup#19654
teknium1 merged 1 commit into
mainfrom
hermes/hermes-a3b30a83

teknium1 commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teknium1 commented May 4, 2026

Summary

Changes

Validation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants