Skip to content

fix(gateway): preserve pending update prompts across restarts#20160

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-e933e80c
May 5, 2026
Merged

fix(gateway): preserve pending update prompts across restarts#20160
teknium1 merged 1 commit into
mainfrom
hermes/hermes-e933e80c

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

@teknium1 teknium1 commented May 5, 2026

Salvage of #18477 onto current main. Cherry-picked @simbam99's commit with authorship preserved — no follow-up fixes needed, cherry-pick applied cleanly.

Summary

Stops _watch_update_progress() from unlinking .update_prompt.json the moment it forwards the prompt. The marker now stays on disk until the user actually replies (or cancels via recognized slash command), so a gateway restart mid-prompt can recover the outstanding question.

Why

On current main: user runs /update on Telegram/Discord → update subprocess asks "Restore local changes? [Y/n]" → gateway forwards → gateway restarts before the user replies. The prompt file was deleted eagerly and _update_prompt_pending is in-memory only, so the user's y falls through as a normal chat message and the detached update subprocess hangs on .update_response until its 300s timeout fires.

With this fix, _schedule_update_notification_watch() at gateway startup (run.py:3314) already re-fires _watch_update_progress() whenever .update_pending.json is on disk — that watcher now finds .update_prompt.json still present and re-forwards it.

Changes

  • gateway/run.py: remove eager prompt_path.unlink() after forwarding; add prompt_path.unlink(missing_ok=True) on the two sites that write .update_response (normal reply + recognized-slash-command cancel).
  • Duplicate-send suppression within a single watcher still works — _update_prompt_pending[session_key] = True is untouched.
  • tests/gateway/test_update_streaming.py: 5 new/updated cases covering restart-recovery, cleanup on normal reply, recognized-slash cancel, and unrecognized-slash response paths.

Verification

  • Cherry-pick applied cleanly onto current origin/main.
  • scripts/run_tests.sh tests/gateway/test_update_streaming.py -v → 21 passed including the new test_prompt_is_recovered_after_watcher_restart.
  • Confirmed update-subprocess side (_gateway_prompt in hermes_cli/main.py:5200) only polls .update_response — never re-reads .update_prompt.json — so keeping the marker around has zero effect on the writer.

Closes #18477.

@teknium1 teknium1 merged commit 8ad5e98 into main May 5, 2026
9 of 10 checks passed
@teknium1 teknium1 deleted the hermes/hermes-e933e80c branch May 5, 2026 10:59
@alt-glitch alt-glitch added type/bug Something isn't working comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists labels May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants