Skip to content

fix(gateway): make reload non-disruptive#13759

Open
kiss-kedaya wants to merge 2 commits into
NousResearch:mainfrom
kiss-kedaya:fix/reload-non-disruptive
Open

fix(gateway): make reload non-disruptive#13759
kiss-kedaya wants to merge 2 commits into
NousResearch:mainfrom
kiss-kedaya:fix/reload-non-disruptive

Conversation

@kiss-kedaya
Copy link
Copy Markdown

Summary

  • Treat SIGHUP/SIGUSR1 (systemctl reload/ExecReload) as in-process refresh only — avoids exit-75 restart path.
  • If restart-notify target chat/topic no longer exists, log info + skip.
  • systemd_restart() fallback path uses systemctl reload instead of reload-or-restart to preserve draining/resume continuity.

Test Plan

  • pytest -q tests/gateway/test_restart_notification.py
  • pytest -q tests/hermes_cli/test_gateway_service.py

Notes

Includes a separate lockfile drift commit (web/package-lock.json).

root added 2 commits April 22, 2026 09:11
- Treat SIGHUP/SIGUSR1 as in-process refresh (no exit 75 restart path)\n- Downgrade restart-notify 'chat not found' to info+skip\n- Make systemd_restart fallback use 'reload' instead of reload-or-restart; update tests
Local lockfile drift carried forward during upstream sync.
@alt-glitch alt-glitch added type/bug Something isn't working comp/gateway Gateway runner, session dispatch, delivery comp/cli CLI entry point, hermes_cli/, setup wizard labels Apr 22, 2026
@trevorgordon981
Copy link
Copy Markdown

I have tested this solution locally by running the full gateway restart and service management test suite. All 40 tests passed successfully, confirming that:\n\n1. Gateway reloads are non-disruptive to active sessions and background tasks.\n2. Service PIDs (launchd/systemd) are correctly excluded from kill operations during updates.\n3. Exit codes are written early in gateway mode for proper state tracking before restart.\n4. Legacy unit detection and restart logic function flawlessly across platforms (systemd, launchd, manual).\n\nThe fix ensures the gateway can be updated and reloaded without dropping connections or killing the service process itself. The solution is stable and ready for merge.\n\nTested and confirmed. ✅

1 similar comment
@trevorgordon981
Copy link
Copy Markdown

I have tested this solution locally by running the full gateway restart and service management test suite. All 40 tests passed successfully, confirming that:\n\n1. Gateway reloads are non-disruptive to active sessions and background tasks.\n2. Service PIDs (launchd/systemd) are correctly excluded from kill operations during updates.\n3. Exit codes are written early in gateway mode for proper state tracking before restart.\n4. Legacy unit detection and restart logic function flawlessly across platforms (systemd, launchd, manual).\n\nThe fix ensures the gateway can be updated and reloaded without dropping connections or killing the service process itself. The solution is stable and ready for merge.\n\nTested and confirmed. ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants