revert(gateway): remove stale-code self-check and auto-restart#20156
Open
teknium1 wants to merge 1 commit into
Open
revert(gateway): remove stale-code self-check and auto-restart#20156teknium1 wants to merge 1 commit into
teknium1 wants to merge 1 commit into
Conversation
Removes the _detect_stale_code / _trigger_stale_code_restart mechanism introduced in #17648 and iterated in #19740. On every incoming message the gateway compared the boot-time git HEAD SHA to the current SHA on disk, and if they differed it would reply with Gateway code was updated in the background -- restarting this gateway so your next message runs on the new code. Please retry in a moment. and then kick off a graceful restart. This is unwanted behaviour: users who run a long-lived gateway and do their own ad-hoc git operations on the checkout end up with their chat interrupted and the current message dropped every time HEAD moves, with no way to opt out. If an operator really needs the old protection against stale sys.modules after "hermes update", the SIGKILL-survivor sweep in hermes update (hermes_cli/main.py, also tagged #17648) already handles the supervisor-respawn case on its own. Removed: gateway/run.py: - _STALE_CODE_SENTINELS, _GIT_SHA_CACHE_TTL_SECS - _read_git_head_sha(), _compute_repo_mtime() module helpers - class-level _boot_wall_time / _boot_repo_mtime / _boot_git_sha / _stale_code_restart_triggered defaults - __init__ boot-snapshot block (_boot_*, _cached_current_sha*, _repo_root_for_staleness, _stale_code_notified) - _current_git_sha_cached(), _detect_stale_code(), _trigger_stale_code_restart() methods - stale-code check + user-facing restart notice at the top of _handle_message() tests/gateway/test_stale_code_self_check.py (deleted, 412 lines) No new logic added. Zero remaining references to any removed symbol. Gateway test suite passes the same 4589 tests it passed before; the 3 pre-existing unrelated failures (discord free-channel, feishu bot admission, teams typing) are unchanged by this commit.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Removes the "stale-code self-check" feature that made the gateway auto-restart itself whenever
git HEADmoved on the checkout after boot. When triggered, it dropped the user's current message and replied with:Why
Undesirable in practice. A long-lived gateway + any ad-hoc git operation on the checkout (branch switch, rebase, pull, even
git worktree addin the common dir) flips HEAD and the user's next message gets hijacked into a forced restart notice with no opt-out. No config flag, no way to disable it per-profile — if HEAD moved, your message died.The original motivation (Issue #17648) was to prevent
ImportErrorfrom stalesys.modulesafterhermes update. That concern is already handled on thehermes updateside by the SIGKILL-survivor sweep inhermes_cli/main.py(same issue number), which forces the supervisor to respawn with fresh code. The gateway-side detection loop was a belt-and-suspenders second mechanism, and the suspenders were cutting off circulation.What was removed
All in
gateway/run.py:_STALE_CODE_SENTINELS,_GIT_SHA_CACHE_TTL_SECS,_read_git_head_sha(),_compute_repo_mtime()_boot_wall_time,_boot_repo_mtime,_boot_git_sha,_stale_code_restart_triggered__init__boot-snapshot block (git HEAD read, mtime compute, cache init)_current_git_sha_cached(),_detect_stale_code(),_trigger_stale_code_restart()_handle_message()Also deleted:
tests/gateway/test_stale_code_self_check.py(412 lines).706 lines removed, 0 added.
Verification
python -c "from gateway import run"→ imports clean.ripgrep '_detect_stale_code|_trigger_stale_code_restart|_read_git_head_sha|_compute_repo_mtime|_GIT_SHA_CACHE_TTL_SECS|_STALE_CODE_SENTINELS|_stale_code_restart_triggered|_boot_git_sha|_boot_repo_mtime|_cached_current_sha|_current_git_sha_cached|_repo_root_for_staleness|_stale_code_notified'→ zero hits anywhere in the repo.scripts/run_tests.sh tests/gateway/→ 4589 passed. The 3 pre-existing unrelated failures (test_discord_free_channel_skips_auto_thread,test_hydrate_bot_identity_populates_self_ids_from_bot_v3_info, Teamstest_send_typing) exist on cleanmainand are unchanged by this revert.Reverts the behaviour introduced in #17648 / #18409 and the SHA-based follow-up in #19740.