Skip to content

fix(gateway): log runtime-status write failures with rate-limiting (salvage #21158)#21285

Merged
teknium1 merged 2 commits into
mainfrom
salvage/pr-21158
May 7, 2026
Merged

fix(gateway): log runtime-status write failures with rate-limiting (salvage #21158)#21285
teknium1 merged 2 commits into
mainfrom
salvage/pr-21158

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

@teknium1 teknium1 commented May 7, 2026

Summary

Gateway platform adapters now surface runtime-status write failures to the log instead of silently swallowing them, with rate-limiting so a persistently broken status dir doesn't spam on every reconnect.

Changes

  • gateway/platforms/base.py: salvage @wabrent's original three-hunk change (replace except: pass with logger.warning in _mark_connected/_mark_disconnected/_set_fatal_error), then consolidate the three try/write/except blocks into a shared _write_runtime_status_safe() helper.

Improvements during salvage

  • First failure per (platform, context) logs at warning level with the exception detail; subsequent failures downgrade to debug. This avoids log-spam on reconnect loops when the status dir is persistently broken (permissions, ENOSPC).
  • Uses getattr for the per-instance _status_write_logged set so test harnesses that bypass __init__ via object.__new__(Adapter) (a common pattern in tests/gateway/) still work.

Validation

  • scripts/run_tests.sh tests/gateway/ -k 'base or platform': 335 passed, 7 skipped.
  • E2E: simulated persistent PermissionError from write_runtime_status() called 5× on "connected" + 2× on "fatal" → exactly 2 WARNINGs, 5 DEBUGs. Happy path: zero log output. object.__new__(Adapter) harness does not crash on getattr fallback.
  • Compile check passes.

Closes #21158 via salvage. Co-authored by @wabrent.

wabrent and others added 2 commits May 7, 2026 06:27
…logs

Extracts the three try/write_runtime_status/except-log blocks into a
shared _write_runtime_status_safe() helper. On failure, logs the first
occurrence per (platform, context) at warning level and downgrades
subsequent failures to debug — so a persistently broken status dir
(permissions, ENOSPC) doesn't spam the log on every Telegram reconnect.

Uses getattr for the _status_write_logged set so test harnesses that
skip __init__ (object.__new__(Adapter)) don't break.

Follow-up to the salvaged #21158.
@teknium1 teknium1 merged commit 0efc547 into main May 7, 2026
9 of 11 checks passed
@teknium1 teknium1 deleted the salvage/pr-21158 branch May 7, 2026 13:30
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

🔎 Lint report: salvage/pr-21158 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 7531 on HEAD, 7531 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 3953 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/gateway Gateway runner, session dispatch, delivery labels May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants