Skip to content

fix(whatsapp): prevent service restart hammering on auth failure#746

Closed
glifocat wants to merge 1 commit into
nanocoai:mainfrom
glifocat:upstream/fix-whatsapp-reconnect-hammering
Closed

fix(whatsapp): prevent service restart hammering on auth failure#746
glifocat wants to merge 1 commit into
nanocoai:mainfrom
glifocat:upstream/fix-whatsapp-reconnect-hammering

Conversation

@glifocat
Copy link
Copy Markdown
Collaborator

@glifocat glifocat commented Mar 5, 2026

Fixes #748

Type of Change

  • Skill - adds a new skill in .claude/skills/
  • Fix - bug fix or security fix to source code
  • Simplification - reduces or simplifies source code

Description

Two issues caused repeated WhatsApp connection attempts after a failed or revoked auth session, risking soft-blocks from WhatsApp:

  1. systemd Restart=always restarts the service even on clean exit(0). The 401/logged-out path uses process.exit(0), so systemd restarts every 5s indefinitely. Same issue on macOS with KeepAlive=true.

  2. No reconnection backoff — transient failure retried immediately then after flat 5s, then crashed, triggering another systemd restart cycle.

Fix: Restart=on-failure + rate limiting for systemd; KeepAlive SuccessfulExit=false for launchd; exponential backoff (5s→10s→20s…→5min) for inline reconnection. All changes are in skill assets (add/ and modify/), no applied files committed.

For Skills

  • I have not made any changes to source code
  • My skill contains instructions for Claude to follow (not pre-built code)
  • I tested this skill on a fresh clone

@Andy-NanoClaw-AI Andy-NanoClaw-AI added PR: Fix Bug fix Status: Needs Review Ready for maintainer review labels Mar 5, 2026
klapom added a commit to klapom/nanoclaw that referenced this pull request Mar 5, 2026
…res)

Bug fixes applied:
- nanocoai#636: task-scheduler recalculates next_run before enqueue
- nanocoai#655: LIMIT 200 on message queries to prevent OOM
- nanocoai#670: rateLimitResetAt field in ContainerOutput interface
- nanocoai#694: ANTHROPIC_MODEL passthrough to container env
- nanocoai#700: session rotation at 5MB JSONL threshold
- nanocoai#701: session retry on corrupted resume (clear + retry)
- nanocoai#708: update_task MCP tool in ipc-mcp-stdio
- nanocoai#719: outputChain .catch() to prevent group hang
- nanocoai#729: fix send_message description (remove incorrect scheduled-task note)
- nanocoai#735: datePrefix() injects current date/time into all agent prompts
- nanocoai#738: ANTHROPIC_MODEL from .env passed to agent container
- nanocoai#746: systemd OnFailure restart prevention logic (container hardening)
- nanocoai#751: DM-with-bot JID normalization
- nanocoai#754: setOnPipeCallback to reset idle timer on piped messages
- nanocoai#756: cursorBeforePipe rollback on container crash

Features added:
- nanocoai#723: streaming infrastructure (STREAM_TEXT markers, onStreamDelta)
- nanocoai#742: container hardening (entrypoint.sh privilege drop, env sanitize)
- nanocoai#680: add-cli skill (CLI send binary)
- nanocoai#727: add-memory skill extracted to .claude/skills/add-memory/
- nanocoai#744: add-s3-storage skill extracted to .claude/skills/add-s3-storage/

Test fixes:
- Mock fs.promises in container-runner.test.ts to prevent real I/O
- Add ANTHROPIC_MODEL to config mock
- Fix cpSync expectation: { recursive: true, force: true }
- Fix isActive() to use state.active instead of state.process
- Fix container-runtime error message: Docker → Container runtime

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@glifocat glifocat force-pushed the upstream/fix-whatsapp-reconnect-hammering branch 2 times, most recently from aa46f21 to f74f53f Compare March 7, 2026 17:31
Two separate issues caused repeated WhatsApp connections after a failed
or revoked auth session:

1. The systemd unit used Restart=always, which restarts the service even
   on a clean exit(0). Since the 401/logged-out path calls process.exit(0),
   systemd would immediately restart and retry — repeating indefinitely.
   Fix: Restart=on-failure (only restart on crashes), RestartSec=10,
   StartLimitBurst=5 per 5 minutes.
   Same fix for launchd: KeepAlive with SuccessfulExit=false.

2. Inline reconnection logic had no backoff — it retried immediately then
   after a flat 5s, then gave up and crashed, which triggered systemd restart.
   Fix: exponential backoff starting at 5s, doubling each attempt, capped
   at 5 minutes.

Changes are in skill assets (add/ and modify/) so they apply correctly
when the skill is installed on a fresh project.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@glifocat glifocat force-pushed the upstream/fix-whatsapp-reconnect-hammering branch from f74f53f to 66b0a7e Compare March 8, 2026 03:43
@Andy-NanoClaw-AI Andy-NanoClaw-AI added Status: Blocked Blocked by merge conflicts or dependencies Status: Needs Review Ready for maintainer review and removed Status: Needs Review Ready for maintainer review Status: Blocked Blocked by merge conflicts or dependencies labels Mar 14, 2026
@gavrielc gavrielc closed this May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR: Fix Bug fix Status: Blocked Blocked by merge conflicts or dependencies

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(whatsapp): service restart hammers WhatsApp on auth failure

3 participants