Skip to content

[Bug]: Discord adapter creates zombie websocket connection on reconnect, causing double responses #18187

@tilarso

Description

@tilarso

Description

When the gateway restarts (or the Discord adapter reconnects), DiscordAdapter.connect() creates a new commands.Bot client but never closes the old one. Discord doesn't immediately terminate the old websocket, leaving two live connections for a window of time. Both connections receive every incoming message, resulting in two separate agent turns being spawned — each generating a different response.

Symptoms

  • Every Discord message triggers two responses with different wording (not a duplicate of the same response)
  • When auto_thread is enabled: one response appears in the auto-created thread (correct), a second response appears directly in the parent channel (incorrect)
  • Gateway log shows the same message arriving twice ~400ms apart:
    inbound message: platform=discord user=X chat=Y msg='hello'
    inbound message: platform=discord user=X chat=Y msg='hello'   ← ~400ms later
    
  • Only one gateway process is running (ps aux confirms)
  • MessageDeduplicator exists and is correctly placed, but fails due to the race condition between two concurrent websocket deliveries

Root Cause

In gateway/platforms/discord.py, connect() unconditionally creates a new commands.Bot instance:

self._client = commands.Bot(
    command_prefix="!",
    intents=intents,
    ...
)

When connect() is called a second time (e.g. during reconnect in run.py line ~2848), the old self._client is orphaned — still connected to Discord's gateway — while the new client also connects. Both are alive simultaneously and both fire on_message for every event.

The MessageDeduplicator (per-adapter instance) cannot prevent duplicates because both websockets deliver the event independently, and the two on_message coroutines may check is_duplicate before either has marked the ID as seen (race condition).

Fix

Before creating the new Bot instance in connect(), close and await the old client if one exists:

# Add before: self._client = commands.Bot(...)
if self._client and not self._client.is_closed():
    await self._client.close()
    self._client = None
self._ready_event.clear()

This ensures only one Discord websocket connection is ever active for the adapter at any time.

Environment

  • Hermes gateway running in Docker container
  • Discord platform adapter
  • auto_thread: true in config (makes the symptom very visible — thread response + channel response)
  • Triggered by any gateway restart or reconnect cycle

Workaround

Avoid gateway restarts. The zombie connection eventually times out on its own (~minutes), after which responses return to normal until the next restart.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/gatewayGateway runner, session dispatch, deliveryplatform/discordDiscord bot adaptertype/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions