Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions gateway/platforms/discord.py
Original file line number Diff line number Diff line change
Expand Up @@ -1768,9 +1768,14 @@ async def slash_reload_mcp(interaction: discord.Interaction):
await self._run_simple_slash(interaction, "/reload-mcp")

@tree.command(name="voice", description="Toggle voice reply mode")
@discord.app_commands.describe(mode="Voice mode: on, off, tts, channel, leave, or status")
@discord.app_commands.describe(mode="Voice mode: join, channel, leave, on, tts, off, or status")
@discord.app_commands.choices(mode=[
discord.app_commands.Choice(name="channel — join your voice channel", value="channel"),
# `join` and `channel` both route to _handle_voice_channel_join in
# gateway/run.py — expose both in the slash UI so autocomplete
# matches what the docs advertise and what the runner accepts when
# the command is typed as plain text.
discord.app_commands.Choice(name="join — join your voice channel", value="join"),
discord.app_commands.Choice(name="channel — join your voice channel (alias)", value="channel"),
discord.app_commands.Choice(name="leave — leave voice channel", value="leave"),
discord.app_commands.Choice(name="on — voice reply to voice messages", value="on"),
discord.app_commands.Choice(name="tts — voice reply to all messages", value="tts"),
Expand Down
2 changes: 2 additions & 0 deletions website/docs/reference/environment-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,8 @@ For cloud sandbox backends, persistence is filesystem-oriented. `TERMINAL_LIFETI
| `DISCORD_IGNORED_CHANNELS` | Comma-separated channel IDs where the bot never responds |
| `DISCORD_NO_THREAD_CHANNELS` | Comma-separated channel IDs where bot responds without auto-threading |
| `DISCORD_REPLY_TO_MODE` | Reply-reference behavior: `off`, `first` (default), or `all` |
| `DISCORD_PROXY` | Proxy URL for all Discord traffic (gateway websocket, REST, and attachment downloads). Overrides `HTTPS_PROXY`. Supports `http://`, `https://`, `socks5://` (SOCKS requires `aiohttp_socks`). |
| `HERMES_DISCORD_VOICE_PACKET_DUMP` | Debug knob for the Discord voice-channel packet handler. `"errors"` (default) dumps on circuit-breaker trip; `"all"` traces every packet to `~/.hermes/logs/voice-packets/`; `"off"` disables dumps. |
| `SLACK_BOT_TOKEN` | Slack bot token (`xoxb-...`) |
| `SLACK_APP_TOKEN` | Slack app-level token (`xapp-...`, required for Socket Mode) |
| `SLACK_ALLOWED_USERS` | Comma-separated Slack user IDs |
Expand Down
12 changes: 8 additions & 4 deletions website/docs/user-guide/features/voice-mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -279,10 +279,10 @@ In the [Developer Portal](https://discord.com/developers/applications) → your
| Intent | Purpose |
|--------|---------|
| **Presence Intent** | Detect user online/offline status |
| **Server Members Intent** | Map voice SSRC identifiers to Discord user IDs |
| **Server Members Intent** | Resolve usernames in `DISCORD_ALLOWED_USERS` to numeric IDs (conditional) |
| **Message Content Intent** | Read text message content in channels |

All three are required for full voice channel functionality. **Server Members Intent** is especially critical — without it, the bot cannot identify who is speaking in the voice channel.
**Message Content Intent** is required. **Server Members Intent** is only needed if your `DISCORD_ALLOWED_USERS` list uses usernames — if you use numeric user IDs, you can leave it OFF. Voice-channel SSRC → user_id mapping comes from Discord's SPEAKING opcode on the voice websocket and does **not** require the Server Members Intent.

#### 3. Opus Codec

Expand Down Expand Up @@ -336,6 +336,8 @@ Use these in the Discord text channel where the bot is present:
/voice status Show voice mode and connected channel
```

Both `/voice join` and `/voice channel` appear in Discord's slash-command autocomplete and route to the same handler.

:::info
You must be in a voice channel before running `/voice join`. The bot joins the same VC you're in.
:::
Expand All @@ -358,9 +360,11 @@ When the bot is in a voice channel:
- Agent responses are sent as text in the channel AND spoken in the VC
- The text channel is the one where `/voice join` was issued

### Echo Prevention
### Echo Prevention and Barge-In

While the bot is playing a TTS reply, its listener does not fully pause — it switches to a higher energy threshold (**barge-in mode**) after a brief guard window. Quiet echo residual from the bot's own voice is ignored, but if a user speaks clearly the bot stops its playback and captures what was said. In practice this gives the conversational feel of being able to interrupt the bot mid-sentence without losing the rest of your turn.

The bot automatically pauses its audio listener while playing TTS replies, preventing it from hearing and re-processing its own output.
You can tune the guard window and the barge-in threshold under `voice.discord_vc` in `config.yaml` — `barge_in_guard` (seconds) and `barge_in_rms` (int16 RMS, higher = less sensitive).

### Access Control

Expand Down
8 changes: 5 additions & 3 deletions website/docs/user-guide/messaging/discord.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,13 +116,13 @@ On the **Bot** page, scroll down to **Privileged Gateway Intents**. You'll see t
| Intent | Purpose | Required? |
|--------|---------|-----------|
| **Presence Intent** | See user online/offline status | Optional |
| **Server Members Intent** | Access the member list, resolve usernames | **Required** |
| **Server Members Intent** | Resolve usernames to user IDs in `DISCORD_ALLOWED_USERS` | Conditional — only if your allowlist contains usernames |
| **Message Content Intent** | Read the text content of messages | **Required** |

**Enable both Server Members Intent and Message Content Intent** by toggling them **ON**.
**Enable Message Content Intent** by toggling it **ON**. Enable **Server Members Intent** as well if you plan to list people in `DISCORD_ALLOWED_USERS` by username instead of by numeric user ID.

- Without **Message Content Intent**, your bot receives message events but the message text is empty — the bot literally cannot see what you typed.
- Without **Server Members Intent**, the bot cannot resolve usernames for the allowed users list and may fail to identify who is messaging it.
- **Server Members Intent** is only needed when the allowlist contains usernames (e.g. `DISCORD_ALLOWED_USERS=alice,bob`). If you use numeric user IDs throughout (e.g. `DISCORD_ALLOWED_USERS=284102345871466496`), the gateway skips requesting this intent on startup, and you can leave it OFF. Enabling an intent you don't need is harmless; requesting an intent that isn't enabled in the Developer Portal will block the bot from coming online.

:::warning[This is the #1 reason Discord bots don't work]
If your bot is online but never responds to messages, the **Message Content Intent** is almost certainly disabled. Go back to the [Developer Portal](https://discord.com/developers/applications), select your application → Bot → Privileged Gateway Intents, and make sure **Message Content Intent** is toggled ON. Click **Save Changes**.
Expand Down Expand Up @@ -283,6 +283,8 @@ Discord behavior is controlled through two files: **`~/.hermes/.env`** for crede
| `DISCORD_IGNORED_CHANNELS` | No | — | Comma-separated channel IDs where the bot **never** responds, even when `@mentioned`. Takes priority over all other channel settings. |
| `DISCORD_NO_THREAD_CHANNELS` | No | — | Comma-separated channel IDs where the bot responds directly in the channel instead of creating a thread. Only relevant when `DISCORD_AUTO_THREAD` is `true`. |
| `DISCORD_REPLY_TO_MODE` | No | `"first"` | Controls reply-reference behavior: `"off"` — never reply to the original message, `"first"` — reply-reference on the first message chunk only (default), `"all"` — reply-reference on every chunk. |
| `DISCORD_PROXY` | No | — | Proxy URL for all Discord traffic (gateway websocket, REST, and attachment downloads). Accepts `http://`, `https://`, and `socks5://` schemes (SOCKS requires `aiohttp_socks`). Overrides the generic `HTTPS_PROXY` / `HTTP_PROXY` / `ALL_PROXY` env vars, and on macOS overrides auto-detection from `scutil --proxy`. |
| `HERMES_DISCORD_VOICE_PACKET_DUMP` | No | `"errors"` | Debug knob for the Discord voice-channel packet handler. `"errors"` (default) writes a JSON dump only when the decode circuit breaker trips; `"all"` appends every packet to a per-SSRC JSONL trace file under `~/.hermes/logs/voice-packets/`; `"off"` disables all dumps. Unknown values fall back to `"errors"`. |

### Config File (`config.yaml`)

Expand Down