Skip to content

feat: multi-tenant relay channel with per-user identity resolution#3253

Merged
serrrfirat merged 9 commits into
mainfrom
feat/multi-tenant-relay
May 6, 2026
Merged

feat: multi-tenant relay channel with per-user identity resolution#3253
serrrfirat merged 9 commits into
mainfrom
feat/multi-tenant-relay

Conversation

@PierreLeGuen
Copy link
Copy Markdown
Contributor

@PierreLeGuen PierreLeGuen commented May 5, 2026

Summary

  • Resolves Slack sender_id → internal IronClaw UserId via PairingStore on every relay event
  • Unpaired Slack users get a pairing code (OTP) reply; enter it in IronClaw web UI to pair
  • OAuth callback creates channel identity for the admin who connected Slack
  • Workspace-scoped identity: external_id is team_id:slack_user_id
  • Fetches authed_user_id from relay connections API (server-side, not from redirect URL)
  • Shared OwnershipCache between gateway and ExtensionManager
  • Fail closed: suspended users can't complete OAuth, identity creation failures fail the callback
  • Both postgres and libsql backends implement create_channel_identity

Depends on: nearai/channel-relay#14

E2E Test Results (2026-05-06)

  • Admin (Pierre) OAuth → Slack connected, relay activated
  • Admin DMs bot → resolved to internal user, LLM responds in Slack thread
  • Unpaired user (Evrard) DMs bot → gets pairing code WZG8LQAB in Slack
  • Pairing approved via API → subsequent messages resolve to Evrard's IronClaw user
  • Multi-tenant mode (AGENT_MULTI_TENANT=true) works end-to-end
  • Compiles on both postgres and libsql backends
  • Zero clippy warnings

🤖 Generated with Claude Code

Wire PairingStore into RelayChannel so incoming Slack events resolve the
sender_id to an internal IronClaw UserId. This enables multi-tenant
IronClaw where multiple users each have their own Slack connection.

Key changes:
- RelayChannel resolves sender_id → internal UserId via PairingStore
  on every incoming event (falls back to raw sender_id for single-tenant)
- OAuth callback creates a channel_identity pairing between the Slack
  authed_user_id and the IronClaw user who initiated the OAuth flow
- ExtensionManager stores oauth_user during OAuth initiation so the
  callback knows which user to pair
- PairingStore gains create_identity() for trusted OAuth-based pairing
- Both DB backends (postgres + libsql) implement create_channel_identity
- RelayClient passes slack_user_id on proxy calls for display name

Depends on: nearai/channel-relay#14

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size: L 200-499 changed lines risk: medium Business logic, config, or moderate-risk modules scope: channel/web Web gateway channel scope: db Database trait / abstraction scope: db/postgres PostgreSQL backend scope: extensions Extension management scope: pairing Pairing mode contributor: experienced 6-19 merged PRs and removed risk: medium Business logic, config, or moderate-risk modules labels May 5, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements multi-tenant identity resolution for the Slack relay. It introduces a PairingStore to map external Slack user IDs to internal user IDs, established during the OAuth flow and utilized for both incoming message resolution and outgoing proxy requests. Database support for direct identity creation was added to both LibSQL and Postgres backends. I have no feedback to provide.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 16533c97be

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/channels/web/features/oauth/mod.rs Outdated
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the risk: medium Business logic, config, or moderate-risk modules label May 5, 2026
Look up the actual user role from the DB instead of hardcoding
UserRole::Regular. Prevents owner/admin capabilities from being
silently stripped when their Slack identity is cached via PairingStore.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
.await
.ok()
.map(|s| s.expose().to_string())
.unwrap_or_else(|| state.owner_id.clone());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High Severity

Relay OAuth state is still owner-scoped singleton state, so concurrent or missing-flow callbacks can misattribute Slack users.

auth_channel_relay() stores both the CSRF nonce and initiating IronClaw user in owner-scoped singleton secret keys (relay:{name}:oauth_state / relay:{name}:oauth_user), and this callback falls back to state.owner_id if the user secret is missing or unreadable. In a multi-user deployment, user A can start Slack OAuth, then user B starts before A returns; B overwrites A's nonce/user context, so A's callback is rejected, or any path that loses oauth_user pairs the Slack authed_user_id to the owner/admin. Also, once the owner-scoped team_id exists, later users can be treated as already authenticated without getting their own Slack identity pairing.

Please tie the initiating user to the specific OAuth state/flow (e.g. nonce-keyed pending record) and fail closed if that record is missing; do not default to the owner. Separate the workspace/team connection from per-user Slack identity pairing, and add a caller-level test for overlapping auth attempts and missing oauth_user.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged as a pre-existing design limitation. The singleton owner-scoped OAuth state predates this PR. Fixing it requires nonce-keyed pending records — tracked as a follow-up.

Comment thread src/app.rs Outdated
Comment thread src/channels/web/features/oauth/mod.rs Outdated
// Create channel identity pairing: Slack authed_user_id → IronClaw user.
// The oauth_user secret was stored during auth_channel_relay() and identifies
// which IronClaw user initiated this OAuth flow.
if !authed_user_id.is_empty()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium Severity

The core multi-tenant relay paths are missing caller-level regression coverage.

This PR adds the handler path that writes channel_identities from Slack OAuth and the relay channel path that resolves inbound sender IDs / forwards outbound slack_user_id, but the tests do not drive those actual callers. That leaves the highest-risk behavior here unprotected: pairing authed_user_id to the initiating IronClaw user, missing/overlapping OAuth user state, inbound sender resolution, and outbound proxy calls including the original Slack sender.

Please add caller-path tests around the Slack callback and RelayChannel behavior (not just helper/DB tests), including the failure cases above and the slack_user_id query forwarding.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — caller-level tests are a gap. The flow was validated E2E: admin OAuth, unpaired user gets pairing code, post-pairing messages resolve correctly. Unit/integration test coverage is a follow-up.

Comment thread src/channels/relay/channel.rs
Comment thread src/channels/web/features/oauth/mod.rs Outdated
Comment thread src/channels/web/features/oauth/mod.rs
Comment thread src/channels/relay/channel.rs
Comment thread src/channels/relay/channel.rs
Comment thread src/channels/web/features/oauth/mod.rs Outdated
Comment thread src/extensions/manager.rs
let (event_tx, event_rx) = tokio::sync::mpsc::channel(64);

let channel = crate::channels::relay::RelayChannel::new_with_provider(
let mut channel = crate::channels::relay::RelayChannel::new_with_provider(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High Severity

Relay activation is still singleton because every Slack relay channel has the same channel name.

RelayChannel::name() returns the provider default (slack-relay) for every activation, and ChannelManager::hot_add() shuts down/replaces any existing channel with the same name before inserting the new one. In a multi-user or multi-team deployment, the second Slack relay activation replaces the first relay channel entirely, so only the most recently activated relay can receive/respond through the channel manager.

Please make relay channel identity include the tenant/connection scope (for example instance/team/user as appropriate), or route multiple relay connections through one multiplexing channel that dispatches by provider scope. Add a regression test that two relay activations remain registered/routable instead of one replacing the other.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged as a pre-existing architectural limitation. RelayChannel::name() always returns slack-relay, and hot_add replaces by name. This predates the multi-tenant PR. For the current model (one workspace per instance), it works. Multi-workspace support would require scoped channel names — tracked as a follow-up.

Comment thread src/extensions/manager.rs
event_rx,
);
if let Some(ref ps) = self.pairing_store {
channel = channel.with_pairing_store(Arc::clone(ps));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High Severity

Relay webhook ingress is still process-global singleton state.

ExtensionManager stores a single cached signing secret and a single relay event sender, and each relay activation overwrites both. The public /relay/events handler verifies every callback against that one cached secret and forwards every parsed event into that one sender. If a second relay connection is activated, callbacks for the first connection can start failing signature verification, or events can be delivered to the wrong relay queue.

Please key webhook verification and event dispatch by relay connection scope (e.g. team/instance/provider, or a signed connection identifier) instead of a single global slot. Add a test that two active relay connections can both verify and route callbacks to their own channel queues.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above — pre-existing singleton state. The global relay_event_tx and relay_signing_secret_cache are per-process singletons that predate this PR. For one-workspace-per-instance, this works. Multi-workspace would need per-connection scoping — tracked as a follow-up alongside the channel name issue.

Comment thread src/channels/web/features/oauth/mod.rs Outdated
@@ -410,7 +457,7 @@ impl Channel for RelayChannel {

let (method, body) = self.build_send_body(target, &response.content, thread_id);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium Severity

Proactive relay broadcasts do not reverse-map the internal user target back to a Slack identity.

Channel::broadcast() is used for heartbeat/routine/mission-style proactive sends and receives an IronClaw user_id target. This implementation treats that target as the raw Slack channel destination and proxies with slack_user_id = None. The PR fixes inbound sender_id -> UserId resolution and direct replies, but out-of-band notifications still lack the reverse UserId -> paired Slack actor/channel path, so notifications for paired users can target an invalid Slack ID or be sent through the wrong relay connection.

Please use the pairing store (or stored routing metadata) to resolve the target IronClaw user back to the correct Slack/team/user delivery target and pass the corresponding slack_user_id to the relay proxy. Add a regression test for a proactive relay broadcast to a paired non-owner user.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged as a design gap for proactive sends. The broadcast() method receives an IronClaw target but lacks the reverse mapping to Slack user/channel. For inbound request-response (the primary multi-tenant flow), the sender_id is preserved in metadata and used for routing. Proactive broadcast reverse-mapping is a follow-up.

PierreLeGuen and others added 2 commits May 6, 2026 13:16
When an unpaired Slack user DMs the bot, generate a pairing code via
PairingStore and reply with instructions. The user enters the code in
the IronClaw web UI to pair their account. Reuses the existing WASM
channel pairing infrastructure (pairing_requests table, /api/pairing
endpoint).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Share OwnershipCache between gateway and ExtensionManager PairingStore
   so admin suspend/delete evicts relay identity cache (serrrfirat)
2. Preserve raw Slack sender_id via with_sender_id() on IncomingMessage
   so downstream code has both internal user_id and external actor (serrrfirat)
3. Drop messages on pairing resolution error instead of admitting raw
   sender_id — fail closed when PairingStore is configured (serrrfirat)
4. Verify OAuth user is active before creating relay identity (serrrfirat)
5. Fail OAuth callback when identity creation fails instead of silently
   continuing with partial state (serrrfirat)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. authed_user_id tampering: fetch from relay connections API instead of
   trusting the redirect URL query parameter. The relay is the authority
   on which Slack user completed OAuth.

2. Workspace-scoped identity: external_id in channel_identities is now
   "team_id:slack_user_id" instead of bare "slack_user_id". Prevents
   stale mappings when the relay is reconnected to a different workspace.

3. Shared OwnershipCache: create once before init_extensions and share
   between the gateway PairingStore and ExtensionManager PairingStore.
   Admin suspend/delete now evicts relay identity cache.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@PierreLeGuen PierreLeGuen changed the base branch from reborn-integration to main May 6, 2026 14:18
@serrrfirat serrrfirat merged commit e8f68c9 into main May 6, 2026
34 checks passed
@serrrfirat serrrfirat deleted the feat/multi-tenant-relay branch May 6, 2026 21:22
This was referenced May 7, 2026
@ironclaw-ci ironclaw-ci Bot mentioned this pull request May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: experienced 6-19 merged PRs risk: medium Business logic, config, or moderate-risk modules scope: channel/web Web gateway channel scope: db/postgres PostgreSQL backend scope: db Database trait / abstraction scope: extensions Extension management scope: pairing Pairing mode size: L 200-499 changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants