Skip to content

feat: DB-backed user management with registration, invitation, and API token CRUD #1605

@ilblackdragon

Description

@ilblackdragon

Problem / Motivation

User management is currently static: users are defined via the GATEWAY_USER_TOKENS env var (a JSON map of secret tokens → UserIdentity). This means:

  • No runtime user creation — adding/removing a user requires an env var change and a process restart
  • No user profilesUserIdentity is just { user_id: String, workspace_read_scopes: Vec<String> }
  • No invitation flow — sharing access means manually distributing tokens
  • No token rotation — compromised tokens require env var update + restart
  • No audit trail — no record of when users were created, last login, token usage
  • Token storage is process-scopedMultiAuthState lives in memory, not in DB

This is the foundation that Issues #2 (Workspaces), #3 (RBAC), and #4 (Admin Panel) depend on.

Proposed Solution

New Database Tables

Both PostgreSQL and libSQL backends must be updated (see dual-backend requirement in src/db/CLAUDE.md).

CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email TEXT UNIQUE,                          -- nullable for token-only users
    display_name TEXT NOT NULL,
    status TEXT NOT NULL DEFAULT 'active',       -- active | suspended | deactivated
    created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
    last_login_at TIMESTAMPTZ,
    created_by UUID REFERENCES users(id),       -- who invited this user (nullable for bootstrap)
    metadata JSONB NOT NULL DEFAULT '{}'        -- extensible profile data
);

CREATE TABLE api_tokens (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    token_hash BYTEA NOT NULL,                  -- SHA-256 hash (never store plaintext)
    name TEXT NOT NULL,                          -- human label ("my-laptop", "ci-bot")
    scopes TEXT[] NOT NULL DEFAULT '{}',         -- future: fine-grained scopes
    expires_at TIMESTAMPTZ,                     -- nullable = never expires
    last_used_at TIMESTAMPTZ,
    created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
    revoked_at TIMESTAMPTZ                      -- soft-revoke: set this instead of deleting
);
CREATE INDEX idx_api_tokens_user ON api_tokens(user_id);
CREATE INDEX idx_api_tokens_hash ON api_tokens(token_hash);

CREATE TABLE invitations (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email TEXT,                                  -- nullable for link-based invites
    invite_token_hash BYTEA NOT NULL,           -- SHA-256 hash of the invite link token
    invited_by UUID NOT NULL REFERENCES users(id),
    status TEXT NOT NULL DEFAULT 'pending',      -- pending | accepted | expired | revoked
    expires_at TIMESTAMPTZ NOT NULL,
    accepted_at TIMESTAMPTZ,
    accepted_by UUID REFERENCES users(id),
    created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

New Database Traits

Add a UserStore sub-trait to the Database supertrait in src/db/mod.rs:

#[async_trait]
pub trait UserStore {
    // Users
    async fn create_user(&self, email: Option<&str>, display_name: &str, created_by: Option<Uuid>) -> Result<UserRecord, DatabaseError>;
    async fn get_user(&self, id: Uuid) -> Result<Option<UserRecord>, DatabaseError>;
    async fn get_user_by_email(&self, email: &str) -> Result<Option<UserRecord>, DatabaseError>;
    async fn list_users(&self, status: Option<&str>) -> Result<Vec<UserRecord>, DatabaseError>;
    async fn update_user_status(&self, id: Uuid, status: &str) -> Result<(), DatabaseError>;
    async fn update_user_profile(&self, id: Uuid, display_name: &str, metadata: &serde_json::Value) -> Result<(), DatabaseError>;
    async fn record_login(&self, id: Uuid) -> Result<(), DatabaseError>;

    // API Tokens
    async fn create_api_token(&self, user_id: Uuid, name: &str, token_hash: &[u8; 32], expires_at: Option<DateTime<Utc>>) -> Result<ApiTokenRecord, DatabaseError>;
    async fn list_api_tokens(&self, user_id: Uuid) -> Result<Vec<ApiTokenRecord>, DatabaseError>;
    async fn revoke_api_token(&self, token_id: Uuid, user_id: Uuid) -> Result<(), DatabaseError>;
    async fn authenticate_token(&self, token_hash: &[u8; 32]) -> Result<Option<(ApiTokenRecord, UserRecord)>, DatabaseError>;
    async fn record_token_usage(&self, token_id: Uuid) -> Result<(), DatabaseError>;

    // Invitations
    async fn create_invitation(&self, email: Option<&str>, invite_token_hash: &[u8; 32], invited_by: Uuid, expires_at: DateTime<Utc>) -> Result<InvitationRecord, DatabaseError>;
    async fn accept_invitation(&self, invite_token_hash: &[u8; 32], accepted_by: Uuid) -> Result<(), DatabaseError>;
    async fn list_invitations(&self, invited_by: Option<Uuid>) -> Result<Vec<InvitationRecord>, DatabaseError>;
}

Auth Refactor

Replace the in-memory MultiAuthState in src/channels/web/auth.rs with a DB-backed authenticator:

  • auth_middleware must call store.authenticate_token(hash) instead of iterating hashed_tokens: Vec<([u8; 32], UserIdentity)>
  • Keep constant-time hashing (hash_token() with SHA-256) — the DB lookup is by hash, not by plaintext
  • Cache recent authentications in an LRU to avoid hitting DB on every request (TTL ~60s)
  • UserIdentity gains a db_user_id: Uuid field alongside the existing string user_id
  • Backward compatibility: if GATEWAY_USER_TOKENS is set, continue using the env-var path. If a users table exists with rows, prefer DB. Log a deprecation warning when env-var tokens are used.

New API Endpoints

All under /api/admin/users and /api/tokens (protected by auth):

POST   /api/admin/users                 — create user (admin only, for now any authenticated user)
GET    /api/admin/users                 — list users
GET    /api/admin/users/{id}            — get user details
PATCH  /api/admin/users/{id}            — update profile / status
POST   /api/admin/users/{id}/suspend    — suspend user
POST   /api/admin/users/{id}/activate   — reactivate user

POST   /api/tokens                      — create API token (returns plaintext ONCE)
GET    /api/tokens                      — list my tokens (no plaintext, just metadata)
DELETE /api/tokens/{id}                 — revoke token

POST   /api/invitations                 — create invitation
GET    /api/invitations                 — list my sent invitations
POST   /api/invitations/accept          — accept invitation (creates user + first token)

Migration Path

Add a startup migration that:

  1. Creates the new tables
  2. If GATEWAY_USER_TOKENS is set, inserts corresponding users and api_tokens rows
  3. Logs which users were migrated

Architecture Notes

  • Follow the handler pattern in src/channels/web/handlers/ — one file per domain (users.rs, tokens.rs, invitations.rs)
  • Follow the AuthenticatedUser extractor pattern already used by jobs/routines/settings handlers
  • Token generation: use rand::thread_rng().gen::<[u8; 32]>() → hex-encode for the user-facing token, SHA-256 hash for storage. Same pattern as hash_token() in auth.rs
  • All new handlers go through auth_middleware — user creation is initially open to any authenticated user (RBAC in Issue Onboarding: show Telegram in channel selection and auto-install bundled channel #3 will restrict it)
  • The UserStore trait must be implemented for both PostgresBackend and LibSqlBackend

Code Pointers

  • src/channels/web/auth.rsMultiAuthState, UserIdentity, hash_token(), auth_middleware — this is the code being replaced/augmented
  • src/channels/web/server.rsGatewayState (add store reference, already present), WorkspacePool
  • src/channels/web/handlers/jobs.rs — pattern to follow for new handler files
  • src/db/mod.rsDatabase supertrait where UserStore must be added
  • src/db/postgres.rs — PostgreSQL implementation to extend
  • src/db/libsql/mod.rs — libSQL implementation to extend
  • src/db/libsql_migrations.rs — consolidated libSQL schema (add tables here)
  • migrations/ — PostgreSQL migration files (add V14__users_tokens.sql)
  • src/config/channels.rsGatewayConfig, UserTokenConfig — env-var parsing to deprecate
  • src/main.rs lines ~595-632 — multi-tenant startup wiring

Acceptance Criteria

  • users, api_tokens, and invitations tables created in both PostgreSQL and libSQL backends
  • UserStore trait added to Database supertrait with implementations for both backends
  • API endpoints for user CRUD, token management, and invitations all functional
  • Creating a token returns the plaintext exactly once; subsequent reads show only metadata
  • Token authentication falls through to DB when GATEWAY_USER_TOKENS is unset
  • Existing GATEWAY_USER_TOKENS path continues to work with deprecation warning logged
  • First startup with GATEWAY_USER_TOKENS + DB migrates env-var users into DB tables
  • Revoking a token immediately prevents authentication (cache eviction or bypass)
  • Suspending a user immediately prevents all their tokens from authenticating
  • Expired invitations cannot be accepted
  • All new handler files follow the AuthenticatedUser extractor pattern
  • Unit tests for UserStore methods (both backends)
  • Integration tests for token create → authenticate → revoke → reject cycle
  • Integration tests for invitation create → accept → user created cycle
  • cargo clippy zero warnings, cargo fmt clean

Pitfalls & Landmines

  • Dual-backend requirement is non-negotiable. Every SQL statement must work in both PostgreSQL and libSQL. libSQL doesn't support gen_random_uuid() — use application-generated UUIDs. libSQL doesn't support BYTEA — use BLOB. libSQL doesn't support TEXT[] — use JSON text. See src/db/libsql_migrations.rs for the translation patterns.
  • Token plaintext must never be logged. The hash_token() function in auth.rs is the only place tokens should be hashed. Don't add Debug derives on structs containing plaintext tokens.
  • Auth cache invalidation is subtle. If you add an LRU cache for DB-backed auth, revoking a token or suspending a user must either evict from cache or the cache TTL must be short enough to be acceptable. Document the chosen tradeoff.
  • The user_id string in existing tables (jobs, memory, settings, routines) currently holds arbitrary strings like "alice". The migration must handle mapping these to the new UUID-based users.id. Consider keeping the string user_id column as-is and adding a nullable user_uuid FK that gets populated during migration. Don't break existing queries.
  • IncomingMessage in src/channels/channel.rs carries a user_id: String. This flows through the entire agent pipeline. Changing it to UUID is a massive refactor — instead, keep user_id as the string identifier and use it as the users.id display/lookup key, or add a parallel field.
  • Don't forget to update GatewayState builder methods in src/channels/web/mod.rs — the with_* pattern must include any new state the auth layer needs.

Non-Goals


Part 1 of 4 — see also: Workspaces (#2), RBAC (#3), Admin Panel (#4)

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High priority — blocks core workflowsenhancementNew feature or requestrisk: highSafety, secrets, auth, or critical infrastructurescope: channel/webWeb gateway channelscope: dbDatabase trait / abstraction

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions