feat: Add Claude Code CLI as LLM provider option #1098

matthew-petty · 2025-12-16T22:42:27Z

Warning

Heavy PR: This PR adds ~12,000 lines including a new service (apps/claude-code-wrapper/), comprehensive tests, and lockfile updates. The wrapper service alone accounts for ~3,500 lines. We recommend reviewing in sections: wrapper service → provider integration → tests.

Caution

Anthropic Terms of Use: This integration pattern uses Claude Code CLI programmatically. Anthropic's terms restrict such usage to self-hosting under limited circumstances only. Please review Anthropic's Terms of Use before deploying. This is NOT suitable for commercial SaaS offerings.

Summary

Adds Claude Code CLI as a new LLM provider option, enabling users with Claude Max subscriptions to route AI operations through Claude Code instead of direct API calls. This provides an alternative for users who want to leverage their existing Max subscription for Inbox Zero's AI features.

Architecture

┌─────────────────────────────────────────────────────────────┐
│  Inbox Zero Web App                                         │
│  ┌───────────────────────────────────────────────────────┐ │
│  │ createGenerateText() / createGenerateObject()         │ │
│  │ - Override logic: if DEFAULT_LLM_PROVIDER=claudecode  │ │
│  │ - Routes to Claude Code instead of AI SDK providers   │ │
│  └───────────────────────────────────────────────────────┘ │
│                          │                                  │
│                          ▼                                  │
│  ┌───────────────────────────────────────────────────────┐ │
│  │ claude-code-llm.ts (HTTP Client)                      │ │
│  │ - Transforms AI SDK calls to wrapper API format       │ │
│  │ - Handles response parsing, usage tracking            │ │
│  └───────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
                           │
                           ▼ HTTP
┌─────────────────────────────────────────────────────────────┐
│  Claude Code Wrapper Service (apps/claude-code-wrapper/)    │
│  - Bun/Hono HTTP server                                     │
│  - Spawns Claude CLI subprocess per request                 │
│  - Endpoints: /generate, /stream, /health                   │
│  - Auth via CLAUDE_CODE_WRAPPER_API_KEY                     │
└─────────────────────────────────────────────────────────────┘
                           │
                           ▼ subprocess
┌─────────────────────────────────────────────────────────────┐
│  Claude Code CLI                                            │
│  - Uses Max subscription or ANTHROPIC_API_KEY               │
│  - Outputs via --output-format stream-json                  │
└─────────────────────────────────────────────────────────────┘

What Works

Feature	Status	Notes
Rule creation via AI	✅	`generateObject`
Email categorization	✅	`generateObject`
Prompt-to-rules conversion	✅	`generateObject`
Email summarization	✅	`generateText`
All other `generateText`/`generateObject`	✅	Full coverage

What Doesn't Work (Yet)

Feature	Status	Notes
Chat assistant	❌	Requires streaming with tool calls

The chat assistant uses streamText() with tools, which requires a different integration pattern. The infrastructure is scaffolded (tool proxy endpoint, Claude skill file) but would require either:

Major refactor of chat components to work with Claude Code's tool system
Parallel implementation of Claude Code-compatible chat

Workaround: Set CHAT_LLM_PROVIDER=anthropic to use direct API for chat while using Claude Code for everything else.

Extensibility

Claude Skills

The wrapper includes a skill system (apps/claude-code-wrapper/.claude/skills/) that allows Claude Code to invoke Inbox Zero tools. This enables future capabilities like:

Claude autonomously managing email rules
Complex multi-step email workflows
Integration with other Claude Code capabilities

Future CLI Integration

The provider-layer pattern established here can be extended to support other CLI-based AI tools, not just Claude Code.

New Environment Variables

# Required for Claude Code provider
DEFAULT_LLM_PROVIDER=claudecode
CLAUDE_CODE_BASE_URL=http://localhost:3100  # or http://claude-code-wrapper:3100 in Docker
CLAUDE_CODE_WRAPPER_API_KEY=<generate with: openssl rand -hex 32>

# Optional
CLAUDE_CODE_TIMEOUT=120000                   # Request timeout (ms)
CLAUDE_CODE_MODEL=sonnet                     # Default model (sonnet, haiku, opus)
CLAUDE_CODE_ECONOMY_MODEL=haiku              # Model for economy/bulk operations

# For Claude CLI authentication (one required)
CLAUDE_CODE_OAUTH_TOKEN=<from Max subscription>  # Preferred
# OR
ANTHROPIC_API_KEY=<api key>                      # Alternative

# For tool proxy (enables Claude skills)
LLM_TOOL_PROXY_TOKEN=<generate with: openssl rand -hex 32>

Provider Selection Logic

User Request
    │
    ▼
getModel(userAi, modelType)
    │
    ├─► User has custom API key? → Use their configured provider
    │
    └─► No custom key
            │
            ├─► modelType="economy" & ECONOMY_LLM_PROVIDER set? → Use economy provider
            ├─► modelType="chat" & CHAT_LLM_PROVIDER set? → Use chat provider  
            └─► DEFAULT_LLM_PROVIDER → Use default
                    │
                    ▼
            createGenerateText/Object()
                    │
                    ├─► Provider is CLAUDE_CODE? → Route to wrapper
                    │
                    └─► DEFAULT_LLM_PROVIDER=claudecode? → Override to wrapper
                            (even if modelType selected different provider)

Docker Usage

# Start with Claude Code wrapper
docker compose --profile claude-code up -d

# The wrapper runs on port 3100 by default
# Configure CLAUDE_CODE_BASE_URL=http://claude-code-wrapper:3100 in apps/web/.env

Open Source Intent

We intend to open source the wrapper service as a standalone project. It was developed specifically to integrate Claude Code with Inbox Zero, so in this PR it remains coupled with the web app. Future work may extract it to a separate repository.

Files Changed

New Service

apps/claude-code-wrapper/ - Complete Bun/Hono service (Dockerfile, routes, CLI handling, tests)

Provider Integration

apps/web/utils/llms/claude-code.ts - HTTP client
apps/web/utils/llms/claude-code-llm.ts - AI SDK-compatible wrappers
apps/web/utils/llms/model.ts - Provider configuration
apps/web/utils/llms/index.ts - Override logic

Tool Proxy (for Claude skills)

apps/web/app/api/llm-tools/invoke/ - REST endpoint for tool invocation

Configuration

apps/web/env.ts - New environment variables
docker-compose.yml - Wrapper service definition
turbo.json - Build configuration

Test Plan

Unit tests for wrapper CLI handling
Unit tests for HTTP client
Unit tests for provider override logic
Integration tests for wrapper service
E2E test for real CLI invocation (requires Claude auth)
Manual testing of rule creation flow
Manual testing with Docker Compose

Summary by CodeRabbit

New Features
- Claude Code added as a new LLM provider with session continuity, real-time streaming, and an HTTP wrapper plus a proxy for LLM tools (rules, patterns, about, knowledge base).
- Inbox Zero tools skill documented and exposed via a standardized invoke API.
Documentation
- Comprehensive README and usage guides covering deployment, env vars, API endpoints, and examples.
Tests
- Extensive unit, integration, streaming, and end-to-end test coverage.
Chores
- Dockerfile, docker-compose service, and project config added for the wrapper.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Introduces an HTTP wrapper service that bridges Claude Code CLI for container-to-container communication. This enables using Claude Code as an LLM provider option in the Inbox Zero app. New package: apps/claude-code-wrapper/ - Express server with health, generate, and stream endpoints - CLI subprocess execution with JSON output parsing - Session tracking support for multi-turn conversations - Server-Sent Events for streaming responses - Docker integration with health checks Related: elie222#8

- Fix CLI flag: use --resume instead of --continue for session IDs - Remove unsupported --max-tokens CLI flag (not in Claude CLI) - Add timing-safe API key comparison to prevent timing attacks - Fix Dockerfile: use npm (package-lock.json) consistently - Fix health check race condition with safe resolve pattern - Add safe JSON stringify to handle circular references in logger Addresses feedback from code review on Phase 1.

- Add env_file to load ANTHROPIC_API_KEY from .env - Document two auth approaches in docker-compose comments: 1. API key via ANTHROPIC_API_KEY env var 2. Max subscription via bind-mounted ~/.claude directory

Adds buildClaudeEnv() helper that removes ANTHROPIC_API_KEY when CLAUDE_CODE_OAUTH_TOKEN is present, ensuring Max subscription auth takes precedence over pay-per-token API auth.

Clearer naming to distinguish wrapper service auth from Claude auth.

Add Claude Code as a new LLM provider option in the web app's LLM abstraction layer. This enables routing AI inference through the claude-code-wrapper service when configured. Changes: - Add CLAUDE_CODE to Provider enum and providerOptions in config.ts - Add claudecode to llmProviderEnum validation in env.ts - Add CLAUDE_CODE_BASE_URL and CLAUDE_CODE_TIMEOUT env vars - Add ClaudeCodeConfig interface and extend SelectModel type - Add Provider.CLAUDE_CODE case in selectModel() switch - Add CLAUDE_CODE to getProviderApiKey() mapping - Register new env vars in turbo.json Note: This commit adds the provider constants and model selection logic. The HTTP client adapter and core function integration will follow in subsequent commits.

Add CLAUDE_CODE_BASE_URL and CLAUDE_CODE_TIMEOUT to the LLM configuration section with documentation about the wrapper service and Max subscription support.

Create HTTP client adapter and LLM wrapper functions for Claude Code provider, enabling the web app to route AI inference through the claude-code-wrapper service. Architecture designed to minimize upstream merge conflicts: - claude-code.ts: HTTP client for wrapper service API - claude-code-llm.ts: LLM wrapper functions (separate from index.ts) - index.ts: Minimal 10-line branch to delegate to new module New files: - apps/web/utils/llms/claude-code.ts - HTTP client with claudeCodeGenerateText() and claudeCodeGenerateObject() - apps/web/utils/llms/claude-code-llm.ts - LLM wrappers that return Vercel AI SDK compatible result shapes Changes to upstream files (minimal): - index.ts: Add imports and delegation branches for Claude Code - package.json: Add zod-to-json-schema dependency This completes Phase 2 - Claude Code is now a fully functional LLM provider option. Set DEFAULT_LLM_PROVIDER=claudecode to use.

Explains why toolCalls, steps, reasoning, and other fields are empty/undefined in Claude Code provider results. These stubs ensure callers can safely access properties without null checks.

Add dedicated test file for Claude Code provider integration with tests for configuration, error handling, and timeout settings. Also add required env var mocks to existing model tests.

- cli.ts: Add 5-minute default timeout to prevent hung processes - cli.ts: Log JSON parse errors instead of silently ignoring - cli.ts: Throw error when no result found instead of fabricating zeros - stream.ts: Log parse failures when falling back to raw text - claude-code.ts: Handle non-JSON error responses gracefully

- Wrapper service now requires API_KEY env var to start (security by default) - HTTP client sends Authorization header with auth key - Web app validates CLAUDE_CODE_WRAPPER_AUTH_KEY when provider is selected - Updated env.ts, turbo.json, .env.example, docker-compose.yml - Added test for missing auth key validation

Add comprehensive test coverage for the Claude Code HTTP client: - ClaudeCodeError class construction and inheritance - claudeCodeGenerateText success and error handling - claudeCodeGenerateObject with Zod schema validation - HTTP error codes (4xx, 5xx, 401 unauthorized) - Timeout/abort behavior - Network failures - Empty response edge cases 24 tests covering all critical paths identified by test reviewer.

Enable conversation continuity across related AI operations by persisting Claude CLI session IDs in Redis. Sessions are scoped by workflow group: - report: All email-report-* tasks share context - rules: Rule creation/management tasks share context - clean: Inbox cleaning operations share context - default: All other standalone tasks Key features: - 30-minute TTL with refresh on each use - Graceful degradation (session failures don't break AI operations) - Automatic session handling (no changes to 40+ AI call sites) Closes: Phase 6 of elie222#8

- Add "server-only" import to prevent client-side usage - Extract session retrieval/persistence into helper functions - Add sessionId to save failure log messages for debugging - Add error propagation tests for Redis failures

Verify session IDs flow through the LLM layer correctly: - Session retrieval before HTTP calls - Session persistence after successful calls - Graceful degradation when Redis fails - Workflow group routing based on label

Enable different models for default vs economy tasks: - Add CLAUDE_CODE_MODEL for default tasks (e.g., 'sonnet') - Add CLAUDE_CODE_ECONOMY_MODEL for high-volume tasks (e.g., 'haiku') Changes: - Wrapper service accepts model parameter in requests - CLI builder passes --model flag to Claude CLI - HTTP client forwards model in request body - selectEconomyModel routes to economy model for Claude Code - Add 3 tests for model selection behavior

Change economy model selection to check ECONOMY_LLM_PROVIDER explicitly instead of inferring from DEFAULT_LLM_PROVIDER. This allows flexibility to use different providers for default vs economy tasks.

Add tests for when DEFAULT_LLM_PROVIDER is claudecode but ECONOMY_LLM_PROVIDER is a different provider (e.g., anthropic). This validates the flexibility to use Claude Code for complex tasks while using API-based providers for high-volume economy tasks.

Make CLAUDE_CODE_MODEL and CLAUDE_CODE_ECONOMY_MODEL optional with sensible defaults: - Default tasks: "sonnet" (Claude Sonnet for complex reasoning) - Economy tasks: "haiku" (Claude Haiku for high-volume/bulk) This eliminates unnecessary configuration while preserving flexibility to override via env vars if needed.

Document the OAuth token env var as a proper entry instead of just a comment. This token is used by the wrapper for Max subscription authentication.

Remove the hardcoded ANTHROPIC_API_KEY: "" that was overriding the env_file value. Now the wrapper correctly supports: - CLAUDE_CODE_OAUTH_TOKEN (Max subscription) - ANTHROPIC_API_KEY (API auth) - OAuth takes precedence if both are set

Clarify that mounting host's ~/.claude directory exposes credentials and should only be used for local development/testing. Production deployments should use env vars (CLAUDE_CODE_OAUTH_TOKEN or ANTHROPIC_API_KEY) instead.

Most users don't need the Claude Code wrapper service. Keep it in its own 'claude-code' profile so it must be explicitly requested.

Automatically selects faster models (haiku) for simple tasks like classification and extraction, while keeping sonnet for complex tasks. This optimization happens at the provider level without modifying upstream code. - Add LABEL_MODEL_OVERRIDES config mapping labels to models - Add getModelForLabel() to select appropriate model per task - Add model logging to wrapper for debugging/verification

Implement text streaming for Claude Code provider with minimal upstream impact: - Add claudeCodeStreamText() HTTP client for SSE streaming from wrapper - Add createClaudeCodeStreamText() adapter with AI SDK compatible interface - Create chat-completion-stream.ts wrapper for provider routing - Change session key from emailAccountId to userEmail for simplicity Upstream changes limited to import path in 2 files: - summarise/controller.ts - compose-autocomplete/route.ts Streaming supports toTextStreamResponse() for summarise and autocomplete. Tool-based features (chat assistant) not yet supported.

- Add comprehensive tests for claudeCodeStreamText() HTTP client function - SSE event parsing (text, session, result, error, done) - Stream consumption and text accumulation - Usage statistics tracking - Error handling (HTTP errors, network failures, null body) - Timeout signal verification - Add streaming adapter tests for createClaudeCodeStreamText() - Session retrieval and persistence - AsyncIterable textStream interface - toTextStreamResponse() HTTP response generation - Message content extraction - onFinish callback invocation - Label-based model override selection - Fix existing tests to use userEmail instead of emailAccountId - Update config expectations to account for getModelForLabel() model injection

Critical fixes: - Fix double stream consumption bug using tee() to split stream - Propagate errors for critical SSE events (session, result, error, done) - Wrap session persistence in try-catch to prevent text delivery failures High priority fixes: - Add usageReceived flag to properly reject usage promise in flush() - Change session failure logging from warn to error for Sentry tracking Medium priority fixes: - Create TextStreamResult interface for explicit AI SDK type compatibility - Capture parse error details in safeJsonParse for better debugging - Separate onFinish callback errors from usage tracking errors - Add logging to toTextStreamResponse error handler - Warn when SSE error events lack error codes

The ModelMessage type from AI SDK defines system message content as string-only. TypeScript was inferring 'never' for the array branch after type narrowing. Simplified system message handling and added proper type assertions for user message content parts.

The Claude CLI requires --verbose when using --output-format stream-json with --print. Without it, the CLI fails immediately on argument validation, causing "CLI exited with code null" errors. Also adds: - Model parameter passthrough to streaming endpoint - Test endpoint for manual streaming verification

- Replace direct string comparison with crypto.timingSafeEqual to prevent timing attacks on token validation - Add Zod schemas for getLearnedPatterns, updateAbout, and addToKnowledgeBase inputs to replace unsafe 'as' casts - Add try-catch error handling around partialUpdateRule, updateRuleActions, and saveLearnedPatterns calls to prevent silent failures

Add debug-level logging when final buffer fails to parse as JSON. This is expected during user cancellations but provides visibility for debugging.

- Test for empty Bearer token returning 401 - Test for getLearnedPatterns with rule not found - Test input validation for getLearnedPatterns, updateAbout, and addToKnowledgeBase with missing required fields

- Move updateRuleConditions, updateRuleActions, and updateLearnedPatterns input schemas from inline definitions to validation.ts - Fix createRule error response format (was using separate error + message fields) - Add literal union for ErrorResponse.code in wrapper types (7 error codes) - Remove unused imports from route.ts

- Add test for EXECUTION_ERROR when tool throws unexpected error - Add test for database error handling in addToKnowledgeBase

- Extract ErrorCode type alias for shared use between ErrorResponse and ClaudeCliError - Update ClaudeCliError.code to use ErrorCode type for type safety - Replace logger.debug with logger.info (debug method doesn't exist) - Update test to use valid error code (INTERNAL_ERROR instead of TEST_CODE)

- Document core vs app-specific code separation - Explain skills system and extensibility - Include complete API reference - Add development and Docker deployment instructions - Prepare for potential open source release

Bun 1.2+ uses lockfile v1 format. Update to latest stable (1.3.3) for security patches and performance improvements. This fixes the "Unknown lockfile version" warning during builds.

- Rename API_KEY to CLAUDE_CODE_WRAPPER_API_KEY for clarity - Extract buildClaudeCodeConfig() and isClaudeCodeAvailable() helpers - Add override logic: when DEFAULT_LLM_PROVIDER=claudecode, use Claude Code for all generateText/generateObject calls regardless of model type - Add comprehensive test coverage (13 tests) to protect against upstream reverts of override logic - Fix Docker container issues: remove read_only, fix volume permissions - Add error logging for CLI failures

vercel · 2025-12-16T22:42:31Z

@matthew-petty is attempting to deploy a commit to the Inbox Zero OSS Program Team on Vercel.

A member of the Team first needs to authorize it.

coderabbitai · 2025-12-16T22:42:39Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds a new Claude Code wrapper service (Bun + Docker) and integrates Claude Code as a first-class LLM provider across the web app: HTTP wrapper (generate/stream/health), CLI orchestration, SSE streaming, Redis session management, an LLM tools proxy endpoint, tests, docs, and deployment config.

Changes

Cohort / File(s)	Summary
Claude Code Wrapper Service `apps/claude-code-wrapper/Dockerfile`, `apps/claude-code-wrapper/package.json`, `apps/claude-code-wrapper/tsconfig.json`, `apps/claude-code-wrapper/vitest.config.ts`	New Bun-based multi-stage Dockerfile, package metadata/scripts, TypeScript config, and Vitest setup for the wrapper service.
Wrapper Documentation & Ignore `apps/claude-code-wrapper/README.md`, `apps/claude-code-wrapper/.claude/skills/inbox-zero-tools/SKILL.md`, `apps/claude-code-wrapper/.gitignore`	README and skill docs for Inbox Zero tools plus .gitignore for wrapper artifacts.
Wrapper CLI & Logger `apps/claude-code-wrapper/src/cli.ts`, `apps/claude-code-wrapper/src/logger.ts`	CLI orchestration: env builder, arg construction, subprocess management, parsing, timeout/ClaudeCliError; simple structured logger.
Wrapper Server & Types `apps/claude-code-wrapper/src/index.ts`, `apps/claude-code-wrapper/src/types.ts`	Express app with API-key auth, request logging, error handling, and zod-based request/response/types.
Wrapper Routes `apps/claude-code-wrapper/src/routes/health.ts`, `apps/claude-code-wrapper/src/routes/generate.ts`, `apps/claude-code-wrapper/src/routes/stream.ts`	Health check (claude --version), /generate-text & /generate-object endpoints, and /stream SSE endpoint with streaming lifecycle, timeout and cleanup.
Wrapper Tests & Helpers `apps/claude-code-wrapper/__tests__/setup.ts`, `apps/claude-code-wrapper/__tests__/helpers.ts`, `apps/claude-code-wrapper/__tests__/cli.test.ts`, `apps/claude-code-wrapper/__tests__/e2e/real-cli.test.ts`, `apps/claude-code-wrapper/__tests__/integration/app.test.ts`, `apps/claude-code-wrapper/__tests__/routes/*`	Extensive unit/integration/e2e tests and helpers for subprocess and SSE simulation.
Web: Claude Code HTTP Client `apps/web/utils/llms/claude-code.ts`, `apps/web/utils/llms/claude-code-http.test.ts`, `apps/web/utils/llms/claude-code.test.ts`	HTTP client for wrapper: generate text/object, streaming SSE handling, ClaudeCodeError, schema validation, SSE parser, and tests.
Web: Claude Code LLM Adapter `apps/web/utils/llms/claude-code-llm.ts`, `apps/web/utils/llms/claude-code-llm.test.ts`	LLM adapter with session management, usage tracking, generate/stream wrappers, and Vercel-compatible response shapes with tests.
Web LLM Routing & Model `apps/web/utils/llms/config.ts`, `apps/web/utils/llms/model.ts`, `apps/web/utils/llms/index.ts`, `apps/web/utils/llms/chat-completion-stream.ts`	Adds Provider.CLAUDE_CODE, build/isAvailable helpers, getModel changes, and chat-completion wrapper to route to Claude Code when configured.
Web Session Store (Redis) `apps/web/utils/redis/claude-code-session.ts`, `apps/web/utils/redis/claude-code-session.test.ts`	Redis-backed session store keyed by workflow group with TTL, plus tests for keying and lifecycle.
Web LLM Tools Proxy `apps/web/app/api/llm-tools/invoke/route.ts`, `apps/web/app/api/llm-tools/invoke/validation.ts`, `apps/web/app/api/llm-tools/invoke/route.test.ts`	New POST /api/llm-tools/invoke route with timing-safe proxy auth, zod validation, email resolution, and handlers for eight tools with tests.
Web Imports & Env `apps/web/app/api/ai/compose-autocomplete/route.ts`, `apps/web/app/api/ai/summarise/controller.ts`, `apps/web/env.ts`, `apps/web/.env.example`, `apps/web/package.json`	Updated imports to chat-completion-stream, added claude-code env vars and provider to env schema and example, and added zod-to-json-schema dep.
Deployment & Build `docker-compose.yml`, `turbo.json`	Docker Compose service for claude-code-wrapper with volume and healthcheck; build env vars added to turbo.json.
Misc `.histfile`	Added command history file.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Web as Web App
    participant Redis as Redis
    participant Wrapper as Claude Code Wrapper
    participant CLI as Claude CLI

    Client->>Web: POST /api/llm-tools/invoke (Bearer token, tool request)
    Web->>Web: Validate token & request body
    Web->>Redis: (optional) get session by label
    Redis-->>Web: sessionId or null
    Web->>Wrapper: POST /generate-text or /stream (prompt, sessionId, model)
    Wrapper->>CLI: spawn claude with args + prompt
    CLI-->>Wrapper: stdout (JSON lines / stream)
    Wrapper->>Web: { text/object, usage, sessionId }
    Web->>Redis: save sessionId (if present)
    Web-->>Client: 200 OK with result

sequenceDiagram
    participant Client as HTTP Client
    participant Wrapper as Claude Code Wrapper
    participant CLI as Claude CLI

    Client->>Wrapper: POST /stream (prompt, optional sessionId)
    Wrapper->>Wrapper: validate & spawn claude (--output-format stream-json)
    Wrapper->>Client: 200 OK + SSE headers
    loop stream
        CLI-->>Wrapper: stdout (JSON line)
        Wrapper->>Wrapper: parse event (session/text/result)
        Wrapper->>Client: SSE event (session/text/result)
    end
    CLI-->>Wrapper: close (exit 0)
    Wrapper->>Client: SSE event done

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Focus review areas:
- apps/claude-code-wrapper/src/cli.ts — subprocess lifecycle, timeout/kill semantics, parsing and error shaping
- apps/claude-code-wrapper/src/routes/stream.ts and apps/web/utils/llms/claude-code.ts — SSE parsing, buffering, and promise coordination
- apps/web/utils/llms/model.ts and routing changes — environment assumptions and provider selection
- apps/web/app/api/llm-tools/invoke/* — validation schemas and multi-handler logic

Possibly related PRs

Fallback to Haiku if service unavailable #279 — Related changes to Claude/LLM provider configuration and routing.
Improved LLM calls #660 — Refactors LLM invocation/routing; overlaps with provider routing and generate/stream delegation.
Migrate to ai sdk v5 #592 — Modifies LLM tools surface and handlers similar to the new tools proxy.

Suggested reviewers

johnlowe399-blip

Poem

🐇 I hopped through ports and spawned a friendly CLI,

Streams sing JSON while sessions hum on by,
Rules find their burrow, tests bound down the track,
Bun boots the wrapper — no carrots left to lack,
🥕 code snug in tunnels, ready for the sky.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly summarizes the main feature being added: Claude Code CLI as a new LLM provider option.
Docstring Coverage	✅ Passed	Docstring coverage is 82.09% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

macroscopeapp · 2025-12-16T22:45:20Z

Add Claude Code CLI provider and route app LLM calls through new authenticated wrapper service and web SDK integrations

Introduce a Bun/Express wrapper exposing /health, /generate-text, /generate-object, and /stream, wire provider selection and streaming in web utils, and add Redis-backed session continuity and an authenticated LLM Tools invoke API with zod validation. Core entry points: apps/claude-code-wrapper/src/index.ts, apps/claude-code-wrapper/src/routes/generate.ts, apps/claude-code-wrapper/src/routes/stream.ts, apps/web/utils/llms/index.ts, and apps/web/utils/llms/chat-completion-stream.ts.

📍Where to Start

Start with the Express server and auth in apps/claude-code-wrapper/src/index.ts, then review request handling in apps/claude-code-wrapper/src/routes/generate.ts and apps/claude-code-wrapper/src/routes/stream.ts, followed by provider routing in apps/web/utils/llms/index.ts and stream wrapper in apps/web/utils/llms/chat-completion-stream.ts.

Macroscope summarized d0a0052.

coderabbitai

Actionable comments posted: 9

🧹 Nitpick comments (31)

apps/claude-code-wrapper/src/routes/health.ts (1)
44-46: Consider consuming stdout/stderr to prevent backpressure.

The spawned process pipes stdout and stderr but never reads from them. While unlikely to cause issues with claude --version (which produces minimal output), unconsumed streams can theoretically cause the child process to block on write if the pipe buffer fills.

Consider consuming the streams:
 const proc = spawn("claude", ["--version"], {
-  stdio: ["ignore", "pipe", "pipe"],
+  stdio: ["ignore", "ignore", "ignore"],
 });
Or if you need the output for logging:
 const proc = spawn("claude", ["--version"], {
   stdio: ["ignore", "pipe", "pipe"],
 });
+
+proc.stdout?.on("data", () => {}); // Drain stdout
+proc.stderr?.on("data", () => {}); // Drain stderr
apps/claude-code-wrapper/src/logger.ts (1)
17-31: Consider adding a space before stringified args.

The args are concatenated directly after the message without a clear separator, which could make logs harder to read when both message and args are present.

Apply this diff to improve readability:
   info: (message: string, ...args: unknown[]) => {
     process.stderr.write(
-      `[INFO] [${new Date().toISOString()}] ${message} ${args.length ? safeStringify(args) : ""}\n`,
+      `[INFO] [${new Date().toISOString()}] ${message}${args.length ? " " + safeStringify(args) : ""}\n`,
     );
   },
   error: (message: string, ...args: unknown[]) => {
     process.stderr.write(
-      `[ERROR] [${new Date().toISOString()}] ${message} ${args.length ? safeStringify(args) : ""}\n`,
+      `[ERROR] [${new Date().toISOString()}] ${message}${args.length ? " " + safeStringify(args) : ""}\n`,
     );
   },
   warn: (message: string, ...args: unknown[]) => {
     process.stderr.write(
-      `[WARN] [${new Date().toISOString()}] ${message} ${args.length ? safeStringify(args) : ""}\n`,
+      `[WARN] [${new Date().toISOString()}] ${message}${args.length ? " " + safeStringify(args) : ""}\n`,
     );
   },
apps/claude-code-wrapper/.claude/skills/inbox-zero-tools/SKILL.md (1)
32-34: Add language specifiers to fenced code blocks.

The static analysis tool flagged these code blocks as missing language specifiers. For consistency and proper rendering, add text or http as the language.
-```
+```text
 POST ${INBOX_ZERO_API_URL}/api/llm-tools/invoke
```diff
-```
+```text
 Authorization: Bearer <LLM_TOOL_PROXY_TOKEN>
 Content-Type: application/json
Also applies to: 48-51

</blockquote></details>
<details>
<summary>apps/claude-code-wrapper/src/index.ts (1)</summary><blockquote>

`55-59`: **Request logging occurs after authentication.**

The request logging middleware is placed after the auth middleware, which means failed auth attempts are not logged. Consider moving the logging middleware before auth if you want visibility into unauthorized access attempts for security monitoring.

</blockquote></details>
<details>
<summary>apps/claude-code-wrapper/__tests__/cli.test.ts (2)</summary><blockquote>

`36-45`: **Environment restoration should use `afterEach` instead of `afterAll`.**

The environment is reset in `beforeEach`, but restored only in `afterAll`. If a test fails mid-execution, subsequent tests in other describe blocks might run with a polluted environment. Consider using `afterEach` for the restoration to ensure proper isolation.

```diff
-    afterAll(() => {
+    afterEach(() => {
       process.env = originalEnv;
     });
Alternatively, keep afterAll but also add cleanup in afterEach for defensive isolation.

173-195: Ensure fake timers are cleaned up on test failure.

If an assertion fails before vi.useRealTimers() is called, subsequent tests may behave unexpectedly. Consider using a try/finally pattern or afterEach to ensure timers are always restored.
 it("times out and kills process when execution exceeds timeout", async () => {
   vi.useFakeTimers();
-  const mockProc = createMockChildProcess();
-  mockSpawn.mockReturnValue(mockProc as never);
-
-  const resultPromise = executeClaudeCli({
-    prompt: "Test",
-    timeoutMs: 1000,
-  });
-
-  // Advance time past the timeout
-  vi.advanceTimersByTime(1001);
-
-  await expect(resultPromise).rejects.toThrow(ClaudeCliError);
-  await expect(resultPromise).rejects.toMatchObject({
-    code: "TIMEOUT_ERROR",
-    message: expect.stringContaining("timed out"),
-  });
-
-  expect(mockProc.kill).toHaveBeenCalledWith("SIGTERM");
-
-  vi.useRealTimers();
+  try {
+    const mockProc = createMockChildProcess();
+    mockSpawn.mockReturnValue(mockProc as never);
+
+    const resultPromise = executeClaudeCli({
+      prompt: "Test",
+      timeoutMs: 1000,
+    });
+
+    // Advance time past the timeout
+    vi.advanceTimersByTime(1001);
+
+    await expect(resultPromise).rejects.toThrow(ClaudeCliError);
+    await expect(resultPromise).rejects.toMatchObject({
+      code: "TIMEOUT_ERROR",
+      message: expect.stringContaining("timed out"),
+    });
+
+    expect(mockProc.kill).toHaveBeenCalledWith("SIGTERM");
+  } finally {
+    vi.useRealTimers();
+  }
 });
apps/claude-code-wrapper/__tests__/e2e/real-cli.test.ts (2)
22-39: Consider making the prompt more deterministic.

The prompt asks Claude to respond with only "test", but the assertion only checks that result.text.length > 0. For more reliable E2E tests, you could assert that the response contains "test" (case-insensitive) to validate the model followed instructions.
       expect(result.text).toBeDefined();
       expect(result.text.length).toBeGreaterThan(0);
+      // Verify model attempted to follow the instruction
+      expect(result.text.toLowerCase()).toContain("test");
This makes the test more meaningful while still being resilient to minor variations.

101-115: Timeout E2E test may be flaky.

A 100ms timeout might occasionally succeed if the CLI responds very quickly (e.g., from cache or error). Consider using an even shorter timeout (e.g., 1ms) or mocking time for more reliable behavior, though that would defeat the E2E purpose.

Alternatively, document that this test may occasionally pass if the CLI fails fast for other reasons.
apps/claude-code-wrapper/__tests__/routes/generate.test.ts (1)
63-97: Consider using vi.useFakeTimers() instead of real setTimeout for more deterministic tests.

The current approach using setTimeout(..., 10) to simulate async CLI responses works but can be flaky under CPU load. Using fake timers would make tests more deterministic and faster.

Example approach:
 it("returns text response on successful CLI execution", async () => {
+  vi.useFakeTimers();
   const mockProc = createMockChildProcess();
   mockSpawn.mockReturnValue(mockProc as never);

   const app = createTestApp();
   const responsePromise = request(app)
     .post("/generate-text")
     .send({ prompt: "Hello" });

-  // Simulate successful CLI response
-  setTimeout(() => {
-    simulateCliSuccess(
-      mockProc,
-      createCliResultJson({
-        result: "Hi there!",
-        total_tokens_in: 5,
-        total_tokens_out: 10,
-        session_id: "session-123",
-      }),
-    );
-  }, 10);
+  // Simulate successful CLI response
+  await vi.advanceTimersByTimeAsync(1);
+  simulateCliSuccess(
+    mockProc,
+    createCliResultJson({
+      result: "Hi there!",
+      total_tokens_in: 5,
+      total_tokens_out: 10,
+      session_id: "session-123",
+    }),
+  );

   const res = await responsePromise;
+  vi.useRealTimers();
   // ...assertions
 });
apps/claude-code-wrapper/__tests__/integration/app.test.ts (1)
233-244: Consider verifying the error response structure more thoroughly.

The malformed JSON test checks for status 500 and generic error message, but it might be worth verifying the error code as well for consistency with other error responses.
     // Express JSON parser error is caught by error handler middleware
     expect(res.status).toBe(500);
     expect(res.body.error).toBe("Internal server error");
+    // Optionally verify error code for consistency
+    expect(res.body.code).toBeDefined();
apps/web/utils/llms/index.ts (1)
176-199: Consider extracting the Claude Code routing logic to reduce duplication.

The shouldUseClaudeCode condition and claudeCodeConfig resolution logic is duplicated between createGenerateText (lines 69-78) and createGenerateObject (lines 181-190). Consider extracting a helper function.
// Helper to determine Claude Code routing
function resolveClaudeCodeConfig(
  modelOptions: ReturnType<typeof getModel>
): { shouldUse: true; config: ClaudeCodeConfig } | { shouldUse: false } {
  const shouldUseClaudeCode =
    (modelOptions.provider === Provider.CLAUDE_CODE &&
      modelOptions.claudeCodeConfig) ||
    (env.DEFAULT_LLM_PROVIDER === Provider.CLAUDE_CODE &&
      isClaudeCodeAvailable());

  if (shouldUseClaudeCode) {
    const config = modelOptions.claudeCodeConfig || buildClaudeCodeConfig();
    return { shouldUse: true, config };
  }
  return { shouldUse: false };
}
apps/web/app/api/llm-tools/invoke/route.test.ts (1)
383-424: Add mock cleanup inside the loop to prevent test pollution.

The for loop iterates through valid tools and calls mockResolvedValueOnce, but if a previous iteration's assertions fail, subsequent iterations may use stale mock state. Consider adding explicit mock resets.
   for (const toolName of validTools) {
+    vi.clearAllMocks(); // Ensure clean state for each tool test
     vi.mocked(prisma.emailAccount.findUnique).mockResolvedValueOnce({
       id: "account-123",
       email: "[email protected]",
       account: { provider: "google" },
     } as any);
apps/web/utils/llms/chat-completion-stream.ts (2)
29-39: Consider making onStepFinish type explicit about Claude Code support limitations.

The onStepFinish callback is included in the interface but won't be called when using Claude Code path. Consider adding a JSDoc note on the property indicating this limitation.
 interface ChatCompletionStreamOptions {
   userAi: UserAIFields;
   modelType?: ModelType;
   messages: ModelMessage[];
   tools?: Record<string, Tool>;
   maxSteps?: number;
   userEmail: string;
   usageLabel: string;
   onFinish?: StreamTextOnFinishCallback<Record<string, Tool>>;
+  /** Note: Not called when using Claude Code provider */
   onStepFinish?: StreamTextOnStepFinishCallback<Record<string, Tool>>;
 }
91-92: Misleading id field usage in emailAccount object.

The id is set to userEmail with a comment saying "id not used", but this creates a confusing object structure. If the id isn't used by Claude Code, consider creating a proper type or using a placeholder that makes this explicit.
-    return createClaudeCodeStreamText({
-      emailAccount: { email: userEmail, id: userEmail }, // id not used, email used for sessions
+    return createClaudeCodeStreamText({
+      emailAccount: { email: userEmail, id: "" }, // id not needed for Claude Code sessions
apps/web/utils/llms/claude-code-override.test.ts (1)
114-124: Use vi.clearAllMocks() instead of vi.resetAllMocks() per coding guidelines.

The coding guidelines specify using vi.clearAllMocks() in beforeEach to clean up mocks between tests. resetAllMocks also resets mock implementations, which may cause unexpected behavior if mocks rely on specific implementations set at module level.
   beforeEach(() => {
-    vi.resetAllMocks();
+    vi.clearAllMocks();
     // Reset to default test configuration
     vi.mocked(env).DEFAULT_LLM_PROVIDER = "claudecode";
apps/web/app/api/llm-tools/invoke/route.ts (2)
208-214: Inconsistent error handling pattern across tool cases.

The getLearnedPatterns case returns an object with error property on validation failure (line 211), while other cases like createRule (line 217) pass unvalidated input directly to the execute function. This creates inconsistent behavior - some tools return error objects, others throw.

Consider validating all inputs consistently in the switch cases before calling execute functions:
     case "createRule": {
+      const parseResult = createRuleSchema(ctx.provider).safeParse(input);
+      if (!parseResult.success) {
+        return { error: `Invalid input: ${parseResult.error.message}` };
+      }
-      return executeCreateRule(input, ctx);
+      return executeCreateRule(parseResult.data, ctx);
+    }
411-416: Logging full error object may expose sensitive information.

The logger.error call with { error } could serialize stack traces or internal details. Use error message only.
   } catch (error) {
     const message = error instanceof Error ? error.message : String(error);
-    logger.error("Failed to create rule", { error });
+    logger.error("Failed to create rule", { error: message });
     return { error: `Failed to create rule: ${message}` };
   }
apps/web/utils/llms/claude-code-llm.test.ts (2)
30-36: Mock ClaudeCodeError missing rawText property from actual implementation.

The mock class doesn't include the rawText property that exists in the actual ClaudeCodeError class (per relevant_code_snippets showing the actual class has rawText?: string). This could cause tests to pass when code relies on rawText being present.
   ClaudeCodeError: class ClaudeCodeError extends Error {
     code: string;
-    constructor(message: string, code: string) {
+    rawText?: string;
+    constructor(message: string, code: string, rawText?: string) {
       super(message);
       this.code = code;
+      this.rawText = rawText;
     }
   },
420-429: Using setTimeout for async waiting can cause flaky tests.

The test uses setTimeout(r, 10) to wait for async session save. This could be flaky under load. Consider using vi.waitFor or flushing promises more reliably.
-      // Wait for async session save
-      await new Promise((r) => setTimeout(r, 10));
+      // Flush pending promises
+      await vi.waitFor(() => {
+        expect(mockSaveSession).toHaveBeenCalled();
+      });
apps/claude-code-wrapper/src/routes/stream.ts (1)
221-231: Empty catch block should use _ to indicate intentional discard.

The empty catch block at Line 221 silently discards the error variable. Use catch (_) or catch (_error) to signal intent.
-      } catch {
+      } catch (_) {
apps/claude-code-wrapper/__tests__/routes/stream.test.ts (1)

66-68: Consider using vi.advanceTimersByTimeAsync for setTimeout in tests.

The tests use real setTimeout with 50ms delays. While this works, using fake timers with vi.advanceTimersByTimeAsync would make tests faster and more deterministic. However, given the note about supertest limitations with fake timers (Lines 440-450), the current approach is reasonable.
apps/web/utils/llms/claude-code-http.test.ts (1)
637-641: Consider extracting stream consumption to a helper.

The pattern of consuming the stream reader appears multiple times (Lines 637-641, 670-673, 700-704, 735-739, 839-842, 942-947). A helper function would reduce duplication.
async function consumeStream(reader: ReadableStreamDefaultReader<string>): Promise<string[]> {
  const chunks: string[] = [];
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    chunks.push(value);
  }
  return chunks;
}
apps/web/utils/redis/claude-code-session.ts (2)
69-74: Consider sanitizing userEmail in Redis key.

While userEmail is likely validated upstream, directly interpolating it into the Redis key could be problematic if it contains special characters like :. Consider URL-encoding or validating the format.
 function getSessionKey(
   userEmail: string,
   workflowGroup: WorkflowGroup,
 ): string {
-  return `claude-session:${userEmail}:${workflowGroup}`;
+  // Encode email to handle special characters safely in Redis key
+  const safeEmail = encodeURIComponent(userEmail);
+  return `claude-session:${safeEmail}:${workflowGroup}`;
 }
36-55: Workflow label mapping uses string literals prone to typos.

The hardcoded label strings (e.g., "email-report-email-behavior") are scattered between this mapping and presumably other parts of the codebase. Consider extracting these as constants or using a more type-safe approach to prevent silent failures from typos.
apps/web/utils/llms/model.ts (2)
247-259: Type assertion null as unknown as LanguageModelV2 is a code smell.

While this works, setting model to null with a type assertion breaks the type contract of SelectModel. Consider making the model field optional or using a discriminated union type based on the provider.
 export type SelectModel = {
   provider: string;
   modelName: string;
-  model: LanguageModelV2;
+  model: LanguageModelV2 | null;
   providerOptions?: Record<string, any>;
-  backupModel: LanguageModelV2 | null;
+  backupModel: LanguageModelV2 | null;
   /** Configuration for Claude Code provider (only set when provider is CLAUDE_CODE) */
   claudeCodeConfig?: ClaudeCodeConfig;
 };
Alternatively, use a discriminated union:
export type SelectModel = 
  | { provider: typeof Provider.CLAUDE_CODE; modelName: string; model: null; claudeCodeConfig: ClaudeCodeConfig; backupModel: null; }
  | { provider: Exclude<string, typeof Provider.CLAUDE_CODE>; modelName: string; model: LanguageModelV2; backupModel: LanguageModelV2 | null; claudeCodeConfig?: never; };
442-446: Placeholder string "claude-code-wrapper" should be a constant.

The placeholder value is hardcoded. Consider extracting it as a named constant for clarity and to avoid magic strings.
+const CLAUDE_CODE_PLACEHOLDER_KEY = "claude-code-wrapper";
+
 function getProviderApiKey(provider: string) {
   const providerApiKeys: Record<string, string | undefined> = {
     // ... other providers
     [Provider.CLAUDE_CODE]: env.CLAUDE_CODE_BASE_URL
-      ? "claude-code-wrapper"
+      ? CLAUDE_CODE_PLACEHOLDER_KEY
       : undefined,
   };
apps/claude-code-wrapper/src/cli.ts (2)
85-105: Minor: SIGKILL fallback timeout is not cleaned up.

If the process exits gracefully after SIGTERM but before the 1-second SIGKILL fallback fires, that inner timeout remains scheduled. While functionally benign (killing an already-dead process is a no-op), consider storing the timeout reference and clearing it in the close handler for cleaner resource management.
     if (timeoutMs > 0) {
       timeoutId = setTimeout(() => {
         if (!isSettled) {
           isSettled = true;
           logger.error("Claude CLI execution timed out", {
             timeoutMs,
             prompt: options.prompt.slice(0, 100),
           });
           claude.kill("SIGTERM");
           // Give it a moment to clean up, then force kill
-          setTimeout(() => claude.kill("SIGKILL"), 1000);
+          const killTimeoutId = setTimeout(() => claude.kill("SIGKILL"), 1000);
+          claude.once("close", () => clearTimeout(killTimeoutId));
           reject(
             new ClaudeCliError(
227-235: Consider reducing log level for expected non-JSON lines.

The comment states "non-JSON lines are expected in some outputs", but logging at warn level will generate noise. Consider using logger.trace instead, which aligns with the comment's expectation.
     } catch (error) {
       parseErrorCount++;
       // Log at debug level since non-JSON lines are expected in some outputs
-      logger.warn("Failed to parse CLI output line as JSON", {
+      logger.trace("Failed to parse CLI output line as JSON", {
         line: line.slice(0, 200),
         error: error instanceof Error ? error.message : "Unknown error",
       });
     }
apps/web/utils/llms/claude-code.ts (1)
413-432: Defensive flush handling with potential double resolve.

The flush handler resolves textPromise at line 431 even if it was already resolved in the "done" event at line 372. While Promise semantics make subsequent resolve() calls no-ops, consider adding a textResolved flag for consistency with sessionIdReceived and usageReceived to make the intent clearer.
   let accumulatedText = "";
   let sessionIdReceived = false;
   let usageReceived = false;
+  let textResolved = false;
   
   // ... in "done" case:
             case "done": {
               // Stream complete, resolve the text promise
+              textResolved = true;
               resolveText(accumulatedText);
   
   // ... in flush:
       flush() {
         // ...
-        resolveText(accumulatedText);
+        if (!textResolved) {
+          resolveText(accumulatedText);
+        }
       },
apps/web/utils/llms/claude-code-llm.ts (2)
29-32: Refactor type aliases to avoid any.

The any types are overly permissive. Based on the usage patterns in the code, these functions accept and return objects with specific shapes ({ system?, prompt, schema? } for input and { text/object, usage, ... } for output). Consider defining explicit interfaces instead.

Apply this pattern:
-// biome-ignore lint/suspicious/noExplicitAny: Complex AI SDK types require flexibility
-type ClaudeCodeGenerateTextFn = (...args: any[]) => Promise<any>;
-// biome-ignore lint/suspicious/noExplicitAny: Complex AI SDK types require flexibility
-type ClaudeCodeGenerateObjectFn = (...args: any[]) => Promise<any>;
+interface GenerateTextOptions {
+  system?: string;
+  prompt: string | ModelMessage[];
+}
+
+interface GenerateTextResult {
+  text: string;
+  usage: LanguageModelUsage;
+  finishReason: string;
+  // ... other fields
+}
+
+type ClaudeCodeGenerateTextFn = (options: GenerateTextOptions) => Promise<GenerateTextResult>;
+
+interface GenerateObjectOptions<T> extends GenerateTextOptions {
+  schema: unknown; // or more specific Zod schema type
+}
+
+interface GenerateObjectResult<T> {
+  object: T;
+  usage: LanguageModelUsage;
+  // ... other fields
+}
+
+type ClaudeCodeGenerateObjectFn = <T>(options: GenerateObjectOptions<T>) => Promise<GenerateObjectResult<T>>;
146-254: LGTM with optional refactor opportunity.

The implementation correctly handles session management, usage tracking, and error logging. The extensive stub values (lines 220-242) are well-documented and match the Vercel AI SDK interface requirements.

Consider extracting the common result structure into a helper function to reduce duplication with createClaudeCodeGenerateObject (lines 220-242 vs 342-359):
function createGenerateResultBase(
  modelName: string,
  sessionId: string,
  usage: LanguageModelUsage
) {
  return {
    finishReason: "stop" as const,
    usage,
    request: {},
    response: {
      id: sessionId,
      timestamp: new Date(),
      modelId: modelName,
      headers: {},
      body: undefined,
    },
    warnings: [],
    providerMetadata: undefined,
    experimental_providerMetadata: undefined,
  };
}

apps/claude-code-wrapper/src/routes/generate.ts

apps/web/.env.example

apps/web/app/api/llm-tools/invoke/route.ts

apps/web/app/api/llm-tools/invoke/validation.ts

apps/web/package.json

apps/web/utils/llms/claude-code-llm.ts

apps/web/utils/llms/index.ts

cubic-dev-ai

5 issues found across 48 files

Prompt for AI agents (all 5 issues)


Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="apps/claude-code-wrapper/__tests__/setup.ts">

<violation number="1" location="apps/claude-code-wrapper/__tests__/setup.ts:19">
P2: Console mocks set at module level will be undone by `vi.restoreAllMocks()` after the first test. Move the console spies inside `beforeEach` so they&#39;re re-established before each test.</violation>
</file>

<file name="apps/web/utils/llms/model.ts">

<violation number="1" location="apps/web/utils/llms/model.ts:443">
P2: Inconsistent availability check - `getProviderApiKey` only checks `CLAUDE_CODE_BASE_URL` while `isClaudeCodeAvailable()` and `buildClaudeCodeConfig` require both `CLAUDE_CODE_BASE_URL` and `CLAUDE_CODE_WRAPPER_API_KEY`. Consider checking both for consistency.</violation>
</file>

<file name="apps/claude-code-wrapper/Dockerfile">

<violation number="1" location="apps/claude-code-wrapper/Dockerfile:54">
P2: Pin the `@anthropic-ai/claude-code` package version for reproducible builds. Without version pinning, builds at different times may pull different CLI versions with potentially breaking changes.</violation>
</file>

<file name="apps/web/app/api/llm-tools/invoke/validation.ts">

<violation number="1" location="apps/web/app/api/llm-tools/invoke/validation.ts:123">
P2: Missing `ActionType.MOVE_FOLDER` from the action type enum. The Prisma schema defines `MOVE_FOLDER` as a valid action type, and the `fields` object already includes `folderName` which is used by this action type.</violation>
</file>

<file name="apps/web/utils/llms/claude-code.ts">

<violation number="1" location="apps/web/utils/llms/claude-code.ts:188">
P1: Creating a new `TextDecoder()` per chunk will incorrectly decode multi-byte UTF-8 characters (emoji, non-ASCII) that span chunk boundaries. The decoder should be instantiated once and reused with `{ stream: true }` option.</violation>
</file>

_{Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR}

apps/claude-code-wrapper/__tests__/setup.ts

apps/web/utils/llms/model.ts

apps/claude-code-wrapper/Dockerfile

apps/web/app/api/llm-tools/invoke/validation.ts

apps/web/utils/llms/claude-code.ts

Replace greedy regex patterns with non-greedy global matches that iterate through candidates. This prevents incorrect extraction when responses contain multiple JSON fragments or trailing text.

- Use generic error message instead of including userEmail in 404 response - Remove userEmail from tool invocation logs (emailAccountId suffices for debugging)

All ruleName fields now require non-empty strings with min(1) validation, matching the existing pattern in getLearnedPatternsInputSchema.

Import and use the proper WorkflowGroup type throughout the session management functions, eliminating the need for unsafe type assertions.

Add isTextContentPart type guard to validate content part objects before accessing properties, replacing unsafe type assertion.

Creating a new TextDecoder per chunk incorrectly decodes multi-byte UTF-8 characters (emojis, CJK, etc.) that span chunk boundaries. Fix by instantiating decoder once and using { stream: true } option.

Both BASE_URL and WRAPPER_API_KEY are now required, matching the isClaudeCodeAvailable() check.

The action type enum was missing MOVE_FOLDER, which is a valid Prisma ActionType. The fields object already includes folderName.

Console mocks at module level are undone by vi.restoreAllMocks() in afterEach. Moving them into beforeEach ensures they are re-established before each test.

Replace the unsafe `!` assertion with proper fallback chain matching buildClaudeCodeConfig() logic: config.model || env var || "sonnet"

Test was expecting error message to contain the email address, but we removed that for security (commit 3e6de34).

matthew-petty · 2025-12-17T00:16:20Z

✅ All Tests Pass

Verified locally after pulling all changes - all tests pass successfully.

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (1)

apps/claude-code-wrapper/src/routes/generate.ts (1)

77-84: Consider sanitizing validation error details.

Same as the /generate-text endpoint, the validation error response exposes the full Zod error structure. Consider returning a more generic error message.

🧹 Nitpick comments (3)

apps/web/utils/llms/claude-code.ts (1)
415-435: Potential double resolution of text promise.

The resolveText(accumulatedText) is called both in the "done" event handler (line 374) and unconditionally in flush() (line 433). While JavaScript promises ignore subsequent resolutions, this indicates unclear intent. Consider tracking whether the "done" event was received to avoid the redundant call.

Apply this diff:
   let accumulatedText = "";
   let sessionIdReceived = false;
   let usageReceived = false;
+  let doneReceived = false;

   // ... in transform switch case "done":
             case "done": {
               // Stream complete, resolve the text promise
+              doneReceived = true;
               resolveText(accumulatedText);
               // ...
             }

   // ... in flush():
       flush() {
         // Handle promises that were never resolved/rejected during stream
         if (!sessionIdReceived) {
           rejectSessionId(/* ... */);
         }
         if (!usageReceived) {
           rejectUsage(/* ... */);
         }
-        resolveText(accumulatedText);
+        if (!doneReceived) {
+          resolveText(accumulatedText);
+        }
       },
apps/web/utils/llms/claude-code-llm.ts (1)
590-601: Consider calling onFinish even when usage tracking fails.

The onFinish callback is only invoked if usage is defined (line 591). If usage tracking fails, the callback is silently skipped. Consider whether onFinish should be called regardless, potentially with partial data, since callers may need to perform cleanup or logging.

Consider this alternative that always calls onFinish:
     // Call onFinish callback if provided - always attempt even if usage tracking failed
-    if (onFinish && usage) {
+    if (onFinish) {
       try {
-        await onFinish({ text, usage });
+        await onFinish({ text, usage: usage ?? { inputTokens: 0, outputTokens: 0, totalTokens: 0 } });
       } catch (onFinishError) {
apps/claude-code-wrapper/src/routes/generate.ts (1)
24-31: Consider sanitizing validation error details.

The validation error response exposes the full Zod error structure via JSON.stringify(parseResult.error.issues), which could reveal internal schema details. While this is an internal service with API key auth, consider returning a more generic error message for production use.

Consider applying this pattern:
     if (!parseResult.success) {
       res.status(400).json({
-        error: "Invalid request body",
+        error: "Invalid request body. Check required fields: prompt, model, etc.",
         code: "VALIDATION_ERROR",
-        rawText: JSON.stringify(parseResult.error.issues),
       });
       return;
     }

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dc11d13 and 6136c69.

📒 Files selected for processing (10)

.histfile (1 hunks)
apps/claude-code-wrapper/__tests__/setup.ts (1 hunks)
apps/claude-code-wrapper/src/routes/generate.ts (1 hunks)
apps/web/app/api/llm-tools/invoke/route.test.ts (1 hunks)
apps/web/app/api/llm-tools/invoke/route.ts (1 hunks)
apps/web/app/api/llm-tools/invoke/validation.ts (1 hunks)
apps/web/utils/llms/claude-code-llm.ts (1 hunks)
apps/web/utils/llms/claude-code.ts (1 hunks)
apps/web/utils/llms/index.ts (4 hunks)
apps/web/utils/llms/model.ts (4 hunks)

✅ Files skipped from review due to trivial changes (1)

.histfile

🚧 Files skipped from review as they are similar to previous changes (1)

apps/claude-code-wrapper/tests/setup.ts

🧰 Additional context used

📓 Path-based instructions (22)

**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/data-fetching.mdc)

**/*.{ts,tsx}: For API GET requests to server, use the swr package
Use result?.serverError with toastError from @/components/Toast for error handling in async operations

**/*.{ts,tsx}: Use wrapper functions for Gmail message operations (get, list, batch, etc.) from @/utils/gmail/message.ts instead of direct API calls
Use wrapper functions for Gmail thread operations from @/utils/gmail/thread.ts instead of direct API calls
Use wrapper functions for Gmail label operations from @/utils/gmail/label.ts instead of direct API calls

**/*.{ts,tsx}: For early access feature flags, create hooks using the naming convention use[FeatureName]Enabled that return a boolean from useFeatureFlagEnabled("flag-key")
For A/B test variant flags, create hooks using the naming convention use[FeatureName]Variant that define variant types, use useFeatureFlagVariantKey() with type casting, and provide a default "control" fallback
Use kebab-case for PostHog feature flag keys (e.g., inbox-cleaner, pricing-options-2)
Always define types for A/B test variant flags (e.g., type PricingVariant = "control" | "variant-a" | "variant-b") and provide type safety through type casting

**/*.{ts,tsx}: Don't use primitive type aliases or misleading types
Don't use empty type parameters in type aliases and interfaces
Don't use this and super in static contexts
Don't use any or unknown as type constraints
Don't use the TypeScript directive @ts-ignore
Don't use TypeScript enums
Don't export imported variables
Don't add type annotations to variables, parameters, and class properties that are initialized with literal expressions
Don't use TypeScript namespaces
Don't use non-null assertions with the ! postfix operator
Don't use parameter properties in class constructors
Don't use user-defined types
Use as const instead of literal types and type annotations
Use either T[] or Array<T> consistently
Initialize each enum member value explicitly
Use export type for types
Use `impo...

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/app/api/llm-tools/invoke/route.ts
apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

**/{pages,routes,components}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/gmail-api.mdc)

Never call Gmail API directly from routes or components - always use wrapper functions from the utils folder

Files:

apps/claude-code-wrapper/src/routes/generate.ts

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/prisma-enum-imports.mdc)

Always import Prisma enums from @/generated/prisma/enums instead of @/generated/prisma/client to avoid Next.js bundling errors in client components

Import Prisma using the project's centralized utility: import prisma from '@/utils/prisma'

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/app/api/llm-tools/invoke/route.ts
apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/security.mdc)

**/*.ts: ALL database queries MUST be scoped to the authenticated user/account by including user/account filtering in WHERE clauses to prevent unauthorized data access
Always validate that resources belong to the authenticated user before performing operations, using ownership checks in WHERE clauses or relationships
Always validate all input parameters for type, format, and length before using them in database queries
Use SafeError for error responses to prevent information disclosure. Generic error messages should not reveal internal IDs, logic, or resource ownership details
Only return necessary fields in API responses using Prisma's select option. Never expose sensitive data such as password hashes, private keys, or system flags
Prevent Insecure Direct Object References (IDOR) by validating resource ownership before operations. All findUnique/findFirst calls MUST include ownership filters
Prevent mass assignment vulnerabilities by explicitly whitelisting allowed fields in update operations instead of accepting all user-provided data
Prevent privilege escalation by never allowing users to modify system fields, ownership fields, or admin-only attributes through user input
All findMany queries MUST be scoped to the user's data by including appropriate WHERE filters to prevent returning data from other users
Use Prisma relationships for access control by leveraging nested where clauses (e.g., emailAccount: { id: emailAccountId }) to validate ownership

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/app/api/llm-tools/invoke/route.ts
apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

**/*.{tsx,ts}

📄 CodeRabbit inference engine (.cursor/rules/ui-components.mdc)

**/*.{tsx,ts}: Use Shadcn UI and Tailwind for components and styling
Use next/image package for images
For API GET requests to server, use the swr package with hooks like useSWR to fetch data
For text inputs, use the Input component with registerProps for form integration and error handling

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/app/api/llm-tools/invoke/route.ts
apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

**/*.{tsx,ts,css}

📄 CodeRabbit inference engine (.cursor/rules/ui-components.mdc)

Implement responsive design with Tailwind CSS using a mobile-first approach

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/app/api/llm-tools/invoke/route.ts
apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

**/*.{js,jsx,ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)

**/*.{js,jsx,ts,tsx}: Don't use accessKey attribute on any HTML element
Don't set aria-hidden="true" on focusable elements
Don't add ARIA roles, states, and properties to elements that don't support them
Don't use distracting elements like <marquee> or <blink>
Only use the scope prop on <th> elements
Don't assign non-interactive ARIA roles to interactive HTML elements
Make sure label elements have text content and are associated with an input
Don't assign interactive ARIA roles to non-interactive HTML elements
Don't assign tabIndex to non-interactive HTML elements
Don't use positive integers for tabIndex property
Don't include "image", "picture", or "photo" in img alt prop
Don't use explicit role property that's the same as the implicit/default role
Make static elements with click handlers use a valid role attribute
Always include a title element for SVG elements
Give all elements requiring alt text meaningful information for screen readers
Make sure anchors have content that's accessible to screen readers
Assign tabIndex to non-interactive HTML elements with aria-activedescendant
Include all required ARIA attributes for elements with ARIA roles
Make sure ARIA properties are valid for the element's supported roles
Always include a type attribute for button elements
Make elements with interactive roles and handlers focusable
Give heading elements content that's accessible to screen readers (not hidden with aria-hidden)
Always include a lang attribute on the html element
Always include a title attribute for iframe elements
Accompany onClick with at least one of: onKeyUp, onKeyDown, or onKeyPress
Accompany onMouseOver/onMouseOut with onFocus/onBlur
Include caption tracks for audio and video elements
Use semantic elements instead of role attributes in JSX
Make sure all anchors are valid and navigable
Ensure all ARIA properties (aria-*) are valid
Use valid, non-abstract ARIA roles for elements with ARIA roles
Use valid AR...

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/app/api/llm-tools/invoke/route.ts
apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

!(pages/_document).{jsx,tsx}

📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)

Don't use the next/head module in pages/_document.js on Next.js projects

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/app/api/llm-tools/invoke/route.ts
apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

**/*.{js,ts,jsx,tsx}

📄 CodeRabbit inference engine (.cursor/rules/utilities.mdc)

**/*.{js,ts,jsx,tsx}: Use lodash utilities for common operations (arrays, objects, strings)
Import specific lodash functions to minimize bundle size (e.g., import groupBy from 'lodash/groupBy')

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/app/api/llm-tools/invoke/route.ts
apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

apps/web/**/*.{ts,tsx}

📄 CodeRabbit inference engine (apps/web/CLAUDE.md)

apps/web/**/*.{ts,tsx}: Use TypeScript with strict null checks
Use @/ path aliases for imports from project root
Use proper error handling with try/catch blocks
Format code with Prettier
Follow consistent naming conventions using PascalCase for components
Centralize shared types in dedicated type files

Import specific lodash functions rather than entire lodash library to minimize bundle size (e.g., import groupBy from 'lodash/groupBy')

Files:

apps/web/app/api/llm-tools/invoke/route.ts
apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

apps/web/app/**/*.{ts,tsx}

📄 CodeRabbit inference engine (apps/web/CLAUDE.md)

Follow NextJS app router structure with (app) directory

Files:

apps/web/app/api/llm-tools/invoke/route.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/app/api/llm-tools/invoke/validation.ts

apps/web/app/api/**/*.ts

📄 CodeRabbit inference engine (apps/web/CLAUDE.md)

apps/web/app/api/**/*.ts: Wrap GET API routes with withAuth or withEmailAccount middleware for authentication
Export response types from GET API routes using Awaited<ReturnType<>> pattern for type-safe client usage

Files:

apps/web/app/api/llm-tools/invoke/route.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/app/api/llm-tools/invoke/validation.ts

apps/web/app/api/**/route.ts

📄 CodeRabbit inference engine (.cursor/rules/fullstack-workflow.mdc)

apps/web/app/api/**/route.ts: Create GET API routes using withAuth or withEmailAccount middleware in apps/web/app/api/*/route.ts, export response types as GetExampleResponse type alias for client-side type safety
Always export response types from GET routes as Get[Feature]Response using type inference from the data fetching function for type-safe client consumption
Do NOT use POST API routes for mutations - always use server actions with next-safe-action instead

Files:

apps/web/app/api/llm-tools/invoke/route.ts

**/app/**/route.ts

📄 CodeRabbit inference engine (.cursor/rules/get-api-route.mdc)

**/app/**/route.ts: Always wrap GET API route handlers with withAuth or withEmailAccount middleware for consistent error handling and authentication in Next.js App Router
Infer and export response type for GET API routes using Awaited<ReturnType<typeof functionName>> pattern in Next.js
Use Prisma for database queries in GET API routes
Return responses using NextResponse.json() in GET API routes
Do not use try/catch blocks in GET API route handlers when using withAuth or withEmailAccount middleware, as the middleware handles error handling

Files:

apps/web/app/api/llm-tools/invoke/route.ts

**/{server,api,actions,utils}/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/logging.mdc)

**/{server,api,actions,utils}/**/*.ts: Use createScopedLogger from "@/utils/logger" for logging in backend code
Add the createScopedLogger instantiation at the top of the file with an appropriate scope name
Use .with() method to attach context variables only within specific functions, not on global loggers
For large functions with reused variables, use createScopedLogger().with() to attach context once and reuse the logger without passing variables repeatedly

Files:

apps/web/app/api/llm-tools/invoke/route.ts
apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

apps/web/app/**/[!.]*/route.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/project-structure.mdc)

Use kebab-case for route directories in Next.js App Router (e.g., api/hello-world/route)

Files:

apps/web/app/api/llm-tools/invoke/route.ts

apps/web/app/api/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/security-audit.mdc)

apps/web/app/api/**/*.{ts,tsx}: API routes must use withAuth, withEmailAccount, or withError middleware for authentication
All database queries must include user scoping with emailAccountId or userId filtering in WHERE clauses
Request parameters must be validated before use; avoid direct parameter usage without type checking
Use generic error messages instead of revealing internal details; throw SafeError instead of exposing user IDs, resource IDs, or system information
API routes should only return necessary fields using select in database queries to prevent unintended information disclosure
Cron endpoints must use hasCronSecret or hasPostCronSecret to validate cron requests and prevent unauthorized access
Request bodies should use Zod schemas for validation to ensure type safety and prevent injection attacks

Files:

apps/web/app/api/llm-tools/invoke/route.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/app/api/llm-tools/invoke/validation.ts

**/app/api/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/security.mdc)

**/app/api/**/*.ts: ALL API routes that handle user data MUST use appropriate middleware: use withEmailAccount for email-scoped operations, use withAuth for user-scoped operations, or use withError with proper validation for public/custom auth endpoints
Use withEmailAccount middleware for operations scoped to a specific email account, including reading/writing emails, rules, schedules, or any operation using emailAccountId
Use withAuth middleware for user-level operations such as user settings, API keys, and referrals that use only userId
Use withError middleware only for public endpoints, custom authentication logic, or cron endpoints. For cron endpoints, MUST use hasCronSecret() or hasPostCronSecret() validation
Cron endpoints without proper authentication can be triggered by anyone. CRITICAL: All cron endpoints MUST validate cron secret using hasCronSecret(request) or hasPostCronSecret(request) and capture unauthorized attempts with captureException()
Always validate request bodies using Zod schemas to ensure type safety and prevent invalid data from reaching database operations
Maintain consistent error response format across all API routes to avoid information disclosure while providing meaningful error feedback

Files:

apps/web/app/api/llm-tools/invoke/route.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/app/api/llm-tools/invoke/validation.ts

apps/web/{utils/ai,utils/llms,__tests__}/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/llm.mdc)

LLM-related code must be organized in specific directories: apps/web/utils/ai/ for main implementations, apps/web/utils/llms/ for core utilities and configurations, and apps/web/__tests__/ for LLM-specific tests

Files:

apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

apps/web/utils/llms/{index,model}.ts

📄 CodeRabbit inference engine (.cursor/rules/llm.mdc)

Core LLM functionality must be defined in utils/llms/index.ts, model definitions and configurations in utils/llms/model.ts, and usage tracking in utils/usage.ts

Files:

apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts

**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/testing.mdc)

**/*.test.{ts,tsx}: Use vitest for testing the application
Tests should be colocated next to the tested file with .test.ts or .test.tsx extension (e.g., dir/format.ts and dir/format.test.ts)
Mock server-only using vi.mock("server-only", () => ({}))
Mock Prisma using vi.mock("@/utils/prisma") and import the mock from @/utils/__mocks__/prisma
Use vi.clearAllMocks() in beforeEach to clean up mocks between tests
Each test should be independent
Use descriptive test names
Mock external dependencies in tests
Do not mock the Logger
Avoid testing implementation details
Use test helpers getEmail, getEmailAccount, and getRule from @/__tests__/helpers for mocking emails, accounts, and rules

Files:

apps/web/app/api/llm-tools/invoke/route.test.ts

**/*.{test,spec}.{js,jsx,ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)

**/*.{test,spec}.{js,jsx,ts,tsx}: Don't nest describe() blocks too deeply in test files
Don't use callbacks in asynchronous tests and hooks
Don't have duplicate hooks in describe blocks
Don't use export or module.exports in test files
Don't use focused tests
Make sure the assertion function, like expect, is placed inside an it() function call
Don't use disabled tests

Files:

apps/web/app/api/llm-tools/invoke/route.test.ts

🧠 Learnings (57)

📚 Learning: 2025-11-25T14:37:09.306Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/fullstack-workflow.mdc:0-0
Timestamp: 2025-11-25T14:37:09.306Z
Learning: Applies to apps/web/app/api/**/route.ts : Create GET API routes using `withAuth` or `withEmailAccount` middleware in `apps/web/app/api/*/route.ts`, export response types as `GetExampleResponse` type alias for client-side type safety

Applied to files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:39:27.909Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security.mdc:0-0
Timestamp: 2025-11-25T14:39:27.909Z
Learning: Applies to **/app/api/**/*.ts : Maintain consistent error response format across all API routes to avoid information disclosure while providing meaningful error feedback

Applied to files:

apps/claude-code-wrapper/src/routes/generate.ts

📚 Learning: 2025-11-25T14:39:23.326Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security.mdc:0-0
Timestamp: 2025-11-25T14:39:23.326Z
Learning: Applies to **/*.ts : Prevent Insecure Direct Object References (IDOR) by validating resource ownership in all queries - always include ownership filters (e.g., `emailAccount: { id: emailAccountId }`) when accessing user-specific resources

Applied to files:

apps/claude-code-wrapper/src/routes/generate.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : LLM feature functions must follow a standard structure: accept options with `inputData` and `emailAccount` parameters, implement input validation with early returns, define separate system and user prompts, create a Zod schema for response validation, and use `createGenerateObject` to execute the LLM call

Applied to files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/app/api/llm-tools/invoke/route.ts
apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

📚 Learning: 2025-11-25T14:37:09.306Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/fullstack-workflow.mdc:0-0
Timestamp: 2025-11-25T14:37:09.306Z
Learning: Applies to apps/web/app/api/**/route.ts : Do NOT use POST API routes for mutations - always use server actions with `next-safe-action` instead

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:39:04.892Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security-audit.mdc:0-0
Timestamp: 2025-11-25T14:39:04.892Z
Learning: Applies to apps/web/app/api/**/route.ts : Use Zod schemas for request body validation in API routes

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:38:56.992Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/project-structure.mdc:0-0
Timestamp: 2025-11-25T14:38:56.992Z
Learning: Applies to apps/web/app/**/[!.]*/route.{ts,tsx} : Use kebab-case for route directories in Next.js App Router (e.g., `api/hello-world/route`)

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:39:04.892Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security-audit.mdc:0-0
Timestamp: 2025-11-25T14:39:04.892Z
Learning: Applies to apps/web/app/api/**/route.ts : All API routes must use `withAuth`, `withEmailAccount`, or `withError` middleware for authentication

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:39:04.892Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security-audit.mdc:0-0
Timestamp: 2025-11-25T14:39:04.892Z
Learning: Applies to apps/web/app/api/**/route.ts : Request parameters must be validated before use; direct parameter usage without validation is prohibited

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:39:04.892Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security-audit.mdc:0-0
Timestamp: 2025-11-25T14:39:04.892Z
Learning: Applies to apps/web/app/api/**/route.ts : Cron endpoints must use `hasCronSecret` or `hasPostCronSecret` middleware to validate cron job authenticity

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : LLM feature functions must import from `zod` for schema validation, use `createScopedLogger` from `@/utils/logger`, `chatCompletionObject` and `createGenerateObject` from `@/utils/llms`, and import `EmailAccountWithAI` type from `@/utils/llms/types`

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts
apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

📚 Learning: 2025-11-25T14:39:27.909Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security.mdc:0-0
Timestamp: 2025-11-25T14:39:27.909Z
Learning: Applies to **/app/api/**/*.ts : ALL API routes that handle user data MUST use appropriate middleware: use `withEmailAccount` for email-scoped operations, use `withAuth` for user-scoped operations, or use `withError` with proper validation for public/custom auth endpoints

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:39:23.326Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security.mdc:0-0
Timestamp: 2025-11-25T14:39:23.326Z
Learning: Applies to app/api/**/*.ts : Use `SafeError` for error responses to prevent information disclosure - provide generic messages (e.g., 'Rule not found' not 'Rule {id} does not exist for user {userId}') without revealing internal IDs or ownership details

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:39:27.909Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security.mdc:0-0
Timestamp: 2025-11-25T14:39:27.909Z
Learning: Applies to **/*.ts : Use SafeError for error responses to prevent information disclosure. Generic error messages should not reveal internal IDs, logic, or resource ownership details

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:39:08.150Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security-audit.mdc:0-0
Timestamp: 2025-11-25T14:39:08.150Z
Learning: Applies to apps/web/app/api/**/*.{ts,tsx} : Use generic error messages instead of revealing internal details; throw `SafeError` instead of exposing user IDs, resource IDs, or system information

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-07-08T13:14:07.449Z

Learnt from: elie222
Repo: elie222/inbox-zero PR: 537
File: apps/web/app/(app)/[emailAccountId]/clean/onboarding/page.tsx:30-34
Timestamp: 2025-07-08T13:14:07.449Z
Learning: The clean onboarding page in apps/web/app/(app)/[emailAccountId]/clean/onboarding/page.tsx is intentionally Gmail-specific and should show an error for non-Google email accounts rather than attempting to support multiple providers.

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:39:04.892Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security-audit.mdc:0-0
Timestamp: 2025-11-25T14:39:04.892Z
Learning: Applies to apps/web/app/api/**/route.ts : Error responses must use `SafeError` to prevent information disclosure; avoid revealing internal details like user IDs or database information in error messages

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:39:04.892Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security-audit.mdc:0-0
Timestamp: 2025-11-25T14:39:04.892Z
Learning: Applies to apps/web/app/api/**/route.ts : All database queries must include user/account filtering with `emailAccountId` or `userId` in WHERE clauses to prevent IDOR vulnerabilities

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:39:23.326Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security.mdc:0-0
Timestamp: 2025-11-25T14:39:23.326Z
Learning: Applies to app/api/**/*.ts : Use `withEmailAccount` middleware for operations scoped to a specific email account (reading/writing emails, rules, schedules, etc.) - provides `emailAccountId`, `userId`, and `email` in `request.auth`

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:39:08.150Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security-audit.mdc:0-0
Timestamp: 2025-11-25T14:39:08.150Z
Learning: Applies to apps/web/app/api/**/*.{ts,tsx} : All database queries must include user scoping with `emailAccountId` or `userId` filtering in WHERE clauses

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:39:27.909Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security.mdc:0-0
Timestamp: 2025-11-25T14:39:27.909Z
Learning: Applies to **/app/api/**/*.ts : Use `withEmailAccount` middleware for operations scoped to a specific email account, including reading/writing emails, rules, schedules, or any operation using `emailAccountId`

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:39:23.326Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security.mdc:0-0
Timestamp: 2025-11-25T14:39:23.326Z
Learning: Applies to **/*.ts : Always validate that resources belong to the authenticated user before any operation - use ownership checks in queries (e.g., `emailAccount: { id: emailAccountId }`) and throw `SafeError` if validation fails

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : Use descriptive scoped loggers for each LLM feature, log inputs and outputs with appropriate log levels, and include relevant context in log messages

Applied to files:

apps/web/app/api/llm-tools/invoke/route.ts
apps/web/utils/llms/index.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : Use TypeScript types for all LLM function parameters and return values, and define clear interfaces for complex input/output structures

Applied to files:

apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/{utils/ai,utils/llms,__tests__}/**/*.ts : LLM-related code must be organized in specific directories: `apps/web/utils/ai/` for main implementations, `apps/web/utils/llms/` for core utilities and configurations, and `apps/web/__tests__/` for LLM-specific tests

Applied to files:

apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/app/api/llm-tools/invoke/route.test.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/llms/{index,model}.ts : Core LLM functionality must be defined in `utils/llms/index.ts`, model definitions and configurations in `utils/llms/model.ts`, and usage tracking in `utils/usage.ts`

Applied to files:

apps/web/utils/llms/index.ts
apps/web/utils/llms/model.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : Implement early returns for invalid LLM inputs, use proper error types and logging, implement fallbacks for AI failures, and add retry logic for transient failures using `withRetry`

Applied to files:

apps/web/utils/llms/index.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : Use XML-like tags to structure data in prompts, remove excessive whitespace and truncate long inputs, and format data consistently across similar LLM functions

Applied to files:

apps/web/utils/llms/index.ts
apps/web/utils/llms/claude-code.ts
apps/web/utils/llms/claude-code-llm.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : Keep related AI functions in the same file or directory, extract common patterns into utility functions, and document complex AI logic with clear comments

Applied to files:

apps/web/utils/llms/index.ts
apps/web/utils/llms/claude-code-llm.ts

📚 Learning: 2025-11-25T14:37:56.430Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm-test.mdc:0-0
Timestamp: 2025-11-25T14:37:56.430Z
Learning: Applies to apps/web/__tests__/**/*.test.ts : Use `console.debug()` for outputting generated LLM content in tests, e.g., `console.debug("Generated content:\n", result.content);`

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:42:08.869Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/ultracite.mdc:0-0
Timestamp: 2025-11-25T14:42:08.869Z
Learning: Applies to **/*.{ts,tsx} : Don't misuse the non-null assertion operator (!) in TypeScript files

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:42:08.869Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/ultracite.mdc:0-0
Timestamp: 2025-11-25T14:42:08.869Z
Learning: Applies to **/*.{ts,tsx} : Don't use non-null assertions with the `!` postfix operator

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : Always define a Zod schema for LLM response validation and make schemas as specific as possible to guide the LLM output

Applied to files:

apps/web/utils/llms/index.ts
apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : System prompts must define the LLM's role and task specifications

Applied to files:

apps/web/utils/llms/index.ts
apps/web/utils/llms/claude-code-llm.ts

📚 Learning: 2025-11-25T14:38:32.328Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/posthog-feature-flags.mdc:0-0
Timestamp: 2025-11-25T14:38:32.328Z
Learning: Applies to **/*.{ts,tsx} : Always define types for A/B test variant flags (e.g., `type PricingVariant = "control" | "variant-a" | "variant-b"`) and provide type safety through type casting

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:37:56.430Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm-test.mdc:0-0
Timestamp: 2025-11-25T14:37:56.430Z
Learning: Applies to apps/web/__tests__/**/*.test.ts : Use vitest imports (`describe`, `expect`, `test`, `vi`, `beforeEach`) in LLM test files

Applied to files:

apps/web/app/api/llm-tools/invoke/route.test.ts

📚 Learning: 2025-11-25T14:37:56.430Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm-test.mdc:0-0
Timestamp: 2025-11-25T14:37:56.430Z
Learning: Applies to apps/web/__tests__/**/*.test.ts : Mock 'server-only' module with empty object in LLM test files: `vi.mock("server-only", () => ({}))`

Applied to files:

apps/web/app/api/llm-tools/invoke/route.test.ts

📚 Learning: 2025-11-25T14:37:56.430Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm-test.mdc:0-0
Timestamp: 2025-11-25T14:37:56.430Z
Learning: Applies to apps/web/__tests__/**/*.test.ts : Place all LLM-related tests in `apps/web/__tests__/` directory

Applied to files:

apps/web/app/api/llm-tools/invoke/route.test.ts

📚 Learning: 2025-11-25T14:39:23.326Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security.mdc:0-0
Timestamp: 2025-11-25T14:39:23.326Z
Learning: Applies to **/*.test.ts : Include security tests in test suites to verify: authentication is required, IDOR protection works (other users cannot access resources), parameter validation rejects invalid inputs, and error messages don't leak information

Applied to files:

apps/web/app/api/llm-tools/invoke/route.test.ts

📚 Learning: 2025-11-25T14:40:00.833Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-11-25T14:40:00.833Z
Learning: Applies to **/*.test.{ts,tsx} : Mock Prisma using `vi.mock("@/utils/prisma")` and import the mock from `@/utils/__mocks__/prisma`

Applied to files:

apps/web/app/api/llm-tools/invoke/route.test.ts

📚 Learning: 2025-11-25T14:40:00.833Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-11-25T14:40:00.833Z
Learning: Applies to **/*.test.{ts,tsx} : Use test helpers `getEmail`, `getEmailAccount`, and `getRule` from `@/__tests__/helpers` for mocking emails, accounts, and rules

Applied to files:

apps/web/app/api/llm-tools/invoke/route.test.ts

📚 Learning: 2025-11-25T14:40:00.833Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-11-25T14:40:00.833Z
Learning: Applies to **/*.test.{ts,tsx} : Use `vitest` for testing the application

Applied to files:

apps/web/app/api/llm-tools/invoke/route.test.ts

📚 Learning: 2025-11-25T14:40:00.833Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-11-25T14:40:00.833Z
Learning: Applies to **/__tests__/**/*.{ts,tsx} : AI tests must be placed in the `__tests__` directory and are not run by default (they use a real LLM)

Applied to files:

apps/web/app/api/llm-tools/invoke/route.test.ts

📚 Learning: 2025-11-25T14:37:56.430Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm-test.mdc:0-0
Timestamp: 2025-11-25T14:37:56.430Z
Learning: Applies to apps/web/__tests__/**/*.test.ts : Prefer using existing helpers from `@/__tests__/helpers.ts` (`getEmailAccount`, `getEmail`, `getRule`, `getMockMessage`, `getMockExecutedRule`) instead of creating custom test data helpers

Applied to files:

apps/web/app/api/llm-tools/invoke/route.test.ts

📚 Learning: 2025-11-25T14:40:00.833Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2025-11-25T14:40:00.833Z
Learning: Applies to **/*.test.{ts,tsx} : Mock `server-only` using `vi.mock("server-only", () => ({}))`

Applied to files:

apps/web/app/api/llm-tools/invoke/route.test.ts

📚 Learning: 2025-11-25T14:37:56.430Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm-test.mdc:0-0
Timestamp: 2025-11-25T14:37:56.430Z
Learning: Include standard test cases for LLM functionality: happy path, error handling, edge cases (empty input, null values), different user configurations, and various input formats

Applied to files:

apps/web/app/api/llm-tools/invoke/route.test.ts

📚 Learning: 2025-11-25T14:39:49.448Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/server-actions.mdc:0-0
Timestamp: 2025-11-25T14:39:49.448Z
Learning: Applies to apps/web/utils/actions/*.validation.ts : Create separate validation files for server actions using the naming convention `apps/web/utils/actions/NAME.validation.ts` containing Zod schemas and inferred types

Applied to files:

apps/web/app/api/llm-tools/invoke/validation.ts

📚 Learning: 2025-11-25T14:36:18.416Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: apps/web/CLAUDE.md:0-0
Timestamp: 2025-11-25T14:36:18.416Z
Learning: Applies to apps/web/utils/actions/**/*.validation.ts : Use Zod schemas for validation and export both schema and inferred types in validation files

Applied to files:

apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts

📚 Learning: 2025-11-25T14:39:49.448Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/server-actions.mdc:0-0
Timestamp: 2025-11-25T14:39:49.448Z
Learning: Applies to apps/web/utils/actions/*.validation.ts : Define input validation schemas using Zod in `.validation.ts` files and export both the schema and its inferred TypeScript type

Applied to files:

apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts

📚 Learning: 2025-11-25T14:37:09.306Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/fullstack-workflow.mdc:0-0
Timestamp: 2025-11-25T14:37:09.306Z
Learning: Applies to apps/web/utils/actions/*.validation.ts : Define Zod validation schemas in separate `*.validation.ts` files and export both the schema and inferred type (e.g., `CreateExampleBody`)

Applied to files:

apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts

📚 Learning: 2025-11-25T14:37:09.306Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/fullstack-workflow.mdc:0-0
Timestamp: 2025-11-25T14:37:09.306Z
Learning: Applies to apps/web/utils/actions/*.validation.ts : Export types from Zod schemas using `z.infer<>` to maintain type safety between validation and client usage

Applied to files:

apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts

📚 Learning: 2025-11-25T14:36:51.389Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/form-handling.mdc:0-0
Timestamp: 2025-11-25T14:36:51.389Z
Learning: Applies to **/*.validation.ts : Define validation schemas using Zod

Applied to files:

apps/web/app/api/llm-tools/invoke/validation.ts
apps/web/utils/llms/claude-code.ts

📚 Learning: 2025-11-25T14:39:49.448Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/server-actions.mdc:0-0
Timestamp: 2025-11-25T14:39:49.448Z
Learning: Applies to apps/web/utils/actions/*.ts : Use `.schema()` method with Zod validation schemas from corresponding `.validation.ts` files in next-safe-action configuration

Applied to files:

apps/web/app/api/llm-tools/invoke/validation.ts

📚 Learning: 2025-11-25T14:36:51.389Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/form-handling.mdc:0-0
Timestamp: 2025-11-25T14:36:51.389Z
Learning: Applies to **/*.validation.ts : Use descriptive error messages in Zod validation schemas

Applied to files:

apps/web/app/api/llm-tools/invoke/validation.ts

📚 Learning: 2025-11-25T14:37:09.306Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/fullstack-workflow.mdc:0-0
Timestamp: 2025-11-25T14:37:09.306Z
Learning: Applies to apps/web/utils/actions/*.ts : Use `next-safe-action` with Zod schemas for all server actions (create/update/delete mutations), storing validation schemas in `apps/web/utils/actions/*.validation.ts`

Applied to files:

apps/web/app/api/llm-tools/invoke/validation.ts

📚 Learning: 2025-11-25T14:36:18.416Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: apps/web/CLAUDE.md:0-0
Timestamp: 2025-11-25T14:36:18.416Z
Learning: Applies to apps/web/utils/actions/**/*.ts : Use `next-safe-action` with `actionClient` for server actions with Zod schema validation

Applied to files:

apps/web/app/api/llm-tools/invoke/validation.ts

📚 Learning: 2025-11-25T14:36:53.147Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/form-handling.mdc:0-0
Timestamp: 2025-11-25T14:36:53.147Z
Learning: Applies to **/*.validation.{ts,tsx} : Define validation schemas using Zod

Applied to files:

apps/web/app/api/llm-tools/invoke/validation.ts

🧬 Code graph analysis (3)

apps/web/utils/llms/index.ts (4)

apps/web/utils/llms/config.ts (1)

Provider (3-13)

apps/web/env.ts (1)

env (18-266)

apps/web/utils/llms/model.ts (2)

isClaudeCodeAvailable (70-72)

buildClaudeCodeConfig (42-64)

apps/web/utils/llms/claude-code-llm.ts (2)

createClaudeCodeGenerateText (147-255)

createClaudeCodeGenerateObject (261-372)

apps/web/app/api/llm-tools/invoke/route.test.ts (2)

apps/claude-code-wrapper/__tests__/helpers.ts (1)

createMockRequest (226-234)

apps/web/next.config.ts (1)

headers (187-264)

apps/web/app/api/llm-tools/invoke/validation.ts (1)

apps/web/utils/actions/rule.validation.ts (1)

delayInMinutesSchema (11-15)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: test

🔇 Additional comments (17)

apps/web/app/api/llm-tools/invoke/route.ts (1)

1-732: LGTM! Secure proxy implementation with comprehensive validation.

The LLM tool proxy endpoint is well-implemented with:

Timing-safe token comparison for authentication

Comprehensive request validation with Zod schemas

All database queries properly scoped to emailAccountId

Generic error messages that don't leak sensitive information

Structured logging without PII (previous comments addressed)

The use of a POST route without standard middleware is appropriate here since this is a special proxy endpoint designed to be called by Claude Code CLI (an external system) rather than the web app itself.

apps/web/app/api/llm-tools/invoke/route.test.ts (1)

1-635: LGTM! Comprehensive test coverage.

The test suite is well-structured with:

Proper mocking of dependencies (server-only, Prisma, env, logger)

vi.clearAllMocks() in beforeEach for test isolation

Comprehensive coverage of authorization, validation, account lookup, tool execution, and edge cases

Clear test descriptions and assertions

apps/web/utils/llms/index.ts (1)

66-88: LGTM - Claude Code routing logic is well-structured.

The routing logic correctly prioritizes explicit provider selection while allowing environment-based defaults. The fallback chain on line 85 properly handles the optional model field.

apps/web/utils/llms/model.ts (3)

247-259: Non-null assertion on claudeCodeConfig.model! is safe but could be clearer.

At line 252, buildClaudeCodeConfig() guarantees model is always set (line 56: modelOverride || env.CLAUDE_CODE_MODEL || "sonnet"), so the non-null assertion is technically safe. However, for consistency with the approach used in index.ts line 85, consider using the model value directly from buildClaudeCodeConfig() which always returns a defined model.

The model: null as unknown as LanguageModelV2 pattern at line 255 is a reasonable workaround documented with a clear comment explaining why Claude Code doesn't use the standard SDK model interface.

42-64: Well-designed configuration builder with proper validation.

The buildClaudeCodeConfig function validates required environment variables and provides clear error messages. The default model fallback chain (modelOverride || env.CLAUDE_CODE_MODEL || "sonnet") is a good pattern.

66-72: Clean availability check implementation.

isClaudeCodeAvailable() correctly checks both required env vars without throwing, enabling safe conditional routing.

apps/web/utils/llms/claude-code.ts (3)

180-238: SSE parser implementation is robust.

The createSSEParser correctly reuses a single TextDecoder with { stream: true } to handle multi-byte UTF-8 characters spanning chunks. The flush handler properly processes any remaining buffer data.

446-534: Structured object generation with proper Zod validation.

The claudeCodeGenerateObject function correctly converts Zod schemas to JSON Schema for the wrapper service and validates the response against the original Zod schema. The error handling provides clear context with the raw text included for debugging.

63-76: Good defensive JSON parsing with logging.

The safeJsonParse helper gracefully handles non-JSON responses and logs a preview for debugging without exposing sensitive data.

apps/web/utils/llms/claude-code-llm.ts (4)

47-63: Useful label-based model optimization pattern.

The LABEL_MODEL_OVERRIDES map allows routing simple tasks to faster/cheaper models without modifying upstream code. The getModelForLabel function provides a clean fallback chain.

69-133: Session management with appropriate graceful degradation.

The retrieveSessionId and persistSessionId functions properly handle errors without breaking the main flow. Using logger.error ensures Sentry captures these failures for monitoring while allowing the LLM call to proceed.

209-243: Well-documented stub values in generateText result.

The comments clearly explain why fields like toolCalls, reasoning, and steps are stubbed. This helps future maintainers understand the limitations of the Claude Code integration.

414-425: Good type guard for content parts validation.

The isTextContentPart function properly validates the structure of content parts at runtime before accessing properties, addressing the concern about unsafe type assertions.

apps/claude-code-wrapper/src/routes/generate.ts (4)

1-12: LGTM!

Imports and router initialization follow proper TypeScript and Express patterns.

33-60: LGTM!

Request processing follows best practices with proper validation, structured logging, async/await error handling, and typed responses.

95-116: LGTM!

Enhanced prompt construction properly instructs Claude to output JSON matching the schema. The schema is validated input, and the template literal construction is safe.

210-210: LGTM!

Standard Express router export pattern.

apps/claude-code-wrapper/src/routes/generate.ts

apps/web/app/api/llm-tools/invoke/validation.ts

apps/web/utils/llms/index.ts

…Object Same fix as createGenerateText - use proper fallback chain instead of non-null assertion for claudeCodeConfig.model.

- Remove raw response text from JSON parse errors - Remove rawOutput from CLI error responses Both could potentially contain user data or secrets.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

apps/web/utils/llms/index.ts (1)

177-201: Claude Code routing logic is correct, but consider extracting the duplicated logic.

The routing logic for createGenerateObject mirrors the logic in createGenerateText (lines 66-88), resulting in ~23 lines of duplicated code. While both implementations are correct and type-safe, consolidating this logic would improve maintainability and reduce the risk of inconsistencies.

Consider extracting the common routing logic into a helper function:

function getClaudeCodeConfigIfNeeded(
  modelOptions: ReturnType<typeof getModel>
): { shouldUse: boolean; config?: ClaudeCodeConfig; modelName?: string } {
  const shouldUse =
    (modelOptions.provider === Provider.CLAUDE_CODE &&
      modelOptions.claudeCodeConfig) ||
    (env.DEFAULT_LLM_PROVIDER === Provider.CLAUDE_CODE &&
      isClaudeCodeAvailable());

  if (!shouldUse) {
    return { shouldUse: false };
  }

  const config = modelOptions.claudeCodeConfig || buildClaudeCodeConfig();
  const modelName = config.model || env.CLAUDE_CODE_MODEL || "sonnet";

  return { shouldUse: true, config, modelName };
}

Then use it in both functions:

export function createGenerateText({
  emailAccount,
  label,
  modelOptions,
}: {
  emailAccount: Pick<EmailAccountWithAI, "email" | "id">;
  label: string;
  modelOptions: ReturnType<typeof getModel>;
}): typeof generateText {
  const claudeCodeRoute = getClaudeCodeConfigIfNeeded(modelOptions);
  
  if (claudeCodeRoute.shouldUse) {
    return createClaudeCodeGenerateText({
      emailAccount,
      label,
      config: claudeCodeRoute.config!,
      modelName: claudeCodeRoute.modelName!,
      provider: Provider.CLAUDE_CODE,
    });
  }
  // ... rest of function
}

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6136c69 and d0a0052.

📒 Files selected for processing (2)

apps/claude-code-wrapper/src/routes/generate.ts (1 hunks)
apps/web/utils/llms/index.ts (4 hunks)

🧰 Additional context used

📓 Path-based instructions (13)

**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/data-fetching.mdc)

**/*.{ts,tsx}: For API GET requests to server, use the swr package
Use result?.serverError with toastError from @/components/Toast for error handling in async operations

**/*.{ts,tsx}: Use wrapper functions for Gmail message operations (get, list, batch, etc.) from @/utils/gmail/message.ts instead of direct API calls
Use wrapper functions for Gmail thread operations from @/utils/gmail/thread.ts instead of direct API calls
Use wrapper functions for Gmail label operations from @/utils/gmail/label.ts instead of direct API calls

**/*.{ts,tsx}: For early access feature flags, create hooks using the naming convention use[FeatureName]Enabled that return a boolean from useFeatureFlagEnabled("flag-key")
For A/B test variant flags, create hooks using the naming convention use[FeatureName]Variant that define variant types, use useFeatureFlagVariantKey() with type casting, and provide a default "control" fallback
Use kebab-case for PostHog feature flag keys (e.g., inbox-cleaner, pricing-options-2)
Always define types for A/B test variant flags (e.g., type PricingVariant = "control" | "variant-a" | "variant-b") and provide type safety through type casting

**/*.{ts,tsx}: Don't use primitive type aliases or misleading types
Don't use empty type parameters in type aliases and interfaces
Don't use this and super in static contexts
Don't use any or unknown as type constraints
Don't use the TypeScript directive @ts-ignore
Don't use TypeScript enums
Don't export imported variables
Don't add type annotations to variables, parameters, and class properties that are initialized with literal expressions
Don't use TypeScript namespaces
Don't use non-null assertions with the ! postfix operator
Don't use parameter properties in class constructors
Don't use user-defined types
Use as const instead of literal types and type annotations
Use either T[] or Array<T> consistently
Initialize each enum member value explicitly
Use export type for types
Use `impo...

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/utils/llms/index.ts

**/{pages,routes,components}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/gmail-api.mdc)

Never call Gmail API directly from routes or components - always use wrapper functions from the utils folder

Files:

apps/claude-code-wrapper/src/routes/generate.ts

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/prisma-enum-imports.mdc)

Always import Prisma enums from @/generated/prisma/enums instead of @/generated/prisma/client to avoid Next.js bundling errors in client components

Import Prisma using the project's centralized utility: import prisma from '@/utils/prisma'

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/utils/llms/index.ts

**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/security.mdc)

**/*.ts: ALL database queries MUST be scoped to the authenticated user/account by including user/account filtering in WHERE clauses to prevent unauthorized data access
Always validate that resources belong to the authenticated user before performing operations, using ownership checks in WHERE clauses or relationships
Always validate all input parameters for type, format, and length before using them in database queries
Use SafeError for error responses to prevent information disclosure. Generic error messages should not reveal internal IDs, logic, or resource ownership details
Only return necessary fields in API responses using Prisma's select option. Never expose sensitive data such as password hashes, private keys, or system flags
Prevent Insecure Direct Object References (IDOR) by validating resource ownership before operations. All findUnique/findFirst calls MUST include ownership filters
Prevent mass assignment vulnerabilities by explicitly whitelisting allowed fields in update operations instead of accepting all user-provided data
Prevent privilege escalation by never allowing users to modify system fields, ownership fields, or admin-only attributes through user input
All findMany queries MUST be scoped to the user's data by including appropriate WHERE filters to prevent returning data from other users
Use Prisma relationships for access control by leveraging nested where clauses (e.g., emailAccount: { id: emailAccountId }) to validate ownership

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/utils/llms/index.ts

**/*.{tsx,ts}

📄 CodeRabbit inference engine (.cursor/rules/ui-components.mdc)

**/*.{tsx,ts}: Use Shadcn UI and Tailwind for components and styling
Use next/image package for images
For API GET requests to server, use the swr package with hooks like useSWR to fetch data
For text inputs, use the Input component with registerProps for form integration and error handling

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/utils/llms/index.ts

**/*.{tsx,ts,css}

📄 CodeRabbit inference engine (.cursor/rules/ui-components.mdc)

Implement responsive design with Tailwind CSS using a mobile-first approach

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/utils/llms/index.ts

**/*.{js,jsx,ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)

**/*.{js,jsx,ts,tsx}: Don't use accessKey attribute on any HTML element
Don't set aria-hidden="true" on focusable elements
Don't add ARIA roles, states, and properties to elements that don't support them
Don't use distracting elements like <marquee> or <blink>
Only use the scope prop on <th> elements
Don't assign non-interactive ARIA roles to interactive HTML elements
Make sure label elements have text content and are associated with an input
Don't assign interactive ARIA roles to non-interactive HTML elements
Don't assign tabIndex to non-interactive HTML elements
Don't use positive integers for tabIndex property
Don't include "image", "picture", or "photo" in img alt prop
Don't use explicit role property that's the same as the implicit/default role
Make static elements with click handlers use a valid role attribute
Always include a title element for SVG elements
Give all elements requiring alt text meaningful information for screen readers
Make sure anchors have content that's accessible to screen readers
Assign tabIndex to non-interactive HTML elements with aria-activedescendant
Include all required ARIA attributes for elements with ARIA roles
Make sure ARIA properties are valid for the element's supported roles
Always include a type attribute for button elements
Make elements with interactive roles and handlers focusable
Give heading elements content that's accessible to screen readers (not hidden with aria-hidden)
Always include a lang attribute on the html element
Always include a title attribute for iframe elements
Accompany onClick with at least one of: onKeyUp, onKeyDown, or onKeyPress
Accompany onMouseOver/onMouseOut with onFocus/onBlur
Include caption tracks for audio and video elements
Use semantic elements instead of role attributes in JSX
Make sure all anchors are valid and navigable
Ensure all ARIA properties (aria-*) are valid
Use valid, non-abstract ARIA roles for elements with ARIA roles
Use valid AR...

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/utils/llms/index.ts

!(pages/_document).{jsx,tsx}

📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)

Don't use the next/head module in pages/_document.js on Next.js projects

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/utils/llms/index.ts

**/*.{js,ts,jsx,tsx}

📄 CodeRabbit inference engine (.cursor/rules/utilities.mdc)

**/*.{js,ts,jsx,tsx}: Use lodash utilities for common operations (arrays, objects, strings)
Import specific lodash functions to minimize bundle size (e.g., import groupBy from 'lodash/groupBy')

Files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/utils/llms/index.ts

apps/web/**/*.{ts,tsx}

📄 CodeRabbit inference engine (apps/web/CLAUDE.md)

apps/web/**/*.{ts,tsx}: Use TypeScript with strict null checks
Use @/ path aliases for imports from project root
Use proper error handling with try/catch blocks
Format code with Prettier
Follow consistent naming conventions using PascalCase for components
Centralize shared types in dedicated type files

Import specific lodash functions rather than entire lodash library to minimize bundle size (e.g., import groupBy from 'lodash/groupBy')

Files:

apps/web/utils/llms/index.ts

apps/web/{utils/ai,utils/llms,__tests__}/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/llm.mdc)

LLM-related code must be organized in specific directories: apps/web/utils/ai/ for main implementations, apps/web/utils/llms/ for core utilities and configurations, and apps/web/__tests__/ for LLM-specific tests

Files:

apps/web/utils/llms/index.ts

apps/web/utils/llms/{index,model}.ts

📄 CodeRabbit inference engine (.cursor/rules/llm.mdc)

Core LLM functionality must be defined in utils/llms/index.ts, model definitions and configurations in utils/llms/model.ts, and usage tracking in utils/usage.ts

Files:

apps/web/utils/llms/index.ts

**/{server,api,actions,utils}/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/logging.mdc)

**/{server,api,actions,utils}/**/*.ts: Use createScopedLogger from "@/utils/logger" for logging in backend code
Add the createScopedLogger instantiation at the top of the file with an appropriate scope name
Use .with() method to attach context variables only within specific functions, not on global loggers
For large functions with reused variables, use createScopedLogger().with() to attach context once and reuse the logger without passing variables repeatedly

Files:

apps/web/utils/llms/index.ts

🧠 Learnings (22)

📚 Learning: 2025-11-25T14:39:27.909Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security.mdc:0-0
Timestamp: 2025-11-25T14:39:27.909Z
Learning: Applies to **/app/api/**/*.ts : Maintain consistent error response format across all API routes to avoid information disclosure while providing meaningful error feedback

Applied to files:

apps/claude-code-wrapper/src/routes/generate.ts

📚 Learning: 2025-11-25T14:39:23.326Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security.mdc:0-0
Timestamp: 2025-11-25T14:39:23.326Z
Learning: Applies to **/*.ts : Prevent Insecure Direct Object References (IDOR) by validating resource ownership in all queries - always include ownership filters (e.g., `emailAccount: { id: emailAccountId }`) when accessing user-specific resources

Applied to files:

apps/claude-code-wrapper/src/routes/generate.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : LLM feature functions must follow a standard structure: accept options with `inputData` and `emailAccount` parameters, implement input validation with early returns, define separate system and user prompts, create a Zod schema for response validation, and use `createGenerateObject` to execute the LLM call

Applied to files:

apps/claude-code-wrapper/src/routes/generate.ts
apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:39:23.326Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security.mdc:0-0
Timestamp: 2025-11-25T14:39:23.326Z
Learning: Applies to app/api/**/*.ts : Use `SafeError` for error responses to prevent information disclosure - provide generic messages (e.g., 'Rule not found' not 'Rule {id} does not exist for user {userId}') without revealing internal IDs or ownership details

Applied to files:

apps/claude-code-wrapper/src/routes/generate.ts

📚 Learning: 2025-11-25T14:39:27.909Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security.mdc:0-0
Timestamp: 2025-11-25T14:39:27.909Z
Learning: Applies to **/*.ts : Use SafeError for error responses to prevent information disclosure. Generic error messages should not reveal internal IDs, logic, or resource ownership details

Applied to files:

apps/claude-code-wrapper/src/routes/generate.ts

📚 Learning: 2025-11-25T14:39:04.892Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security-audit.mdc:0-0
Timestamp: 2025-11-25T14:39:04.892Z
Learning: Applies to apps/web/app/api/**/route.ts : Error responses must use `SafeError` to prevent information disclosure; avoid revealing internal details like user IDs or database information in error messages

Applied to files:

apps/claude-code-wrapper/src/routes/generate.ts

📚 Learning: 2025-11-25T14:39:08.150Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/security-audit.mdc:0-0
Timestamp: 2025-11-25T14:39:08.150Z
Learning: Applies to apps/web/app/api/**/*.{ts,tsx} : Use generic error messages instead of revealing internal details; throw `SafeError` instead of exposing user IDs, resource IDs, or system information

Applied to files:

apps/claude-code-wrapper/src/routes/generate.ts

📚 Learning: 2025-11-25T14:42:08.869Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/ultracite.mdc:0-0
Timestamp: 2025-11-25T14:42:08.869Z
Learning: Applies to **/*.{js,jsx,ts,tsx} : Don't hardcode sensitive data like API keys and tokens

Applied to files:

apps/claude-code-wrapper/src/routes/generate.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : LLM feature functions must import from `zod` for schema validation, use `createScopedLogger` from `@/utils/logger`, `chatCompletionObject` and `createGenerateObject` from `@/utils/llms`, and import `EmailAccountWithAI` type from `@/utils/llms/types`

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/llms/{index,model}.ts : Core LLM functionality must be defined in `utils/llms/index.ts`, model definitions and configurations in `utils/llms/model.ts`, and usage tracking in `utils/usage.ts`

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : Use TypeScript types for all LLM function parameters and return values, and define clear interfaces for complex input/output structures

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/{utils/ai,utils/llms,__tests__}/**/*.ts : LLM-related code must be organized in specific directories: `apps/web/utils/ai/` for main implementations, `apps/web/utils/llms/` for core utilities and configurations, and `apps/web/__tests__/` for LLM-specific tests

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : Use descriptive scoped loggers for each LLM feature, log inputs and outputs with appropriate log levels, and include relevant context in log messages

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : Use XML-like tags to structure data in prompts, remove excessive whitespace and truncate long inputs, and format data consistently across similar LLM functions

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : Implement early returns for invalid LLM inputs, use proper error types and logging, implement fallbacks for AI failures, and add retry logic for transient failures using `withRetry`

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:37:56.430Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm-test.mdc:0-0
Timestamp: 2025-11-25T14:37:56.430Z
Learning: Applies to apps/web/__tests__/**/*.test.ts : Use `console.debug()` for outputting generated LLM content in tests, e.g., `console.debug("Generated content:\n", result.content);`

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : Keep related AI functions in the same file or directory, extract common patterns into utility functions, and document complex AI logic with clear comments

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:42:08.869Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/ultracite.mdc:0-0
Timestamp: 2025-11-25T14:42:08.869Z
Learning: Applies to **/*.{ts,tsx} : Don't misuse the non-null assertion operator (!) in TypeScript files

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:42:08.869Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/ultracite.mdc:0-0
Timestamp: 2025-11-25T14:42:08.869Z
Learning: Applies to **/*.{ts,tsx} : Don't use non-null assertions with the `!` postfix operator

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : Always define a Zod schema for LLM response validation and make schemas as specific as possible to guide the LLM output

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:38:07.606Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : System prompts must define the LLM's role and task specifications

Applied to files:

apps/web/utils/llms/index.ts

📚 Learning: 2025-11-25T14:38:32.328Z

Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/posthog-feature-flags.mdc:0-0
Timestamp: 2025-11-25T14:38:32.328Z
Learning: Applies to **/*.{ts,tsx} : Always define types for A/B test variant flags (e.g., `type PricingVariant = "control" | "variant-a" | "variant-b"`) and provide type safety through type casting

Applied to files:

apps/web/utils/llms/index.ts

🧬 Code graph analysis (1)

apps/web/utils/llms/index.ts (4)

apps/web/utils/llms/config.ts (1)

Provider (3-13)

apps/web/env.ts (1)

env (18-266)

apps/web/utils/llms/model.ts (2)

isClaudeCodeAvailable (70-72)

buildClaudeCodeConfig (42-64)

apps/web/utils/llms/claude-code-llm.ts (2)

createClaudeCodeGenerateText (147-255)

createClaudeCodeGenerateObject (261-372)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: test

🔇 Additional comments (3)

apps/web/utils/llms/index.ts (2)

17-17: LGTM: Imports are well-organized and follow project conventions.

The new imports for Claude Code integration are necessary and properly structured. They follow the project's path alias conventions and import specific functions to minimize bundle size.

Also applies to: 34-45

66-88: LGTM: Claude Code routing logic is correct and type-safe.

The routing logic properly checks if Claude Code should be used based on provider configuration and availability. The fallback chain for modelName (line 85) correctly handles optional values and matches the logic in buildClaudeCodeConfig(). The comments clearly explain the wrapper layer override behavior.

apps/claude-code-wrapper/src/routes/generate.ts (1)

1-211: Well-implemented route handlers with past security issues properly addressed.

The code quality is solid:

Proper input validation using Zod schemas

Safe logging that only captures metadata (model, prompt length) without exposing PII

JSON parsing with multiple fallback strategies using non-greedy regex patterns (past issue resolved)

Error handling that returns generic messages without leaking sensitive data like rawOutput or raw response text (past issues resolved)

Clean separation of concerns with dedicated helper functions

The implementation correctly follows the authenticated wrapper service pattern described in the PR objectives.

matthew-petty added 30 commits December 16, 2025 16:24

fix: Add env_file and document Max subscription auth options

a6e3e6e

- Add env_file to load ANTHROPIC_API_KEY from .env - Document two auth approaches in docker-compose comments: 1. API key via ANTHROPIC_API_KEY env var 2. Max subscription via bind-mounted ~/.claude directory

feat: Prefer OAuth token over API key for Max subscribers

fbd0475

Adds buildClaudeEnv() helper that removes ANTHROPIC_API_KEY when CLAUDE_CODE_OAUTH_TOKEN is present, ensuring Max subscription auth takes precedence over pay-per-token API auth.

refactor: Rename CLAUDE_CODE_API_KEY to CLAUDE_WRAPPER_AUTH_KEY

bd56f89

Clearer naming to distinguish wrapper service auth from Claude auth.

docs: Add Claude Code env vars to .env.example

2b77236

Add CLAUDE_CODE_BASE_URL and CLAUDE_CODE_TIMEOUT to the LLM configuration section with documentation about the wrapper service and Max subscription support.

docs: Add explanatory comments for AI SDK result shape stubs

b9eb302

Explains why toolCalls, steps, reasoning, and other fields are empty/undefined in Claude Code provider results. These stubs ensure callers can safely access properties without null checks.

test: Add Claude Code provider tests

7e88907

Add dedicated test file for Claude Code provider integration with tests for configuration, error handling, and timeout settings. Also add required env var mocks to existing model tests.

refactor: Improve Claude Code session management

0d7273c

- Add "server-only" import to prevent client-side usage - Extract session retrieval/persistence into helper functions - Add sessionId to save failure log messages for debugging - Add error propagation tests for Redis failures

test: Add integration tests for Claude Code session flow

77f2281

Verify session IDs flow through the LLM layer correctly: - Session retrieval before HTTP calls - Session persistence after successful calls - Graceful degradation when Redis fails - Workflow group routing based on label

refactor: Require explicit ECONOMY_LLM_PROVIDER for Claude Code economy

11e0869

Change economy model selection to check ECONOMY_LLM_PROVIDER explicitly instead of inferring from DEFAULT_LLM_PROVIDER. This allows flexibility to use different providers for default vs economy tasks.

docs: Add CLAUDE_CODE_OAUTH_TOKEN to .env.example

5ddfe39

Document the OAuth token env var as a proper entry instead of just a comment. This token is used by the wrapper for Max subscription authentication.

docs: Add security warning for ~/.claude bind mount

a1803ba

Clarify that mounting host's ~/.claude directory exposes credentials and should only be used for local development/testing. Production deployments should use env vars (CLAUDE_CODE_OAUTH_TOKEN or ANTHROPIC_API_KEY) instead.

fix: Exclude claude-code-wrapper from 'all' profile

89485bb

Most users don't need the Claude Code wrapper service. Keep it in its own 'claude-code' profile so it must be explicitly requested.

matthew-petty added 9 commits December 16, 2025 16:27

fix: Add logging for stream buffer parse failures

9ded276

Add debug-level logging when final buffer fails to parse as JSON. This is expected during user cancellations but provides visibility for debugging.

test: Add edge case tests for LLM tools endpoint

b8df389

- Test for empty Bearer token returning 401 - Test for getLearnedPatterns with rule not found - Test input validation for getLearnedPatterns, updateAbout, and addToKnowledgeBase with missing required fields

test: Add error handling tests for LLM tools endpoint

ca8e12b

- Add test for EXECUTION_ERROR when tool throws unexpected error - Add test for database error handling in addToKnowledgeBase

docs: Add comprehensive README for Claude Code Wrapper

80f6168

- Document core vs app-specific code separation - Explain skills system and extensibility - Include complete API reference - Add development and Docker deployment instructions - Prepare for potential open source release

fix: Update Dockerfile Bun version to 1.3.3

83237ef

Bun 1.2+ uses lockfile v1 format. Update to latest stable (1.3.3) for security patches and performance improvements. This fixes the "Unknown lockfile version" warning during builds.

coderabbitai bot reviewed Dec 16, 2025

View reviewed changes

cubic-dev-ai bot reviewed Dec 16, 2025

View reviewed changes

matthew-petty added 11 commits December 16, 2025 17:02

fix: Use non-greedy regex with iteration for JSON extraction

8ba98f2

Replace greedy regex patterns with non-greedy global matches that iterate through candidates. This prevents incorrect extraction when responses contain multiple JSON fragments or trailing text.

fix: Remove PII (email) from error responses and logs

3e6de34

- Use generic error message instead of including userEmail in 404 response - Remove userEmail from tool invocation logs (emailAccountId suffices for debugging)

fix: Add consistent ruleName validation across all tool schemas

4c26442

All ruleName fields now require non-empty strings with min(1) validation, matching the existing pattern in getLearnedPatternsInputSchema.

fix: Use WorkflowGroup type instead of string with type assertion

1bb420e

Import and use the proper WorkflowGroup type throughout the session management functions, eliminating the need for unsafe type assertions.

fix: Add runtime validation for content parts structure

a4db4e7

Add isTextContentPart type guard to validate content part objects before accessing properties, replacing unsafe type assertion.

fix: Reuse TextDecoder with stream mode for multi-byte UTF-8

8db4c65

Creating a new TextDecoder per chunk incorrectly decodes multi-byte UTF-8 characters (emojis, CJK, etc.) that span chunk boundaries. Fix by instantiating decoder once and using { stream: true } option.

fix: Consistent Claude Code availability check in getProviderApiKey

1e0aeed

Both BASE_URL and WRAPPER_API_KEY are now required, matching the isClaudeCodeAvailable() check.

fix: Add missing ActionType.MOVE_FOLDER to validation schema

42eb1b4

The action type enum was missing MOVE_FOLDER, which is a valid Prisma ActionType. The fields object already includes folderName.

fix: Move console spies into beforeEach for proper test isolation

7057906

Console mocks at module level are undone by vi.restoreAllMocks() in afterEach. Moving them into beforeEach ensures they are re-established before each test.

fix: Remove non-null assertion on claudeCodeConfig.model

a3b3939

Replace the unsafe `!` assertion with proper fallback chain matching buildClaudeCodeConfig() logic: config.model || env var || "sonnet"

test: Update test to match generic error message

6136c69

Test was expecting error message to contain the email address, but we removed that for security (commit 3e6de34).

coderabbitai bot reviewed Dec 17, 2025

View reviewed changes

apps/claude-code-wrapper/src/routes/generate.ts Show resolved Hide resolved

apps/claude-code-wrapper/src/routes/generate.ts Show resolved Hide resolved

apps/web/app/api/llm-tools/invoke/validation.ts Show resolved Hide resolved

apps/web/utils/llms/index.ts Show resolved Hide resolved

matthew-petty added 2 commits December 16, 2025 19:51

fix: Replace non-null assertion with fallback chain in createGenerate…

3ba1868

…Object Same fix as createGenerateText - use proper fallback chain instead of non-null assertion for claudeCodeConfig.model.

security: Remove sensitive data from error responses

d0a0052

- Remove raw response text from JSON parse errors - Remove rawOutput from CLI error responses Both could potentially contain user data or secrets.

coderabbitai bot reviewed Dec 17, 2025

View reviewed changes

feat: Add Claude Code CLI as LLM provider option #1098

Are you sure you want to change the base?

feat: Add Claude Code CLI as LLM provider option #1098

Uh oh!

Conversation

matthew-petty commented Dec 16, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Architecture

What Works

What Doesn't Work (Yet)

Extensibility

Claude Skills

Future CLI Integration

New Environment Variables

Provider Selection Logic

Docker Usage

Open Source Intent

Files Changed

New Service

Provider Integration

Tool Proxy (for Claude skills)

Configuration

Test Plan

Summary by CodeRabbit

Uh oh!

vercel bot commented Dec 16, 2025

Uh oh!

coderabbitai bot commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

macroscopeapp bot commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add Claude Code CLI provider and route app LLM calls through new authenticated wrapper service and web SDK integrations

📍Where to Start

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

matthew-petty commented Dec 17, 2025

✅ All Tests Pass

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

matthew-petty commented Dec 16, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 16, 2025 •

edited

Loading

macroscopeapp bot commented Dec 16, 2025 •

edited

Loading