feat(opencode): add Vercel sandbox substrate [WIP]#22961
feat(opencode): add Vercel sandbox substrate [WIP]#22961avemeva wants to merge 10 commits intoanomalyco:devfrom
Conversation
Introduces the opt-in knob for swapping tool execution into a remote substrate. The flag and the 'workspace' config schema are the ONLY upper-layer surface consumers see; everything downstream dispatches on them internally. The config schema is a discriminated union over backend = 'local' | 'vercel'. The vercel variant carries optional credentials / snapshotId / timeout / worktree; omitted values fall back to VERCEL_TOKEN / VERCEL_TEAM_ID / VERCEL_PROJECT_ID / VERCEL_SANDBOX_IMAGE_ID env vars. The flag's getter is dynamic so tests can mutate the env at runtime to force a backend.
The substrate-agnostic contract every implementation (local, vercel, test fixtures) satisfies: fs primitives (stat, readFile, writeFile, readDir, mkDir, remove, rename, exists), a generic exec + Scope-aware execStream, a watch stream with optional ignore globs, and an opaque close() finalizer. WorkspaceBackendError is the L2 error every backend speaks; WorkspaceError is the higher-level shape surfaced to L3 consumers (separated into its own file to break a module-eval cycle with the router).
LocalBackend implements the Backend contract using node:fs/promises, effect/unstable/process ChildProcessSpawner, and @parcel/watcher — the existing host primitives. ThrowingBackend is a test fixture that fails every primitive; the conformance suite runs against it to prove no consumer secretly bypasses the Backend seam. Shared pure helpers (is-binary detection, line slicing, exec mailbox, ripgrep argv + JSON parser) live under helpers/ so all backends and consumers share one implementation.
WorkspaceRouter picks a Backend per Instance based on the OPENCODE_WORKSPACE_BACKEND flag and config.workspace.backend, caches the result per-Instance via InstanceState, and fails with a WorkspaceError when vercel credentials are missing. Workspace.Primitives is the L3 interface consumers use via the substrate adapters landing in later commits; it forwards to the Backend while adding derived helpers (readFileString, writeFileWithDirs, readFileLines, isBinary, isDir, search, files, resolve, containsPath) and mapping BackendError -> WorkspaceError. Workspace.Service.Tag is the L4 tool-facing wrapper that layers post-write orchestration (format -> bus events -> LSP touch + diagnostics). WorkspaceRuntime is the app-level assembly point that wires Primitives + real Format / Bus / LSP layers; it lives in its own file to avoid a module-eval cycle with the L4 services.
VercelBackend implements the Backend contract against @vercel/sandbox 2.x. Sandbox identity is deterministic per tenant (nameFor(directory) = oc-<sha1[:20]>), created lazily on first call, kept alive with throttled extendTimeout, and persistent across opencode restarts. FS primitives the SDK doesn't expose (stat, readDir, exists, remove, rename) are implemented via bash -lc running coreutils. 'exec' uses sb.runCommand directly; 'execStream' opens a WebSocket to an in-sandbox gateway daemon that forwards bytes to a real child process's stdio — the only path that gives LSP the live bidirectional I/O it needs. The gateway ships baked into the sandbox image (see script/sandbox-image/) so cold-start bootstrap is a single launch + health poll. script/verify-sandbox-image.ts smoke-tests that the image has every binary the L4 services need.
Four adapters that make VercelBackend satisfy the contracts opencode consumers already use, so tools, L4 services, and LSP stay untouched: - vercel-spawner.ts: Effect ChildProcessSpawner impl routing through Workspace.Primitives.execStream (used by bash tool, Git, Format, Snapshot, Ripgrep). - vercel-process.ts: util/process.ts::Process.spawn impl returning a sync Node ChildProcess-like handle; stdin writes and stdout/stderr reads queue until the async WebSocket connect resolves (used by LSP language-server spawns). - vercel-filewatcher.ts: no-op FileWatcher.Service since the agent is the only writer inside a tenant sandbox. - node-stream-adapters.ts: Effect Stream/Sink ↔ Node Readable/Writable bridges shared by the two spawn adapters so vscode-jsonrpc/node consumers (LSPClient) see normal Node streams.
…n backend flag The four substrate entry points in opencode's existing code — two shared services and one module-global function plus its companion watcher — now pick a Vercel-backed implementation when OPENCODE_WORKSPACE_BACKEND=vercel, falling back to the existing host implementation otherwise: - cross-spawn-spawner.ts: defaultLayer dispatches between the existing cross-spawn layer and VercelChildProcessSpawner. - util/process.ts: Process.spawn() dispatches between launch(...) and spawnViaVercel(...). - file/watcher.ts: defaultLayer dispatches between the @parcel/watcher layer and the no-op vercel layer. - session/llm.ts: break a module-eval cycle introduced by any of the above via a lazy AppRuntime import inside the single function that uses it. Dispatch bodies use dynamic import() to defer the workspace module graph until first use — workspace/* transitively pulls in @/global's top-level await, which must not load during the low-level modules' own eval phase. Every tool, L4 service, and LSP consumer keeps yielding the SAME service tags as before. The substrate swap is invisible above this seam.
One conformance suite that runs against all three Backend implementations (local / throwing / vercel), picked via OPENCODE_CONFORMANCE_BACKEND. Same assertions, same fixtures — proving behavior parity across substrates and catching any silent host fallback (throwing backend MUST propagate its marker through every primitive). Also adds pure-helper unit tests (is-binary, lines, mailbox, rg) and focused tests for the router (flag > config > default precedence, per-Instance cache, creds-missing error) and the Primitives BackendError -> WorkspaceError mapping.
Diffs HEAD vs the branching point, partitions files into lower (new substrate code — expected to grow) and upper (consumers opencode shouldn't have to modify to add a substrate). Reports totals + ratio. Used to verify the Vercel substrate migration stays a drop-in with respect to opencode's existing service interfaces.
Black-box test that spawns opencode serve as a subprocess and drives it via the SDK client against a real Vercel sandbox. Exercises: fresh session has 0 messages, bash tool returns Linux + /vercel/sandbox (proving the command ran in the sandbox not the host), filesystem writes persist across prompts in the same tenant. Skips the whole file when VERCEL_* is missing so it's safe to leave enabled in default 'bun test'. Run manually with vercel creds via 'OPENCODE_WORKSPACE_BACKEND=vercel bun test script/vercel-proof.test.ts'.
|
Hey! Your PR title Please update it to start with one of:
Where See CONTRIBUTING.md for details. |
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
|
This PR doesn't fully meet our contributing guidelines and PR template. What needs to be fixed:
Please edit this PR description to address the above within 2 hours, or it will be automatically closed. If you believe this was flagged incorrectly, please let a maintainer know. |
|
This pull request has been automatically closed because it was not updated to meet our contributing guidelines within the 2-hour window. Feel free to open a new pull request that follows our guidelines. |
Opt-in substrate that runs every tool's fs + subprocess calls inside a per-tenant Vercel sandbox instead of on the host. Use case: a multi-tenant agent server where each user gets their own isolated fs + exec environment (one user = one sandbox, one opencode process handles N tenants).
Issue for this PR
No existing issue — opening this as a discussion PR. Happy to file one if that's preferred.
Type of change
What does this PR do?
Context. opencode is a CLI AI coding agent. The LLM calls tools (
bash,read,write,edit,grep,glob,lsp, ...) and opencode executes them against a workspace. Today that workspace is the host:readdoesfs.readFile,bashshells out viacross-spawn,grepforksrg,lsplaunchestypescript-language-server, fs events come from@parcel/watcher. Fine for one user = one machine. Breaks for a multi-tenant server where one opencode process serves N users and each user's tool calls must be isolated.What this adds. Set
OPENCODE_WORKSPACE_BACKEND=verceland every tool call runs inside that tenant's per-user Vercel sandbox. One tenant = one persistent sandbox, keyed by the opencodeInstance. Default (local) is unchanged.Interface. Everything a tool needs from a substrate, in one interface:
Two implementations ship:
LocalBackend(wrapsnode:fs+ cross-spawn +@parcel/watcher) andVercelBackend(wraps@vercel/sandbox).Shape. Consumers upstream don't change — they still yield the same service tags. Two independent mechanisms route calls down to a
Backend: funnel dispatch picks the implementation at Layer build, the router picks the tenant'sBackendinstance at Service resolution.Two spawn funnels exist because opencode has two spawn shapes: Effect-style returning
Effect<Handle, _, Scope>, and sync Node-style returningChildProcess. LSP usesvscode-jsonrpc/nodeand needs the latter. One-shotexecgoes direct tosb.runCommandwhileexecStreamroutes through the gateway because the ~13 git commands per agent step would cost ~6s of WS handshake overhead otherwise; LSP needs the live stdio that only the gateway provides.How did you verify your code works?
bun testin the opencode package is green — 2002 pass, 0 fail, 22 skip, 1 todo across 179 files. A conformance suite undertest/workspace/conformance/runs the same assertions against all three backends viaOPENCODE_CONFORMANCE_BACKEND={local,throwing,vercel}; the throwing backend fails every primitive and is the guard against any consumer silently bypassing the Backend seam. The e2e proof described below passes 8/8 against a real Vercel sandbox.Running the e2e proof yourself. Build from this branch:
Put the built binary on PATH, set vercel creds, run the proof:
Takes ~40–90 seconds end-to-end. Skips cleanly when
VERCEL_*is missing. The harness spawnsopencode serveas a subprocess, drives it through@opencode-ai/sdk, and asserts: a fresh session has zero messages, the bash tool returnsLinux+/vercel/sandbox(proving the command ran in the sandbox, not the host), and filesystem writes persist across prompts in the same tenant.Screenshots / recordings
N/A — no UI changes.
Checklist
Known limitations
First
execStreamcall per sandbox is 1–3s (cold start + gateway bootstrap); subsequent calls reuse the persistent sandbox. Snapshot service fires about 13 git commands per agent step — on vercel each is an SDK round-trip, so snapshotting is ~3s slower than local; a sandbox-local gitdir would fix it but isn't in this PR. FileWatcher is a no-op under vercel by design.Review scaffolding (not intended to land)
script/vercel-proof.test.tsandscripts/count-upper-layer.share reviewer-facing verification tooling, not part of the substrate itself. The proof script is here so anyone can reproduce the e2e result end-to-end; the footprint script is how I kept the consumer-side blast radius small while iterating. Both would be dropped (or moved to a separate test/dev-tools PR) before this lands.Rebase
Left it at
43b37346bto keep the diff readable; dev has moved ~80 commits since. If there's any interest or this gets any traction, happy to rebase onto current dev.