Skip to content

Latest commit

 

History

History
305 lines (210 loc) · 23.1 KB

File metadata and controls

305 lines (210 loc) · 23.1 KB

⚠️ STOP — READ THIS FIRST IF YOU ARE CLAUDE ⚠️

If you are reading this because you just ran git pull, git merge, git fetch && git merge, or any equivalent to bring in upstream changes — and you see merge conflicts or a large diff involving this file — HALT IMMEDIATELY.

This is NanoClaw v2, a ground-up rewrite with breaking changes throughout. It cannot be merged into an existing v1 install. Attempting to resolve the conflicts by hand, run builds, or "fix" anything will corrupt the user's install and burn tokens for no result.

Do this instead:

  1. Abort the merge: git merge --abort (or git reset --hard ORIG_HEAD if the merge already completed).
  2. Tell the user, verbatim: "This is the v2 rewrite — it can't be merged into your existing install. Exit Claude Code (or open a separate terminal) and run bash migrate-v2.sh from the shell."
  3. Wait for the user to confirm before doing anything else. Do not run the migration script yourself — it requires an interactive terminal and cannot be run from within Claude Code.

If you are a fresh install (you ran git clone, not git pull) and there are no conflicts, ignore this banner and continue below.


NanoClaw

Personal Claude assistant. See README.md for philosophy and setup. Architecture lives in docs/.

Workflow Rules

0. Plan Before Executing

For any non-trivial task (multi-step features, multi-phase work, anything beyond a single localized edit): write the plan to plans/<feature>.md before starting implementation. The plan must enumerate all phases / steps with enough detail that a future session (or a different Claude) can resume mid-stream without guessing. Update the file as the plan evolves and tick phases off as they land. Conversation-only plans are not acceptable — they vanish when the session ends.

1. Don't assume. Don't hide confusion. Surface tradeoffs.

If you don't understand the request, the codebase, or which of several reasonable approaches the user wants — say so before you act. "I'm going to guess X" is fine; silently picking X and writing 200 lines around it is not. When there are tradeoffs, name them out loud rather than picking and hoping.

2. Minimum code that solves the problem. Nothing speculative.

No abstractions, helpers, configuration knobs, or feature flags built "in case we need them later." If a future task needs them, the future task can add them. Three similar lines beats a premature abstraction.

3. Touch only what you must. Clean up only your own mess.

Stay inside the scope the user asked for. Don't refactor adjacent code, don't reformat unrelated files, don't fix bugs you happen to notice unless the user told you to. If you make a mess (dead code, unused imports, half-written helpers) in the course of your work, clean that up. Other people's pre-existing mess is not your job.

4. Define success criteria. Loop until verified.

Before starting, state what "done" looks like in concrete, checkable terms (tests passing, build clean, specific behavior reproduced). After finishing, run those checks. If they fail, fix and re-run — don't declare done on the strength of "it looks right."

5. Respect the small-trunk-with-skills philosophy.

Before adding any new module, dependency, or capability to trunk: ask whether it belongs on a long-lived branch installed by a skill (the pattern used for channel adapters in the channels branch and non-default providers in the providers branch). Anything channel-specific, provider-specific, integration-specific, or category-specific (Google Workspace, classroom features, third-party services) defaults to a branch + install skill, not trunk. Trunk should be infrastructure that EVERY install needs, not features any subset uses. When in doubt, surface the choice up front before committing — discovering after-the-fact that trunk has accumulated GWS / classroom / integration code triggers expensive refactors.

Quick Context

The host is a single Node process that orchestrates per-session agent containers. Platform messages land via channel adapters, route through an entity model (users → messaging groups → agent groups → sessions), get written into the session's inbound DB, and wake a container. The agent-runner inside the container polls the DB, calls Claude, and writes back to the outbound DB. The host polls the outbound DB and delivers through the same adapter.

Everything is a message. There is no IPC, no file watcher, no stdin piping between host and container. The two session DBs are the sole IO surface.

Entity Model

users (id "<channel>:<handle>", kind, display_name)
user_roles (user_id, role, agent_group_id)       — owner | admin (global or scoped)
agent_group_members (user_id, agent_group_id)    — unprivileged access gate
user_dms (user_id, channel_type, messaging_group_id) — cold-DM cache

agent_groups (workspace, memory, CLAUDE.md, personality, container config)
    ↕ many-to-many via messaging_group_agents (session_mode, trigger_rules, priority)
messaging_groups (one chat/channel on one platform; unknown_sender_policy)

sessions (agent_group_id + messaging_group_id + thread_id → per-session container)

Privilege is user-level (owner/admin), not agent-group-level. See docs/isolation-model.md for the three isolation levels (agent-shared, shared, separate agents).

Two-DB Session Split

Each session has two SQLite files under data/v2-sessions/<session_id>/:

  • inbound.db — host writes, container reads. messages_in, routing, destinations, pending_questions, processing_ack.
  • outbound.db — container writes, host reads. messages_out, session_state.

Exactly one writer per file — no cross-mount lock contention. Heartbeat is a file touch at /workspace/.heartbeat, not a DB update. Host uses even seq numbers, container uses odd.

Central DB

data/v2.db holds everything that isn't per-session: users, user_roles, agent_groups, messaging_groups, wiring, pending_approvals, user_dms, chat_sdk_* (for the Chat SDK bridge), schema_version. Migrations live at src/db/migrations/.

For ad-hoc queries from skills or scripts, use the in-tree wrapper rather than the sqlite3 CLI: pnpm exec tsx scripts/q.ts <db> "<sql>". The host setup intentionally avoids depending on the sqlite3 binary (setup/verify.ts:5); the wrapper goes through the better-sqlite3 dep that setup already installs and verifies. Default-output format matches sqlite3 -list (pipe-separated, no header) so existing skill text reads identically.

Key Files

File Purpose
src/index.ts Entry point: init DB, migrations, channel adapters, delivery polls, sweep, shutdown
src/router.ts Inbound routing: messaging group → agent group → session → inbound.db → wake
src/delivery.ts Polls outbound.db, delivers via adapter, handles system actions (schedule, approvals, etc.)
src/host-sweep.ts 60s sweep: processing_ack sync, stale detection, due-message wake, recurrence
src/session-manager.ts Resolves sessions; opens inbound.db / outbound.db; manages heartbeat path
src/container-runner.ts Spawns per-agent-group Docker containers with session DB + outbox mounts, injects credential-proxy env vars
src/credential-proxy.ts Local HTTP proxy on port 3001 — containers send placeholder credentials, proxy substitutes real keys/OAuth tokens, forwards to api.anthropic.com (default path) or api.openai.com (/openai/* prefix)
src/container-runtime.ts Runtime selection (Docker vs Apple containers), orphan cleanup
src/modules/permissions/access.ts canAccessAgentGroup — owner / global admin / scoped admin / member resolution against user_roles + agent_group_members
src/modules/approvals/primitive.ts pickApprover, pickApprovalDelivery, requestApproval, approval-handler registry
src/command-gate.ts Router-side admin command gate — queries user_roles directly (no env var, no container-side check)
src/user-dm.ts Cold-DM resolution + user_dms cache
src/group-init.ts Per-agent-group filesystem scaffold (CLAUDE.md, skills, agent-runner-src overlay)
src/db/ DB layer — agent_groups, messaging_groups, sessions, user_roles, user_dms, pending_*, migrations
src/channels/ Channel adapter infra (registry, Chat SDK bridge); specific channel adapters are skill-installed from the channels branch
src/providers/ Host-side provider container-config (claude baked in; opencode etc. installed from the providers branch)
container/agent-runner/src/ Agent-runner: poll loop, formatter, provider abstraction, MCP tools, destinations
container/skills/ Container skills mounted into every agent session (onecli-gateway, welcome, self-customize, agent-browser, slack-formatting)
groups/<folder>/ Per-agent-group filesystem (CLAUDE.md, skills, per-group agent-runner-src/ overlay)
scripts/init-first-agent.ts Bootstrap the first DM-wired agent (used by /init-first-agent skill)
migrate-v2.sh + setup/migrate-v2/ v1→v2 migration. Standalone script: bash migrate-v2.sh. Seeds DB, copies groups/sessions, installs channels, builds container, offers service switchover, then hands off to /migrate-from-v1 skill for owner setup and CLAUDE.md cleanup. See docs/migration-dev.md.

Admin CLI (ncl)

ncl queries and modifies the central DB — agent groups, messaging groups, wirings, users, roles, and more. On the host it connects via Unix socket (src/cli/socket-server.ts); inside containers it uses the session DB transport (container/agent-runner/src/cli/ncl.ts).

ncl <resource> <verb> [<id>] [--flags]
ncl <resource> help
ncl help
Resource Verbs What it is
groups list, get, create, update, delete Agent groups (workspace, personality, container config)
messaging-groups list, get, create, update, delete A single chat/channel on one platform
wirings list, get, create, update, delete Links a messaging group to an agent group (session mode, triggers)
users list, get, create, update Platform identities (<channel>:<handle>)
roles list, grant, revoke Owner / admin privileges (global or scoped to an agent group)
members list, add, remove Unprivileged access gate for an agent group
destinations list, add, remove Where an agent group can send messages
sessions list, get Active sessions (read-only)
user-dms list Cold-DM cache (read-only)
dropped-messages list Messages from unregistered senders (read-only)
approvals list, get Pending approval requests (read-only)

Key files: src/cli/dispatch.ts (dispatcher + approval handler), src/cli/crud.ts (generic CRUD registration), src/cli/resources/ (per-resource definitions).

Channels and Providers (skill-installed)

Trunk does not ship any specific channel adapter or non-default agent provider. The codebase is the registry/infra; the actual adapters and providers live on long-lived sibling branches and get copied in by skills:

  • channels branch — Discord, Slack, Telegram, WhatsApp, Teams, Linear, GitHub, iMessage, Webex, Resend, Matrix, Google Chat, WhatsApp Cloud (+ helpers, tests, channel-specific setup steps). Installed via /add-<channel> skills.
  • providers branch — OpenCode (and any future non-default agent providers). Installed via /add-opencode.

Each /add-<name> skill is idempotent: git fetch origin <branch> → copy module(s) into the standard paths → append a self-registration import to the relevant barrel → pnpm install <pkg>@<pinned-version> → build.

Self-Modification

One tier of agent self-modification today:

  1. install_packages / add_mcp_server — changes to the per-agent-group container config only (apt/npm deps, wire an existing MCP server). Single admin approval per request; on approve, the handler in src/modules/self-mod/apply.ts rebuilds the image when needed (install_packages only) and restarts the container. container/agent-runner/src/mcp-tools/self-mod.ts.

A second tier (direct source-level self-edits via a draft/activate flow) is planned but not yet implemented.

Secrets / Credentials

Credentials live on the host in .env. Containers receive placeholder values only; the local credential proxy at 127.0.0.1:3001 (bound to docker0 IP on Linux so containers can reach it) substitutes real keys/tokens at request time. None of these secrets ever sit in container env vars or chat context.

Wiring: src/credential-proxy.ts (the proxy), src/container-runner.ts buildContainerArgs (sets container env). On startup the proxy reads from .env: ANTHROPIC_API_KEY, CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_AUTH_TOKEN, OPENAI_API_KEY, plus optional ANTHROPIC_BASE_URL / OPENAI_BASE_URL overrides.

Container env at spawn time:

  • ANTHROPIC_BASE_URL=http://host.docker.internal:3001 — Anthropic SDK target
  • OPENAI_BASE_URL=http://host.docker.internal:3001/openai/v1 — OpenAI SDK target (the /openai prefix is how the proxy multiplexes providers; /v1 is OpenAI's own API version)
  • ANTHROPIC_API_KEY=placeholder or CLAUDE_CODE_OAUTH_TOKEN=placeholder (depending on host's auth mode)
  • OPENAI_API_KEY=placeholder (always — SDKs refuse to init without it)

OAuth handling: when the host runs in OAuth mode, the proxy exchanges the placeholder for the real Claude OAuth token (auto-refreshed from ~/.claude/.credentials.json). For Anthropic API key mode, the proxy simply rewrites the x-api-key header. For OpenAI, the proxy injects Authorization: Bearer <real key>.

Rotation: edit .env, restart the host (systemctl --user restart nanoclaw). No container rebuild — every future spawn picks up the new key on its next request.

Adding a new provider: add the key to readEnvFile() in credential-proxy.ts, add a routing branch (path prefix or host match), set the corresponding *_BASE_URL=http://host:3001/<prefix>/... in container-runner.ts's buildContainerArgs. Mirror the OpenAI pattern.

If approvals are configured server-side but the host callback isn't running (or throws), every credentialed call hangs until the gateway times out. Conversely, if the gateway has no rule asking for approval, the host callback never fires regardless of how it's wired.

Skills

Four types of skills. See CONTRIBUTING.md for the full taxonomy.

  • Channel/provider install skills — copy the relevant module(s) in from the channels or providers branch, wire imports, install pinned deps (e.g. /add-discord, /add-slack, /add-whatsapp, /add-opencode).
  • Utility skills — ship code files alongside SKILL.md (e.g. /claw).
  • Operational skills — instruction-only workflows (/setup, /debug, /customize, /init-first-agent, /manage-channels, /update-nanoclaw).
  • Container skills — loaded inside agent containers at runtime (container/skills/: welcome, self-customize, agent-browser, slack-formatting).
Skill When to Use
/setup First-time install, auth, service config
/init-first-agent Bootstrap the first DM-wired agent (channel pick → identity → wire → welcome DM)
/manage-channels Wire channels to agent groups with isolation level decisions
/customize Adding channels, integrations, behavior changes
/debug Container issues, logs, troubleshooting
/update-nanoclaw Bring upstream updates into a customized install

Contributing

Before creating a PR, adding a skill, or preparing any contribution, you MUST read CONTRIBUTING.md. It covers accepted change types, the four skill types and their guidelines, SKILL.md format rules, and the pre-submission checklist.

PR Hygiene

Before creating a PR, run these checks:

git diff upstream/main --stat HEAD
git log upstream/main..HEAD --oneline

Show the output and wait for approval. Installation-specific files (group files, .claude/settings.json, local configs) should not be included.

Development

Run commands directly — don't tell the user to run them.

# Host (Node + pnpm)
pnpm run dev          # Host with hot reload
pnpm run build        # Compile host TypeScript (src/)
./container/build.sh  # Rebuild agent container image (nanoclaw-agent:latest)
pnpm test             # Host tests (vitest)

# Agent-runner (Bun — separate package tree under container/agent-runner/)
cd container/agent-runner && bun install   # After editing agent-runner deps
cd container/agent-runner && bun test      # Container tests (bun:test)

Container typecheck is a separate tsconfig — if you edit container/agent-runner/src/, run pnpm exec tsc -p container/agent-runner/tsconfig.json --noEmit from root (or bun run typecheck from container/agent-runner/).

Service management:

# macOS (launchd)
launchctl load   ~/Library/LaunchAgents/com.nanoclaw.plist
launchctl unload ~/Library/LaunchAgents/com.nanoclaw.plist
launchctl kickstart -k gui/$(id -u)/com.nanoclaw  # restart

# Linux (systemd)
systemctl --user start|stop|restart nanoclaw

Troubleshooting

Check these first when something goes wrong:

What Where
Host logs logs/nanoclaw.error.log first (delivery failures, crash-loop backoff, warnings), then logs/nanoclaw.log for the full routing chain
Setup logs logs/setup.log (overall), logs/setup-steps/*.log (per-step: bootstrap, environment, container, onecli, mounts, service, etc.)
Session DBs data/v2-sessions/<agent-group>/<session>/inbound.db (messages_in: did the message reach the container?), outbound.db (messages_out: did the agent produce a response?)

Note: container logs are lost after the container exits (--rm flag). If the agent silently failed inside the container, there's no persistent log to inspect.

Supply Chain Security (pnpm)

This project uses pnpm with minimumReleaseAge: 4320 (3 days) in pnpm-workspace.yaml. New package versions must exist on the npm registry for 3 days before pnpm will resolve them.

Rules — do not bypass without explicit human approval:

  • minimumReleaseAgeExclude: Never add entries without human sign-off. If a package must bypass the release age gate, the human must approve and the entry must pin the exact version being excluded (e.g. package@1.2.3), never a range.
  • onlyBuiltDependencies: Never add packages to this list without human approval — build scripts execute arbitrary code during install.
  • pnpm install --frozen-lockfile should be used in CI, automation, and container builds. Never run bare pnpm install in those contexts.

Docs Index

Doc Purpose
docs/architecture.md Full architecture writeup
docs/api-details.md Host API + DB schema details
docs/db.md DB architecture overview: three-DB model, cross-mount rules, readers/writers map
docs/db-central.md Central DB (data/v2.db) — every table + migration system
docs/db-session.md Per-session inbound.db + outbound.db schemas + seq parity
docs/agent-runner-details.md Agent-runner internals + MCP tool interface
docs/isolation-model.md Three-level channel isolation model
docs/setup-wiring.md What's wired, what's open in the setup flow
docs/architecture-diagram.md Diagram version of the architecture
docs/build-and-runtime.md Runtime split (Node host + Bun container), lockfiles, image build surface, CI, key invariants
docs/v1-to-v2-changes.md v1→v2 architecture diff — vocabulary for where v1 things moved
docs/migration-dev.md Migration development guide — testing, debugging, dev loop

Container Build Cache

The container buildkit caches the build context aggressively. --no-cache alone does NOT invalidate COPY steps — the builder's volume retains stale files. To force a truly clean rebuild, prune the builder then re-run ./container/build.sh.

Container Runtime (Bun)

The agent container runs on Bun; the host runs on Node (pnpm). They communicate only via session DBs — no shared modules. Details and rationale: docs/build-and-runtime.md.

Gotchas — trigger + action:

  • Adding or bumping a runtime dep in container/agent-runner/ → edit package.json, then cd container/agent-runner && bun install and commit the updated bun.lock. Do not run pnpm install there — agent-runner is not a pnpm workspace.
  • Bumping @anthropic-ai/claude-agent-sdk, @modelcontextprotocol/sdk, or any agent-runner runtime dep → no minimumReleaseAge policy applies to this tree. Check the release date on npm, pin deliberately, never bun update blindly.
  • Writing a new named-param SQL insert/update in the container → use $name in both SQL and JS keys: .run({ $id: msg.id }). bun:sqlite does not auto-strip the prefix the way better-sqlite3 does on the host. Positional ? params work normally.
  • Adding a test in container/agent-runner/src/ → import from bun:test, not vitest. Vitest runs on Node and can't load bun:sqlite. vitest.config.ts excludes this tree.
  • Adding a Node CLI the agent invokes at runtime (like agent-browser, claude-code, vercel) → put it in the Dockerfile's pnpm global-install block, pinned to an exact version via a new ARG. Don't use bun install -g — that bypasses the pnpm supply-chain policy.
  • Changing the Dockerfile entrypoint or the dynamic-spawn command (src/container-runner.ts line ~301) → keep exec bun ... so signals forward cleanly. The image has no /app/dist; don't reintroduce a tsc build step.
  • Changing session-DB pragmas (container/agent-runner/src/db/connection.ts) → journal_mode=DELETE is load-bearing for cross-mount visibility. Read the comment block at the top of the file first.

CJK font support

Agent containers ship without CJK fonts by default (~200MB saved). If you notice signals the user works with Chinese/Japanese/Korean content — conversing in CJK, CJK timezone (e.g., Asia/Tokyo, Asia/Shanghai, Asia/Seoul, Asia/Taipei, Asia/Hong_Kong), system locale hint, or mentions of needing to render CJK in screenshots/PDFs/scraped pages — offer to enable it:

# Ensure .env has INSTALL_CJK_FONTS=true (overwrite or append)
grep -q '^INSTALL_CJK_FONTS=' .env && sed -i.bak 's/^INSTALL_CJK_FONTS=.*/INSTALL_CJK_FONTS=true/' .env && rm -f .env.bak || echo 'INSTALL_CJK_FONTS=true' >> .env

# Rebuild and restart so new sessions pick up the new image
./container/build.sh
launchctl kickstart -k gui/$(id -u)/com.nanoclaw   # macOS
# systemctl --user restart nanoclaw                # Linux

container/build.sh reads INSTALL_CJK_FONTS from .env and passes it through as a Docker build-arg. Without CJK fonts, Chromium-rendered screenshots and PDFs containing CJK text show tofu (empty rectangles) instead of characters.