fix(credential-proxy): proactively refresh expiring Anthropic OAuth tokens (v2 port of #1102) by chiptoe-svg · Pull Request #2363 · nanocoai/nanoclaw

chiptoe-svg · 2026-05-09T13:49:37Z

Scope

This fix is for users on the native credential proxy only. OneCLI users (/init-onecli) have a separate vault gateway that handles credential refresh in its own daemon — src/credential-proxy.ts is not in their request path. The functions added here are no-ops for that audience.

The audience that benefits:

Native credential proxy (/use-native-credential-proxy, or any fork that didn't install OneCLI)
Anthropic OAuth mode (~/.claude/.credentials.json via claude login — not claude setup-token, which already issues long-lived tokens)
Headless deployments (NanoClaw running as systemd --user / launchd, where Claude Code CLI is not running and therefore not refreshing the file)

Summary

Adapts PR #1102 to the current v2 credential-proxy. Same problem, same approach, different file shape.

OAuth tokens from ~/.claude/.credentials.json (or the macOS keychain) issued by claude login expire ~1 hour after issuance. Today the proxy only re-reads the file on cache expiry and trusts whatever's there. On a host that doesn't have Claude CLI actively keeping the file fresh — the typical NanoClaw-as-systemd-service deployment on a Linux server — the file goes stale, the proxy returns an expired access token, and containers start getting 401s with no recovery path.

(claude setup-token issues year-long tokens specifically for unattended use, so single-instructor installs that authenticated that way don't see the bug — but multi-user forks where each user does claude login will hit it within an hour.)

Changes

All in src/credential-proxy.ts (+210/-22):

readFullOAuthCredentials() — reads ~/.claude/.credentials.json first; on macOS, falls back to the Claude Code-credentials keychain entry. Keychain branch is platform-gated (process.platform === 'darwin') so Linux installs are a clean no-op.
saveOAuthCredentials() — atomic write back (tmp + rename, 0600), so process restarts pick up the latest token. Creates ~/.claude with 0700 if missing.
refreshAnthropicOAuthToken() — POST to platform.claude.com/v1/oauth/token with grant_type=refresh_token. Single-flight guarded via a module-level refreshInFlight promise so concurrent callers share one refresh.
getOAuthToken() is now async and triggers a refresh when:
- token is past expiresAt - REFRESH_BUFFER_MS (5 min), OR
- expiresAt is undefined (macOS keychain path doesn't store it — refresh immediately so we learn the real expiry).

What's preserved

Static tokens from .env (CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_AUTH_TOKEN) still win and are never refreshed.
5-minute refresh buffer (REFRESH_BUFFER_MS) unchanged.
GWS Google OAuth path entirely untouched.
API-key mode entirely untouched.
Codex / OpenAI ChatGPT subscription path entirely untouched (the credential-proxy is not in that path; codex app-server inside the container handles its own refresh).

Differences from #1102

v2 already had a getOAuthToken() and a cachedOAuthToken/cachedExpiresAt pair; reused those instead of inventing new shape (tokenCache interface). Result: smaller surface area.
v2 uses log.warn(...), not logger.warn({...}, '...'). Adapted accordingly.
Single-flight pattern moved to a top-level refreshInFlight (module-level) instead of embedded in startCredentialProxy. Two reasons: cleaner type inference, and getOAuthToken is callable without going through startCredentialProxy (e.g. unit tests).
Skipped the 401 reactive-retry from fix(credential-proxy): auto-refresh OAuth token, handle keychain-only auth #1102. Proactive refresh prevents 401s in normal operation; the retry is a useful safety net but adds non-trivial complexity to the request handler (need to buffer upstream response before piping to be able to retry). Happy to add in a follow-up if the maintainers want it.

Test plan

pnpm run build — clean.
pnpm test — 418/418 pass (existing credential-proxy tests still pass; they don't exercise the OAuth refresh path because that requires mocking platform.claude.com, but the static + cached + read-from-file paths are covered).
End-to-end: long-running install holds container alive past 1 hour without 401s. Validating in a downstream fork; will report back.

Risk

Atomic write-back: tmp file is created in the same directory as the credentials file with mode 0600, then renamed. If the process is killed mid-write the partial tmp file is left behind but the live file is untouched.
Single-flight guard: if the refresh promise rejects before being settled the guard would never release. The implementation uses try/finally to clear refreshInFlight regardless of outcome.
macOS keychain execSync: synchronous — could block the event loop briefly on the keychain query. In practice this only runs on darwin and only when the file is absent; the typical case is the file path which is async-friendly.
No new dependencies: only stdlib (child_process, fs, os, path, https).

🤖 Generated with Claude Code

Revert OneCLI integration and add built-in credential proxy that reads API key or OAuth token from .env, injecting credentials into container API requests without exposing secrets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Pino was replaced with a built-in logger on main. For branches with baileys (WhatsApp), pino resolves as a transitive dependency of @whiskeysockets/baileys. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Picks up main's changes while preserving native credential proxy: - Built-in logger replacing pino/pino-pretty - Removed unused deps (yaml, zod, @vitest/coverage-v8) - CLAUDE.md template copy fix (nanocoai#1391) - MAX_MESSAGES_PER_PROMPT config - Kept credential proxy (not OneCLI) for credential injection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… into HEAD # Conflicts: # src/config.ts # src/container-runner.test.ts # src/container-runner.ts # src/index.ts

…ucket C)

- src/auth-switch.ts: ported from fork; toggles api-key/oauth by commenting/uncommenting ANTHROPIC_API_KEY in .env; adapted logger import to v2's log/log.js convention - src/credential-proxy.ts: integrated fork's OAuth token refresh logic (getOAuthToken with 5-min buffer, ~/.claude/.credentials.json fallback, in-memory cache) and OpenAI routing (/openai/* prefix); fixed logger → log import to match v2 convention - src/credential-proxy.test.ts: updated vi.mock from logger.js to log.js Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Port fork's photo/voice/PDF/auth features onto the v2 Chat SDK bridge adapter pattern via an onInbound interceptor chain. Also copies image.ts from the fork (logger import updated to v2 log module). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Recovered from the prior session that ran /add-codex and got SIGTERM'd mid-build. The /add-codex skill had already: - Copied codex provider source files into container/agent-runner/ and src/providers/ - Wired self-registration imports into both barrels - Added codex CLI install to container/Dockerfile Then SIGTERM hit during ./container/build.sh, leaving these in the working tree. Carrying them as their own commit so the history shows the codex install separately from the v2-startup auto-migration that got bundled with them in the original safety pin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

These deletes happened automatically at the first v2 host startup — src/claude-md-compose.ts:migrateGroupsToClaudeLocal() runs idempotently and renames each group's CLAUDE.md to CLAUDE.local.md (per-group memory the v2 spawn re-composes around). groups/global/ is removed entirely since shared global content moved into container/CLAUDE.md. The renamed CLAUDE.local.md files aren't tracked (they're gitignored under groups/<folder>/), so this commit just records the deletion of the old tracked files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Was untracked at conversation start; bundled into the original safety pin commit by accident. Splitting into its own commit for clarity. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@felix

Brings in the migration tooling that was supposed to seed v2.db from a v1 install but had never been run on this machine. Used in-place with NANOCLAW_MIGRATE_SKIP=preflight,owner,guide,safety,copy,rebuild,verify to seed the (empty) central DB from store/messages.db + .env. Includes: migrate-v2.sh v1→v2 entry point (sibling-clone or in-place) setup/migrate.ts sequencer setup/migrate/*.ts detect/extract/seed/jid/owner/guide modules .nanoclaw-migrations/ audit trail of what was extracted Three seeder bugs were patched in the resulting data after running: - messaging_groups.platform_id stayed in v1 'tg:' format instead of being normalized to v2 'telegram:' format - users.id was 'telegram:tg:<id>' (double-prefixed) — owner-propose bypasses userIdFromJid for is_main fallback path - engage_mode='pattern',pattern='@felix' for v1 requires_trigger=0 case (which means "trigger optional"); should be pattern='.' Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…backup The v2 rewrite reintroduced OneCLI gateway calls in container-runner and the approvals module, which fail-open with 401s on this install (which runs the native credential-proxy skill, not OneCLI). Without OneCLI auth, every container spawn threw and the agent stopped responding. Native credential proxy already existed in v2 (src/credential-proxy.ts, PROXY_BIND_HOST in container-runtime.ts) but wasn't wired through to container env injection or to the proxy listen address. Changes: - container-runner.ts: drop onecli.ensureAgent / applyContainerConfig; inject ANTHROPIC_BASE_URL and OPENAI_BASE_URL pointing at host.docker.internal:CREDENTIAL_PROXY_PORT so containers route through the proxy with placeholder credentials. - index.ts: pass PROXY_BIND_HOST to startCredentialProxy so on Linux the proxy binds where containers can actually reach it (docker0 IP or 0.0.0.0 fallback), not just 127.0.0.1. - modules/approvals/index.ts: stop starting the OneCLI long-poll approval handler — it 401s on app.onecli.sh and the credential approval flow isn't used here. Plus periodic central-DB backup (the original ask): - db/backup.ts: SQLite online .backup() to data/backups/, ring of 60 timestamped files (~1 hour at sweep cadence). Failures logged, never thrown — must not break the sweep. - host-sweep.ts: call backupCentralDb() at the start of each tick. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pre-commit format:fix hook auto-reformatted these during a separate commit; carrying the diff into git as its own change so future diffs on these files don't carry unrelated noise. No semantic changes — purely line-collapse and import reflow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two issues kept OpenAI tools (image gen, etc.) failing in containers even after the native-proxy port: 1. OPENAI_BASE_URL was set to .../v1 — but the proxy multiplexes providers via path prefix /openai/* (credential-proxy.ts:111). With no /openai prefix, the proxy treated requests as Anthropic and forwarded /v1/chat/completions to api.anthropic.com. Fix: set OPENAI_BASE_URL to .../openai/v1 so the proxy strips /openai and forwards /v1/<endpoint> to api.openai.com. 2. OPENAI_API_KEY was never set in container env. OpenAI SDKs refuse to initialize without it even when OPENAI_BASE_URL is overridden (the SDK's own env-presence check, not server-side). Set a placeholder so the SDK is happy; the proxy substitutes the real key in the Authorization header before forwarding upstream. Verified end-to-end: container makes POST to host:3001/openai/v1/... with Authorization: Bearer placeholder, proxy returns a valid chatcmpl-* response from gpt-4o-mini. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

This install runs the native credential proxy (src/credential-proxy.ts), not the OneCLI gateway. Earlier commits in this branch (72422af, 58edbc5) removed OneCLI from the runtime path; this commit removes the rest. Removed: - src/modules/approvals/onecli-approvals.ts (handler module — was no longer started; deleted) - @onecli-sh/sdk dependency from package.json (lockfile regenerated; -1 package, no transitives needed elsewhere) - ONECLI_URL / ONECLI_API_KEY exports from src/config.ts - resolveOneCLIApproval / ONECLI_ACTION import + branch in src/modules/approvals/response-handler.ts (always returned false once the handler stopped registering; removing simplifies the handler down to its DB-backed-approvals path) CLAUDE.md updates: - Dropped the v1→v2 migration "STOP — READ THIS FIRST" banner — migration is complete on this install - Replaced the "Secrets / Credentials / OneCLI" section with a native-proxy explanation matching what the code actually does (proxy bind, container env vars, OAuth handling, rotation, how to add a new provider) - Dropped /init-onecli skill from the operational-skills list - Updated container-runner.ts row in the file table; added a row for src/credential-proxy.ts; dropped the dead src/onecli-approvals.ts row (file never existed at that path on this branch anyway) Verified host still boots clean with no "OneCLI approval handler started" line, TypeScript build passes, agent round-trip still works. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds a per-agent-group "draft" workspace, a web UI for editing it, and glue plumbing (per-group model column, /model and /playground Telegram commands). DB: - migration 014 adds `agent_groups.model` (per-group model override). - createAgentGroup() now actually inserts the model column (was being silently dropped previously, masked by the column not existing). Core library (src/agent-builder/core.ts): - Pure DB+filesystem API for draft lifecycle: createDraft, applyDraft, discardDraft, listDrafts, listAgentGroups, diffDraftAgainstTarget, getDraftStatus. - Channel helpers: ensureDraftMessagingGroup, ensureDraftWiring — auto-create the messaging_group + wiring per draft so test sessions flow through the standard router. - 18 vitest cases + a CLI smoke script (scripts/agent-builder-smoke.ts). Channel adapter (src/channels/playground.ts + public/): - Registers as channel_type='playground'. Each draft gets its own auto-created messaging_group. Test chat reuses the standard router/container/delivery path; adapter.deliver() pushes outbound messages over Server-Sent Events to the connected browser. - Lazy-start: HTTP server NOT bound at host boot. /playground on Telegram calls startPlaygroundServer() which binds the port and issues a magic-link URL. /playground stop closes it. - Magic-link auth: per-restart random token, single-use, sets a 7-day HttpOnly cookie. /playground stop or 30-min idle scrubs the cookie. - 0.0.0.0 by default with magic-link auth; PLAYGROUND_BIND_HOST= 127.0.0.1 forces SSH-tunnel-only access. - Public host autodetected from os.networkInterfaces(), preferring public over private IPv4. PLAYGROUND_PUBLIC_HOST overrides. UI (5 panes via topbar tabs): - Picker: list drafts + non-draft agent groups, create/discard/open. - Chat: SSE-streamed conversation with the draft agent. - Persona: CLAUDE.local.md editor + reload + save. - Skills: enable/disable per-draft, anthropic/skills library browser with compatibility badges (compatible/partial/incompatible). Library cached at data/playground/library-cache/. - Files: file tree + textarea editor. Path-traversal guarded. - Diff: side-by-side draft vs target. - Topbar provider toggle: claude/codex; switching kills the running container and bumps sessions.agent_provider so the next message uses the new provider. - Status badge: ● unsaved / ✓ in sync / ⚠ target deleted. /model Telegram command (src/channels/telegram.ts + src/model-switch.ts): - /model — show current provider + model + suggested-models hint list. - /model <name> — persist to agent_groups.model, kill running container so next message uses it. Trust-first: any string accepted, server validates. Provider sync fix (src/container-runner.ts): - ensureRuntimeFields now also writes the resolved provider + model into container.json, so the in-container runner picks the right runtime. Without this, host-side resolveProviderName picked codex correctly but the container's loadConfig fell through to 'claude' because container.json didn't have a provider field. Codex provider: - Default model bumped from gpt-5.4-mini to gpt-5.5. - container/agent-runner/src/index.ts forwards container.json's `model` into CODEX_MODEL/ANTHROPIC_MODEL env so providers honor it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

SKILL.md updated for the Phase 12 multi-tier role system on the classroom branch: - description + summary advertise admin/instructor/TA/student tiers - prerequisites note the Phase 12.1 main-side dependency (gate signature change, commit 0441eaf) — needed for role-aware playground gating - copy list adds class-pair-instructor.ts and class-pair-ta.ts - imports list grows to five lines (greeting + instructor + ta + playground-gate + container-env) - provision example shows --instructors and --tas flags - "What members experience after pairing" section split by role (student / TA / instructor get different greeting text) - customization section explains where each role's persona lives + that the class-shared.md is symlinked from data/ REMOVE.md and VERIFY.md don't need changes — they already describe the file set as a list rather than enumerating individual files, and the verify script just checks tsc/tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two changes: 1. plans/gws-mcp.md (new) — Phase 13 design. A thin Node MCP host-side using per-API @googleapis/* packages (small, Google-published, no monolith bloat), fronted by a per-agent-scoping relay. V1 surface is exactly two tools — drive_doc_read_as_markdown and drive_doc_write_from_markdown — closing the gap rclone leaves (rclone gives binary .gdoc pointers; this gives editable text). Reuses ~/.config/gws/credentials.json (already minted). Architecture rejects three alternatives explicitly: - @googleworkspace/cli backend → subprocess overhead, no benefit over googleapis directly. - googleapis monolith → 250+ auto-generated clients, dragged the VPS to a halt at install time. Per-API packages instead. - Community Python MCP (taylorwilsdon's) → adds Python runtime and we don't control the surface; usable but not preferred. Per-agent role scoping is the security boundary (uses existing canAccessAgentGroup primitive from Phase 12). V2 expansions (Sheet/Calendar/Gmail) gated on actual use cases. 2. Remove .claude/skills/add-gmail-tool/ and .claude/skills/add-gcal-tool/ (the OneCLI-only Google MCP wrappers). Both required the OneCLI gateway to inject OAuth tokens; this install uses the native credential proxy and never installed OneCLI. The skills couldn't run here. Phase 13's /add-gws-tool will replace them with a working skill that uses the credential proxy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

V1 of the Google Workspace MCP layer. Two tools, all of the auth plumbing lives in the existing credential proxy. src/credential-proxy.ts: - New `/googleapis/*` route. Strip prefix, forward to googleapis.com. - Reads ~/.config/gws/credentials.json (authorized_user OAuth format with refresh_token). Caches access_token in memory until 5 min before expiry. - On miss, POSTs to oauth2.googleapis.com/token with grant_type= refresh_token to mint a new access token. Standard Google OAuth — no library needed; raw https.request keeps the proxy single- file and dependency-light. - Substitutes Authorization: Bearer placeholder → real token on every forwarded request. - 502 with actionable message if no creds present. src/container-runner.ts: - Inject GWS_BASE_URL=http://<gateway>:3001/googleapis at spawn, alongside the existing ANTHROPIC_BASE_URL / OPENAI_BASE_URL. container/agent-runner/src/mcp-tools/gws.ts: - drive_doc_read_as_markdown({ fileId }): GETs Drive's export endpoint with mimeType=text/markdown, returns the markdown. - drive_doc_write_from_markdown({ markdown, title?, fileId? }): multipart upload to Drive's resumable-upload endpoint with metadata { mimeType: application/vnd.google-apps.document }. POST creates a new Doc; PATCH (when fileId given) replaces. Returns { fileId, webViewLink, name }. - Uses fetch() against GWS_BASE_URL with Authorization: Bearer placeholder. The proxy substitutes the real token. container/agent-runner/src/mcp-tools/index.ts: appends `import './gws.js';`. container/skills/google-workspace/SKILL.md: rewritten end-to-end. Was previously documenting a `gws` CLI that wasn't actually installed in the Dockerfile (Felix has been reading misleading instructions). Now describes the two MCP tools above + workflow examples + explicit list of what's NOT in V1 (Sheets / Calendar / Gmail / Slides come later, gated on real use cases). Phase 13.3 (per-agent role gating at the proxy URL layer) is deferred — V1 is full-access for the instructor. When class roles need it, we add URL-pattern matching to the proxy that consults canAccessAgentGroup. 345/345 host tests green, host + container tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Codex CLI (and similar tools) look for AGENTS.md at the project root. This file imports CLAUDE.md so all the project-level instructions (architecture, file map, conventions, supply-chain rules, gotchas) apply regardless of which agent is editing — none of NanoClaw's structure depends on whether the developer is using Claude Code or Codex. Adds a small Codex-specific notes section covering: - apply_patch for edits (vs. Claude Code's Edit tool) - bash + ripgrep for search (vs. Grep tool) - cat/head/sed for reading (vs. Read tool) - update_plan is the in-session widget; plans/<feature>.md is the durable on-disk plan that survives sessions - Pre-commit prettier hook leaves uncommitted reformat output — every agent hits this; commit a follow-up "chore: apply prettier formatting" when it bites - Push proactively at phase boundaries Everything that doesn't change between Claude Code and Codex (architecture, file paths, no-stash rule, supply-chain policy, container/host runtime split, branch model) is just listed explicitly so a Codex-driven session doesn't second-guess the existing rules. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

src/gws-auth.ts — reusable OAuth helpers: - loadOAuthClient: read existing client_id/secret from ~/.config/gws/credentials.json - buildAuthorizationUrl: Google OAuth consent URL with prompt= consent + access_type=offline (required to get a refresh_token back; without these Google omits it on re-auth) - exchangeCodeForTokens: code → access_token + refresh_token via POST to oauth2.googleapis.com/token - writeCredentialsJson: atomic write at 0600 with merged client_id/secret + new tokens. Defensive against missing refresh_token (preserves old if Google declines to issue new). No HTTP server, no CLI logic — pure helpers, reusable across the one-off CLI today and Phase 14's magic-link server. scripts/gws-authorize.ts — one-off CLI that wraps the helpers: - Spins up a localhost HTTP server (default :8765) - Prints the consent URL for the user to open - Receives Google's redirect, exchanges code, writes credentials - Documents the SSH-port-forward workflow for VPS setups Solves "Google OAuth not configured" / 502 errors when the cached refresh token has expired or been revoked (typical after ~6 months of disuse for unverified clients, or when the user revokes access in Google Account settings). plans/gws-mcp.md — adds Phase 14 section: - Per-student Google OAuth, mirroring Phase 9's Codex auth pattern - student_google-auth.ts storage (analog to student-auth.ts) - Magic-link flow added to existing student-auth-server (port 3003) - Per-student bearer lookup in credential-proxy keyed on agent group's student_user_id metadata - /gauth Telegram command (analog to /login) - GCP Console one-time: add NANOCLAW_PUBLIC_URL/google-auth/callback as authorized redirect URI Why Phase 14 matters: Phase 13's V1 routes every agent's GWS calls through the instructor's bearer. Single-instructor case it's fine. For class deploy it's a real boundary problem — student agents could read instructor's Docs by guessing fileIds. Per-student OAuth makes Google enforce the boundary instead of relying on URL parsing in our proxy. Today's gws-authorize.ts is the foundation: when Phase 14 lands, the magic-link flow imports the same exchangeCodeForTokens + writeCredentialsJson helpers; only the storage path and redirect URI differ. 345/345 tests green, tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces the always-on "Web hosting" block in container/CLAUDE.md with a discoverable skill. The old block forbade `cloudflared`/`ngrok`/ `localtunnel`, but agents routed around it via `npx cloudflared` and their own `node server.js`. Piling on more prohibitions wasn't helping. The new skill names the loophole tools explicitly (npx, npm exec, trycloudflare.com, pages.dev, etc.) and pairs the publish recipe with positive design guidance — typography, color, motion, layout — adapted from Anthropic's frontend-design skill (Apache 2.0, attributed). The goal is to make the right path the obvious path, not just the permitted one. Shared prompt drops from ~35 lines on this topic to one pointer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Bug: when an outbound message had an odd count of `*` or `_`, the legacy- Markdown sanitizer dropped EVERY occurrence of those chars to keep Telegram's parser happy. That silently mangled URLs whose path contained an underscore — e.g. `http://host/telegram_main/the-view/` became `http://host/telegrammain/the-view/` after sanitize, and the user got a 404 from a link they couldn't have typoed (they clicked it). Fix: backslash-escape stray `\\*`/`\\_` instead of dropping them. Telegram's legacy Markdown renders `\\_` as a literal underscore, so URLs survive verbatim. Same logic for `\\*`. Even-balanced messages still pass through untouched, so legitimate `_italic_` and `*bold*` rendering is preserved. This unblocks every group folder slug containing an underscore, including the classroom convention (`student_01`, `ta_01`, `instructor_01`). Regression test added for the original `telegram_main` failure case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two follow-ups from a real-world miss where the agent (a) didn't invoke the skill and used a tunnel anyway, and (b) sent the URL before writing the files, giving the user a blank page. - container/CLAUDE.md: shared prompt now says "first action is `Skill: make-website`" instead of "invoke the skill" — imperative, hard to read as optional. Names trycloudflare/ngrok explicitly so an ambitious agent can't loophole into them. - skill: publish recipe is now a 4-step ordered list with an explicit curl verification before sending the URL. The prior wording let the agent post a URL optimistically while assets were still being written. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Container skill that turns an agent into the librarian for a persistent, interlinked markdown wiki under /workspace/agent/wiki/, with raw inputs under /workspace/agent/sources/. Implements the three operations from the pattern: - Ingest: per-source, sequential, 5–15 pages touched per source (summary + entities + concepts + cross-refs + index + log). - Query: read index.md first, synthesize with citations, file noteworthy answers back as new pages. - Lint: contradictions, orphans, stale claims, missing cross-refs, data gaps. Append findings to log.md. Spelled out so the agent doesn't fall back to RAG-style "summarize each file in isolation" behavior — the whole point is per-source integration into a compounding artifact, not parallel skim. Installed via /add-karpathy-llm-wiki. Per-group activation requires scaffolding wiki/ + sources/ trees and a CLAUDE.local.md section (both gitignored under groups/*); this commit only ships the skill. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…rk cross-provider Codex doesn't have Claude Code's discoverable Skill tool. With only CLAUDE.md/CLAUDE.local.md inlined into baseInstructions, agents running on Codex couldn't act on phrases like "your first action is Skill: make-website" — the tool didn't exist, the skill bodies weren't loaded, and the per-group prompt's references just dangled. This adds composeAvailableSkills(): scans the per-group skill symlinks at /home/node/.claude/skills/ (the same set Claude Code sees, scoped by container.json's skill selection), parses each SKILL.md's frontmatter for name + description, and emits a markdown discovery list as part of baseInstructions. The list directs Codex agents to Read /app/skills/<name>/SKILL.md when a description matches the user's request — mirroring the "lazy-load full body" approach Claude Code uses internally rather than inlining tens of KB up front. Net effect: persona, CLAUDE.local.md, and the skill catalog all work the same on Claude or Codex (or any future non-Claude provider that uses the same agent-runner shim). Switching providers is now a config change, not a content rewrite. Tests cover frontmatter parsing edge cases (missing description, missing name field, no frontmatter at all), determinism (alphabetical sort), and the empty-dir/no-eligible-skills paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Switching providers required editing container.json AND updating sessions.agent_provider in v2.db AND stopping the running container. Three places, three different commands; forgetting any one leaves the system in a half-switched state (running container still on the old provider, or session row disagreeing with config file). `pnpm exec tsx scripts/switch-provider.ts <group> <provider>` does all three in order, prints what changed, and is idempotent (no-op when already on the requested provider). Resolves the group folder to its agent_groups row, updates sessions.agent_provider for every session in that group, and stops every running container whose name matches the group prefix. Provider name is intentionally not whitelisted — registered providers are an open set determined at runtime by which provider modules the barrel imports, and this script shouldn't gate that. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds `/provider` to the Telegram slash-command registry, mirroring `/model` and `/auth`: /provider — show current provider + hint list /provider codex — switch group to Codex /provider claude — switch back to Claude Behind it, factor the switch logic out of scripts/switch-provider.ts into src/provider-switch.ts (setProvider/getCurrentProvider/listProviderHints) so the CLI and the Telegram handler share one implementation. Trust-first: any string is accepted; an unregistered provider surfaces server-side at next spawn rather than being whitelisted at command time. Idempotent — already-on-provider returns ok:false reason:no-change so the chat reply can say "no change" honestly instead of misleading "switched". Update path is atomic across all three places provider state lives: container.json, sessions.agent_provider, and any running container. Tests cover container.json read/write, no-change path, no-container-json path, group-not-found path, sessions.agent_provider update, and that unrelated container.json fields (skills, packages, mcpServers) survive the switch. Uses TEST_GROUPS_DIR env to point at a tmpdir without mocking the config module. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pre-existing whitespace cleanup from the GWS work. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

# Conflicts: # .claude/skills/add-gmail-tool/SKILL.md # CLAUDE.md # container/Dockerfile # migrate-v2.sh # package.json # pnpm-lock.yaml # setup/verify.ts # src/index.ts

Upstream's AgentGroup interface requires model: string|null, but a few callers (channel-approval.ts:272, agent-route.test.ts cross-agent-group guard, host-core.test.ts ~10 spots) construct AgentGroup objects without that field. Build broke after merging upstream/main; this fixes the call sites with model: null. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…okens OAuth tokens issued by `~/.claude/.credentials.json` (or the macOS keychain) expire ~1 hour after issuance. Today the proxy only re-reads the file when the in-memory cache hits expiry, then trusts whatever's in the file. On a host that doesn't have Claude CLI actively keeping the file fresh — the typical NanoClaw-as-systemd-service deployment on a Linux server — the file goes stale, the proxy returns an expired access token, and containers start getting 401s with no recovery path. Adds a self-sufficient refresh flow that the proxy owns: - `readFullOAuthCredentials()` — reads `~/.claude/.credentials.json` first; on macOS, falls back to the `Claude Code-credentials` keychain entry. Keychain branch is platform-gated (`process.platform === 'darwin'`) so Linux installs are a clean no-op. - `saveOAuthCredentials()` — atomic write back to the credentials file (tmp + rename, 0600), so process restarts pick up the latest token. - `refreshAnthropicOAuthToken()` — POST to platform.claude.com's /v1/oauth/token with grant_type=refresh_token. Single-flight guarded so concurrent in-flight requests share one refresh. - `getOAuthToken()` is now async and triggers a refresh when: * token is past `expiresAt - REFRESH_BUFFER_MS` (5 min), or * `expiresAt` is undefined (the macOS keychain path doesn't store it — refresh now so we learn the real expiry). Static tokens from `.env` (CLAUDE_CODE_OAUTH_TOKEN / ANTHROPIC_AUTH_TOKEN) still win and are never refreshed. The Google OAuth path is unchanged. Adapted from PR nanocoai#1102 (nanocoai#1102) which was authored against v1; ported to v2's credential-proxy.ts shape and naming. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

gavrielc and others added 30 commits March 24, 2026 17:52

Merge branch 'main' into skill/native-credential-proxy

8baee05

Merge branch 'main' into skill/native-credential-proxy

0a144bc

Merge branch 'main' into skill/native-credential-proxy

aa55f05

chore: remove direct pino/pino-pretty dependency

a674f5f

Pino was replaced with a built-in logger on main. For branches with baileys (WhatsApp), pino resolves as a transitive dependency of @whiskeysockets/baileys. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'upstream/skill/native-credential-proxy'…

9d90462

… into HEAD # Conflicts: # src/config.ts # src/container-runner.test.ts # src/container-runner.ts # src/index.ts

skill: apply add-telegram from channels branch

9e93b10

chore: apply gitleaks security (Bucket H)

aae8c3f

feat: add personas library (Bucket B)

ed2f202

feat: add image-gen, pdf-reader, google-workspace container skills (B…

a68a22a

…ucket C)

feat: port voice transcription (Bucket F)

fe87000

docs: add student setup and playground guides (Bucket J)

2e28263

ci: disable upstream-only workflow triggers (Bucket I)

ac4dd2e

feat: port web hosting and remote-control (Bucket G)

86e34c9

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat: wire /auth command to auth-switch module

16c1848

chore: carry migration guide into v2 tree

6f845a9

fix: resolve build errors (logger→log, proxyServer scope, sharp dep)

c03b487

chore: upgrade to upstream 941a75f

4c39911

chore: track raccoon-unicycle.png test asset

82a79be

Was untracked at conversation start; bundled into the original safety pin commit by accident. Splitting into its own commit for clarity. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chiptoe-svg and others added 26 commits May 6, 2026 00:40

Merge branch 'gws-tool'

1d74ad2

Update repository URL in Quick Start section

0f6ad74

style(gws-auth): collapse path.join onto one line

6d3e8ce

Pre-existing whitespace cleanup from the GWS work. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'upstream/main'

e10b930

# Conflicts: # .claude/skills/add-gmail-tool/SKILL.md # CLAUDE.md # container/Dockerfile # migrate-v2.sh # package.json # pnpm-lock.yaml # setup/verify.ts # src/index.ts

Update README.md

d802991

Add files via upload

e469b0a

Update README.md

3439416

Update repository URL in Quick Start section

db9d459

docs(readme): drop v1→v2 migration block (this fork is already on v2)

4e7174b

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Merge branch 'main' of https://github.com/chiptoe-svg/nanoclaw_gccourse

ea37ad6

fix codex provider contracts

b09abb4

chiptoe-svg requested review from gabi-simons and gavrielc as code owners May 9, 2026 13:49

This was referenced May 10, 2026

🦞 OpenClaw 生态日报 2026-05-10 gsscsd/big_model_radar#321

Open

🦞 OpenClaw 生态日报 2026-05-10 ivanweng2077/big_model_radar#21

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(credential-proxy): proactively refresh expiring Anthropic OAuth tokens (v2 port of #1102)#2363

fix(credential-proxy): proactively refresh expiring Anthropic OAuth tokens (v2 port of #1102)#2363
chiptoe-svg wants to merge 86 commits into
nanocoai:mainfrom
chiptoe-svg:fix/oauth-active-refresh

chiptoe-svg commented May 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chiptoe-svg commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Scope

Summary

Changes

What's preserved

Differences from #1102

Test plan

Risk

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chiptoe-svg commented May 9, 2026 •

edited

Loading