-
Notifications
You must be signed in to change notification settings - Fork 0
02 runtime architecture
github-actions[bot] edited this page Feb 20, 2026
·
1 revision
At runtime, a typical interactive request flows like this:
- Client sends user message to LocalBuddy (
POST /message). - LocalBuddy may answer directly or enqueue a request into Server (
/requests/enqueue). - RemoteBuddy claims request (
/requests/claim), plans work, emits status/messages. - RemoteBuddy may enqueue a job (
/jobs/enqueue). - WorkerPals claims and executes (
/jobs/claim-> run -> complete/fail). - WorkerPals enqueues completion metadata (
/completions/enqueue). - SourceControlManager claims completion and integrates it.
- Server emits session events over SSE/WS so UI can render the full lifecycle.
Three boundaries matter most during design and debugging:
- Planning boundary:
- LocalBuddy/RemoteBuddy decide what should be done.
- Execution boundary:
- WorkerPals decides how planned work is executed.
- Integration boundary:
- SourceControlManager decides whether and how execution output lands on integration branch.
- Control plane:
apps/server- queue state, event history, session transport, autonomy APIs.
- Data plane:
- Worker execution in isolated worktrees/containers (
apps/workerpals). - Git integration work in SourceControlManager.
- Worker execution in isolated worktrees/containers (
This split limits blast radius: service crashes should not directly corrupt execution worktrees.
Server uses SQLite (outputs/data/pushpals.db) for:
- sessions,
- append-only events (cursor replay),
- request queue,
- job queue and logs,
- completion queue,
- autonomy state/snapshots/locks.
Important design detail:
- persist first, broadcast second for events.
This guarantees replay correctness after crashes or reconnects.
- If
apps/clientfails:- request/job pipelines still run; only user visibility is reduced.
- If
apps/remotebuddyfails:- requests accumulate; workers continue current claimed jobs.
- If
apps/workerpalsfails:- jobs remain pending/claimed until recovery sweeps and worker return.
- If
apps/source_control_managerfails:- completions accumulate pending integration.
- If
apps/serverfails:- control plane is unavailable; all components degrade until restart.
Two transport options are supported:
- SSE (
/sessions/:id/events) with cursor replay (afterquery param). - WebSocket (
/sessions/:id/ws) also cursor-aware.
Client libraries choose transport by environment and fall back with reconnect policies.
Both requests and jobs support priority tiers:
interactivenormalbackground
Ordering is priority first, then age. Queue stats and SLO summaries are computed from persisted timestamps.
To trace one unit of work end-to-end, follow:
-
requestId(request lifecycle), -
jobId(execution lifecycle), -
completionId(integration lifecycle), -
sessionIdand event cursor (user-visible timeline).
- Idempotency store in RemoteBuddy to avoid duplicate processing on reconnect.
- Stale-claim recovery sweeps for jobs in Server.
- Lock lease lifecycle for autonomy dispatch (
acquire,renew,release). - Retry policies and bounded attempt counts in WorkerPals and SourceControlManager.
- Worktree isolation per execution job.
Pros:
- replayable lifecycle for debugging and audits,
- strong failure containment,
- policy-first autonomous execution model.
Cons:
- operational complexity for local development,
- more infrastructure code compared to direct single-agent execution,
- requires disciplined schema/version management across components.
When modifying runtime flow:
- Confirm queue status transitions still form a valid state machine.
- Confirm session events remain replay-safe.
- Confirm idempotency behavior on reconnect/restart.
- Update the corresponding component wiki pages.
- Add OpenTelemetry-style trace propagation through request/job/completion IDs.
- Add dead-letter queues for repeatedly failing requests/jobs/completions.
- Add adaptive queue fairness (aging + priority balancing) for long-running background workloads.