ralph-rs — Deterministic Dependencies & Validation, Dynamic Scheduling, Optional Nondeterministic Review
A Rust CLI that orchestrates coding agent harnesses (Claude Code, Codex, OpenCode, Copilot, Goose, Pi) through dependency-DAG plans with test validation, git integration, retry loops, and an optional built-in step-by-step review pipeline.
Determinism framing (post-DAG-redesign, §11). ralph is no longer a flat
"Deterministic Execution Planner." The reproducibility promise is now
per-step: same inputs → same step behavior, and the scheduler's choice
among runnable steps is a deterministic (topological depth, sort_key, short_id) tie-break — so a linear plan (and any plan, given identical human
answers) runs in the authored order, no script regressions. Added on top:
dynamic scheduling (when a branch blocks on a human, the order the
remaining branches run depends on human-answer timing) and first-class
nondeterministic review (a separate harness audits steps). The wall-clock
interleave of concurrently-running reviews is explicitly not part of the
reproducibility guarantee (§14.4).
The DAG-redesign design document is docs/dag-redesign.md. It is the
authoritative spec for the dependency-DAG model, the interruption system,
the built-in review pipeline, the scheduler, and the TUI outline/inbox. The
"Important deliberate deviations" below record where the shipped code
intentionally differs from that draft. Two material deviations from §3.2 /
§5 / §3.4 that landed post-redesign: (1) test-then-commit + at-most-one
commit per step (replaces the draft's "commit per iteration, before the
test" — with no per-iteration commits there was no RetryStrategy::Keep
vs. Rollback distinction at runtime and nothing for
--squash-on-complete to collapse, so both were removed in migration
V37); (2) a
retry-exhaustion auto-blocker (a kind=Blocker interruption with
ranked Retry / Mark Failed options instead of the draft's terminal
StepStatus::Failed transition on TestFailed/CommitFailed). Both are
detailed under "DAG redesign — shipped shape" below.
The pre-DAG TUI design spec is TUI-plan.md at the project root. Note:
that document was written before implementation. Its prompt-layer model
(§8/§11), questions storage (§15), and build-phase list still describe the
pre-overhaul shape (per-plan context_prepend, global/project
prefix-suffix pairs, questions_enabled DEFAULT 0); the prompt-overhaul
branch superseded those, and the DAG redesign superseded the flat
step-list/question model on top of that — see "Prompt model", "Key Design
Decisions", and "DAG redesign — shipped shape" below for the current state.
The narrative sections that the overhauls touched have been reconciled in
TUI-plan.md, but the older keybinding tables and ASCII mocks were left as
historical design notes. This file is the authoritative reference for the
project's current state.
- Language: Rust (edition 2024)
- CLI: clap v4 with derive macros + clap_complete for shell completions
- Database: rusqlite with bundled feature (zero system deps)
- Async: tokio (subprocess management, signal handling, TUI)
- TUI: ratatui + crossterm (vim keybindings)
- Serialization: serde + serde_json, chrono (timestamps)
- Platform dirs: dirs crate (XDG-compliant)
- Error handling: anyhow
- IDs: uuid v4, fractional indexing for step ordering
src/
main.rs — Entry point, clap CLI dispatch, resolve_plan helper
cli.rs — Clap command/arg definitions (ValueEnum for Lifecycle, PlanStatus)
config.rs — JSON config loading (~/.config/ralph-rs/config.json), harness definitions
db.rs — SQLite connection, migrations (V1–V34)
plan.rs — Plan/Step/ExecutionLog models, enums (StepStatus incl. derived Blocked overlay; PlanStatus incl. derived Interrupted; ReviewStatus; Interruption domain model)
frac_index.rs — Base-62 fractional indexing for O(1) step reordering
storage.rs — High-level CRUD (plans, steps, step_dependencies + cycle check, short_id mint, interruptions CRUD, corrective-step request bridge, hooks, locks, project prompt)
harness.rs — Harness resolution, subprocess spawning, output parsing
prompt.rs — Step prompt construction (four-layer `Prompts`, bounded "Resolved interruptions" section, retry context, plan context, hooks); DEFAULT_CONTEXT_PREPEND global-prompt seed
review.rs — Built-in nondeterministic review pipeline: separate O(1) read-only reviewer prompt, spawnable detached review subprocess, orchestrator-only `finalize_review`, corrective-step request drain + re-parent, recursion-cap escalation
executor.rs — Single-step execution (spawn harness → test → commit-on-pass; failed attempts preserve the dirty tree and feed `previous_test_output` into the retry; pre-commit-hook failure is treated as a test failure for retry purposes; retry-budget exhaustion on test-fail or commit-hook-fail raises an auto-`Blocker` interruption instead of going terminal; skip parks WIP)
runner.rs — Plan-level orchestrator: the topological **scheduler** (runnable-set + `(depth, sort_key, short_id)` tie-break), impl semaphore=1, detached-review JoinSet drained at scheduler ticks, sole DB writer, status transitions, --all
run_lock.rs — Per-project run lock to prevent concurrent runs
signal.rs — Two-stage Ctrl+C handling (graceful then forceful)
test_runner.rs — Deterministic test execution (shell commands)
git.rs — Git CLI wrappers (branch, commit, diff, rollback)
hook_library.rs — Hook library management (read/write hook markdown files)
hooks.rs — Hook execution engine (lifecycle hooks at pre/post-step, pre/post-test)
plan_harness.rs — AI harness invocation for plan generation (interactive)
export.rs — Plan export to portable JSON
import.rs — Plan import from JSON with override options
preflight.rs — Pre-run environment validation (harness auth, git dirty state, etc.)
output.rs — Output formatting (JSON, plain, color detection, NDJSON events)
commands/
mod.rs — Re-exports, shared helpers (resolve_project/step, init, doctor, confirm)
plan.rs — Plan CRUD, dependency, plan-level hook, plan harness set/show, review-toggle commands
step.rs — Step CRUD, move, edit (agent/harness/criteria/max-retries/review), step-level hooks
run.rs — Status, log (incl. WIP-skip + per-iteration commits w/ git-note verdict), skip (`--changes`) commands; TUI dispatchers (`run_inbox_tui`, …)
prompt.rs — `ralph prompt set/clear/show` (global/project scope; `.ralph/prompt.md`-aware)
question.rs — `ralph question ask --priority` / `ralph block` (harness raises an interruption)
interruption.rs — `ralph interruption list/show/resolve` (human-side resolve of questions + blockers)
config_cmd.rs — `ralph config show/set-timezone`, `ralph config review set` (global review block)
agents.rs — Agent file CRUD commands
hooks.rs — Hook library CRUD, export/import commands
harness.rs — Read-only harness inspection (`ralph harness list/show`)
tui/
mod.rs — TUI module entry
view.rs — `View` enum (PlanList, ArchivedList, PlanDetail, StepDetail, **Inbox**)
outline.rs — Pure DAG-outline projection (topological depth indent, join `deps:` by short_id, `↳ corrects` marker); shares the runner's `step_schedule_cmp` so outline order == execution order; `z`/`Z` focus (downstream-dependents cone) is a pure view transform
chrome.rs — Persistent top breadcrumb (incl. focus path) + bottom hint/cwd/version bar
theme.rs — Color tokens (truecolor `Color::Rgb` constants)
toast.rs — Transient bottom-row message bar with TTL
dialog.rs — Confirm-dialog primitive (yes/no over a background view)
choice.rs — Generic single-select dialog primitive (vertical j/k/↑/↓ list, Enter/Esc)
editor.rs — `$EDITOR` handoff (round-trip text through a tempfile)
events.rs — NDJSON `RunEvent` subscription wiring (TUI → runner subprocess)
help.rs — `?` help overlay (per-view binding model + render)
palette.rs — `/` / `:` slash-command parser + tab completion
palette_dispatch.rs — Maps parsed palette commands to per-view actions
read_only.rs — Read-only attach state when an external runner holds the lock
run_dialog.rs — `/run` branch-choice dialog (consumes `choice.rs`) + naming phase
skip_dialog.rs — `s` skip change-handling dialog (Stash/Commit/Discard via `choice.rs`; Esc = cancel-restart, no retry budget)
selection.rs — Multi-selection state (with `[N]` badge ordering)
views/
plan_list.rs — Landing screen: tile per plan, sort by recency
archived_list.rs — Same layout as plan_list but for archived plans
plan_detail.rs — Plan-detail view state (drives the DAG outline; `z`/`Z` focus, `I` to inbox)
plan_detail_input.rs — Pure key handler returning `InputAction`s
plan_detail_ui.rs — Plan-detail rendering (right pane + the DAG outline via `outline_view.rs`)
outline_view.rs — DAG outline render that **replaces the flat step list** in plan_detail (depth indent, `deps:`, `↳ corrects <short_id>`, derived `Blocked` overlay, review badges); pure state machine + render split; mouse path resolves through `outline.visible_rows()`
inbox.rs — `View::Inbox` cross-branch interruptions inbox state (open questions + blockers; run-through auto-advance; resolved items kept dimmed)
inbox_ui.rs — Inbox rendering
step_detail.rs — Step-detail pane stack (four layers: Global/Project/Plan/Step prompts, etc.)
step_detail_picker.rs — Bottom-row pickers (harness/model/agent/change_policy)
rendered_prompt.rs — Read-only fully-assembled-prompt preview (`l`/`→` from StepPrompt pane; per-attempt nav)
create_plan.rs — Inline create-plan modal (slug → description → tests)
answer_modal.rs — `InterruptionModal` (ranked proposed answers with the agent's #1 pre-selected + freeform escape hatch + optional comment; blocker variant = resolve / resolve-with-comment, no options; deliberately no "let the agent decide" shortcut) — used by **both** the Inbox and step-detail's inline open-question answer flow (built from a `storage::OpenQuestion` via `InterruptionModal::from_open_question`) — plus the post-answer `ResumeModal` (the separate "resume the run?" prompt). The legacy `AnswerModal` has been removed; both surfaces now render one shared `InterruptionModal` via `inbox_ui::render_interruption_modal`
plan_dependencies.rs — Plan-dependency sub-view (List + Picker modes)
plan_hooks.rs — Plan-hook attachment sub-view
step_hooks.rs — Step-hook attachment sub-view
step_tags.rs — Step tag editor sub-view
The TUI is multi-view (plan list / archived list / plan detail /
step detail / inbox) with sub-views pushed on top for plan
dependencies, plan hooks, step hooks, step tags, and the rendered-prompt
preview. Each view is a self-contained App struct with pure
state-machine methods, plus a separate render function and a per-view
input handler — splitting these three lets us unit-test state
transitions without spinning up a real terminal.
DAG outline (replaces the flat step list, §12.1). Plan-detail no
longer renders a flat self.steps list; it renders the DAG outline
(src/tui/outline.rs projection + views/outline_view.rs render):
topologically ordered, indented by depth, each join step listing its
dependencies inline (deps: …) by short_id, reviewer-inserted steps
marked ↳ corrects <short_id>. The outline shares the runner's
step_schedule_cmp, so the drawn order is exactly the execution order.
Blocked is a derived overlay (an open interruption), rendered like
the derived Interrupted plan status — never persisted. Both the
keyboard and the mouse path resolve through outline.visible_rows()
(click selects / second-click enters / scroll moves the outline cursor)
so they share one index space on a non-linear or focused DAG.
Focus / re-root (§12.2). z on a step re-roots the outline at that
step's downstream dependents cone (only it and what flows out of it;
upstream context lives in the breadcrumb chrome); Z/Esc pops back
toward the true root(s). Focus nests and is a pure view transform —
no DB writes, no scheduler effect; scheduling still spans the whole DAG.
Interruptions inbox (View::Inbox, §12.3). A cross-branch list of
every open question/blocker, decoupled from DAG navigation, reachable
from anywhere via I (Shift-i; lowercase i is a pre-existing
"insert/create" binding so the inbox deliberately uses I) with an
open-count badge. Submitting an answer auto-advances to the next open
interruption (run-through; Esc exits); resolved items stay dimmed for
context. The ranked-answer UI is the InterruptionModal in
answer_modal.rs; the legacy AnswerModal has been removed and
step-detail's inline open-question answer flow now drives the same
InterruptionModal (built from a storage::OpenQuestion via
InterruptionModal::from_open_question, rendered through the shared
inbox_ui::render_interruption_modal). The post-answer ResumeModal
(the "resume the run?" prompt) is unaffected. Palette adds /inbox and
/focus.
The step-detail screen exposes the four user-facing prompt layers as
panes (GlobalPrompt / ProjectPrompt / PlanPrompt / StepPrompt) —
the pre-overhaul PlanContextPrepend / PlanPrefix / PlanSuffix panes
are gone. From the StepPrompt pane, l/→ pushes the
RenderedPromptView sub-view (src/tui/views/rendered_prompt.rs): a
read-only preview of the fully-assembled prompt exactly as
prompt::build_step_prompt produces it, with j/k navigating between
per-attempt renders (each attempt re-assembled with the retry context the
executor would have built for it).
Mouse is supported in the list views: in plan_list / archived_list / plan_detail's step list, a click selects the row, a second click on the already-selected row enters it, and the scroll wheel moves the cursor. The TUI still enables mouse capture (Shift-click bypasses it for native text selection).
The dispatchers live in src/commands/run.rs (run_plan_list_tui,
run_archived_list_tui, run_plan_detail_tui, run_step_detail_tui,
run_plan_dependencies_tui, run_rendered_prompt_tui, run_inbox_tui).
They own the alternate-screen / raw-mode session, the crossterm event
loop, and any DB/storage write-throughs.
Sub-view state machines expose a pure handle_key(KeyEvent) -> Outcome
method; the dispatcher executes the side effect and loops on Pending.
Routing into the TUI is conditional: ralph (no subcommand) and
ralph run with no non-default flags drop into the TUI. Any
non-default flag (--one, --all, --harness, --json, …) keeps
today's non-interactive behavior so scripts don't regress. The
--non-interactive flag and a non-TTY stdout both force the
non-interactive path.
Runtime communication between the TUI and a TUI-spawned runner is
NDJSON over the runner's stdout (same stream as --json / --jsonl).
See docs/ndjson-events.md for the schema.
The help overlay (?) toggles a centered modal listing the bindings of
the current view, grouped by category. Per-view binding models live in
src/tui/help.rs; each view's App carries a HelpState field whose
intercept_key is consulted before the view's normal input handler so
view bindings don't fire under the overlay.
- Deterministic-only: No built-in LLM; plans created manually or via harness delegation
- Multi-harness: Pluggable harness support with different integration patterns (native agent file, env var, prompt injection)
- Git-integrated: All steps are git commits; branches per plan
- Retry strategy: the old
RetryStrategy {Keep, Rollback}enum has been removed (migration V37 dropsplans.retry_strategy/steps.retry_strategy). Failed attempts always preserve the dirty tree and there is at most one commit per step (commit-on-test-pass), so there was nothing to keep/rollback across attempts. The enum, the per-plan/per-step columns, the--retry-strategy/--clear-retry-strategyCLI flags, and the export/import fields are all gone - SQLite storage at platform-appropriate data dir (
~/.local/share/ralph-rs/ralph.dbon Linux) - JSON config at
~/.config/ralph-rs/config.json(XDG semantics on all platforms) - Signal-aware: Two-stage Ctrl+C (graceful then forceful) via tokio watch channels
- Fractional indexing: O(1) step insertion without full reindex
- Run locks: SQLite-based per-project lock prevents concurrent
ralph runinvocations;--forceto recover stale locks - Hook system: Reusable hooks in
~/.config/ralph-rs/hooks/*.mdwith scope, export/import, and lifecycle attachment - NDJSON output:
--jsonflag streams structured events during runs;--quietsuppresses progress;--no-colorandNO_COLORrespected. The DAG redesign addsreview_started,review_finished,corrective_step_requested,corrective_step_inserted,review_loop_escalated,paused_by_user, andsummary(alongside the existingattempt_cancelled); seedocs/ndjson-events.md. Phase E addsinterruption_raisedandinterruption_resolved— both events fire on every insert / every resolve regardless of who triggered it (harness, executor, TUI, CLI), reversing the pre-Phase-E "no NDJSON for interruptions" stance.auto_raised: booloninterruption_raiseddiscriminates the executor's retry-exhausted auto-blocker (true) from every other path (false); the derivedBlockedoverlay and therun_lockscross-process bridge remain the durable source of truth for full interruption state (events are advisory notifications, not payloads) - Skip overhaul:
ralph skip --changes <stash|commit|discard>(defaultstash) and a TUI Choice skip dialog (Stash/Commit/Discard; Esc-cancel restarts the attempt consuming no retry budget) decide what happens to the killed harness's in-flight work.commitwrites a[ralph wip]commit carrying aRalph-Skipped-Step: <id>git trailer;ralph logsurfaces those commits andralph step resetreverts them (confirm /--force). A cross-process skip bridge (plans.skip_requested_step_id/plans.skip_changes, migration V23) lets the TUI/CLI skip a step running inside a separate spawned-runner process - Shell completions:
ralph completions <shell>generates bash/zsh/fish/elvish/powershell
- Plan = dependency DAG of steps. Every step has a stable plan-unique
8-char
short_id(the user-facing handle; the internal UUID is unchanged) andstep_dependenciesedges (a structural clone ofplan_dependencies, withwould_create_step_cycle). Roots = steps with no deps. The V25 backfill turns every existing linear plan into a degenerate chain DAG that executes identically. Import mirrors the same backfill for legacy bundles (classification is byshort_idpresence). - Topological scheduler + deterministic tie-break. A single dynamic
scheduler (in
runner.rs) replaces the linear iterator. It computes the runnable set (every depCompleteand its review returned; notBlocked; not terminal) and picks by(topological depth, sort_key, short_id)(step_schedule_cmp). With no edges every depth is 0 and this is byte-identical to the old "earliest actionable by sort_key" — linear plans don't regress. - Unified interruptions +
run_lockscross-process bridge. Questions and blockers are one entity/table/state-machine (interruptions, V26, supersedesstep_questions). A harness raises one viaralph question ask --priority/ralph block; it binds to the live run via therun_lockstable (get_live_runreadsrun_locks.step_id) — there is noRALPH_STEP_IDenv var for this (theRALPH_STEP_IDenv var exists only for lifecycle hooks, unrelated). An open interruption consumes no retry budget; the scheduler moves to another branch.PlanStatus::Questionbecame the broader derivedInterrupted;StepStatus::Blockedis a derived overlay (never persisted, clears on resolution). - Built-in review pipeline. Off by default; effective =
step.review_enabled ?? plan.review_enabled ?? config.review.enabled ?? false(V27 nullable columns; precedence step > plan > global). The reviewer prompt is separately assembled (notbuild_step_prompt), O(1): plan/step context + a singlegit show <sha>diff (Decision 5 — no dependency diffs). Concurrency model: reviews run as a detached task (a tokioJoinSet); the orchestrator's single scheduler loop is the sole DB writer and drains finished reviews at scheduler ticks; an implementation semaphore of 1 serializes implement+test+commit; the reviewer runs read-only by contract in a throwawaygit worktreepinned at the reviewed SHA (RAIIgit::ReviewWorktree,Dropcleanup, Git env redirection scrubbed). This is defense-in-depth, not filesystem sandboxing; reviewer harnesses are trusted processes. - No
StepStatus::AwaitingReviewvariant. A review-gated step staysInProgress— gating is structural (the re-parented edge to the corrective step / deps-not-satisfied), per §3.3/§10, not status-based. A separate per-stepreview_status(Pending | InFlight | Passed | Failed | Skipped | Disabled) tracks the verdict. - Test-then-commit + at-most-one commit per step (deviates from the
design draft's "commit per iteration, before the test"). A commit
happens only on the first attempt whose deterministic tests pass;
failed attempts leave the dirty tree on disk and feed
previous_test_output(and pre-commit hook stderr, treated as a test failure) into the next prompt — no commit, no rollback. Subjectralph <short_id>.<n> - <title>+ trailersRalph-Plan/Ralph-Step/Ralph-Iteration: <n>(n is the attempt number that finally passed — with at most one commit per step, this identifies which attempt succeeded rather than counting commits) /Ralph-Review: pending. The review verdict is still recorded as a git note onrefs/notes/ralph-review, not by amending the commit (git::annotate_review_verdict) — history/tree-safe under concurrency.ralph logandralph step resetcontinue to work via the trailers (reset reverts the single commit). Tooling that readRalph-Iterationas an attempt identifier is still correct; tooling that expected multiple iteration commits per step needs updating. - Retry-exhaustion auto-blocker (deviates from the design draft's
terminal-Failed transition on
TestFailed/CommitFailed). When a step exhausts its retry budget on test-fail or commit-hook-fail (commit-hook stderr is treated as a test failure), the executor automatically raises akind=Blockerinterruption instead of going terminal. The blocker carries two ranked options — priority 1 ="Retry step with parked changes", priority 2 ="Mark step Failed"— and a body of"Step failed after N attempts."plus the last attempt's test output (and hook stderr when applicable). The step's stored status staysPendingwithattempts == max_attempts; the derivedBlockedoverlay shadows it while the blocker is open. The scheduler moves to another runnable branch (consumes no further retry budget). Resolution (TUI inbox orralph interruption resolve):RETRY_EXHAUSTED_OPTION_RETRY→ resetattempts = 0and statusPendingwhile preserving the parked dirty WIP tree (the failed attempts' on-disk changes are kept, restored from the parked stash on the next pick — this is retry-with-parked-changes, not a fresh start), scheduler re-picks;RETRY_EXHAUSTED_OPTION_FAIL→ statusFailedterminal (any surviving parked worktree state is discarded); a freeform answer matching neither is treated as retry-with-hint (attempts reset, parked changes preserved; the hint flows into the next prompt via the bounded "Resolved interruptions" section). Other failure modes —HarnessFailed,Timeout,NoChanges— remain terminalFailed. Recognition contract lives incommands::interruption::apply_retry_exhausted_resolution; the option constantsRETRY_EXHAUSTED_OPTION_RETRY/RETRY_EXHAUSTED_OPTION_FAILarepub constinsrc/executor.rsso the executor (writer), Phase C resolution handler, and TUI all share one source of truth.TerminationReason::PausedForQuestionandStepOutcome::PausedForQuestionare reused (no new variants); the insert + status-park happen in a singleunchecked_transactionso the scheduler can't observePending without open interruptionmid-write. - Corrective re-parenting + recursion cap (§10). A failed review
requests (never performs) a corrective step via an NDJSON event +
the V29
corrective_step_requestsbridge row. The orchestrator (sole writer) drains it: insertsA′(corrects_step_id = A,A′ depends_on A), re-parents every former dependent ofAontoA′, thenA→Completewithreview_status = Failed. The review→correction→review chain is bounded by a per-planmax_review_corrections(V30, defaultDEFAULT_MAX_REVIEW_CORRECTIONS= 3); exceeding it raises akind=blockerinterruption ("review loop — needs human") instead of spawning forever. - §9 concurrency invariants (hard): (1) one implementation slot
(semaphore=1); (2) reviews are read-only by contract and run from a
throwaway worktree at a fixed SHA; (3) single DAG writer (only the
orchestrator mutates the DAG; reviewers only request); (4)
cross-process interruption bridge via
run_locks(reviews never take the run-lock). - §14.1 resolved (flipped post-test-then-commit): with at most one
commit per step, there are no per-iteration commits to keep or squash
— the
execution_logsrows are the audit trail (each row carries the attempt's prompt / harness stdout+stderr / test output / diff for every attempt including failed ones), and the single committed SHA represents only the attempt that passed. The--squash-on-completeflag and the per-plansquash_on_completecolumn have been removed (migration V37 drops the column; there was nothing to squash post test-then-commit). §14.4: scheduler reproducibility is timing-independent given identical human inputs; the wall-clock interleave of concurrent reviews is not part of the guarantee. - Export/import carry the DAG:
ExportedStepgainsshort_id(always emitted) anddepends_on: Vec<short_id>; plan+stepreview_enabledandmax_review_correctionsround-trip via theskip_serializing_if/defaultpattern. Runtime state (interruptions,review_status, attempts, iteration commits,corrects_step_idprovenance) is not exported. Import validates the imported edge set (no dangling edges, unique short_ids, acyclic, ≥1 root) before any write;--strictrejects a review-on bundle when the target machine has no review harness. - Explicit step placement (post-redesign follow-up).
ralph step addno longer has a positional--after <N>(list position, no edge — the ambiguity that silently produced edge-less DAGs). On a non-empty plan exactly one placement is required:--after <S>(new step depends on S),--before <S>(new step takes over S's incoming edges; S then depends only on it; root-S ⇒ new step is the new root),--depends-on <S>...(the multi-parent join primitive), or--root(explicit independent root).--after+--beforetogether splices between them. The first step of an empty plan is the implied root.--import-jsonnow carries the DAG (per-object batch-localid+depends_on, validated unique/acyclic/no-dangling, whole batch atomic) instead of being edge-free; it also wiresreview_enabled(previously silently dropped). The hand-authoredidis a batch-local wiring label only (never persisted); the persistedshort_idis minted (auto, the common path) or — if explicitly supplied — validatedis_short_id_shaped(a readable/numericshort_idwas the bug: created but unselectable / shadowing a step position). The sameis_short_id_shapedguard is enforced invalidate_dag_aware_stepsfor fullralph importbundles (real exports always pass; it's a tamper/hand-edit guard). Engine stays a general DAG (joins via--depends-on);--after/--beforeare tree-shaped authoring sugar.ralph step listnow shows each step'sshort_id+deps:, andralph plan harness generateemits a non-fatal warning when it produced an edge-less multi-step plan. - Plan-local step dependencies (post-redesign follow-up, V31). A
step_dependenciesedge is only meaningful inside one plan — the scheduler, import/export, outline, and corrective re-parenting all operate on a single plan's step set. V25's two independent foreign keys blocked dangling step IDs but not a cross-plan edge. V31 enforces the invariant at the DB boundary: it drops any pre-existing cross-plan rows, then installsBEFORE INSERT/BEFORE UPDATEtriggers (step_dependencies_same_plan_{insert,update}) thatRAISE(ABORT, …)on a plan mismatch.storage::add_step_dependencyadditionally re-checks in-process to surface precise errors (Step not foundvs. cross-plan) on the common path — deliberate defense-in-depth that must stay in sync with the triggers. - Schema/version: migrations run through V37 (V32 drops the old
UNIQUE(step_id, attempt)execution-log constraint, V33 adds per-step cycle indices for retry-cycle audit grouping (the parked-changes retry resetsattemptsto start a new cycle), V34 adds durablestep_parked_worktreesstash state, V35 adds thehuman_approvedone-more-cycle grant for review-loop escalation, V36 drops the per-planquestions_enabledopt-out (interruptions are always enabled), and V37 drops the vestigialplans.retry_strategy/steps.retry_strategy/plans.squash_on_completecolumns);Cargo.tomlis 0.1.20.
Four layers, assembled outermost → innermost by prompt::build_step_prompt
(Prompts struct in src/prompt.rs):
- Global —
config.promptin~/.config/ralph-rs/config.json. Seeded withDEFAULT_CONTEXT_PREPEND(the ralph-CLI introspection hints) atralph init;ralph init --restore-promptsre-seeds it unconditionally (overwriting customization); uncustomized legacy configs are reseeded on migration.build_step_promptno longer auto-injects the prepend — the Global layer carries it, so editing the global prompt fully customizes it. - Project —
<project>/.ralph/prompt.md(a file, if present) wins over theproject_settings.promptDB column.ralph prompt set/clear/show --scope projectis file-vs-DB aware. - Plan — the plan's
description, rendered once into the# Plan: {slug}context block. There is no per-plan prefix/suffix and no per-plancontext_prepend(legacy per-plan columns dropped in migration V21; the project-scope prefix/suffix pair was collapsed intoproject_settings.promptin V22). - Step — the step body (title / description / acceptance criteria).
There is no suffix concept; layers stack as prefix sections only.
--scope universal is a clap alias for --scope global. ralph doctor
emits a non-fatal warning when the global prompt lacks the ralph-CLI
hints, pointing the user at ralph init --restore-prompts; it also warns
non-fatally when review_enabled is set but no/invalid review harness is
configured.
DAG-redesign prompt deltas:
- The old unbounded "Previously answered questions" section became
"Resolved interruptions" and is now bounded (closes the §4
context-growth leak): the last N resolved interruptions for the step,
each body/resolution/comment run through
truncate_text. The chosen answer/resolution and the human comment both flow into this bounded injection. - The reviewer prompt is a separate, independently-assembled prompt
(not
build_step_prompt,review::build_review_prompt): plan/step context + the singlegit show <sha>diff — O(1) in plan size. No dependency diffs are ever injected (Decision 5 preserved — the DAG is a scheduling/eligibility construct, not a prompt-context-growth construct).
ralph init [--non-interactive] [--default-harness <name>] [--force] [--restore-prompts]
ralph plan create <slug> [-d <desc>] [--test <cmd>]... [--harness <h>] [--agent <name>] [--branch <name>] [--depends-on <slug>]... [--max-review-corrections <n>]
ralph plan list [--all] [--status <status>] [--archived]
ralph plan show <slug>
ralph plan approve <slug>
ralph plan delete <slug> [--force/-y]
ralph plan archive <slug>
ralph plan unarchive <slug>
ralph plan set-hook <slug> --lifecycle <lifecycle> --hook <name>
ralph plan unset-hook <slug> --lifecycle <lifecycle> --hook <name>
ralph plan hooks <slug>
ralph plan dependency add <slug> --depends-on <slug>...
ralph plan dependency remove <slug> --depends-on <slug>...
ralph plan dependency list <slug>
ralph plan review <on|off> <slug> # per-plan review toggle (precedence step > plan > config > false)
ralph plan harness set <harness> [<slug>]
ralph plan harness show [<slug>]
ralph plan harness generate [<description>] [<slug>] [--use-harness <h>]
# Every <num> step selector ALSO accepts an 8-char short_id (DAG handle).
ralph step list [<slug>]
ralph step add <title> [<slug>] [-d <desc>] [--after <short_id|num>] [--before <short_id|num>] [--root] [--depends-on <short_id|num>]... [--agent <name>] [--harness <h>] [--criteria <c>]... [--max-retries <n>] [--import-json <FILE|->] # non-empty plan requires exactly one placement: --after | --before | --depends-on | --root (empty plan: implied root)
ralph step remove <num|short_id>|--step-id <uuid> [<slug>] [--force/-y]
ralph step edit <num|short_id>|--step-id <uuid> [<slug>] [--title <t>] [--description <d>] [--agent <name>] [--harness <h>] [--criteria <c>]... [--clear-criteria] [--max-retries <n>] [--clear-max-retries] [--review <on|off|inherit>]
ralph step reset <num|short_id>|--step-id <uuid> [<slug>] [--force/-y]
ralph step move <num|short_id>|--step-id <uuid> --to <n> [<slug>]
ralph step set-hook <num|short_id>|--step-id <uuid> [<slug>] --lifecycle <lifecycle> --hook <name>
ralph step unset-hook <num|short_id>|--step-id <uuid> [<slug>] --lifecycle <lifecycle> --hook <name>
ralph step dependency add <num|short_id> --depends-on <short_id|num>...
ralph step dependency remove <num|short_id> --depends-on <short_id|num>...
ralph step dependency list <num|short_id>
ralph run [<slug>] [--one/--single] [--all] [--from <n>] [--to <m>] [--dry-run] [--skip-preflight] [--current-branch] [--auto-stash] [--harness <h>] [--force]
ralph resume [<slug>]
ralph skip [<slug>] [--step <n>] [--reason <reason>] [--changes <stash|commit|discard>] [--force]
# Harness raises an interruption mid-step (binds to the live run via the
# run_locks table; consumes NO retry budget); human resolves it CLI-side.
ralph question ask [<text>] [--suggest/-s <answer>]... [--priority <n>]...
ralph block [<text>]
ralph interruption list [<slug>]
ralph interruption show <id|index>
ralph interruption resolve <id|index> [--option <k>] [--answer <text>] [--comment <text>]
ralph export <slug> [-o <file>]
ralph import <file> [--slug <name>] [--branch <name>] [--strict]
ralph status [<slug>] [--verbose/-v]
ralph log [<slug>] [--step <n>] [--limit <n>] [--full|--lines <n>]
ralph prompt show [--scope <global|project|universal>] [--resolved]
ralph prompt set --scope <global|project|universal> <content>
ralph prompt clear --scope <global|project|universal>
ralph config show
ralph config set-timezone <tz>
ralph config review set [--harness <h>] [--model <m>] [--enabled <bool>]
ralph agents list|show|create|delete
ralph hooks list|show|add|remove|export|import
ralph harness list [--json]
ralph harness show <name> [--json]
ralph doctor
ralph completions <shell>
Global flags: --project <path> (-C), --harness <name>, --json, --quiet, --no-color
There are two documents that teach an AI agent how to author a ralph plan, and they must stay in lockstep:
.claude/skills/create-ralph/SKILL.md— the slash-command skill, used when a user runs/create-ralphinside Claude Code.HARNESS_PLAN_AGENT_BASEinsrc/plan_harness.rs— the system prompt sent to a coding harness spawned byralph plan harness generate.
Both teach the same workflow, anti-patterns, and CLI surface. If you change one, change the other in the same PR. Drift means the same user gets materially worse plans depending on which entry point they use.
The harness prompt should not reference Claude-Code-specific things
($ARGUMENTS, allowed-tools, frontmatter); the skill should not duplicate
the runtime hook-library injection that render_plan_agent does. Everything
else — preflight, recommended shape, authoring (--import-json warning),
review steps, anti-patterns, CLI flags — should match in substance.
cargo build
cargo test
cargo clippy -- -D warningsTest footgun — ETXTBSY on freshly-written scripts: Tests that write a shell script to a tempdir and then Command::new(script).status() it can intermittently fail in CI with Text file busy (os error 26). Cause: cargo runs tests in parallel; another thread's spawned child can inherit a writable fd to the script across its fork→exec window, and Linux refuses execve() while any process holds the file open for write. Fix: invoke via /bin/sh <path> instead of exec'ing the script directly — sh opens it as a regular file and sidesteps the kernel's writer-check. See sh_editor() in src/tui/editor.rs for the pattern.
- kctx-local (sibling at
../kctx-local/) — Local-first Q&A CLI for codebases. Uses same Rust patterns. - mcp2cli-rs (at
../../mcp2cli/mcp2cli-rs/) — Universal CLI adapter for MCP, OpenAPI, GraphQL.