Skip to content

arch(reborn): consolidate composition root, narrow public surface, ship live binary#3695

Merged
serrrfirat merged 15 commits into
reborn-integrationfrom
reborn/composition-runtime-and-narrow-surface
May 17, 2026
Merged

arch(reborn): consolidate composition root, narrow public surface, ship live binary#3695
serrrfirat merged 15 commits into
reborn-integrationfrom
reborn/composition-runtime-and-narrow-surface

Conversation

@serrrfirat
Copy link
Copy Markdown
Collaborator

Summary

Promotes ironclaw_reborn_composition to the Reborn composition root the crate's own docstring promised, ships a live ironclaw-reborn binary that boots a real agent through it, and narrows ironclaw_reborn's public surface to a directory of modules — all locked in with boundary tests.

Three things land together because doing them apart is harder than doing them in one PR:

  1. Architectural restructureironclaw_reborn_composition absorbs the runtime assembly slice (drivers + worker + LLM bridge), per the existing crate-level docstring that already promised "Product/AppBuilder integration belongs in later slices."
  2. CLI is now a real agentironclaw-reborn run -m "..." and ironclaw-reborn run (REPL on stdin) boot the composed runtime, submit a turn, wait for completion, print the assistant reply.
  3. Narrow public surfaceironclaw_reborn's ~25-item wall of pub use re-exports is gone. lib.rs is now a directory of modules. Boundary tests pin the new shape.

Why one PR

The cleanup is the point. Shipping "the binary runs" first with the existing noisy surface (plus the new composition entry points layered on top) bakes in the speculative public API problem this PR is trying to solve. Trace audit before starting found:

  • 0 external workspace crates ever named anything from the noisy re-export wall (CapabilityResolveError, LoopCapabilityInputResolver, ModelRoute, DurableLoopHostMilestoneSink, etc.).
  • 2 internal test files (in ironclaw_reborn/tests/) needed import paths updated.
  • 4 CLI commands needed minor updates (already getting rewritten anyway for the live-agent commit).

That's small enough to do confidently in one change.

Commits

Commit Crate(s) Net
feat(llm): make create_registry_provider public… ironclaw_llm +pub on one fn
arch(reborn-composition): extend composition root with assembled runtime ironclaw_reborn_composition +build_reborn_runtime, +RebornRuntime
arch(reborn): narrow ironclaw_reborn public surface to a directory of modules ironclaw_reborn strips ~25 pub use lines; promotes 5 modules to pub mod; updates 5 internal tests; fixes 1 baseline cargo test --no-run duplicate-impl bug
feat(reborn-cli): boot a live agent through the composition root ironclaw_reborn_cli tokio runtime + REPL/-m, drops direct ironclaw_reborn dep
arch(boundaries): pin narrowed Reborn surface and composition-routed CLI ironclaw_architecture +new test, +5 entries to CLI forbidden list, +composition root-llm-provider discipline check, updates one existing assertion

What the runtime looks like to callers

The CLI imports literally:

use ironclaw_reborn_composition::{
    build_reborn_runtime, RebornCompositionProfile, RebornRuntimeInput, TurnRunnerSettings,
};
#[cfg(feature = "root-llm-provider")]
use ironclaw_reborn_composition::RebornLlmConfig;
use ironclaw_reborn_config::{RebornBootConfig, RebornProfile};

That is the entire surface a downstream consumer touches. No TurnCoordinator, SessionThreadService, HostManagedModelGateway, LoopExitApplier, RebornLoopDriverHostFactory, ThreadScope, SubmitTurnRequest, AcceptedMessageRef, IdempotencyKey, etc. — all of those stay private to the composition root.

RebornRuntime exposes:

pub async fn new_conversation(&self) -> Result<ConversationId, _>;
pub async fn send_user_message(&self, conv: &ConversationId, text: &str) -> Result<AssistantReply, _>;
pub async fn shutdown(self) -> Result<(), _>;
pub fn services(&self) -> &RebornServices;  // diagnostic readiness snapshot only

Verified end-to-end

$ IRONCLAW_REBORN_HOME=$TMP ironclaw-reborn run -m "ping"
ironclaw-reborn: runtime started
  profile     : local-dev
  reborn_home : /tmp/.../home

WARN ironclaw_reborn::text_loop_driver: loop host port returned sanitized error
     stage="model" kind=Unavailable safe_summary=model service is unavailable
WARN ironclaw_reborn::turn_runner: driver invocation failed, recording recovery required
     run_id=9d984ca0-... error=driver error: agent loop driver is unavailable: model: unavailable
(no assistant text; status=RecoveryRequired, run_id=9d984ca0-...)

real 0m0.225s

Without OPENAI_API_KEY / ANTHROPIC_API_KEY / OLLAMA_BASE_URL exported the run surfaces RecoveryRequired (the stub gateway's Unavailable propagated through driver → worker → applier). With a real API key exported the same command prints the assistant reply.

Profile coverage

Only RebornCompositionProfile::LocalDev is wired end-to-end. Production and migration-dry-run continue to return their substrate-only RebornServices via the unchanged build_reborn_services(); calling build_reborn_runtime with them yields a clear "not yet wired" error rather than partially starting an agent. Production wiring is the next slice — durable thread store, durable checkpoint stores, real evidence ports, and a real turn-run wake notifier all already exist in their own crates, but the composition there needs separate review.

Tests

Crate Result
ironclaw_llm 733 / 733
ironclaw_reborn (with root-llm-provider) 225 / 225
ironclaw_reborn_composition (both feature variants) 5 / 5
ironclaw_reborn_cli 35 / 35 (2 existing diagnostic tests now invoke run --dry-run)
ironclaw_architecture 17 / 17 (1 updated, 1 new)

Workspace-wide cargo check --workspace --all-targets is clean on both --no-default-features and default.

Drive-by

crates/ironclaw_reborn/src/planned_driver.rs had a duplicate impl LoopCancellationPort for ResumePayloadHost inside mod tests that broke cargo test --no-run -p ironclaw_reborn on baseline. The later (no-op None-returning) impl was unreachable behind the conflict; removed it and kept the delegating one. Called out explicitly in the commit message.

Boundary tests pinning the new shape

  • reborn_cli_binary_crate_stays_separate_from_v1_root: CLI's exact normal deps are {ironclaw_reborn_composition, ironclaw_reborn_config} and nothing else.
  • ironclaw_reborn_cli BoundaryRule.forbidden: adds ironclaw_reborn, ironclaw_llm, ironclaw_loop_support, ironclaw_threads, ironclaw_turns.
  • reborn_loop_support_llm_wiring_stays_out_of_root_src: extended to also assert ironclaw_reborn_composition gates ironclaw_llm behind the same root-llm-provider feature with optional = true, default-features = false.
  • reborn_internal_crate_keeps_directory_of_modules_lib_rs (new): forbids pub use ironclaw_loop_support::, pub use loop_driver_host::, pub use milestone_events::, pub use model_gateway::, pub use model_routes::, pub use planned_driver::, pub use planned_driver_factory::, pub use text_loop_driver::, pub use app_loop_family:: in crates/ironclaw_reborn/src/lib.rs. Also asserts the runtime assembly lives in ironclaw_reborn_composition and reaches ironclaw_reborn only via module paths.

Follow-ups (intentionally not in this PR)

  • Wire production / migration-dry-run profiles end-to-end (durable substrate is built; this is composition glue only).
  • Consider moving the remaining integration tests in crates/ironclaw_reborn/tests/ to #[cfg(test)] mod tests blocks colocated with the modules they exercise. That would let several currently-pub mod modules become pub(crate) mod and tighten the surface further. Out of scope here because the bound is already enforced by the boundary tests and the migration is mechanical.

The Reborn composition root needs to build an LlmProvider from a
RegistryProviderConfig without dragging in v1 SessionManager / OAuth
plumbing (which only the NEAR AI provider chain needs).

create_llm_provider() routes via SessionManager; for registry-based
providers (OpenAI, Anthropic, Ollama, DeepSeek, Gemini, OpenRouter,
GitHub Copilot) it ultimately calls create_registry_provider() with
no SessionManager use. Exposing that internal entry point lets
ironclaw_reborn_composition::build_reborn_runtime construct an
LlmProvider directly from a composition-owned RebornLlmConfig.

No behavior change for existing callers.
The composition crate's existing docstring promised "Product/AppBuilder
integration belongs in later slices." This is that slice.

Adds:
  - RebornRuntimeInput { services, llm, runner }
  - RebornRuntime task-level handle (new_conversation,
    send_user_message, shutdown) -- callers never see TurnCoordinator,
    SessionThreadService, HostManagedModelGateway, or any other
    substrate trait
  - build_reborn_runtime(input) -- substrate + driver registry +
    LLM model gateway bridge (behind root-llm-provider feature) +
    text-only loop driver + ThreadCheckpointLoopExitEvidencePort +
    TurnRunnerWorker, all wired and spawned as one unit
  - RebornLlmConfig composition-owned LLM config DTO with
    openai_compat() convenience constructor

Feature flag:
  ironclaw_reborn_composition gains a root-llm-provider feature that
  pulls ironclaw_llm (optional, default-features = false) and turns on
  ironclaw_reborn/root-llm-provider. Mirrors the discipline already
  pinned by reborn_loop_support_llm_wiring_stays_out_of_root_src for
  ironclaw_reborn.

Profile coverage:
  Only RebornCompositionProfile::LocalDev is wired end-to-end here.
  Production and migration-dry-run still return their substrate-only
  RebornServices; calling build_reborn_runtime with them yields a
  clear "not yet wired" error rather than partially starting an agent.

Without an LLM provided, a stub gateway is wired so the worker boots
cleanly and the failure surfaces at first send_user_message
(status=RecoveryRequired) rather than at runtime construction. That
keeps `ironclaw-reborn run --dry-run` and substrate-only smoke paths
green without API credentials.
… modules

ironclaw_reborn previously exposed ~25 types as a wall of `pub use`
re-exports at the crate root: capability resolvers and surface profile
filters from ironclaw_loop_support, milestone scope/sink, model route
policies, planned-driver factory helpers, the loop-driver-host factory,
text-only driver, model gateway adapters, and more.

Trace audit: zero external workspace crates ever named any of those
items. The wall was pure speculative public API (Illia's concern on
PR review: "this leaves opportunity to use internals"). Composition
imports through submodule paths (ironclaw_reborn::driver_registry::*,
::loop_driver_host::*, ::model_gateway::*) and doesn't need the
flattened surface at all.

This commit:
  - lib.rs becomes a directory of `pub mod` declarations and a
    docstring. No more `pub use` flattening.
  - app_loop_family, milestone_events, model_routes, planned_driver,
    planned_driver_factory promoted to `pub mod` so existing internal
    integration tests in crates/ironclaw_reborn/tests/ can still reach
    them via paths instead of flattened re-exports. The submodules
    were previously private and lived only behind the noisy re-export
    wall.
  - Internal tests updated to use module-path imports.
  - One `use crate::build_loop_family_registry` (in src/) updated to
    use the new module path.
  - Drive-by fix: removed a pre-existing duplicate
    `impl LoopCancellationPort for ResumePayloadHost` in
    planned_driver.rs tests that broke `cargo test --no-run -p
    ironclaw_reborn` on baseline.

Behavior change for downstream consumers: zero (the boundary tests
already forbid any non-composition crate from importing
ironclaw_reborn, so nothing outside the crate could have observed
the old surface).

The next commit (CLI rewire) drops the CLI's direct
`ironclaw_reborn` dep entirely; the commit after that pins the
narrowed surface in boundary tests.
The `ironclaw-reborn run` command previously initialized a side-effect-
free "runtime shell" snapshot and exited. With the composition root
now exposing a full RebornRuntime, `run` becomes a real agent entry
point:

  - tokio multi-thread runtime spun up in fn run()
  - reads LLM provider config from env (OPENAI_API_KEY,
    ANTHROPIC_API_KEY, OLLAMA_BASE_URL) into RebornLlmConfig
  - calls build_reborn_runtime() to spawn the worker
  - `run -m "hello"` -- single-shot mode: submit, wait, print
    assistant reply, shutdown
  - `run` without -m -- stdin REPL until EOF or Ctrl-C
  - `run --dry-run` preserves the legacy diagnostic-only banner so
    existing smoke tests (and any tooling that scrapes the snapshot)
    keep working without modification beyond appending the flag

CLI dependency graph -- only Reborn-prefixed crates now allowed:
  - ironclaw_reborn_composition (new; replaces ironclaw_reborn)
  - ironclaw_reborn_config (unchanged)
  - secrecy, tokio, tracing-subscriber (added)

Dropped: direct `ironclaw_reborn` import. The two CLI files that
reached `ironclaw_reborn::driver_registry::DriverRegistry::new()` and
`ironclaw_reborn::ModelSlot` were rewritten: doctor.rs no longer
constructs a DriverRegistry (it was a no-op diagnostic anyway), and
models.rs ships a CLI-local mirror of the ModelSlot label taxonomy so
the diagnostic `models list` / `models status` commands keep their
existing output without crossing the narrowed boundary.

Behavior:
  IRONCLAW_REBORN_HOME=\ ironclaw-reborn run -m hello   # ~2s round
  trip without LLM env vars surfaces:
    "(no assistant text; status=RecoveryRequired, run_id=...)"
  which is the documented stub-gateway path. With a real
  OPENAI_API_KEY exported, the same command would print the assistant
  reply.

Smoke tests:
  Two existing diagnostic-shape tests now invoke `run --dry-run` to
  exercise the same snapshot they previously asserted on. All 35 CLI
  smoke tests pass.
Locks the new dependency shape in place so a future contributor can't
accidentally re-open the speculative public API surface or shortcut
through ironclaw_reborn directly from the CLI.

Changes:

  reborn_loop_support_llm_wiring_stays_out_of_root_src:
    Extended to also assert that ironclaw_reborn_composition gates
    ironclaw_llm behind the same root-llm-provider feature with
    `optional = true, default-features = false`. The composition root
    is the new sanctioned LLM-gateway construction site, so it
    inherits the same discipline ironclaw_reborn already had to
    follow.

  reborn_internal_crate_keeps_directory_of_modules_lib_rs (new):
    Forbids re-introducing the wall of `pub use` re-exports in
    crates/ironclaw_reborn/src/lib.rs (covers all 9 internal modules
    whose items previously leaked: loop_support, loop_driver_host,
    milestone_events, model_gateway, model_routes, planned_driver,
    planned_driver_factory, text_loop_driver, app_loop_family).
    Also asserts that the runtime assembly (build_reborn_runtime,
    RebornRuntime) actually lives in ironclaw_reborn_composition
    and reaches the underlying types via module paths, so the assembly
    slice cannot drift to another crate.

  BoundaryRule for ironclaw_reborn_cli:
    Now forbids ironclaw_reborn, ironclaw_llm, ironclaw_loop_support,
    ironclaw_threads, ironclaw_turns (on top of the existing
    ironclaw / ironclaw_engine / ironclaw_gateway / ironclaw_skills /
    ironclaw_tui forbidden set). The CLI must enter Reborn only
    through ironclaw_reborn_composition.

  reborn_cli_binary_crate_stays_separate_from_v1_root:
    Updated the exact-deps assertion to expect
    {ironclaw_reborn_composition, ironclaw_reborn_config} instead of
    the old {ironclaw_reborn, ironclaw_reborn_config}.
@github-actions github-actions Bot added scope: dependencies Dependency updates size: XL 500+ changed lines risk: low Changes to docs, tests, or low-risk modules contributor: core 20+ merged PRs labels May 15, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the Reborn architecture by narrowing the public API of ironclaw_reborn and introducing ironclaw_reborn_composition as the central assembly point for the runtime. The ironclaw_reborn_cli is updated to leverage this new composition layer, enabling a live REPL and single-shot message execution, supported by expanded architectural tests to enforce dependency boundaries. Review feedback highlights critical improvements for the runtime assembly, including handling JoinError during worker shutdown to prevent swallowing panics, ensuring the wake channel remains open to avoid CPU spinning, and utilizing exhaustive matching for protocol mapping to improve maintainability.

/// returning.
pub async fn shutdown(self) -> Result<(), RebornRuntimeError> {
self.worker_cancel.cancel();
let _ = self.worker_handle.await;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The result of self.worker_handle.await is ignored, which swallows potential panics or join errors from the background worker task during shutdown. Per repository rules, when handling JoinError from tokio tasks, we should log the error and distinguish between panics and cancellations to aid debugging.

        match self.worker_handle.await {
            Ok(_) => {}
            Err(e) => {
                if e.is_panic() {
                    tracing::error!("reborn worker task panicked: {e}");
                } else {
                    tracing::warn!("reborn worker task cancelled: {e}");
                }
            }
        }
References
  1. When handling errors from tokio tasks, log the JoinError and distinguish between panics and cancellations in the error message to capture debugging information.

evidence_port,
));

let (_wake_sender, wake_receiver) = TurnRunnerWakeReceiver::new();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Dropping the _wake_sender immediately closes the wake channel. If the TurnRunnerWorker implementation uses a select! loop that doesn't explicitly handle a closed channel (e.g., by checking for None from recv()), this can cause a CPU spin as the receiver branch will continuously trigger. Additionally, this disables the event-driven wake mechanism, forcing the worker to rely solely on the poll_interval for processing runs.

Comment on lines +567 to +580
let protocol = match cfg.protocol.as_str() {
"openai_completions" | "openai" => ProviderProtocol::OpenAiCompletions,
"anthropic" => ProviderProtocol::Anthropic,
"ollama" => ProviderProtocol::Ollama,
"deepseek" => ProviderProtocol::DeepSeek,
"gemini" => ProviderProtocol::Gemini,
"openrouter" => ProviderProtocol::OpenRouter,
"github_copilot" => ProviderProtocol::GithubCopilot,
other => {
return Err(RebornRuntimeError::LlmProvider(format!(
"unsupported llm protocol: {other}"
)));
}
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This match statement maps strings to ProviderProtocol variants using a wildcard other arm. Per the general rules, mapping between types should ideally be exhaustive to ensure that new variants added to the target enum are explicitly handled. Using a string-based match here bypasses compile-time checks for new protocols added to ironclaw_llm::ProviderProtocol.

References
  1. Mapping between enums or state-representing types should use exhaustive match statements instead of wildcard arms to ensure all cases are explicitly handled and to catch new variants at compile time.

Copy link
Copy Markdown
Collaborator

@zmanian zmanian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Overview

Three coordinated changes land together:

  1. Composition root absorbs the runtime assembly slice. ironclaw_reborn_composition gains build_reborn_runtime + RebornRuntime — substrate facades plus driver registry, host factory, model gateway adapter, and a spawned TurnRunnerWorker. New runtime.rs (~630 lines) and runtime_input.rs (~120 lines).
  2. ironclaw_reborn's public surface is narrowed from a ~25-item pub use wall to a directory of pub mod declarations. Five internal tests + two internal call sites updated to module-path imports. One drive-by fix removes a duplicate impl LoopCancellationPort that broke cargo test --no-run on baseline.
  3. CLI boots a live agent. ironclaw-reborn run -m "..." and ironclaw-reborn run (REPL) compose through the new entry point. The existing diagnostic-only behavior is preserved behind a new --dry-run flag so smoke tests keep working.

Boundary tests (ironclaw_architecture) pin all of the above: new forbidden deps on ironclaw_reborn_cli, a root-llm-provider-gating assertion against ironclaw_reborn_composition/Cargo.toml, and a new reborn_internal_crate_keeps_directory_of_modules_lib_rs test that bans the old pub use wall in crates/ironclaw_reborn/src/lib.rs.

What's strong

  • Boundaries-as-tests: the cleanup is locked in by ironclaw_architecture assertions, not just convention. A future contributor re-adding the noisy pub use wall fails CI.
  • Docstrings carry the architectural intent. crates/ironclaw_reborn/src/lib.rs and the runtime.rs header both clearly state who the sanctioned consumer is and why.
  • --dry-run preservation keeps the existing smoke-test signal alive without forking the binary's behavior.
  • RebornRuntime exposes only task-level methods (new_conversation / send_user_message / shutdown). Internal types (TurnCoordinator, SessionThreadService, LoopExitApplier, etc.) are not leaked through the runtime handle — only the diagnostic RebornServices snapshot is exposed and labeled as such.
  • Stub gateway pattern is right (build_stub_gateway). Returning a HostManagedModelError::safe(Unavailable, ...) instead of failing at startup means an operator can verify the binary boots before wiring credentials — and the PR description's manual run output confirms this.
  • The duplicate impl LoopCancellationPort fix is correctly called out separately in the commit message and reduced to a clarifying comment.

Concerns

1. Wake channel is dropped on construction (medium — agree with the bot review)

In runtime.rs:

let (_wake_sender, wake_receiver) = TurnRunnerWakeReceiver::new();

The leading underscore drops the sender immediately. This:

  • Disables the event-driven wake mechanism — the worker only ever fires on poll_interval.
  • If TurnRunnerWorker's receive branch in a select! doesn't guard None, this could spin.

Either store the sender on RebornRuntime (and expose wake() for send_user_message to call after submit_turn returns Accepted), or add a comment explicitly stating "intentionally drop — worker tolerates closed channel and we rely on poll-only for the CLI slice." The former is the correct fix; the latter at minimum needs to be verified against the worker's actual select! behavior. As-is this is silently degrading event-driven throughput.

2. JoinError is swallowed on shutdown (medium — agree with the bot review)

pub async fn shutdown(self) -> Result<(), RebornRuntimeError> {
    self.worker_cancel.cancel();
    let _ = self.worker_handle.await;
    Ok(())
}

If the worker panics, the panic is silently dropped. Project rules emphasize root-cause diagnosis. At minimum log the JoinError with panic-vs-cancellation discrimination:

if let Err(e) = self.worker_handle.await {
    if e.is_panic() {
        tracing::error!("reborn worker task panicked: {e}");
    } else {
        tracing::warn!("reborn worker task cancelled: {e}");
    }
}

3. CLI-specific identifiers baked into composition root (medium — coupling)

runtime.rs hardcodes "reborn-cli" as the tenant id, agent id suffix, and binding ids. The PR's own framing says the composition crate is the assembly point for "the CLI and any future Reborn ingress," but a web channel or e2e harness would inherit tenant_id = "reborn-cli" from the substrate, which is wrong.

Better: take a RuntimeIdentity { tenant, agent, source_binding } (or just three strings) on RebornRuntimeInput with a "reborn-cli" default that the CLI sets explicitly. This keeps the composition crate channel-agnostic, which the docstring already promises.

4. ModelSlot duplicated in the CLI without enforcement (low)

commands/models.rs mirrors ironclaw_reborn::ModelSlot as a CLI-local enum to satisfy the boundary rule, with a comment explaining why. The boundary tests forbid the import but don't catch drift. If a third slot lands in ironclaw_reborn::ModelSlot, the CLI silently keeps reporting two.

Two reasonable options:

  • Re-export ModelSlot (or just its &'static [&'static str] slot-name list) from ironclaw_reborn_composition, which is on the CLI's import-allowed list. Composition can keep the enum private and surface only the names.
  • Add an architecture test that asserts the slot-name list in commands/models.rs matches the one in ironclaw_reborn::model_routes::ModelSlot.

5. Dead local in build_runtime_input (nit)

commands/run.rs:

let composition_profile = match config.profile() {
    RebornProfile::LocalDev => RebornCompositionProfile::LocalDev,
    other => anyhow::bail!(...),
};
let _ = composition_profile; // currently always LocalDev when reached.

The CLI then calls RebornBuildInput::local_dev(...) and never passes composition_profile through. The same gate is also enforced inside build_reborn_runtime, so the CLI check is duplicative defense-in-depth. Either:

  • Drop the CLI-side check entirely and rely on the inner one (anyhow::bail! becomes a propagated RebornRuntimeError::InvalidArgument), or
  • Pass the variant through if the substrate input ever takes one (it doesn't today — local_dev is the only profile-specific factory exposed).

The let _ = composition_profile; is a strong code smell pointing at this.

6. protocol: String in RebornLlmConfig (low — disagree with the bot review)

The bot flagged the wildcard arm in the string→ProviderProtocol match. I'd push back: keeping protocol as String in RebornLlmConfig is the right call here — it prevents ironclaw_llm::ProviderProtocol from leaking into the composition crate's public surface, which is the whole point of this PR.

However, the validation could be more rigorous: today an unknown protocol fails at runtime on first send_user_message. Consider either:

  • Validating the string in RebornLlmConfig::new so misconfiguration fails at startup, or
  • Documenting on the field that valid values are enumerated and adding a const slice the docstring can reference.

7. Minor cosmetic / style

  • tests/loop_driver_host.rs has two non-contiguous use ironclaw_loop_support blocks. Merge them.
  • init_tracing() in commands/run.rs returns Result<(), ()> and the caller drops it with let _ = init_tracing();. Idiomatically this should be fn init_tracing() returning () (since the error is already discarded), or the caller should at least log "tracing already initialized" to stderr.
  • runtime.rs exposes services() for "diagnostic readiness snapshot only" — consider renaming to readiness_snapshot() to discourage callers from reading it as a typed handle to substrate facades.
  • external_event_id: Some(format!("reborn-cli:{}", Uuid::new_v4())) — fresh UUID per send defeats the idempotency-key purpose of external_event_id. Fine for the CLI slice but worth a comment noting the trade-off.

Tests / coverage

  • 5/5 in ironclaw_reborn_composition, 225/225 in ironclaw_reborn, 35/35 in CLI, 17/17 in ironclaw_architecture (per the PR body).
  • Gap: no test actually drives build_reborn_runtime end-to-end with the stub gateway and asserts the resulting AssistantReply { status: RecoveryRequired, text: None }. The manual run output in the PR description is the only evidence the assembly works. A #[tokio::test] in ironclaw_reborn_composition/tests/ that asserts the stub gateway path produces RecoveryRequired would lock in the contract RebornRuntime::send_user_message advertises in its docstring.
  • The boundary tests are excellent and pin exactly the right invariants. mergeable_state: dirty on the PR will need rebasing against reborn-integration before merge.

Security / safety

  • Nothing here introduces a new I/O surface beyond stdin/stderr/stdout and the existing LLM provider HTTP flow.
  • API keys are wrapped in secrecy::SecretString end-to-end, including through RebornLlmConfig. Good.
  • #[forbid(unsafe_code)] is preserved on ironclaw_reborn_composition.

Performance

  • RebornRuntime::wait_for_terminal polls at 100ms with a 180s ceiling (PollSettings::default). Reasonable for CLI; not exposed to callers. If the wake-channel fix in concern 1 lands, this poll becomes a fallback rather than the primary signal.
  • read_latest_assistant_text re-reads full thread history per send; fine for an interactive CLI, would not scale to long conversations or a server channel.

Bottom line

Solid architectural cleanup. The boundary-tests-as-enforcement story is exemplary. Three concerns are worth resolving before merge:

  1. Wake channel (concern 1) — verify whether dropping the sender silently degrades throughput or risks a spin; either keep the sender or document the deliberate choice.
  2. JoinError handling on shutdown (concern 2) — log panic vs. cancellation rather than swallow.
  3. CLI identifiers in composition root (concern 3) — parameterize tenant_id / agent_id / source binding so the composition crate stays channel-agnostic per its own docstring.

Concerns 4–7 are quality-of-life and can be follow-ups. The duplicate impl LoopCancellationPort removal is a worthwhile drive-by and correctly disclosed.


Generated by Claude Code

@serrrfirat
Copy link
Copy Markdown
Collaborator Author

Addressed the review feedback in 9b372aae:

  • Kept the TurnRunnerWakeSender alive on RebornRuntime and wake the worker after accepted turn submissions, so runtime sends no longer rely only on fallback polling.
  • Added shutdown JoinError handling with panic vs cancellation logging.
  • Parameterized runtime identity via RebornRuntimeIdentity / RebornRuntimeInput so tenant, agent, source binding, and reply binding are caller-owned instead of hardcoded to reborn-cli inside composition.
  • Removed the CLI-local ModelSlot mirror by exposing slot names through the composition boundary.
  • Cleaned the dead local-dev profile variable and simplified tracing init.
  • Merged the non-contiguous ironclaw_loop_support imports in loop_driver_host.rs.
  • Added a caller-level runtime test for the stub-gateway path that proves send_user_message reaches RecoveryRequired via wake even with a long poll interval.

Validation run locally:

  • cargo test -p ironclaw_reborn_composition
  • cargo test -p ironclaw_reborn_cli
  • cargo test -p ironclaw_architecture reborn
  • cargo clippy -p ironclaw_reborn_composition -p ironclaw_reborn_cli --all-targets -- -D warnings

@serrrfirat
Copy link
Copy Markdown
Collaborator Author

Addressed the follow-up review findings from the codex-review pass:

  • ironclaw-reborn run --message and non-TTY stdin now exit non-zero when the runtime does not produce assistant text.
  • Piped stdin now preserves prompt whitespace and propagates send errors.
  • The standalone runtime now cancels RecoveryRequired and timed-out runs so conversation locks are not abandoned.
  • Runtime sends are serialized before accept+submit to avoid orphaned model-visible messages under concurrent sends.
  • Runtime thread scope now matches the loop-exit evidence path so live model replies can complete.
  • Cleaned up the narrowed Reborn public-surface import in llm_gateway.

Verified with:

  • cargo test -p ironclaw_reborn_composition
  • cargo test -p ironclaw_reborn_cli
  • cargo test -p ironclaw_architecture reborn
  • cargo clippy -p ironclaw_reborn_composition -p ironclaw_reborn_cli --all-targets -- -D warnings
  • cargo clippy -p ironclaw_reborn --test llm_gateway --features root-llm-provider -- -D warnings
  • cargo test -p ironclaw_reborn_composition --all-features --tests --no-run
  • cargo test -p ironclaw_reborn --all-features --tests --no-run
  • codex review --base origin/reborn-integration clean

@github-actions github-actions Bot added the scope: docs Documentation label May 16, 2026
@ilblackdragon
Copy link
Copy Markdown
Member

Code Review

Overview

Three coupled changes that ship together:

  1. ironclaw_reborn_composition absorbs the runtime-assembly slice — new runtime.rs (+753) + runtime_input.rs (+159) compose substrate + drivers + LLM gateway + turn-runner worker into a runnable RebornRuntime whose surface is three task-level methods.
  2. ironclaw-reborn run becomes a real agent-m single-shot or stdin REPL; spawns a tokio multi-thread runtime; surfaces RecoveryRequired as a cancel-then-error rather than a hang.
  3. ironclaw_reborn's ~25 pub use re-export wall is gonelib.rs is now a directory of pub mod declarations; internal tests rewritten to module-path imports.

+1683 / −205, 25 files. Locked in with new boundary tests.

Strengths

  • Surface narrowing is enforced, not aspirational. Two boundary tests do the work: reborn_internal_crate_keeps_directory_of_modules_lib_rs greps lib.rs for the nine forbidden pub use prefixes, and ironclaw_reborn_cli's BoundaryRule.forbidden now includes ironclaw_reborn, ironclaw_llm, ironclaw_loop_support, ironclaw_threads, ironclaw_turns. The CLI's exact-deps assertion is {ironclaw_reborn_composition, ironclaw_reborn_config} and nothing else — a regression would fail CI loudly.
  • RebornRuntime is genuinely task-level. new_conversation, send_user_message, shutdown — no TurnCoordinator, no SessionThreadService, no LoopExitApplier, no AcceptedMessageRef/IdempotencyKey plumbing leaks. The CLI imports literally four composition types (build_reborn_runtime, RebornRuntimeInput, RebornRuntimeIdentity, TurnRunnerSettings).
  • RecoveryRequired is converted to a clean cancel. The standalone runtime has no recovery worker, so the runtime explicitly issues cancel_run(SanitizedCancelReason::OperatorRequested) when it sees RecoveryRequired and returns the cancel response. Validated by the new stub_gateway_send_cancels_recovery_required_and_releases_conversation integration test, which also verifies the second send works — proving the conversation isn't permanently locked.
  • Profile gating is fail-closed. Non-LocalDev profiles return InvalidArgument("profile=… is not yet wired end-to-end by build_reborn_runtime; only local-dev is supported in this slice") rather than partially starting an agent.
  • send_lock: Mutex<()> serializes sends per runtime. Prevents two concurrent send_user_message calls from racing on the same conversation. Worth noting that this also serializes across distinct conversations — see Issues below.
  • Worker liveness check. Both at the start of send_user_message and inside wait_for_terminal, worker_handle.is_finished() is checked and surfaces WorkerStopped. No silent hang if the worker has died.
  • Shutdown is graceful and observable. worker_cancel.cancel() plus worker_handle.await with is_panic() branching — panic emits error!, cancellation emits warn!. Both paths return Ok(()).
  • REPL distinguishes TTY vs piped stdin. TTY mode prompts to stderr (so stdout stays clean for piping), prints errors but continues; piped mode propagates errors as nonzero exit. Smoke tests cover both modes (run_message_exits_nonzero_when_runtime_does_not_produce_reply, run_piped_stdin_exits_nonzero_when_runtime_does_not_produce_reply).
  • --dry-run preserves legacy diagnostic shape. Two existing smoke tests append --dry-run to the same run command they used to invoke — output fields are unchanged, so any external tooling that scrapes the snapshot keeps working without modification.
  • Drive-by fix is documented. A pre-existing duplicate impl LoopCancellationPort for ResumePayloadHost in planned_driver.rs tests was breaking baseline cargo test --no-run. Removed with an inline comment explaining the conflict; called out in the commit message. Exemplary handling.
  • parse_provider_protocol accepts aliases. "openai", "openai_completions", "deepseek", "openrouter" all map to canonical ProviderProtocol variants. Tested. Avoids a surprise wire-format breakage.
  • RebornRuntimeIdentity is parametrized. Tenant/agent/binding identifiers are no longer hardcoded "reborn-cli" strings buried in build_reborn_runtime. Future channel adapters supply their own identity instead of inheriting CLI-specific labels.
  • forbid(unsafe_code) on the composition crate. Good baseline.

Issues / Suggestions

1. PollSettings is hardcoded; no caller knob for the run-completion poll loop (medium)

PollSettings::default() gives interval = 100ms and max_total = 180s. The 180-second cap is the only escape hatch from wait_for_terminal. For an LLM that takes ~3 minutes (long-context Anthropic, retries, etc.), this caps at runtime and returns RunTimeout. There's no with_poll_settings builder on RebornRuntimeInput. Three minutes is fine as the default, but production callers will want to tune both the poll interval (latency vs. coordinator load) and the timeout. Worth adding pub fn with_poll_settings(mut self, settings: PollSettings) -> Self — even if PollSettings stays private behind a public-newtype facade.

2. send_lock: Mutex<()> is per-runtime, not per-conversation (medium)

The serialization is correct but coarser than it needs to be: two distinct conversations cannot have concurrent send_user_message calls in flight. For a single-user CLI this is fine, but as soon as a channel adapter or multi-tab UI hangs off RebornRuntime, throughput drops to one-at-a-time across all conversations. Consider keying the lock by ConversationId (e.g. DashMap<ConversationId, Mutex<()>> or a tokio::sync::RwLock over a per-conversation lock map).

3. No cooperative cancellation during wait_for_terminal (medium)

The CLI's tokio::select! on ctrl_c only triggers between REPL reads. While a send_user_message is mid-wait (up to 180s), Ctrl-C just kills the process. The wait loop has no cancellation token plumbing. Easy fix: thread a CancellationToken through send_user_message (or take one in RebornRuntimeInput), and select! on it inside the poll loop. The runtime already issues cancel_run on timeout — same path applies on user cancel.

4. wait_for_terminal doesn't apply backoff (low)

The 100ms fixed-interval poll is fine for local-dev but constant load on the coordinator. Production wiring will likely want exponential backoff up to a cap. Defer until production substrate lands, but worth flagging.

5. parse_provider_protocol uses serde_json::from_value(Value::String(...)) (nit)

This converts a string → JSON → enum via serde. A direct match (or implementing FromStr on ProviderProtocol) would be more idiomatic and avoid the serde_json round-trip. Boot-time only, so performance is not a concern — purely a clarity point.

6. protocol doc string still lists legacy spellings (nit)

RebornLlmConfig.protocol doc says "openai_completions", "anthropic", ..., "deepseek", "gemini", "openrouter". The parser also accepts the canonical serde snake_case names (open_ai_completions, deep_seek, open_router) — those should be listed first or instead, with the legacy names noted as accepted aliases. (Resolved differently in #3704 — worth syncing.)

7. CLI prints assistant reply via println! but errors via eprintln! (nit)

print_reply uses println! for the reply text and the fallback "(no assistant text; status=..., run_id=...)". The fallback is a diagnostic, not a reply, and arguably belongs on stderr — a script piping output for downstream processing currently gets that diagnostic mixed into its data stream when the run lands in RecoveryRequired/Failed.

8. validate_runtime_identity validates then stores as String (nit)

The function validates tenant_id, agent_id, source_binding_id, reply_target_binding_id with their typed constructors (TenantId::new, AgentId::new, etc.), then discards the typed values and stores Strings on RebornRuntime. Inside send_user_message, SourceBindingRef::new(self.source_binding_id.clone()) is called again. Storing the typed refs upfront would avoid the second validation pass (and the unreachable InvalidArgument it implies — which itself reads like a code-smell).

9. ironclaw_reborn::model_routes::ModelSlot is reached through reborn_model_slot_names() (nit)

The CLI's models list/status commands import a pub fn reborn_model_slot_names() -> Vec<&'static str> from composition. That function reaches into ironclaw_reborn::model_routes::ModelSlot::all(). This works but it's a thin re-projection; if the diagnostic surface stays this small, fine. If more slot metadata is needed later, define a public DTO on composition instead of re-extracting strings.

10. Behavior change: TurnRunnerSettings::default() changed silently (low — also flagged in #3704)

Duration::from_secs(10)/from_secs(2) → composition's existing default. PR #3704 then changes it again to from_secs(5)/from_millis(200). Independent of this PR, but the chain across the two reviewed PRs is worth a single CHANGELOG note for downstream callers that use TurnRunnerSettings::default() directly.

Risk summary

Low for the architecture moves (surface narrowing, composition centralization, boundary tests). Medium for the run-loop ergonomics — items 1–3 (configurable poll, per-conversation locking, in-flight cancellation) are the real follow-ups before this can be a foundation for production ingress adapters. Nothing here blocks landing the slice, but the next slice should pick up at least the cancellation token.

— Posted by Claude Code review

@serrrfirat
Copy link
Copy Markdown
Collaborator Author

Addressed the review feedback from #3695 (comment) in 43c56ce7a:

  • Added public PollSettings and RebornRuntimeInput::with_poll_settings so callers can tune completion polling timeout/interval.
  • Changed runtime send serialization from one per-runtime mutex to per-conversation locks.
  • Added cooperative cancellation via send_user_message_with_cancellation; CLI Ctrl-C now cancels in-flight runs instead of waiting for timeout.
  • Kept CLI stdout reply-only by moving no-reply diagnostics to stderr.
  • Stored validated typed identity refs on RebornRuntime instead of re-validating binding strings per send.
  • Replaced serde JSON protocol parsing with direct matching and updated protocol docs to list canonical names first plus aliases.
  • Added CHANGELOG note for Reborn runtime poll/turn-runner cadence behavior.

Validation:

  • cargo fmt --check
  • cargo check -p ironclaw_reborn_composition -p ironclaw_reborn_cli --all-targets
  • cargo test -p ironclaw_reborn_composition
  • cargo test -p ironclaw_reborn_cli
  • cargo clippy -p ironclaw_reborn_composition -p ironclaw_reborn_cli --all-targets -- -D warnings
  • cargo test -p ironclaw_reborn_composition --features root-llm-provider --lib

Note: scripts/pre-commit-safety.sh still reports existing PR-wide findings outside this review-fix diff (UTF8/TMPDIR/PANIC/PROJECTION); no new flagged patterns came from these touched files.

Copy link
Copy Markdown
Collaborator

@zmanian zmanian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall direction is right — collapsing the speculative pub use wall, pinning the new shape with boundary tests, and shipping a binary that actually exercises the assembled runtime are all things I have been asking for. A few things I want addressed or at least acknowledged before this lands.

1. Substrate/binary boundary — composition crate has grown a personality

The architectural premise I care about is: Reborn substrate (ironclaw_agent_loop, ironclaw_reborn, ironclaw_loop_support, ironclaw_threads, ironclaw_turns) must not depend on the current binary, and the substrate must remain reusable independent of any specific app.

This PR is fine on the binary side — ironclaw_reborn_cli's deps are pinned to {ironclaw_reborn_composition, ironclaw_reborn_config} and the boundary test forbids the substrate names. Good.

What I want flagged is what happened to ironclaw_reborn_composition. Before this PR, its docstring promised "substrate services only — Product/AppBuilder integration belongs in later slices." After this PR, the same crate hard-deps on ironclaw_reborn, ironclaw_threads, ironclaw_loop_support, and (gated) ironclaw_llm, plus tokio/tokio-util/uuid, and ships an InMemorySessionThreadService-bound RebornRuntime with a turn-runner worker spawned inside it. That is not a composition root, that is the live agent runtime. The crate is doing two jobs now:

  • substrate-only facades for callers that wire their own loop (build_reborn_services)
  • assembled-runtime ingress with in-memory thread store + worker (build_reborn_runtime / RebornRuntime)

These should be two crates, or at minimum the assembled-runtime piece should be feature-gated so a downstream consumer can take ironclaw_reborn_composition for facades without pulling ironclaw_reborn and the worker machinery. Otherwise "composition root" becomes the new wall of speculative API. I would accept the merge with a tracking issue, but please open it before this lands.

2. Public surface — what's gone, what should also have gone

Stripping the ~25 pub use re-exports from ironclaw_reborn/src/lib.rs is the right call and the boundary test pinning it is exactly the discipline I want. Two follow-up concerns:

  • Promoting modules to pub mod (loop_driver_host, model_routes, planned_driver_factory, runtime, app_loop_family, text_loop_driver, milestone_events, model_gateway) is functionally equivalent to keeping the re-exports if the items inside stay pub. The follow-up in the PR body (move integration tests to colocated #[cfg(test)] so these can become pub(crate) mod) needs to be tracked as a real ticket, not a parenthetical.
  • ironclaw_llm::create_registry_provider flipped from fn to pub fn with a one-line doc. Composition's needs are real, but pub on the registry-provider factory leaks the rig-core dispatch shape to anyone who depends on ironclaw_llm. Either gate behind a feature, move into a pub(crate)-with-helper-module, or commit to the API stability.

3. Live binary smoke — what is actually verified

run --message "ping" returning nonzero RecoveryRequired without an API key is good negative coverage, and the two new smoke tests pin that. What is not pinned anywhere I can find:

  • a happy-path turn that produces an assistant reply through the assembled runtime (the PR body shows it works with a real OPENAI_API_KEY but there is no recorded fixture or stub-gateway-with-success test)
  • RebornRuntime::send_user_message against a Completed run — the planned-loop harness in ironclaw_product_workflow/tests/ covers the worker, but not the public RebornRuntime API the CLI imports
  • the auto-cancel-on-RecoveryRequired branch in wait_for_terminal (a real product runtime will have a recovery worker; baking "cancel because no recovery worker" into the public runtime API is a footgun)

Add at least one composition-level test that drives build_reborn_runtimesend_user_message → assistant-text-returned with a recording gateway. The harness in support/planned_agent_loop.rs already has the pieces.

4. Profile coverage

build_reborn_runtime only wires LocalDev. Production and migration-dry-run return a clear error, which is correct, but the CLI's RebornProfile::LocalDev check is duplicated in run.rs — once the composition errors clearly, the CLI doesn't need its own match. Minor.

Verdict

Approve once (1) the composition-crate identity question gets a tracking issue, (2) create_registry_provider's public exposure is justified or scoped, and (3) one happy-path test against RebornRuntime::send_user_message lands. Everything else is follow-up.

— Zaki

@serrrfirat
Copy link
Copy Markdown
Collaborator Author

Addressed Zaki's review feedback from #3695 (review) in d80c5e18f:

  • Opened tracking issue Track Reborn composition crate split and public surface narrowing #3726 for the Reborn composition crate split / feature-gating question and follow-up narrowing of ironclaw_reborn public modules.
  • Scoped ironclaw_llm::create_registry_provider behind a new opt-in registry-provider-factory feature; internal provider-chain code now uses a private helper, and Reborn composition explicitly opts in via root-llm-provider.
  • Added a composition-level happy-path test that drives build_reborn_runtimeRebornRuntime::send_user_message through a recording model gateway and asserts TurnStatus::Completed plus assistant text.

Validation:

  • cargo fmt
  • cargo test -p ironclaw_reborn_composition
  • cargo test -p ironclaw_reborn_cli
  • cargo test -p ironclaw_reborn_composition --features root-llm-provider --lib
  • cargo check -p ironclaw_llm
  • cargo clippy -p ironclaw_reborn_composition -p ironclaw_reborn_cli -p ironclaw_llm --all-targets -- -D warnings

Note: scripts/pre-commit-safety.sh still reports existing PR-wide UTF8/TMPDIR/PANIC/PROJECTION findings outside this focused fix.

Copy link
Copy Markdown
Collaborator

@zmanian zmanian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review after d80c5e18f (and the prior runtime hardening passes).

Prior findings — status

1. Substrate/binary boundary (composition crate character change) — Resolved-as-tracked. #3726 is open and captures both the composition crate split / feature-gate decision and the follow-up narrowing of ironclaw_reborn public modules. I'm OK deferring the actual split given the live binary is downstream of ironclaw_reborn_composition and the assembled-runtime path is what ironclaw-reborn consumes today; what mattered was that we don't pretend this crate is still "substrate-only." The tracking issue makes the future cost explicit.

2. ironclaw_llm::create_registry_provider leaking rig-core — Resolved. Now gated behind opt-in registry-provider-factory feature; internal call sites use a private create_registry_provider_inner; ironclaw_reborn_composition's root-llm-provider feature transitively opts in. Default builds of ironclaw_llm no longer expose the registry factory. Good outcome.

3. Live-binary happy-path coverage — Resolved. send_user_message_returns_completed_assistant_text_with_recording_gateway drives build_reborn_runtimenew_conversationsend_user_message through a RecordingGateway and asserts TurnStatus::Completed plus the assistant text. The test-only with_model_gateway_override plumbing is #[cfg(test)]-gated and doesn't leak into the public input surface. Combined with the negative-path test that was already there, the runtime ingress is now pinned on both sides.

4. wait_for_terminal auto-cancelling RecoveryRequired — Resolved. 74c9636 makes the cancellation explicit with a comment that the standalone runtime has no recovery worker and must release the conversation lock; the test was updated to assert TurnStatus::Cancelled rather than RecoveryRequired. This is the right behavior for the standalone composition — once a real recovery worker exists, this branch is what will need to change, and the comment flags exactly that.

New observations

  • runtime.rs build_reborn_runtime now has nested #[cfg(test)] / #[cfg(not(test))] inside #[cfg(feature = "root-llm-provider")] arms for the gateway override. It works but it's getting hard to read; if a third axis shows up, extract a small select_model_gateway helper.
  • The pre-commit safety findings the author called out (UTF8/TMPDIR/PANIC/PROJECTION) are pre-existing on this branch and not introduced by this PR; not blocking here, but worth folding into a follow-up sweep.

Verdict

Approve. All three prior findings landed cleanly, the deferred split has an issue, and the runtime now has both happy-path and recovery-path coverage. Thanks for the thorough turnaround.

— Zaki

@serrrfirat serrrfirat merged commit 1e5c87c into reborn-integration May 17, 2026
15 checks passed
@serrrfirat serrrfirat deleted the reborn/composition-runtime-and-narrow-surface branch May 17, 2026 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: low Changes to docs, tests, or low-risk modules scope: dependencies Dependency updates scope: docs Documentation size: XL 500+ changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants