feat(reborn): apply universal FS dispatch across consumer crates#3679
feat(reborn): apply universal FS dispatch across consumer crates#3679ilblackdragon wants to merge 95 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request migrates persistence logic across several crates to a unified RootFilesystem interface, effectively replacing the ironclaw_storage crate. The changes introduce append and tail operations for durable logging and implement Full-Text Search and Vector similarity search for the in-memory, libSQL, and PostgreSQL backends. New filesystem-backed repositories are added for memory documents, outbound state, processes, and secrets. Feedback identifies critical performance and concurrency issues, including O(N) bottlenecks in event log head probing, TOCTOU race conditions in secret and session consumption, and inefficient N+1 query patterns during directory listings that require optimization through indexing or targeted queries.
| let head = self | ||
| .fs | ||
| .tail(&path, SeqNo::from_backend(0)) | ||
| .await | ||
| .map_err(map_filesystem_tail_error)? | ||
| .last() | ||
| .map(|record| record.seq.get()) | ||
| .unwrap_or(0); |
There was a problem hiding this comment.
Using tail to probe the stream head is a significant performance risk as it fetches the entire event log into memory. To avoid O(N) performance bottlenecks in append operations for durable logs, maintain a cursor cache or perform bounded reverse scans to read only the necessary tail records.
References
- Avoid O(N) performance bottlenecks in append operations for durable logs by maintaining a cursor cache or performing bounded reverse scans to read only the necessary tail records.
| let material = SecretMaterial::from(decrypted.expose().to_string()); | ||
|
|
||
| lease.status = SecretLeaseStatus::Consumed; | ||
| self.write_lease(&lease).await?; |
There was a problem hiding this comment.
The consume operation lacks atomicity across multiple processes, creating a time-of-check-to-time-of-use (TOCTOU) race condition. To prevent this, re-verify the condition (lease status) after acquiring the write lock or use the record version for optimistic concurrency control (CAS) to ensure only one update succeeds.
References
- To prevent a time-of-check-to-time-of-use (TOCTOU) race condition when checking a condition under a read lock before performing a write, re-verify the condition after acquiring the write lock but before executing the write.
| let entry = Entry::bytes(body).with_content_type(ContentType::json()); | ||
| self.filesystem | ||
| .put(&path, entry, CasExpectation::Any) | ||
| .await |
There was a problem hiding this comment.
consume_session_use suffers from a TOCTOU race condition in multi-process deployments. Concurrent updates will result in lost increments. This operation must use the record version for optimistic concurrency control or re-verify the state after acquiring a write lock to ensure atomicity.
References
- To prevent a time-of-check-to-time-of-use (TOCTOU) race condition when checking a condition under a read lock before performing a write, re-verify the condition after acquiring the write lock but before executing the write.
| for entry in entries { | ||
| if !entry.name.ends_with(".json") { | ||
| continue; | ||
| } | ||
| let Some(attempt) = self | ||
| .get_json::<OutboundDeliveryAttempt>(&entry.path) | ||
| .await? | ||
| else { | ||
| continue; | ||
| }; | ||
| if scope_matches(&attempt.scope, &scope) { | ||
| deliveries.push(attempt); | ||
| } | ||
| } |
There was a problem hiding this comment.
This loop performs an N+1 query pattern by calling get for every entry. To optimize list operations that involve scanning multiple directories and files, consider maintaining an index file that directly lists the paths to the relevant items, or use targeted database queries to fetch specific records instead of loading all records and filtering in the application.
References
- To optimize list operations that involve scanning multiple directories and files, maintain an index file that directly lists the paths to the relevant items.
- Use targeted database queries to fetch specific records instead of loading all records and filtering in the application.
| for entry in entries { | ||
| if entry.name.ends_with(".json") { | ||
| let bytes = self.filesystem.as_ref().read_file(&entry.path).await?; | ||
| let record = deserialize::<ProcessRecord>(&bytes)?; | ||
| // Reviewer (PR #3666) flagged: a `get` returning `None` after | ||
| // `list_dir` enumerated the path indicates a race or backend | ||
| // inconsistency. Returning a partial process list silently | ||
| // hides this; surface it as `NotFound` so callers see the | ||
| // same failure shape they got with the legacy `read_file` | ||
| // path. | ||
| let Some(versioned) = self.filesystem.as_ref().get(&entry.path).await? else { | ||
| return Err(ProcessError::Filesystem(format!( | ||
| "process record listed but missing at {}", | ||
| entry.path | ||
| ))); | ||
| }; | ||
| let record = deserialize::<ProcessRecord>(&versioned.entry.body)?; | ||
| if same_scope_owner(&record.scope, scope) { | ||
| records.push(record); | ||
| } |
There was a problem hiding this comment.
The list method implements an N+1 pattern. This should be optimized by maintaining an index file that directly lists the paths to the relevant items, avoiding the overhead of multiple get calls during directory traversal.
References
- To optimize list operations that involve scanning multiple directories and files, maintain an index file that directly lists the paths to the relevant items.
| let mut leases = Vec::new(); | ||
| for entry in entries { | ||
| if !entry.name.ends_with(".json") { | ||
| continue; | ||
| } | ||
| let Some(versioned) = self | ||
| .filesystem | ||
| .get(&entry.path) | ||
| .await | ||
| .map_err(fs_to_secret_store_error)? | ||
| else { | ||
| continue; | ||
| }; | ||
| let mut stored: StoredLease = deserialize_secret(&versioned.entry.body)?; | ||
| if !same_scope_for_lease(&stored.scope, scope) { | ||
| continue; | ||
| } | ||
| stored.status = Self::effective_status(&stored, now); | ||
| leases.push(Self::lease_to_public(&stored)); | ||
| } |
There was a problem hiding this comment.
leases_for_scope performs an N+1 query pattern. To optimize this, use targeted database queries to fetch specific records in a single operation, or maintain an index file to avoid costly directory traversal.
References
- Use targeted database queries to fetch specific records instead of loading all records and filtering in the application.
- To optimize list operations that involve scanning multiple directories and files, maintain an index file that directly lists the paths to the relevant items.
Three changes that close out the remaining HIGH-severity feedback from the self-review (#1 #2 #3 #4 already addressed in 990c4f7 + e2530ad): **#2 — `parse_state` silent fallback to Pending removed.** `src/db/filesystem_jobs.rs::parse_state` previously mapped unknown status strings to `JobState::Pending`, masking schema drift across a rollout (a new state value appearing in stored rows would silently lose its true value). Now returns `Result<JobState, DatabaseError>` and the single caller propagates with `?`. Matches the wire-stable enums rule in `types.md`. **#6 — `is_engine_unsupported` no longer substring-matches.** `crates/ironclaw_engine/src/store/filesystem.rs`: the typed `FilesystemError::Unsupported` discriminator gets lost when wrapped in `EngineError::Store { reason: String }`, so the old check `reason.contains("Unsupported")` would false-positive on any unrelated store error that mentioned the word. Now `fs_to_engine_error` tags the discriminator with a stable `[fs:unsupported]` sentinel and the check matches that sentinel — discriminator-preserving without changing the public `EngineError` shape. **#7 — `FilesystemChannelPairingStore` no longer drops corrupted records.** `src/db/filesystem_pairing.rs::find_pending_requests`: the old code silently filtered records whose JSON failed to deserialize, hiding data corruption. Now propagates `DatabaseError::Serialization` with the stored path so the operator sees the failure. Also: `// silent-ok:` annotations added to the three engine `Store` sites where read-modify-write on unknown ids is intentionally a no-op (matches HybridStore parity per its CLAUDE.md). Each annotation names the legacy contract being preserved. Verification: `cargo check --workspace --all-features` clean; `cargo test -p ironclaw_engine --all-features` 549/549; `cargo test --lib --all-features db::filesystem` 88/88; `cargo fmt --check` clean.
|
Self-review round done. Spawned a code-reviewer agent against the integrated branch; it returned 7 HIGH / 8 MEDIUM / 7 LOW findings. The HIGH-severity fixes are now in:
Also (peripheral but related): vector "nearest" in libsql + postgres no longer loads every row's body to rank — first stage selects MEDIUM/LOW findings (N-scan operations missing indexes, file-size budget violations, etc.) are tracked as follow-ups; none affect correctness. Test counts after the fix round (local,
CI now running on the integrated branch (run 25911695237). |
|
Codex review found several actionable correctness/security issues in the filesystem-backed stores:
|
`list_messages_internal`, `list_conversations_summary`, and `run_query` each called `filesystem.query(.., Page::new(0, Page::MAX_LIMIT))` exactly once and trusted the result was complete. Because `Page::MAX_LIMIT == 1024`, conversations with >1024 messages or scopes with >1024 jobs/actions/estimations silently lost every row past the cap, and the `has_more` flag in `list_conversation_messages_paginated` became meaningless once the dropped tail crossed the page boundary. Codex PR #3679 P2 review flagged the pattern. Extract a shared `query_all_pages` helper in `filesystem_conversations` that loops `query(..., Page::new(offset, MAX_LIMIT))` until a short page comes back, then reuse it from `filesystem_jobs::run_query` and from the inline scan in `update_estimation_actuals`. The helper preserves the existing `NotFound -> Vec::new()` short-circuit and the `fs_err_to_database` error mapping so call sites are otherwise unchanged. Regression tests: - `list_messages_drains_pages_beyond_max_limit` writes `MAX_LIMIT + 5` messages and asserts the full count round-trips through `list_conversation_messages` and that `list_conversation_messages_paginated` reports `has_more` honestly for both partial and exhaustive windows. - `get_job_actions_drains_pages_beyond_max_limit` writes `MAX_LIMIT + 3` actions on one job and asserts the full count comes back in sequence order. - `list_agent_jobs_drains_pages_beyond_max_limit` writes `MAX_LIMIT + 7` jobs and asserts both `list_agent_jobs` and `agent_job_summary` count every row.
Two HIGH-severity findings on PR #3679. Both sites read a versioned entry, validated a one-shot/use-limit condition, then wrote back with `CasExpectation::Any`. The process-local mutex only serializes writers inside one process; multi-process callers sharing the same backend root could both pass the check and overwrite each other. - `FilesystemSecretStore::consume` — two consumers could both observe an Active one-shot lease, both decrypt, and both overwrite the consumed marker. - `FilesystemCredentialBroker::consume_session_use` — two consumers could both pass the max-uses check at `uses=N-1` and overwrite each other's increment, losing a use. Both now use the canonical retry-on-`FilesystemError::VersionMismatch` pattern from `ironclaw_engine::store::filesystem::update_thread_state` (post-`e2530adff`): re-read, re-evaluate the consume/use-limit condition, write with `CasExpectation::Version(versioned.version)`. A shared `CAS_RETRY_ATTEMPTS = 3` constant bounds the loop; exhausting it surfaces a transient backend error rather than papering over pathological hot-spots. Also annotated `leases_for_scope` with a `TODO(perf)` covering the N+1 list+get fan-out — bounded today by the owner-prefix path layout and short lease TTLs; replacing it with `Filter::Eq` over `query` requires the secrets store to declare its first index, which is a follow-up. Regression coverage: two new tests wrap `InMemoryBackend` with a `VersionRacingBackend` that bumps the watched path's version out-of-band on the first versioned `put`, forcing a `VersionMismatch` and exercising the retry loop. They also assert that the retried CAS write actually persisted (the next consume hits LeaseConsumed; the next three increments exhaust the max-uses budget).
Four P2 correctness fixes from the codex/gemini review. **Settings keys use percent-encoding** (`src/db/filesystem_settings.rs`): `encode_segment` previously mapped `/`, space, control chars, and others all to `_`. Keys like `a/b` and `a_b` collided onto the same path and silently overwrote each other. Now percent-encodes every byte outside the unreserved set so distinct inputs map to distinct outputs. **SQL index names get a blake3 suffix on overflow** (`crates/ironclaw_filesystem/src/db.rs`): `sql_index_name` truncated identifiers exceeding 62 chars without disambiguating, so two distinct long `(prefix, name)` specs could collapse onto the same DDL object — `CREATE ... IF NOT EXISTS` would silently reuse the wrong index/trigger. Now appends an 8-char blake3 hash suffix before truncating. Added `blake3 = "1"` to the crate's deps (small + already used by other workspace crates). **InMemoryBackend rejects writes over implicit directories** (`crates/ironclaw_filesystem/src/in_memory.rs`): the SQL backends refuse `put(/a)` when `/a/b` exists (treating `/a` as a directory). The in-memory reference impl silently accepted those writes, letting tests pass against production-impossible state. Mirror the SQL contract. **Event-store head-probe is bounded** (`crates/ironclaw_reborn_event_store/src/filesystem_store.rs`): The replay-gap detection previously called `tail(path, 0)` to read the whole log just to look at its last seq — O(N) on every cold-path call. Now probes `tail(path, after - 1)`: a non-empty result means head == after (consumer is caught up); empty means head < after (foreign-future cursor). Returns at most one record instead of the entire log. Verification: cargo check --workspace --all-features clean; cargo test -p ironclaw_filesystem -p ironclaw_secrets -p ironclaw_reborn_event_store --all-features all pass.
|
Review-response summary — all HIGH-severity and P2-correctness findings from gemini-code-assist + codex review are now addressed.
Verification after the fix round:
|
First consumer migration onto the new RootFilesystem surface. Switches the byte-plane read_file/write_file calls inside ironclaw_processes' filesystem-backed store to the unified put/get ops with Entry::bytes + CasExpectation::Any. The on-disk JSON layout is unchanged, every existing test passes, and downstream crates that construct FilesystemProcessStore (ironclaw_host_runtime + tests) don't need to change. Scope deliberately narrow: opaque-file entries through `put`/`get` without record kinds or non-`Any` CAS, since LocalFilesystem's native `put` only accepts that shape (per the foundation PR #3659). Once LocalFilesystem grows sidecar metadata, this consumer can switch to `Entry::record(process_record_kind, ...)` + `CasExpectation::Absent` without changing the on-disk layout. Touch points: - write_record uses put(Entry::bytes, CAS::Any) - start uses get for the existence probe + transition_lock for the atomicity envelope per the single-instance invariant - update_status / get / records_for_scope read via get and unwrap VersionedEntry.body - records_for_scope returns ProcessError::Filesystem (not silent skip) when get returns None for a path that list_dir just yielded — matches the pre-migration NotFound propagation invariant Test scaffold update: BackendErrorFilesystem now overrides `get` too, so the fault-propagation regression test continues to exercise its intended path. (Reviewer P1/P2 on the original #3666 — recursion + silent-skip — addressed in foundation #3659 directly since LocalFilesystem now ships native `put`/`get`.)
Stacked on the consolidated foundation PR #3659. Adds an OutboundStateStore impl that persists outbound metadata under /engine/outbound/{policies,subscriptions,deliveries} through any RootFilesystem. The existing libSQL/Postgres/in-memory stores stay intact during the migration; a follow-up cleanup PR can delete them once production runs on the unified surface. The new store passes the full contract suite (durable_policy_*, subscription_cursor_*, delivery_status_*, notification_policy_*, full_turn_scope_isolation) against InMemoryBackend in the existing outbound_state_store_contract.rs test file.
…fied put/get Stacked on PR #3670 (outbound). Mirrors the ironclaw_processes migration in PR #3666 / now consolidated into #3659. Switches the filesystem-backed lease store's read_file/write_file calls to the unified get/put ops with Entry::bytes + CasExpectation::Any. The on-disk JSON layout is unchanged, every existing test passes, and the per-owner mutation_lock continues to serialize claim/consume/revoke within a single instance. Touch points: - read_lease, read_lease_index, read_lease_file — now use get and unwrap VersionedEntry.body. - write_lease, write_lease_index — now use put(Entry::bytes, Any). - Imports updated. - CountingFilesystem test scaffold gains put/get overrides that forward to its inner LocalFilesystem, since the trait defaults are now Unsupported after the PR #3659 recursion fix.
Stacked on PR #3671 (authorization). Mirrors processes (#3666) and authorization (#3671) migrations. Switches all read_file/write_file calls in FilesystemRunStateStore and FilesystemApprovalRequestStore to the unified get/put ops with Entry::bytes + CasExpectation::Any. On-disk JSON layout unchanged. Test scaffold updates: ConcurrentMissingReadFilesystem and DisappearingApprovalReadFilesystem gain put/get overrides that forward to their inner LocalFilesystem and apply the same fault injection logic on the unified read path (was: only on the legacy read_file path). Required after the trait defaults moved to Unsupported in PR #3659.
The ironclaw_storage crate predates the unified RootFilesystem surface introduced by PR #3659 (universal FS dispatch). Its `BlobStore`/`RecordStore` traits, `StorageKey`/`StorageVersion`/`PutCondition` types, and `StoredBlob`/`StoredRecord` shapes parallel the new unified put/get /CasExpectation/RecordVersion machinery on `RootFilesystem` — a textbook duplicate-dispatch smell flagged by .claude/rules/architecture.md. Only `ironclaw_outbound` consumed any of the crate, and only 5 small helpers (`encode_json`, `decode_json`, `redacted_backend_error`, `StorageError::Backend`, `ABSENT_SCOPE_COMPONENT`). All other types and the entire `BlobStore`/`RecordStore` surface (660 LOC) were unused — their intended consumers already moved to `RootFilesystem` directly. Inlined the 5 helpers into `crates/ironclaw_outbound/src/db.rs`: - `encode_json`/`decode_json` → direct `serde_json::to_string`/`from_str` - `redacted_backend_error` → local log+collapse to `OutboundError::Backend` (preserves the redaction boundary required by ironclaw_outbound/CLAUDE.md) - `ABSENT_SCOPE_COMPONENT` → local const "" Removed the crate's workspace membership, the outbound dep, the forbidden-edges BoundaryRule, and the crate directory. Also updated the ironclaw_outbound BoundaryRule to permit a normal dependency on `ironclaw_filesystem` — `FilesystemOutboundStateStore` landed in the prior cascade PR and the boundary rule was stale.
…egacy Two changes that close out the demoable parts of the universal-FS-dispatch rework (tasks #18 and the demonstrable portion of #19 from the plan). **HsmBackend placeholder** (`crates/ironclaw_filesystem/src/hsm.rs`). Demonstrates that a new backend is a single-file change: implements the one `RootFilesystem` trait, declares a restricted capability surface (`Read` + `Write` + `Stat` + `Delete` + `TxnCapability::Cas` — no records, no query, no index, no events, no multi-key transactions), and routes `put`/`get`/`delete`/`stat`/`list_dir` through an in-process placeholder. Five tests prove the seam works end-to-end: - `hsm_supports_encrypted_bytes_round_trip` — bytes put/get works. - `hsm_rejects_structured_records` — `put` with `RecordKind::Some` or non-empty `indexed` returns `Unsupported`, so a consumer cannot accidentally route records through encryption-only storage. - `hsm_rejects_query_and_index_ops` — `query`/`ensure_index` return `Unsupported` consistent with the declared capabilities. - `composite_rejects_overclaimed_hsm_descriptor` — mount-time validation (`validate_mount_capabilities`) refuses a descriptor that claims `Query`/`IndexExact` over a backend that doesn't deliver, failing with `FilesystemError::DescriptorOverclaims { missing, .. }`. - `composite_routes_to_hsm_under_secrets_mount` — the acceptance gate: mounting HsmBackend at `/secrets` and routing put/get through the composite works with no consumer-visible changes. Indexed projection is still rejected because the declared capabilities advertise no index/query support. A real HSM implementation replaces the in-memory placeholder with an HSM session handle; the trait surface, capability declarations, and mount-time validation are reusable as-is. The placeholder is *not* a security boundary — it is a seam demonstration. **database.md scoped to legacy directories**. The dual-backend rule file (`.claude/rules/database.md`) is `paths`-scoped to `src/db/**`, `src/history/**`, and `migrations/**` — exactly the legacy surface that predates the universal FS dispatch. Added a "Status & Direction" preamble pointing new persistence work at `ScopedFilesystem` and the `2026-05-14-universal-fs-dispatch.md` plan, with the existing per-crate dual-backend guidance kept (and tagged "legacy") for code still inside those directories.
…ection Decorate every `FilesystemCapabilityLeaseStore::write_lease_raw` and `write_lease_index` `Entry` with a `tenant_id` indexed projection so an admin-tier query can filter explicitly by tenant and a path-rewriting bug surfaces as a query-time mismatch rather than silent cross-tenant leakage. The projection is additive — `ScopedFilesystem` path-prefix scoping remains the primary isolation boundary. Adds: - `index_key_tenant_id()` / `index_name_authorization_tenant()` helpers and `ensure_tenant_id_index` declared on the per-owner lease prefix - Tenant projection on lease and lease-index entries - Byte-only-backend fallback on both writes (strip the projection and downgrade CAS to `Any` when LocalFilesystem returns `Unsupported`), matching the sibling crates' fallback shape - Regression test that issues a lease and asserts a positive `Filter::Eq` query (matching tenant) returns rows and a negative query (different tenant) returns zero rows Refs docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md.
Clarify in the tenant-isolation plan's "What this gives us" section that the `tenant_id` defense-in-depth projection covers processes/secrets/ outbound/authorization but deliberately excludes the engine `Store`. The engine trait is single-tenant-by-construction per Open Question 2 — it never sees `tenant_id` internally and has no scope source to drive the projection at the write site, so it relies on path-prefix scoping alone.
…fs-rework-unified # Conflicts: # crates/ironclaw_reborn/src/planned_driver.rs
…ement contract The architecture boundary test `reborn_virtual_roots_match_storage_placement_contract` (`crates/ironclaw_architecture/tests/reborn_dependency_boundaries.rs`) asserts that `ironclaw_host_api::VIRTUAL_ROOTS` matches the canonical roots in `docs/reborn/contracts/storage-placement.md` and `docs/reborn/contracts/filesystem.md`. The previous commit (`b30c31eae`) added five consumer-store roots to `VIRTUAL_ROOTS` without updating the contract docs and collapsed three `/system/*` subroots into a single `/system` entry, which broke the test in CI. - Restore `/system/settings`, `/system/extensions`, `/system/skills` in `VIRTUAL_ROOTS` (they are still load-bearing — used by `ironclaw_dispatcher` for extension discovery and mount roots). - Keep the five new consumer-store roots (`/processes`, `/authorization`, `/outbound`, `/tenant-shared`, `/tenants`) and add matching rows to both contract docs explaining what each one carries. - Split the unified `/system` mount alias in `default_singleton_mount_view` / `invocation_mount_view` into one alias per canonical subroot, since each subroot is the smallest VirtualPath unit reserved by the contract. Tests updated to match. After this commit, `cargo test -p ironclaw_architecture --test reborn_dependency_boundaries` passes and the composition mount-view tests still pass.
These per-backend OutboundStateStore impls predate the universal-FS dispatch design. Now that FilesystemOutboundStateStore routes through RootFilesystem, the backend choice happens at the RootFilesystem layer (LibSqlRootFilesystem vs PostgresRootFilesystem vs InMemoryBackend) — no need for outbound-specific persistence impls. - Delete src/libsql_store.rs (LibSqlOutboundStateStore) - Delete src/postgres_store.rs (PostgresOutboundStateStore) - Delete src/db.rs (helpers used only by the above) - Delete the four contract tests that exercised them (libsql_persists*, libsql_rejects*, postgres_persists*, postgres_rejects*); durability across reopen is now a property of the RootFilesystem backend, not of outbound-specific persistence - Inline-gate the OutboundPushKind/OutboundDeliveryStatus as_str helpers as #[allow(dead_code)] — they were only consumed by the deleted SQL bodies No production callers; no composition rewiring needed (composition already uses FilesystemOutboundStateStore).
… follow-ups After deleting LibSql/PostgresOutboundStateStore (`d4bf7b3c2`), the remaining cleanup splits cleanly into three independent follow-up PRs: - secrets: needs the master-key decryptability check reproduced on FilesystemSecretStore before composition can switch off LibSqlSecretsStore / PostgresSecretsStore. - authorization: FilesystemCapabilityLeaseStore needs to move from borrowed `&'a ScopedFilesystem<F>` to `Arc<ScopedFilesystem<F>>` so it fits `Arc<dyn CapabilityLeaseStore>` for composition wiring. - run_state: not yet migrated to ScopedFilesystem; the same shape as the other migrated consumers is the prerequisite before LibSqlRunStateStore / PostgresRunStateStore can be deleted. Each is independent (200-400 lines + composition rewiring + tests) and shouldn't pile onto PR #3679. The "trait collapse" further step (rename Filesystem*Store → *Store, delete the trait) widens HostRuntimeServices type signatures and is a deeper structural change deserving its own PR after the SQL impls are gone.
Adds `FilesystemSecretStore::verify_can_decrypt_existing_secrets`, porting the fail-loud-on-master-key-mismatch contract from `LibSqlSecretsStore` / `PostgresSecretsStore` so composition can drop the SQL backends. The sentinel lives at `/secrets/_master_key_check.json` under the caller's MountView; constant-time compare against the fixed plaintext mirrors the libSQL / Postgres implementations the SQL backends carry today. Driven by the "Legacy per-backend store cleanup — ironclaw_secrets" entry in docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md.
These per-backend CapabilityLeaseStore impls predate the universal-FS
dispatch design. Now that FilesystemCapabilityLeaseStore routes through
RootFilesystem (via ScopedFilesystem), the backend choice happens at
the RootFilesystem layer (LibSqlRootFilesystem vs PostgresRootFilesystem
vs InMemoryBackend) — no need for an authorization-specific persistence
impl per backend.
Cites the cleanup entry in
docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md
("Legacy per-backend store cleanup" → ironclaw_authorization row).
- Migrate FilesystemCapabilityLeaseStore from `<'a, F>` + `&'a ScopedFilesystem<F>`
to `<F>` + `Arc<ScopedFilesystem<F>>` so composition can hold it as
`Arc<dyn CapabilityLeaseStore>` without erasing a lifetime parameter.
Same shape as FilesystemSecretStore, FilesystemOutboundStateStore,
FilesystemProcessStore.
- Preserve the AUTH-CRITICAL H4 CAS-Version retry + Unsupported→Any
fallback in update_lease_cas / write_lease_raw — only the borrow→Arc
shape changed; serialize/deserialize, mutation_lock, tenant_id
projection, and revoke/claim/consume semantics are untouched.
- Delete crates/ironclaw_authorization/src/db.rs (LibSqlCapabilityLeaseStore
+ PostgresCapabilityLeaseStore impls + helpers used only by them).
- Delete tests/db_capability_lease_store_contract.rs (the legacy
contract test exercised the SQL stores directly; durability across
reopen is now a property of the RootFilesystem backend, not of an
authorization-specific persistence impl, and the filesystem store
reload paths are exercised by capability_lease_contract.rs).
- Drop the `libsql` and `postgres` cargo features on
ironclaw_authorization (and their `dep:libsql` / `dep:deadpool-postgres` /
`dep:tokio-postgres` / `tracing` dependencies — only the deleted SQL
code used them).
- Composition (lib.rs + factory.rs): replace
Arc::new(LibSql/PostgresCapabilityLeaseStore::new(db|pool)) +
`.run_migrations().await?` with
Arc::new(FilesystemCapabilityLeaseStore::new(Arc::clone(&scoped_filesystem))).
Migrations are now the RootFilesystem's responsibility, already run
on `filesystem.run_migrations()` earlier in the same function.
- Update ironclaw_reborn_composition's `libsql`/`postgres` features to
drop their `ironclaw_authorization/libsql` and `/postgres` flags.
- Update the host_runtime durable-restart integration test from the
leaked `&'static ScopedFilesystem<_>` shape to a plain
`Arc<ScopedFilesystem<LocalFilesystem>>`; the Box::leak is gone.
Verified:
- cargo clippy -p ironclaw_authorization -p ironclaw_capabilities
-p ironclaw_reborn_composition --all-features --tests -- -D warnings clean
- cargo test -p ironclaw_authorization --all-features: 27 passed
- cargo check --workspace --all-features clean
…sition
The universal-FS dispatch design says backend choice happens at the
RootFilesystem layer, not at the consumer-store layer.
FilesystemSecretStore + FilesystemCredentialBroker already route
through ScopedFilesystem over any RootFilesystem (LibSqlRootFilesystem
/ PostgresRootFilesystem / InMemoryBackend / LocalFilesystem), so the
LibSqlSecretsStore / PostgresSecretsStore / LibSqlCredentialStore /
PostgresCredentialStore siblings are vestigial duplication.
Composition (`crates/ironclaw_reborn_composition/src/{lib,factory}.rs`)
now constructs FilesystemSecretStore over the existing
scoped_filesystem (wrap_scoped). The master-key decryptability check
that used to live on LibSqlSecretsStore was ported to
FilesystemSecretStore::verify_can_decrypt_existing_secrets in
commit 36d230e; composition calls it on every production startup
so a master-key mismatch still fails loud.
Deleted:
- `crates/ironclaw_secrets/src/db.rs` (LibSql + Postgres impls + helpers)
- `crates/ironclaw_secrets/tests/db_secret_store_contract.rs`
- `crates/ironclaw_secrets/tests/db_credential_store_contract.rs`
- `crates/ironclaw_secrets/tests/security_findings_poc.rs` (only exercised the SQL stores)
- `pub use db::{LibSqlSecretsStore, …}` re-exports in `lib.rs`
- `libsql` / `postgres` features on `ironclaw_secrets` (no consumer left)
- `deadpool-postgres` / `libsql` / `tokio-postgres` deps from `ironclaw_secrets/Cargo.toml`
- Feature-union references in `ironclaw_reborn` / `ironclaw_reborn_composition`
Public `SecretStore` and `CredentialBroker` traits are unchanged; the
new `FilesystemSecretStore` already implements them. The
`ScopedSecretsStoreAdapter` shim that previously wrapped the legacy
SQL stores is also unused now and can be removed in a follow-up
(left in place this commit to keep the diff focused on the legacy
deletion).
Per the design-doc entry "Legacy per-backend store cleanup —
ironclaw_secrets" in
docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md.
… Filesystem{RunState,ApprovalRequest}Store
Mirrors the engine migration in commit ac8e677 and the parallel
processes / secrets / authorization / outbound refactors:
`FilesystemRunStateStore` and `FilesystemApprovalRequestStore`
previously held a `&'a F: RootFilesystem` borrow and manually encoded
tenant/user identity in every path via `tenant_user_root` (formatted
`/engine/tenants/<tenant>/users/<user>/agents/.../projects/.../runs|approvals`).
Any composition layer that forgot to wrap the backend in a tenant
scope would leak across tenants with the type system saying nothing —
the HIGH-severity class of finding tracked in PR #3679 for
`ironclaw_engine`.
Migrated the run-state / approval stores so tenant isolation is
structural rather than something this crate has to re-derive from
`ResourceScope.tenant_id` / `user_id`:
- `FilesystemRunStateStore::new` and `FilesystemApprovalRequestStore::new`
now take `Arc<ScopedFilesystem<F>>`. The `'a` lifetime parameter on
the struct is gone — composition can hold these as
`Arc<dyn RunStateStore>` / `Arc<dyn ApprovalRequestStore>` without
pinning a borrow.
- Path helpers (`run_record_path`, `run_records_root`,
`approval_record_path`, `approval_records_root`) return `ScopedPath`
instead of `VirtualPath`. The on-disk layout under the `/run-state`
and `/approvals` mount aliases is:
/run-state[/agents/<agent>][/projects/<project>][/missions/<mission>][/threads/<thread>]/runs/<invocation_id>.json
/approvals[/agents/<agent>][/projects/<project>][/missions/<mission>][/threads/<thread>]/<request_id>.json
The leading `/engine/tenants/<tenant>/users/<user>` prefix is gone;
the caller's `MountView` resolves the alias to a tenant/user-scoped
target at every op. Sub-scope axes (agent/project/mission/thread)
stay in the alias-relative path because they are within-tenant
scoping not covered by the per-tenant `MountAlias`.
- `records_for_scope` on both stores reconstructs each child
`ScopedPath` from the `list_dir`-returned `VirtualPath` leaf via a
new `join_scoped` helper so the per-op ACL still runs on the
follow-up `get` (mirrors the engine / processes / secrets / outbound
store's `join_scoped` shape).
- `put_with_cas` now takes `&ScopedFilesystem<F>` / `&ScopedPath`. The
`Unsupported`→`Any` fallback semantics are unchanged, so byte-only
backends (`LocalFilesystem`) keep working through the per-path
`FILESYSTEM_RECORD_LOCKS` map.
- `filesystem_record_lock` now keys on `ScopedPath` instead of
`VirtualPath` so the lock is alias-relative (one process-local lock
per logical record, not per resolved target).
- `tenant_user_root` is removed; its agent/project/mission/thread
segments move into the new alias-relative `scope_owner_alias_string`
helper.
Two sibling mount aliases (`/run-state` + `/approvals`) on one
`Arc<ScopedFilesystem<F>>`: composition wires both aliases on the same
shared `MountView`, so this crate can drive run-state and approvals
through one filesystem handle while keeping the on-disk subtrees
distinct.
Tests:
- `tests/run_state_contract.rs` now constructs a `ScopedFilesystem`
over `LocalFilesystem` with `MountPermissions::read_write_list_delete()`
grants on both `/run-state` and `/approvals` aliases pointing at
tenant1/user1 targets. The existing cross-tenant assertions still
pass because the post-read `same_scope_owner` check filters out
records whose in-body scope differs from the request scope.
- `tests/approval_resolution_contract.rs` follows the same shape with
a local `scoped_run_state_fs` helper.
- New regression test
`filesystem_run_state_store_isolates_two_tenants_with_same_user_project_ids`
wires two `FilesystemRunStateStore`s + `FilesystemApprovalRequestStore`s
over one `LocalFilesystem` with different `MountView` targets but
identical `user_id` / `project_id` / `invocation_id` request scopes.
Writing on tenant A must not be visible from tenant B's stores —
`get`, `records_for_scope`, `complete`, and `approve` all fail closed.
Fails closed if the ScopedFilesystem wrapping ever regresses to raw
`&F: RootFilesystem`.
Composition + contract changes (so the per-invocation `MountView` and
boundary tests stay aligned):
- `crates/ironclaw_host_api/src/path.rs::VIRTUAL_ROOTS` adds
`/run-state` and `/approvals` (with the other migrated consumer-store
roots).
- `crates/ironclaw_reborn_composition/src/lib.rs::PER_USER_ALIASES`
adds `/run-state` and `/approvals` so both
`default_singleton_mount_view` and `invocation_mount_view` expose
them with full per-user-owner permissions.
- `docs/reborn/contracts/storage-placement.md` and
`docs/reborn/contracts/filesystem.md` add matching `/run-state` and
`/approvals` rows so the
`reborn_virtual_roots_match_storage_placement_contract` boundary
test stays green.
- `crates/ironclaw_host_runtime/tests/reborn_durable_restart_integration.rs`
is rewired to construct an `Arc<ScopedFilesystem<LocalFilesystem>>`
(with the new aliases) and pass it to both filesystem run-state
stores. The capability-lease store keeps the legacy
`&'static ScopedFilesystem<F>` shape until its own migration lands;
`leaked_scoped_engine_filesystem` is preserved as the borrow source
for now.
Trait surfaces (`RunStateStore`, `ApprovalRequestStore`,
`RunStateApprovalStore`) are unchanged. The legacy per-backend
LibSql / Postgres run-state + approval stores stay in this PR; their
deletion is the second pass tracked in
`docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md`.
Plan: docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md
…position Completes the third leg of the legacy-store cleanup. Now that FilesystemRunStateStore + FilesystemApprovalRequestStore route through ScopedFilesystem (commit a96d1f6), the per-backend siblings (LibSqlRunStateStore, PostgresRunStateStore, LibSqlApprovalRequestStore, PostgresApprovalRequestStore, LibSqlRunStateApprovalStore, PostgresRunStateApprovalStore) are vestigial duplication of what RootFilesystem already abstracts. Deleted: - `crates/ironclaw_run_state/src/db.rs` (LibSql + Postgres + shared helpers; ~one large module). - `crates/ironclaw_run_state/tests/db_run_state_store_contract.rs` (was the only suite exercising the SQL impls directly; the ScopedFilesystem-backed contract suite over InMemoryBackend + RootFilesystem covers the same surface). - `pub use db::{LibSql*, Postgres*}` re-exports in `lib.rs`. - `with_libsql_run_state_approval_store` / `with_postgres_run_state_approval_store` builder methods on `HostRuntimeServices` (composition now wires FilesystemRunStateStore + FilesystemApprovalRequestStore directly off the shared `Arc<ScopedFilesystem<F>>` it already constructs). Composition (`crates/ironclaw_reborn_composition/src/{lib,factory}.rs`) and the durable-restart integration test (`crates/ironclaw_host_runtime/tests/reborn_durable_restart_integration.rs`) now use the Arc-based construction shape for all three filesystem-backed stores in the trio (run-state, approval-requests, capability-leases) — same `Arc<ScopedFilesystem<LocalFilesystem>>` shared between them. Per the design-doc entry "Legacy per-backend store cleanup — ironclaw_run_state" in `docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md`.
The test was deleted on origin/reborn-integration in commit 34886c3 (`refactor(crates): group workspace crates by domain`) but my recent merge (7efef49) kept the file: independently-modified-on-one-side + deleted-on-the-other resolves to "keep" by default. The kept file references `RebornLibSqlMemoryDocumentRepository` which no longer exists (renamed to `LibSqlMemoryDocumentRepository`), breaking compile. Delete the file to match upstream's intent — the contract is covered elsewhere by `ironclaw_memory::tests::db_memory_repository_contract` and the audit-projection tests inside `ironclaw_event_projections` that don't depend on the renamed type.
The reborn-native libSQL/Postgres memory document repositories (`RebornLibSqlMemoryDocumentRepository`, `RebornPostgresMemoryDocumentRepository`) are vestigial duplication of what `RootFilesystem` already abstracts — per the same universal-FS dispatch design that drove the secrets/auth/run_state cleanup. `FilesystemMemoryDocumentRepository` over `LibSqlRootFilesystem` / `PostgresRootFilesystem` covers both surfaces with one impl. Implements `MemoryDocumentIndexRepository` (chunk replace + lifecycle hooks) on `FilesystemMemoryDocumentRepository` so the existing `ChunkingMemoryDocumentIndexer` can drive memory indexing through the unified repo — previously only the reborn-native libSQL impl participated. Deleted: - `crates/ironclaw_memory/src/repo/native_libsql.rs` - `crates/ironclaw_memory/src/repo/native_postgres.rs` - All five `reborn_native_*` contract tests in `crates/ironclaw_memory/tests/` — covered by the `FilesystemMemoryDocumentRepository` contract suite over `InMemoryBackend` plus the `RootFilesystem` backend contract tests in `ironclaw_filesystem` itself. `ChunkingMemoryDocumentIndexer` (`indexer.rs`) is unchanged; it delegates to anything that implements `MemoryDocumentIndexRepository` and now has a wider universe of impls to choose from. The dangling `RebornLibSqlMemoryDocumentRepository` reference in `crates/ironclaw_event_projections/tests/memory_significant_events_projection_contract.rs` is updated to use `FilesystemMemoryDocumentRepository` over `InMemoryBackend` — same contract, no DB dependency. CLAUDE.md updated to drop the "legacy `LibSqlMemoryDocumentRepository` / `PostgresMemoryDocumentRepository` still own their own per-backend behavioral coverage" sentence, since those tests are now deleted. Per the design-doc entry "Legacy per-backend store cleanup" in `docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md`.
The universal-FS dispatch design says backend choice happens at the RootFilesystem layer, not the consumer-store layer. The reborn-native duplicates were already deleted in 27485c5; this commit removes the remaining *legacy* per-backend memory document repositories. They duplicated what FilesystemMemoryDocumentRepository<F> already covers over RootFilesystem (LibSqlRootFilesystem / PostgresRootFilesystem / InMemoryBackend / LocalFilesystem). No production callers — composition has been on FilesystemMemoryDocumentRepository for a while, and no other crate imported LibSqlMemoryDocumentRepository / PostgresMemoryDocumentRepository. Deleted: - `crates/ironclaw_memory/src/repo/libsql.rs` - `crates/ironclaw_memory/src/repo/postgres.rs` - `pub use repo::{LibSql,Postgres}MemoryDocumentRepository` re-exports - `mod libsql` / `mod postgres` declarations in `repo/mod.rs` - Feature-gated helpers `scoped_memory_agent_id`, `db_path_for_memory_document`, `memory_document_from_db_path` in `repo/mod.rs` — only consumed by the deleted impls - `#[cfg(feature = "libsql")]` `encode_embedding_blob` / `decode_embedding_blob` / `cosine_similarity` in `src/embedding.rs` and `escape_fts5_query` in `src/search.rs` — same, only consumed by the deleted libsql repo (the filesystem repo has its own local `encode_embedding_blob`) - `crates/ironclaw_memory/tests/db_memory_repository_contract.rs` - `crates/ironclaw_memory/tests/e2e_hybrid_search.rs` - `crates/ironclaw_memory/tests/e2e_persistence_versioning.rs` - `crates/ironclaw_memory/tests/e2e_scope_isolation_safety.rs` - `crates/ironclaw_memory/tests/memory_filesystem_vertical_integration.rs` - `libsql` / `postgres` features on `ironclaw_memory` (no consumer in the workspace enabled them) and the corresponding optional deadpool-postgres / libsql / pgvector / tokio-postgres deps Memory substrate semantics (versioning, chunk replace, metadata cascade, hybrid search fusion, protected-path safety) retain coverage through `FilesystemMemoryDocumentRepository` over `InMemoryBackend` in `memory_backend_contract.rs` + `memory_filesystem_contract.rs` + the inline `repo/filesystem.rs` `#[cfg(test)]` suite. Backend-specific durability across reopen is now a `RootFilesystem` property and lives in `ironclaw_filesystem`'s own backend contract tests. CLAUDE.md drops the dangling "legacy `LibSqlMemoryDocumentRepository` / `PostgresMemoryDocumentRepository` still own their own per-backend behavioral coverage until their migration is scheduled" tail — the first half of that sentence was already removed in 27485c5; this commit removes the second half, since the tests no longer exist. The `tests/e2e_trace_memory_isolation.rs` ignore-message reference to the deleted `e2e_scope_isolation_safety.rs` test is repointed at the InMemoryBackend contract tests that own the substrate-level SOUL.md rejection coverage now. The CI `Run ironclaw_memory tests` step drops `--features libsql` (no such feature exists any more). Per the design-doc entry "Legacy per-backend store cleanup — ironclaw_memory" in `docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md`.
…oute through filesystem
The universal-FS dispatch design says backend choice happens at the
RootFilesystem layer, not at the durable-log impl layer.
FilesystemDurableEventLog / FilesystemDurableAuditLog already route
through ScopedFilesystem over any RootFilesystem (LibSqlRootFilesystem
/ PostgresRootFilesystem / InMemoryBackend), so the LibSql*/Postgres*
DurableEventLog / DurableAuditLog siblings that spoke SQL directly are
vestigial duplication.
The `RebornEventStoreConfig::Libsql { path_or_url, auth_token }` and
`::Postgres { url }` variants stay (production composition still
matches on the same shape) — but the dispatch now opens a backend-
specific RootFilesystem from the same input and wraps it in the
unified filesystem-backed durable log. Production policy gates
(production InMemory rejection, libSQL cleartext / ambiguous-bare URL
rejection, Postgres sslmode=disable rejection, TLS connector
construction) move into the event-store crate's `libsql_backed` /
`postgres_backed` modules.
Deleted:
- `crates/ironclaw_reborn_event_store/src/libsql_store.rs`
(LibSqlDurableEventLog, LibSqlDurableAuditLog, LibSqlStore,
build_libsql_event_stores)
- `crates/ironclaw_reborn_event_store/src/postgres_store.rs`
(PostgresDurableEventLog, PostgresDurableAuditLog, PostgresStore,
build_postgres_event_stores)
- `crates/ironclaw_reborn_event_store/src/sql_common.rs` (helpers used
only by the deleted SQL impls)
- `crates/ironclaw_reborn_event_store/migrations/{libsql,postgres}/*`
(replaced by RootFilesystem-layer migrations)
- SQL-table corruption fixture tests in `durable_event_store_contract`
that punched holes in `reborn_event_entries` — the table no longer
exists for events; the contiguity contract is enforced by the
RootFilesystem's `append`/`tail` invariants and covered by the
`FilesystemDurableEventLog` contract suite (backend-agnostic) plus
the surviving public-surface rebuild tests
- `dep:url` and `dep:uuid` (only used by the deleted Postgres SQL
body); kept `tokio-postgres-rustls` / `rustls` / etc. since the
Postgres path still parses the URL and builds the TLS connector
before handing the pool to `PostgresRootFilesystem`
No composition changes: `with_reborn_event_store_config` still takes
`RebornEventStoreConfig` and matches on the same variants.
Per the design-doc entry "Legacy per-backend store cleanup —
ironclaw_reborn_event_store" in
docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md.
…ilesystemSessionThreadService Mirrors the run-state / processes / secrets / authorization migrations (commits a96d1f6, 81664dd, 4ae5676, 5e7688d) plus the analogous engine migration in ac8e677. The new `FilesystemSessionThreadService<F: RootFilesystem>` accepts an `Arc<ScopedFilesystem<F>>` and routes every read/write through the `/threads` mount alias. The composition layer's per-invocation MountView resolves `/threads` to a tenant/user-scoped VirtualPath subtree, so tenant isolation is structural rather than something this crate has to re-derive from `ThreadScope.tenant_id`. Path layout under the alias keeps sub-scope (`agent_id`, `project_id`, `owner_user_id`, `mission_id`) plus the thread id in the alias-relative path so a single tenant/user can own multiple cells: /threads[/agents/<agent>][/projects/<project>][/owners/<owner_user>][/missions/<mission>]/threads/<thread_id>/thread.json /threads.../threads/<thread_id>/messages/<message_id>.json /threads.../threads/<thread_id>/summaries/<summary_id>.json /threads/idempotency/<sha256>.json The thread record holds the per-thread monotonic `next_sequence` counter; `accept_inbound_message` and `append_assistant_draft` do an optimistic CAS on the thread record to reserve a sequence, then write the new per-message record with `CasExpectation::Absent`. Every multi-step transition retries on `FilesystemError::VersionMismatch` and falls back to `CasExpectation::Any` on `Unsupported` so byte-only `LocalFilesystem` backends keep working (matches the existing run-state / authorization store fallback shape). Idempotency records flatten under one `/threads/idempotency/` directory keyed by SHA-256 of the (scope, source_binding_id, external_event_id) tuple — two scopes hash to different keys, so the flat layout is safe. `replay_accepted_inbound_message`, which has no scope input, scans that directory and matches binding/event id against the persisted record body. Wiring: - `/threads` added to `PER_USER_ALIASES` in `ironclaw_reborn_composition` so both `default_singleton_mount_view` and `invocation_mount_view` expose the alias with full r/w/l/d permissions. - `/threads` added to `VIRTUAL_ROOTS` in `ironclaw_host_api::path` so the alias resolves to a recognized top-level subtree. - `docs/reborn/contracts/{filesystem,storage-placement}.md` updated to list `/threads` alongside the other consumer-store mount aliases. Cross-tenant isolation regression test (`filesystem_session_thread_service_isolates_two_tenants_with_same_user_project_ids`) asserts that two `FilesystemSessionThreadService`s sharing one `InMemoryBackend` cannot see each other's threads, messages, or idempotency records when their `MountView`s resolve `/threads` to different tenant subtrees — identical `(agent_id, project_id, owner_user_id, thread_id)` tuples on both sides. Per the design-doc entry "ironclaw_threads" in `docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md`.
…sition
Completes the fourth leg of the legacy per-backend store cleanup
(after `ironclaw_outbound`, `ironclaw_secrets`, `ironclaw_authorization`,
and `ironclaw_run_state`). Now that `FilesystemSessionThreadService`
routes through `ScopedFilesystem` (previous commit), the per-backend
siblings (`LibSqlSessionThreadService`, `PostgresSessionThreadService`)
are vestigial duplication of what `RootFilesystem` already abstracts.
Deleted:
- `crates/ironclaw_threads/src/db.rs` (LibSql + Postgres impls plus the
shared `DurableState` snapshot/replace helpers — one ~1.8k line module).
- `crates/ironclaw_threads/tests/db_session_thread_contract.rs` (was the
only suite exercising the SQL impls directly; the
`FilesystemSessionThreadService` contract suite over `InMemoryBackend`
+ the in-memory `session_thread_contract.rs` semantic suite cover the
same surface, and durability across reopen is now a `RootFilesystem`
property tested at the `ironclaw_filesystem` layer).
- `pub use db::{LibSql*,Postgres*}` re-exports in `lib.rs`.
- `libsql` + `postgres` Cargo features on `ironclaw_threads` (no remaining
feature-gated code paths). `deadpool-postgres`, `tokio-postgres`,
`libsql`, `tracing`, and `tempfile` (dev) dependencies removed
alongside.
Composition / call-site wiring:
- `crates/ironclaw_reborn/tests/loop_driver_host.rs`: the
`turn_runner_worker_completes_after_libsql_turn_and_thread_services_reopen`
restart regression now builds a `LibSqlRootFilesystem` + a
`ScopedFilesystem` wrapping `/threads → /threads` and constructs the
thread service via `FilesystemSessionThreadService::new`. The
`libsql-restart-tests` feature now pulls
`ironclaw_filesystem/libsql` instead of `ironclaw_threads/libsql`
(which no longer exists). `ironclaw_turns/libsql` is still wired
because `LibSqlTurnStateStore` has not migrated yet.
Per the design-doc entry "Legacy per-backend store cleanup —
ironclaw_threads" in
`docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md`.
|
Correctness/security review on current head
Focused checks I ran on this head:
These passing checks don't cover the failing scenarios above. |
…m in FilesystemConversationStateStore
Mirrors the engine / processes / secrets / outbound / authorization /
run-state migrations: the legacy `RebornLibSqlConversationStateStore` /
`RebornPostgresConversationStateStore` held the substrate handle
directly (`Arc<libsql::Database>` / `deadpool_postgres::Pool`) and the
caller had to construct per-tenant substrates to keep two tenants'
conversation state apart. The new `FilesystemConversationStateStore<F>`
takes `Arc<ScopedFilesystem<F>>`; the `MountView` resolves the
`/conversations` alias to a tenant/user-scoped `VirtualPath` before any
backend dispatch, so tenant isolation is structural rather than a
convention the caller must remember.
Pass 1 of the two-pass migration described in
`docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md`. The
legacy `RebornLibSql*` / `RebornPostgres*` stores stay in this PR;
their deletion is the second pass (mirrors the `run-state` paired
pattern in commits `a96d1f63b` + `a238050a2`).
What changes:
- New `FilesystemConversationStateStore<F: RootFilesystem>` implementing
the existing `ConversationStateRepository` trait. Persists at
`/conversations/state.json`. Uses `CasExpectation::Version` + bounded
retry with `Unsupported` -> `Any` fallback (same `put_with_byte_fallback`
shape as the other migrated stores). Writes carry a `tenant_ids`
indexed projection — defense-in-depth scoping that mirrors the other
migrated consumer crates; the conversation state blob is multi-tenant
in shape, so the projection lists every tenant present in the
snapshot (under per-invocation MountView, the rewritten path slices
to one tenant per file).
- New `RebornFilesystemConversationServices` wrapper that wires an
`InMemoryConversationServices` over the filesystem store — same
shape as `RebornLibSqlConversationServices` / `RebornPostgresConversationServices`.
- `state_store` module is now unconditional (was feature-gated to
`libsql`/`postgres`). The `state_repository` plumbing on
`InMemoryConversationServices` and `InMemoryState::persistence_revision`
are unconditional too so the filesystem store can plug in without a
new feature flag.
- JSON wire envelope `StoredConversationState` flattens HashMaps keyed
by struct values (`ActorKey`, `BindingKey`, `ThreadKey`,
`ExternalEventRouteKey`, `MessageIdempotencyKey`,
`AcceptedMessageReplayKey`, `AcceptedMessageRef`) to `Vec<(K, V)>`
because JSON requires HashMap keys to be strings. Legacy libSQL/
Postgres adapters did the same shape via individual rows; this
envelope keeps the equivalent contract within a single document.
Composition + contracts:
- `PER_USER_ALIASES` in `ironclaw_reborn_composition` gains
`/conversations`; both `default_singleton_mount_view` and
`invocation_mount_view` rewrite it.
- `VIRTUAL_ROOTS` in `ironclaw_host_api::path` gains `/conversations`.
- `docs/reborn/contracts/storage-placement.md` and
`docs/reborn/contracts/filesystem.md` add matching `/conversations`
rows so the `reborn_virtual_roots_match_storage_placement_contract`
boundary test stays green.
Tests:
- `tests/filesystem_store_contract.rs` adds two regressions:
- `filesystem_conversation_services_round_trip_persisted_state_on_reopen`:
writes pair + binding under one services instance, drops, reopens a
fresh services instance over the same backend, replays the same
resolve and asserts the binding rehydrated from durable storage.
- `filesystem_conversation_state_store_isolates_two_tenants_with_same_user_project_ids`:
two `RebornFilesystemConversationServices` instances share one
`InMemoryBackend` with different `MountView` targets but identical
`(user_id, project_id)` request scopes; writing on tenant A must
not be visible from tenant B, even though the alias-relative path
is identical. Fails closed if the `ScopedFilesystem` wrapping ever
regresses to raw `Arc<F: RootFilesystem>`.
- All 44 pre-existing `inbound_contract` tests keep passing
unchanged.
`cargo clippy --workspace --all-features --tests -- -D warnings` and
`cargo test -p ironclaw_conversations --all-features` both clean (44
inbound + 2 filesystem-store contract tests).
Plan: docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md
Completes the second leg of the conversations migration. Now that
FilesystemConversationStateStore routes through ScopedFilesystem
(prev commit a46edd88c), the per-backend siblings
(RebornLibSqlConversationStateStore, RebornPostgresConversationStateStore,
RebornLibSqlConversationServices, RebornPostgresConversationServices)
are vestigial duplication of what RootFilesystem already abstracts.
Deleted:
- `crates/ironclaw_conversations/src/libsql.rs` (787 lines).
- `crates/ironclaw_conversations/src/postgres.rs` (785 lines).
- `pub use libsql::{...}` / `pub use postgres::{...}` re-exports in
`lib.rs` and the matching `mod` declarations.
- The pub(crate) `ExternalConversationIdentity::conversation_fingerprint`
helper (gated to libsql/postgres; only used by the deleted SQL store
body). The public `ExternalConversationRef::conversation_fingerprint`
on the ref type stays — adapter crates still depend on it for
composite key digesting.
- `libsql`, `postgres`, `dep:libsql`, `dep:deadpool-postgres`,
`dep:tokio-postgres` features + manifest deps from Cargo.toml.
- The transitively-only-used `sha2` crate dependency (the digest
helper that backed `conversation_fingerprint` for the SQL composite
key indexes is also gone).
- Three legacy-store tests in `tests/inbound_contract.rs`:
`libsql_conversation_services_survive_restart_for_retry_replay`,
`libsql_stale_service_write_fails_without_overwriting_newer_rows`,
`postgres_conversation_services_round_trip_restart_replay_when_available`,
and the `table_count` helper they shared. Durability across reopen
is now a `RootFilesystem` property covered by
`filesystem_conversation_services_round_trip_persisted_state_on_reopen`
in `tests/filesystem_store_contract.rs`.
Composition is untouched because no production composition wired
conversations through these legacy stores yet — the migration plan
listed conversations under "future work" prior to this PR, and the
top-level Reborn composition has no
`with_libsql_conversation_state_store` / equivalent builder. New
composition wires `RebornFilesystemConversationServices` directly off
the shared `Arc<ScopedFilesystem<F>>`.
Per the "Legacy per-backend store cleanup — ironclaw_conversations"
entry in
`docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md`.
`cargo clippy --workspace --all-features --tests -- -D warnings` and
`cargo test -p ironclaw_conversations --all-features` both clean
(41 inbound + 2 filesystem-store contract tests).
…esystemTurnStateStore Mirrors the engine migration in commit ac8e677 and the parallel processes / secrets / authorization / outbound / run-state refactors: adds `FilesystemTurnStateStore<F>` that holds `Arc<ScopedFilesystem<F>>` and routes every read/write through a fixed alias-relative path (`/turns/state.json`). Any composition layer that hands the store a `ScopedFilesystem` constructed with a tenant/user-scoped `MountView` gets cross-tenant isolation structurally — the type system enforces what previously had to be re-derived from `TurnScope.tenant_id` at every call site. Internally the store reuses the existing `InMemoryTurnStateStore::from_persistence_snapshot_with_admission_limit_provider` load + `persistence_snapshot` write pair, so the in-memory state machine (idempotency, admission reservations, active locks, checkpoint sequencing, lifecycle event projection) remains the canonical implementation. The filesystem store wraps every mutating trait method in a CAS-aware read-modify-write loop with bounded retry; reads project through the in-memory store without writing back. Trait surface implemented: - `TurnStateStore` (submit / resume / request_cancel / get_run_state) - `TurnRunTransitionPort` (claim/heartbeat/recover/record_model_route/ block/complete/cancel/fail/record_recovery_required/ apply_validated_loop_exit) - `TurnEventProjectionSource` (read_turn_events_after) - `LoopCheckpointStore` (put/get loop checkpoints) The on-disk layout under the `/turns` mount alias is fixed: /turns/state.json Tenant + user identity moves into the caller's `MountView` per the per-tenant `MountAlias` rewriting, so neither prefix is encoded in the path itself. Within-tenant axes (agent/project/thread) stay in the persisted `TurnScope` on each record because they are not covered by the per-tenant `MountAlias`. Implementation details mirroring the run-state store: - CAS-Version + retry on `VersionMismatch`, `Unsupported -> Any` fallback so byte-only `LocalFilesystem` keeps working through the per-process `FILESYSTEM_RECORD_LOCKS` map. - `submit_turn` runs the run-profile resolver once *outside* the CAS retry loop and threads a pre-resolved resolver into the apply closure — keeps the per-path async lock from spanning the resolver future. - `TurnError::Unavailable { reason }` for filesystem operation failures (matches the existing `db_error` mapping). Composition + contract changes: - `crates/ironclaw_host_api/src/path.rs::VIRTUAL_ROOTS` adds `/turns`. - `crates/ironclaw_reborn_composition/src/lib.rs::PER_USER_ALIASES` adds `/turns` so both `default_singleton_mount_view` and `invocation_mount_view` expose it with per-user-owner permissions. - `docs/reborn/contracts/{filesystem,storage-placement}.md` add the matching `/turns` rows so the `reborn_virtual_roots_match_storage_placement_contract` boundary test stays green. Tests: - New regression test `tests/filesystem_turn_state_contract.rs`: - `filesystem_turn_state_store_persists_submit_and_reopens` — asserts a submitted turn rehydrates after the store is re-constructed from the same scoped filesystem (durability contract). - `filesystem_turn_state_store_hides_records_from_other_tenants_via_mount_view` wires two `FilesystemTurnStateStore`s over one `LocalFilesystem` with different `MountView` targets but identical `(tenant_id, agent_id, project_id, thread_id, idempotency_key)`. Writing on tenant A must not be visible from tenant B's store — `get_run_state` reports `ScopeNotFound`, and tenant B mints its own distinct `run_id` on re-submit. Fails closed if the ScopedFilesystem wrapping ever regresses to a raw `&F: RootFilesystem`. Trait surface (`TurnStateStore`, `TurnRunTransitionPort`, `TurnEventProjectionSource`, `LoopCheckpointStore`) is unchanged. The legacy per-backend `LibSqlTurnStateStore` / `PostgresTurnStateStore` stay in this PR; their deletion is the second pass tracked in `docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md`. Plan: docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md
…tion
Completes the fourth leg of the legacy-store cleanup. Now that
FilesystemTurnStateStore routes through ScopedFilesystem (commit
ff88e75ca), the per-backend siblings (LibSqlTurnStateStore,
PostgresTurnStateStore) are vestigial duplication of what
RootFilesystem already abstracts. Their internal load-snapshot /
replace-snapshot machinery and their SQL-table schema/migrations are
no longer the source of truth — the snapshot blob format is, and the
backend choice happens at the RootFilesystem layer.
Deleted:
- `crates/ironclaw_turns/src/db.rs` (LibSql + Postgres + shared
schema/migration/snapshot-load/replace helpers).
- `crates/ironclaw_turns/tests/db_turn_state_store_contract.rs` (was
the only suite exercising the SQL impls directly; the
ScopedFilesystem-backed contract suite added in the previous
commit covers the same surface).
- `pub use db::{LibSqlTurnStateStore, PostgresTurnStateStore}` and
the `mod db` declaration in `lib.rs`.
- `with_libsql_turn_state_store` / `with_postgres_turn_state_store`
builder methods on `HostRuntimeServices`. Composition now wires a
single `with_filesystem_turn_state_store` off the shared
`Arc<ScopedFilesystem<F>>` it already constructs.
- The `libsql` and `postgres` features on `ironclaw_turns`
(along with `libsql`, `tokio-postgres`, `deadpool-postgres` deps).
Wiring updates:
- `crates/ironclaw_host_runtime/Cargo.toml`: drop
`ironclaw_turns/{libsql,postgres}` feature wires.
- `crates/ironclaw_host_runtime/src/services.rs`: replace the two
per-backend builders with a single
`with_filesystem_turn_state_store<F>(Arc<ScopedFilesystem<F>>)`.
- `crates/ironclaw_reborn_composition/{lib,factory}.rs`: switch both
libSQL and PostgreSQL production wirings to
`.with_filesystem_turn_state_store(Arc::clone(&scoped_filesystem))`
off the same shared scoped filesystem already used for run-state.
- `crates/ironclaw_reborn_composition/Cargo.toml`: drop the
`ironclaw_turns/{libsql,postgres}` feature wires.
- `crates/ironclaw_reborn/Cargo.toml`: replace
`ironclaw_turns/libsql` in `libsql-restart-tests` with
`ironclaw_filesystem/libsql` (the libSQL backend lives at the
filesystem layer now).
Test migrations:
- `crates/ironclaw_host_runtime/tests/host_runtime_services_contract.rs`:
the three `production_turn_*` tests are rewired around a new
`libsql_scoped_turns_fs` helper that constructs an
`Arc<ScopedFilesystem<LibSqlRootFilesystem>>` with the canonical
`/turns` alias. Production readiness assertions, coordinator
submit+reopen, and missing-resolver fail-closed coverage are all
preserved against the filesystem-backed store.
- `crates/ironclaw_reborn/tests/loop_driver_host.rs`: the
`turn_runner_worker_completes_after_libsql_turn_and_thread_services_reopen`
restart test is rewired around a new `libsql_filesystem_turn_store`
helper. Both the pre-restart and post-restart sides now build a
`FilesystemTurnStateStore<LibSqlRootFilesystem>` instead of
`LibSqlTurnStateStore`. The thread-side `LibSqlSessionThreadService`
is untouched (its own migration is independent). Restart durability
is exercised against the libSQL-backed filesystem snapshot.
- `crates/ironclaw_turns/tests/loop_checkpoint_store_contract.rs`:
drops the libsql/postgres-specific
`turn_loop_checkpoints`-table assertions (which were validating
SQL-schema separation between `turn_checkpoints` and
`turn_loop_checkpoints`) and replaces them with a
filesystem-backed roundtrip + snapshot-reopen assertion that
exercises the same `loop_checkpoints` vs `checkpoints` projection
invariant against `FilesystemTurnStateStore`. InMemory coverage
is unchanged.
Per the design-doc entry "Legacy per-backend store cleanup —
ironclaw_turns" in
`docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md`
(marked Done in this commit).
…elete legacy LibSql/Postgres stores
Mirrors the run-state, processes, secrets, authorization, and outbound
migrations to the universal-FS-dispatch pattern, completing the legacy
per-backend cleanup in `ironclaw_resources`.
Pass 1 — add `FilesystemResourceGovernorStore<F: RootFilesystem>`:
- New `crates/ironclaw_resources/src/filesystem_store.rs` defines
`FilesystemResourceGovernorStore<F>`, an `Arc<ScopedFilesystem<F>>`-
based `ResourceGovernorStore` impl that serializes the whole governor
snapshot (limits + reservations + usage tallies) as a single record
at `/resources/snapshot.json` under the new consumer-store mount
alias. Tenant/user identity moves into the caller's `MountView`, so
the path itself is alias-relative — same shape as the other consumer
stores migrated in this rework.
- `ResourceGovernorStore::update` is a sync trait (in-memory + legacy
SQL impls were sync); the new filesystem impl crosses to async via
a lazily-spawned current-thread tokio worker thread (same shape as
the deleted libSQL/Postgres `AsyncStorageWorker`).
- Concurrency: a process-local per-path async `Mutex` map serializes
in-process read-modify-write cycles; the inner CAS-Version
precondition surfaces cross-process races as `ResourceError::Storage`
errors (the trait's `FnOnce` closure shape forbids retry, mirroring
the libSQL/Postgres `BEGIN IMMEDIATE` / `LOCK TABLE` semantics).
- Byte-only backend fallback: `LocalFilesystem` rejects
`CasExpectation::Version`/`Absent`, so the store falls back to
`CasExpectation::Any` under the same per-path async lock with an
emulated `Absent` precheck — matches the other consumer-store
migrations.
- Adds `/resources` to `VIRTUAL_ROOTS` in
`crates/ironclaw_host_api/src/path.rs`, `PER_USER_ALIASES` in
`crates/ironclaw_reborn_composition/src/lib.rs`, and the
`/resources` row in
`docs/reborn/contracts/{filesystem,storage-placement}.md` so the
per-invocation `MountView` and boundary tests stay aligned.
- Cross-tenant isolation regression test in
`filesystem_store.rs::tests::isolates_two_tenants_with_same_user_project_ids`
wires two `FilesystemResourceGovernorStore`s over one
`InMemoryBackend` with different `MountView` targets but identical
`user_id`/`project_id`. Writing on tenant A must not be visible from
tenant B's governor — fails closed if the wrapping ever regresses to
raw `Arc<F: RootFilesystem>`.
Pass 2 — delete legacy stores and switch composition:
- Deleted `LibSqlResourceGovernorStore`, `PostgresResourceGovernorStore`,
the `AsyncStorageWorker` plumbing, the SQL schema constant, and the
worker-cell head-of-line-block regression test from `lib.rs`.
- Replaced `with_libsql_resource_governor` /
`with_postgres_resource_governor` on `HostRuntimeServices` with a
single `with_filesystem_resource_governor<FsBackend>(Arc<ScopedFilesystem>)`
builder method. Backend choice is now a `RootFilesystem` property.
- `with_resource_governor` is no longer feature-gated — filesystem
resource governor wiring is universal.
- `crates/ironclaw_reborn_composition/src/{lib,factory}.rs` now
construct `FilesystemResourceGovernorStore::new(Arc::clone(&scoped_filesystem))`
over the shared `Arc<ScopedFilesystem<F>>` composition already
builds via `wrap_scoped` — same wiring shape as the other migrated
consumer stores.
- `LibSqlProductionHostRuntimeServices` and
`PostgresProductionHostRuntimeServices` type aliases now
parameterize on `FilesystemResourceGovernorStore<{LibSql,Postgres}RootFilesystem>`.
- `crates/ironclaw_resources/Cargo.toml`: `tokio` becomes a
non-optional dep with `sync` feature added (always needed for the
per-path async lock); `ironclaw_filesystem` added; `libsql` /
`postgres` features drop their `tokio` dep (already non-optional).
- Tests in `crates/ironclaw_resources/tests/resource_governor_contract.rs`
and `crates/ironclaw_host_runtime/tests/host_runtime_services_contract.rs`
replace the SQL-specific reload + builder tests with
filesystem-backed equivalents over `InMemoryBackend +
ScopedFilesystem`; the on-disk snapshot round-trip via a fresh
store handle covers the same durability contract the SQL versions
did.
Tests: `cargo test -p ironclaw_resources --all-features` passes
(4 unit + 38 contract); `cargo test -p ironclaw_host_runtime
--all-features --test host_runtime_services_contract` passes (81);
`cargo clippy --workspace --all-features --tests -- -D warnings`
clean; `cargo check --workspace --all-features` clean.
Plan: docs/plans/2026-05-16-scoped-filesystem-tenant-isolation.md
…ixture `reborn_durable_restart_integration::approval_resume_*` and `process_result_and_output_*` were panicking with "no backend mount found for virtual path /processes/..." because `mounted_engine_filesystem` only mounted `/engine` on the underlying `LocalFilesystem`, while the `ScopedFilesystem` exposed `/processes`, `/authorization`, `/run-state`, and `/approvals` aliases to the filesystem-backed stores. Add a backend mount for each of those virtual roots, pointing at a sibling subdirectory under `engine_root` so the same on-disk tree is reopened across the restart fixture's service graphs.
…esources The boundary suite still forbade `ironclaw_filesystem` from `ironclaw_threads`, `ironclaw_turns`, and `ironclaw_resources`, but each of those crates now routes durable storage through `ScopedFilesystem` under the universal-fs-dispatch rework (matches the same exception already in place for `ironclaw_secrets`). Mirror the secrets-rule comment shape so the intentional edge is documented alongside the rule rather than just removed.
|
Deep architecture/crate hygiene review:
In Impact: runtime services remain shared across tenants/users despite the PR's tenant-isolation framing. Fix by making store construction request-scoped, or by carrying
Impact: approval persistence can split across two records, so crash/concurrency windows can leave pending approval saved but run not blocked or related partial states. The production readiness guard also does not model combined atomic run-state/approval store as a required component. Fix with a filesystem-backed
Impact: this is how contract fixes like scoped mounts, egress wiring, secret validation, or atomic stores get applied to one entrypoint but not the other. Collapse to one production builder and make |
Single PR stacking ALL the consumer-crate work that applies #3659 (universal
RootFilesystemdispatch fabric) across the codebase. Supersedes the prior stacked PRs (#3666, #3670, #3671, #3672, #3678 — all closed).What's in here
13 commits, +15,214 / -929 LOC across 61 files. Every commit is independently reviewable.
feat(processes)FilesystemProcessStoreon the unified put/get surfacefeat(outbound)FilesystemOutboundStateStoreon the unified surfacefeat(authorization)FilesystemCapabilityLeaseStorethrough unified put/getfeat(run-state)FilesystemRunStateStore+FilesystemApprovalRequestStorerefactor(workspace)ironclaw_storage(660 LOC, parallel BlobStore/RecordStore traits removed)feat(filesystem)HsmBackendplaceholder + scopedatabase.mdto legacyfeat(reborn)append/tailon libsql + postgres (V30 migration);FilesystemDurableEventLogfeat(secrets)FilesystemSecretStore+FilesystemCredentialBrokerfeat(filesystem,memory)IndexKind::Fts+IndexKind::Vectoron libsql + postgres;FilesystemMemoryDocumentRepositoryfeat(db) conversations+jobsFilesystemConversationStore+FilesystemJobStore(~1700 LOC + 23 tests)feat(db) sandbox+routine+toolfailFilesystemSandboxStore+FilesystemRoutineStore+FilesystemToolFailureStore(~2000 LOC + 20 tests)feat(engine)FilesystemStore— full 33-method engineStoreimpl on the unified surface (~700 LOC + 24 tests; HybridStore untouched)feat(db) settings+workspace+restFilesystemSettingsStore+FilesystemUsersStore+FilesystemChannelPairingStore+FilesystemIdentityStore+FilesystemWorkspaceStore(split across 5 sub-modules; ~4400 LOC + 47 tests)Coverage on the universal-FS surface
put(Entry::bytes(...))/getEntry::record(...)with indexed projectionFilter::Eq/PrefixOn/And/Or/Fts/VectorNearestensure_indexdeclares Exact/Prefix at startup; FTS5 + tsvector + cosine vector advertised by libsql + postgresappend/tail; FilesystemDurableEventLog + sandbox job-events use itCasExpectation::Versionwith retry onVersionMismatchBackendCapabilitiesbitmask; descriptor over-claims surfaceFilesystemError::DescriptorOverclaimsWhat's still pending (intentional follow-ups)
Databasetrait dissolution: each of the 10 sub-traits now has aFilesystem*Storefacade insrc/db/filesystem_*.rs, but the composite trait itself insrc/db/mod.rsis unchanged — both legacy SQL impls and new FS-backed impls coexist. Final removal of the composite trait is intentionally deferred behind the wiring switch (out of scope here).src/app.rsstill constructsHybridStoreagainst the legacyDatabaseimpl. Switching toCompositeRootFilesystem-backed wiring is a single-PR follow-up.EncryptedBackenddecorator: encryption stays embedded inSecretsCryptofor the migration window; documented TODO to move to a backend decorator.HsmBackendis a placeholder demonstrating the seam.Verification
cargo fmt --checkcleancargo check --workspace --all-featurescleancargo clippy --workspace --all-features --tests -- -D clippy::correctnesscleancargo testper migrated crate — all green; engine: 549 tests; secrets: 60; memory: 219 (2 pre-existinge2e_hybrid_searchfailures unrelated); event_store: 57; architecture: 14; workspace libdb::filesystem86 testsMigration pattern (canonical, used by every commit)
Path layout:
/<root>/<scope>/<id>for entities,/<root>/<scope>/<id>/<child>/<child_id>for sub-entities. Indexed projections +Filter::Eq + Andfor queries. CAS retry onVersionMismatch.Test plan