Skip to content

feat: add host-controlled trust-class policy engine#3043

Merged
serrrfirat merged 9 commits into
reborn-integrationfrom
feat/policy-engine
Apr 29, 2026
Merged

feat: add host-controlled trust-class policy engine#3043
serrrfirat merged 9 commits into
reborn-integrationfrom
feat/policy-engine

Conversation

@nickpismenkov
Copy link
Copy Markdown
Collaborator

@nickpismenkov nickpismenkov commented Apr 28, 2026

Summary

  • Implements PR1b host-controlled trust-class policy engine from Reborn PR1b: add host-controlled trust-class policy engine #3012.
  • Splits RequestedTrustClass (manifest input, freely deserializable) from EffectiveTrustClass (policy output) at the type level — privileged variants are crate-private constructors.
  • New ironclaw_trust crate: TrustPolicy trait + HostTrustPolicy engine with layered PolicySources (BundledRegistry + AdminConfig functional, SignedRegistry + LocalDevOverride interface seams), plus InvalidationBus for synchronous trust-change fan-out.
  • 14 acceptance-criteria contract tests + 4 new host_api wire tests + 2 lib smoke tests cover all 5 issue-suggested tests + all 10 acceptance criteria. Compile-time guarantee enforced via static_assertions::assert_not_impl_any!(EffectiveTrustClass: DeserializeOwned).
  • Architecture boundary test prevents ironclaw_trust from depending on dispatcher / capability host / runtimes / approvals / etc.

Change Type

  • Bug fix
  • New feature
  • Refactor
  • Documentation
  • CI/Infrastructure
  • Security
  • Dependencies

Linked Issue

Closes #3012. Originally meant to land before PR3 #2999 (auth control), but #2999 already merged into reborn-integration; PR1b still applies because ironclaw_authorization does not yet consume EffectiveTrustClass — a small follow-up PR will wire them together.

Validation

  • cargo fmt --all -- --check
  • cargo clippy --all --benches --tests --examples --all-features -- -D warnings
  • cargo build
  • Relevant tests pass: cargo test -p ironclaw_host_api -p ironclaw_trust -p ironclaw_architecture --features ironclaw_trust/test-fixtures46/46 (1 boundary + 29 host_api + 2 lib smoke + 14 contract). The --features ironclaw_trust/test-fixtures flag is required for the contract test target — see "Test fixtures gated behind feature flag" below.
  • Workspace --lib tests: all pass except 4 pre-existing failures in ironclaw/ironclaw_engine (verified to fail on base reborn-integration HEAD too — unrelated to PR1b)
  • cargo test --features integration — N/A (no DB or integration code)
  • Manual testing: contract tests T1-T13 cover the manual smoke checks from the implementation plan

Acceptance-criteria coverage matrix

AC from #3012 Covering test
1. No deserialize → effective privileged trust compile-time: assert_not_impl_any!(EffectiveTrustClass: DeserializeOwned) at top of policy_contract.rs. Runtime: T9, T10, H1.
2. Manifest privileged request denied/downgraded T1, T2 + default_decision (LocalManifest → Sandbox, others → UserTrusted)
3. Host-approved → effective trust only via policy engine T3
4. Trust class alone grants no capability T4, T5
5. Expanded authority requires renewed approval T6 (covers add/remove/reorder/identical)
6. Trust downgrade invalidates active grants before side effects T7, T8 + InvalidationBus synchronous fan-out + mutator-publish docs on BundledRegistry/AdminConfig/SignedRegistry
7. Grant retention requires stable identity T11, T12 + identity_changed/grant_retention_eligible helpers
8. Tests prove user-installed cannot self-promote T1
9. Tests prove FirstParty/System without grant denies invocation T5 (via FakeAuthorizer)
10. Boundary tests prevent upward deps architecture rule

Suggested-tests coverage

Issue test Plan tests
1. Self-promotion denied T1 (effective ≠ priv) + T2 (privileged grant fails via FakeAuthorizer)
2. Host assignment succeeds T3 + T4 (no caps without grant)
3. Trust alone grants nothing T5
4. Downgrade invalidates active authority T7, T8 (via FakeGrantStore listener)
5. Requested/effective type split T9 (compile-time + runtime), T10 (runtime), H1

Security Impact

Security-critical change. Enforces the type-level guarantee that user-installed manifests cannot self-promote to privileged effective trust:

  • EffectiveTrustClass::FirstParty / System constructors are pub(crate) — only HostTrustPolicy::evaluate can produce them.
  • EffectiveTrustClass has no Deserialize impl; statically asserted at compile time via static_assertions.
  • host_api::TrustClass privileged variants reject serde input (#[serde(skip_deserializing)]).
  • RequestedTrustClass (untrusted input) and EffectiveTrustClass (policy output) are distinct types with no From/TryFrom between them.
  • Trust downgrade/revocation publishes a TrustChange synchronously on InvalidationBus before any subsequent evaluate() returns the lower decision — fail-closed.
  • Mutator methods on BundledRegistry / AdminConfig / SignedRegistry carry explicit doc-level mutation→publish contracts; module docs explain the orchestration pattern.
  • Test fixtures are gated behind a test-fixtures Cargo feature — production builds cannot import the privileged-trust constructors.
  • New crate has no upward dependencies (only ironclaw_host_api); enforced by reborn_crate_dependency_boundaries_hold test.

Database Impact

None.

Blast Radius

  • New crate ironclaw_trust (foundation level, depends only on ironclaw_host_api).
  • Additive changes to ironclaw_host_api:
    • new trust module: RequestedTrustClass, PackageIdentity, PackageSource
    • new PackageId newtype in ids.rs (one line, uses existing string_id! macro)
    • doc-only update on existing TrustClass to clarify it is the effective ceiling
  • Boundary rule added to ironclaw_architecture/tests/reborn_dependency_boundaries.rs.
  • No existing functionality changed; nothing in agent loop, gateway, sandbox, or db touched.

Rollback Plan

Revert this PR cleanly. The ironclaw_trust crate has no consumers yet — PR3's ironclaw_authorization crate (#2999) does not yet integrate the policy engine. Rollback is git revert <merge> plus removing crates/ironclaw_trust from the workspace members array; no migration or data impact.

Review Follow-Through

  • Naming: EffectiveTrustClass wraps host_api::TrustClass (variants Sandbox/UserTrusted/FirstParty/System) rather than introducing a new enum with the issue's suggested Untrusted/ThirdParty/FirstParty/System. The issue explicitly permits this ("Implementation does not need to use these exact names"). SandboxUntrusted, UserTrustedThirdParty semantically.
  • Test fixtures gated behind feature flag: crates/ironclaw_trust/src/fixtures.rs is #[cfg(any(test, feature = "test-fixtures"))] #[doc(hidden)]. The contract test target sets required-features = ["test-fixtures"]. Production builds cannot import the privileged-trust constructors. To run the full contract suite locally: cargo test -p ironclaw_trust --features test-fixtures (or --features ironclaw_trust/test-fixtures from workspace root).
  • All 4 sources from Reborn PR1b: add host-controlled trust-class policy engine #3012 are present as PolicySource impls: Bundled (functional), AdminConfig (functional), SignedRegistry (interface seam with trusted_signers map + SignerEntry), LocalDevOverride (interface seam, inert in PR1b).
  • Mutation/invalidation orchestration is left to the caller by design — the registry mutators (BundledRegistry::upsert/remove, AdminConfig::upsert/remove, SignedRegistry::upsert/remove) carry explicit doc-level contracts requiring the caller to publish a TrustChange on the relevant InvalidationBus. PR3's grant-store wiring will own this orchestration; T7 / T8 already exercise the contract.
  • Follow-up wiring PR will integrate EffectiveTrustClass + InvalidationBus into ironclaw_authorization (PR3's auth-control crate). PR1b deliberately stops at the contract boundary so the wiring can be reviewed independently.

Review track: C (security/runtime/DB/CI) — adds a security-critical type-system boundary to the foundation layer.

@github-actions github-actions Bot added size: XL 500+ changed lines risk: medium Business logic, config, or moderate-risk modules scope: docs Documentation scope: dependencies Dependency updates contributor: core 20+ merged PRs and removed risk: medium Business logic, config, or moderate-risk modules labels Apr 28, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the ironclaw_trust crate, which implements a host-controlled trust policy engine for IronClaw Reborn. It defines the EffectiveTrustClass to manage privileged trust ceilings, establishes an InvalidationBus for handling trust-change events, and provides a layered policy evaluation mechanism. My feedback suggests optimizing the authority_changed function to avoid unnecessary heap allocations and improving the testability of HostTrustPolicy by abstracting the time provider instead of using Utc::now() directly.

Comment thread crates/ironclaw_trust/src/invalidation.rs Outdated
Comment thread crates/ironclaw_trust/src/policy.rs Outdated
- Fix default_decision docstring/code mismatch — LocalManifest now drops to
  Sandbox (matching the docstring intent); other origins keep UserTrusted.
- Gate fixtures module behind a `test-fixtures` Cargo feature; integration
  test target opts in via required-features. Production builds cannot
  import the privileged-trust constructors.
- Add AdminConfig::remove, plus module-level docs spelling out the
  mutation→InvalidationBus.publish contract on every mutator.
- Beef up SignedRegistry with a trusted_signers map + SignerEntry struct
  (still inert in PR1b — the seam is now defensible).
- Add LocalDevOverride placeholder source — fourth source named in #3012.
- Thread max_resource_ceiling through SourceMatch / BundledEntry /
  AdminEntry / SignerEntry — no longer dead state.
- Remove unused TrustError::SourceRejected variant; document
  InvariantViolation as the future-extension surface.
- T9: add static_assertions::assert_not_impl_any!(EffectiveTrustClass:
  DeserializeOwned) for the compile-time AC #1 guarantee.
- T6: rename misleading curr_removed → curr_unchanged, add real removal
  case + reorder case, document over-firing as deliberate.
- Add lib.rs unit tests so bare `cargo test -p ironclaw_trust` runs
  smoke checks instead of silently passing 0.

Verification: cargo fmt, cargo clippy --all --all-features, and
cargo test -p ironclaw_host_api -p ironclaw_trust -p ironclaw_architecture
--features ironclaw_trust/test-fixtures all pass (46/46).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the risk: medium Business logic, config, or moderate-risk modules label Apr 28, 2026
…+ injectable clock

- authority_changed: replace sort+collect with bidirectional iter().all(contains)
  + length guard. Allocation-free, faster on small authority lists, better
  for WASM. Set semantics preserved; reorder of equal sets remains
  retainable; the length guard catches multiset cases like [a,a] vs [a,b].
  (gemini-code-assist comment on invalidation.rs)

- HostTrustPolicy::evaluate: inject a Clock instead of calling Utc::now()
  directly. New `clock` module with `Clock` trait, `SystemClock` (default
  production wiring), and `FixedClock` test fixture. Existing
  `HostTrustPolicy::new` is non-breaking — it constructs a SystemClock
  internally. Tests can use `HostTrustPolicy::with_clock` for deterministic
  audit-replay / golden-file scenarios. New contract test
  `evaluate_uses_injected_clock_for_evaluated_at` locks in the guarantee.
  (gemini-code-assist comment on policy.rs)

Verification: cargo fmt clean, cargo clippy --all --all-features -D
warnings clean, cargo test -p ironclaw_host_api -p ironclaw_trust -p
ironclaw_architecture --features ironclaw_trust/test-fixtures → 47/47.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@nickpismenkov nickpismenkov added the skip-regression-check Bypass regression test CI gate (tests exist but not in tests/ dir) label Apr 28, 2026
The fix(trust): commits in this branch trip the regression-check
workflow's IS_FIX heuristic, but the workflow's tests-detection step
fails to see the 24 added test markers in the diff on CI even though
they are visible locally — likely an actions/checkout vs pull-request
merge-commit interaction.

This empty commit's body carries the explicit
[skip-regression-check]
directive that the workflow honors so the false-positive failure
unblocks merge. The PR also has the matching label applied.

Tests added in this PR (visible locally via
`git diff origin/reborn-integration...HEAD -U0 -- '*.rs' | \
   grep -E '^\+.*(#\[test\]|#\[cfg\(test\)\]|mod tests)'`):
24 added test markers across host_api_contract.rs (+4),
policy_contract.rs (+15), and ironclaw_trust/src/lib.rs (+2).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@nickpismenkov nickpismenkov linked an issue Apr 28, 2026 that may be closed by this pull request
10 tasks
Copy link
Copy Markdown
Collaborator

@zmanian zmanian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strong PR overall. This is exactly the host-controlled trust-class policy engine I argued for in #issuecomment-4328881303, and the type-level requested/effective split is the right answer to the manifest self-assertion gap I flagged on PR1. Worth pulling out what works before raising concerns:

What's working well:

  • RequestedTrustClass (freely deserializable) vs EffectiveTrustClass (newtype with pub(crate) privileged constructors) is precisely the boundary the freeze packet was missing. The compile-time assert_not_impl_any!(EffectiveTrustClass: serde::de::DeserializeOwned) pins the invariant against future refactor drift.
  • test-fixtures Cargo feature gating is exactly the right level of paranoia for the privileged constructors. Production builds physically cannot enable it without an explicit Cargo.toml change.
  • TrustProvenance recorded on every decision gives audit a real story for why a decision came back the way it did.
  • Ok(None) (source did not recognize) vs Err (real evaluation failure) is a clean trait contract.
  • Digest pinning in BundledRegistry::evaluate correctly implements AC #7: a drift returns Ok(None), falls through to default downgrade.
  • SignedRegistry returning Ok(None) until verification lands is the safe default — won't accidentally trust on the basis of a self-declared signer field.
  • Clock injection (HostTrustPolicy::with_clock) makes a security-critical path deterministic.
  • The doc-level orchestration contract on BundledRegistry/AdminConfig mutators ("publish before next dispatch") is loud and explicit.

The two gemini comments are misreads worth dismissing: authority_changed does not sort (it's iter().all(contains) bidirectional, as the doc explains), and evaluate does use the injected Clock trait, not Utc::now() directly.


Concerns

CRITICAL — AdminConfig matches by package_id alone across every PackageSource

// AdminConfig matches any source — operators may elevate a package
// installed from any origin.
let Some(entry) = entries.get(&input.identity.package_id) else {
    return Ok(None);
};

AdminEntry carries package_id + effective_trust + allowed_effects + max_resource_ceiling — no digest, no signer, no PackageSource constraint. So if an operator blesses package_id = "operator_blessed" at FirstParty, then any package with that name from any source — including a LocalManifest controlled by an unprivileged user — gets FirstParty effective trust. T13 demonstrates exactly this path: local_manifest_identity("operator_blessed") resolves to TrustClass::FirstParty via AdminConfig.

PackageId::new only validates the character set; it's not authenticated. So a user-installed LocalManifest shadowing an admin-blessed identifier escalates to FirstParty.

BundledRegistry doesn't have this problem because (a) it gates on PackageSource::Bundled and (b) it has digest pinning. AdminConfig has neither.

Suggested fix (any one):

  • AdminEntry requires digest and/or signer, and AdminConfig::evaluate enforces them; or
  • AdminEntry carries an explicit PackageSource discriminant the entry must match; or
  • AdminConfig::evaluate rejects PackageSource::LocalManifest packages by default, with operator opt-in for "I really do mean to elevate this local-manifest package."

The current doc comment ("operators may elevate a package installed from any origin") names the design but understates the consequence. As long as operators are forced to use globally unique identifiers, this is "fine in production" — but the structure leaves the foot gun in the same crate that just type-locked the rest of the surface.

HIGH — InvalidationBus orchestration is enforced only by caller discipline

The contract is correct and the docs are loud, but there's no type-level coupling between mutating a BundledRegistry/AdminConfig/SignedRegistry and publishing a TrustChange. A future caller can call registry.remove(pid) and forget the bus.publish(...) step. AC #6 silently breaks. The compiler won't catch it.

PR3's ironclaw_authorization wiring is where this matters most. Worth either:

  • adding a RegistryMutator<'a, 'b> { registry: &'a Registry, bus: &'b InvalidationBus, prev: TrustDecision } lifetime-bound wrapper that publishes atomically on drop or commit; or
  • collapsing upsert/remove into a single API that returns the previous entry so the caller cannot forget the previous-state half of the contract.

For PR1b this is acceptable because no caller exists yet, but the PR3 review should specifically check this orchestration. Today's docs describe the rule; nothing enforces it.

HIGH — default_decision for unmatched Bundled / Registry is UserTrusted, not Sandbox

PackageSource::Bundled | PackageSource::Registry { .. } | PackageSource::Admin => {
    EffectiveTrustClass::user_trusted()
}

Two specific concerns:

  1. Unmatched Bundled shouldn't exist. "Bundled" means compiled into the host binary. If a Bundled package is missing from BundledRegistry, something is misconfigured. Falling soft to UserTrusted covers the bug; falling closed to Sandbox (or returning a hard error, as the doc anticipates for PR3) makes it audible.
  2. Unmatched Registry is more dangerous. Registry { url } is a remote source. SignedRegistry::evaluate returns Ok(None) until verification lands. So an unsigned remote package today gets UserTrusted authority via no verification at all. That's fail-open in a security-critical surface. Suggest Sandbox for unmatched Registry until signature verification ships.

MEDIUM — TrustChange with previous == current (no-op or upgrade) is publishable

InvalidationBus::publish runs every listener regardless of whether the change is a downgrade, no-op, or upgrade. The doc emphasizes the downgrade case, but listeners that read "TrustChange ⇒ invalidate active grants" will incorrectly invalidate on benign upgrades or no-ops.

Suggest a TrustChange::is_downgrade(&self) -> bool and either a guard in publish or stronger listener-side documentation. Or define the type to require previous != current at construction.

MEDIUM — authority_changed over-fires on duplicate entries

Capability authority is conceptually a set, but the function compares &[CapabilityId] slices. The doc explains why the length guard is needed ([a, a, b] vs [a, b]), but treating those two as different forces grant reissue when the content is identical. Two callers with the same effective authority but different list-canonicalization will trigger unnecessary reissues.

Suggest typing previous_authority as HashSet<CapabilityId> (or BTreeSet for determinism), or canonicalizing before comparison.

LOW — LocalDevOverride has unreachable interior

enabled: bool is private with no setter; overrides: RwLock<HashMap<...>> is #[allow(dead_code)]. PR1b ships a structural seam that can't be exercised even by tests. Consider:

  • gating the whole struct behind a future dev-override feature, or
  • exposing a pub(crate) test-fixtures setter so the seam is at least exercisable in the test suite.

The current shape works as a forward-compat placeholder, but it's dead code today.

LOW — Missing privileged round-trip serialization test

EffectiveTrustClass does implement Serialize (for audit envelopes). There's a smoke test serializing user_trusted(), but nothing pins the wire shape for first_party() / system() (which can be constructed via the test-fixtures feature). One assertion that serde_json::to_value(effective_first_party_for_test()) produces json!("first_party") would lock the audit shape against accidental rename.


Verdict

Approve with one CRITICAL follow-up (the AdminConfig cross-source name-collision attack surface) and the orchestration concerns flagged for PR3 wiring. The substrate is correct, the type discipline is the strongest in the Reborn stack so far, and the test coverage maps cleanly to the AC matrix. The AdminConfig issue is the only one that genuinely changes the security posture; the rest are quality-of-implementation items that can land as follow-ups.

Happy to file individual issues for the critical and HIGH items if useful.

@henrypark133
Copy link
Copy Markdown
Collaborator

I think the requested/effective trust split is directionally right. The PR gives us a useful type-level boundary between untrusted manifest input and host-approved effective trust.

I still think a few contract details should be clarified before downstream Reborn crates build against this.

What existing Reborn docs already establish

The broader model is already documented pretty well:

  • TrustClass is an authority ceiling, not a grant or kernel bypass.
    • docs/reborn/contracts/host-api.md
    • docs/reborn/contracts/kernel-boundary.md
  • User-installed packages cannot self-declare FirstParty / System; those ceilings come from host policy, signed/bundled metadata, or admin config.
    • docs/reborn/contracts/host-api.md
    • docs/reborn/contracts/extensions.md
  • Shipped first-party code and bundled loops still need explicit grants, mounts, leases, resources, and obligations.
    • docs/reborn/contracts/host-api.md
    • docs/reborn/contracts/capability-access.md
  • Registered capabilities are only possibilities, not authority; dispatch still requires grants/leases.
    • docs/reborn/contracts/capability-access.md
    • docs/reborn/contracts/capabilities.md
  • Credential accounts are metadata only; raw secrets still require scoped lease_once + consume.
    • docs/reborn/contracts/secrets.md
    • docs/reborn/contracts/host-runtime.md

So the concern is not with the high-level direction. The concern is that this PR introduces the concrete trust-policy substrate, and a few source-of-truth relationships are still implicit.

1. Please make the exact evaluation contract explicit

The main missing piece for me is a small policy matrix for:

PackageIdentity
+ RequestedTrustClass
+ requested authority
+ policy source entries
-> EffectiveTrustClass
+ AuthorityCeiling
+ provenance

The code changes that make this important are:

  • crates/ironclaw_host_api/src/trust.rs
    • adds RequestedTrustClass, PackageIdentity, PackageSource
  • crates/ironclaw_trust/src/policy.rs
    • adds TrustPolicyInput, HostTrustPolicy, SourceMatch, default fallback behavior
  • crates/ironclaw_trust/src/sources.rs
    • adds BundledRegistry, AdminConfig, SignedRegistry, LocalDevOverride
  • crates/ironclaw_trust/src/decision.rs
    • adds EffectiveTrustClass, AuthorityCeiling, TrustDecision

Could the PR docs or crate docs define, per policy source:

  • which fields are match keys: package_id, source, digest, signer, requested trust, requested authority;
  • whether RequestedTrustClass constrains the output or is only an input claim/audit signal;
  • what happens when requested trust is lower than the matched policy entry;
  • what happens when requested trust is higher than the matched policy entry;
  • whether requested authority constrains AuthorityCeiling or only drives grant reissue/invalidation;
  • whether mismatches deny, downgrade, or fall through to the next source;
  • whether first-match-wins is intended as stable product policy.

Without this table, downstream crates could reasonably disagree about whether requested trust is a cap, a claim, or just audit metadata.

2. Please define PackageIdentity against existing extension terminology

Existing contracts mostly talk in terms of:

  • ExtensionId
  • ExtensionPackage
  • extension manifests
  • loop packages
  • package/capability names
  • CapabilityDescriptor.provider: ExtensionId

This PR adds PackageId / PackageIdentity as trust-policy vocabulary. That may be the right abstraction, but the relationship should be explicit.

Could you clarify whether PackageIdentity means:

  • only extension packages;
  • extensions plus loop packages;
  • skills/MCP adapters too;
  • any installable userland unit;
  • or also host-bundled/built-in capabilities?

This matters because trust, grant retention, manifest validation, registry lookup, filesystem layout, dispatcher checks, and credential/account scoping all need to interpret the same identity consistently.

3. Please clarify built-in tool / host capability mapping

The Reborn contracts seem to want one authority path:

CapabilityHost -> authorization -> dispatcher/runtime

Relevant docs:

  • docs/reborn/contracts/capabilities.md
  • docs/reborn/contracts/capability-access.md
  • docs/reborn/contracts/dispatcher.md
  • docs/reborn/contracts/runtime-selection.md
  • docs/reborn/contracts/lightweight-agent-loop.md

If this trust policy applies only to custom/user-installed packages while existing built-in tools continue through a separate ToolRegistry / approval path, we may end up with divergent authorization and credential-guard semantics.

Could you clarify whether built-ins should eventually become host-bundled PackageIdentity + CapabilityDescriptor entries?

Examples:

  • built-in shell/script tool -> bundled Script/System capability?
  • built-in HTTP/web-fetch tool -> bundled capability with network obligations?
  • built-in message tool -> bundled host-service capability?
  • built-in Gmail/GitHub credential-backed tools -> bundled capabilities with secret-injection obligations?

If built-ins are intentionally outside this policy engine, what is the equivalent trust ceiling, grant/lease, approval, secret-injection, network, mount, invalidation, and audit path for them?

This does not need to migrate every built-in tool in this PR, but the contract should define the intended relationship so Reborn does not split custom packages and built-ins into two security models.

4. Please align manifest/descriptor trust_ceiling with requested/effective trust

One concrete doc/type alignment issue: existing docs still show:

CapabilityDescriptor {
    trust_ceiling: TrustClass,
}

and docs/reborn/contracts/extensions.md says descriptor trust_ceiling is inherited from manifest trust.

This PR introduces:

  • manifest-side/request-side RequestedTrustClass;
  • policy-output EffectiveTrustClass;
  • policy-output AuthorityCeiling.

Could you define the relationship between those concepts?

For example, should manifests/descriptors carry requested trust only, while effective trust is attached later by policy? Or should CapabilityDescriptor.trust_ceiling remain but be explicitly treated as requested/declarative metadata until the trust policy evaluates it?

The current wording could be read as the manifest directly providing an effective trust ceiling, which conflicts with the new requested/effective split.

5. Please centralize mutation + invalidation in the follow-up wiring

crates/ironclaw_trust/src/invalidation.rs introduces the right idea: trust downgrade/revocation should invalidate affected grants before later side effects.

The part that still feels easy to misuse is that source mutation and invalidation publication are separated:

  • policy sources expose mutators such as upsert / remove;
  • docs say the caller must publish TrustChange on the InvalidationBus.

That may be the right split because the caller knows the previous decision and active authority. But the follow-up wiring should centralize:

capture previous decision
-> mutate source
-> publish TrustChange
-> then allow future dispatch

so individual callers do not have to remember the sequence manually.

Summary

Overall, I like the security direction. The type split is useful and the trust-ceiling-not-grant model matches the Reborn contracts.

The clarification I am asking for is mostly about making this a precise cross-crate contract:

  • exact evaluation matrix;
  • PackageIdentity terminology and scope;
  • built-in tool / host capability mapping;
  • manifest requested trust vs policy effective trust alignment;
  • centralized mutation + invalidation wiring.

Those clarifications would make it much easier to review the follow-up authorization integration and prevent trust, grants, built-ins, credentials, and package identity from drifting across crates.

Address the review items spanning security correctness, API
discipline, and contract documentation from the cross-crate review
of PR3043 (Henry's review).

Security fixes:
- AdminConfig: bind elevation to (package_id, source, digest) so a
  LocalManifest cannot shadow an admin-blessed Bundled id. The
  highest-risk path lives behind AdminEntry::for_local_manifest so
  every elevation of a user-writable origin is greppable. Closes
  the cross-source shadowing footgun documented in T13b/T13c/T13d.
- default_decision: fail-closed across every PackageSource. An
  unmatched Registry url no longer picks up UserTrusted by self-
  declaration; unmatched Bundled / Admin also drop to Sandbox. T15
  pins the contract.

API tightening:
- HostTrustPolicy::mutate_with is the only public runtime-mutation
  path; per-source upsert/remove are pub(crate) and reachable only
  through SourceMutators inside a mutate_with closure. AC #6 (trust
  changes invalidate before next dispatch) is now a compile-time
  guarantee rather than caller discipline. T14 family.
- TrustChange::new(...) -> Option<Self> filters no-ops at construction;
  is_downgrade / is_upgrade / is_kind_change helpers backed by
  EffectiveTrustClass::authority_level let listeners be selective
  rather than over-revoking on benign upgrades. InvalidationBus::publish
  drops no-ops as defense-in-depth (debug_assert in dev). T16 family.
- TrustChange.previous_authority / TrustPolicyInput.requested_authority
  / authority_changed / grant_retention_eligible typed as
  BTreeSet<CapabilityId>, not Vec / slice. Closes the [a, a, b] vs
  [a, b] over-fire at the type boundary. Required PartialOrd/Ord
  on string-id newtypes in host_api (additive change).
- LocalDevOverride exposes is_enabled / override_count accessors and
  a test-fixtures-only enabled_for_test constructor; inert contract
  is now testable via T17.
- EffectiveTrustClass canonical wire shapes pinned for all four
  variants (T18) so audit envelopes survive serde renames.

Documentation:
- New crates/ironclaw_trust/CONTRACT.md (421 lines) — cross-crate
  contract co-located with the code. Covers evaluation matrix
  (per-source match keys; requested_trust audit-only; mismatch =
  fall through; first-match-wins), PackageIdentity scope, requested
  vs effective trust split, mutation orchestration, set-typed
  authority, and built-in tool migration intent with V1 vs
  post-migration axis mapping.
- crates/ironclaw_host_api/src/trust.rs: type-level docs on
  PackageIdentity / RequestedTrustClass / module level answer
  PackageIdentity scope and CapabilityDescriptor.trust_ceiling
  reconciliation at the source. Manifest \`trust = "..."\` mapping
  table inline.
- CLAUDE.md and lib.rs point at CONTRACT.md.

Verification:
  cargo fmt clean
  cargo clippy --all-features --tests -D warnings clean
  30 trust contract tests + 2 lib smoke tests + host_api tests pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@nickpismenkov nickpismenkov requested a review from zmanian April 29, 2026 05:04
}
}

pub fn with_entries<I: IntoIterator<Item = AdminEntry>>(entries: I) -> Self {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High SeverityAdminConfig is keyed only by PackageId, despite the contract binding trust to (package_id, source).

PackageIdentity explicitly documents that same-package_id packages from different PackageSources are distinct trust subjects, and the evaluator checks entry.source != input.identity.source. But the backing map stores only one AdminEntry per PackageId:

HashMap<PackageId, AdminEntry>

That means an admin config containing both for_bundled("foo", ...) and for_registry("foo", ...) silently drops whichever entry was inserted first, and admin_remove(&PackageId) removes the whole id regardless of source. This breaks the source-pin invariant and can also cause untracked trust changes for the overwritten/removed source.

Please key admin entries by the full trust subject (at least (PackageId, PackageSource)) and make admin_upsert / admin_remove source-aware. Add a regression test with two same-id entries for different sources and verify both evaluate independently.

let mutators = SourceMutators {
sources: &self.sources,
};
let result = f(&mutators)?;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High Severitymutate_with can change policy state and then skip invalidation entirely when the closure returns an error.

The closure receives live in-place mutators. If it performs a downgrade/removal and then returns Err, this ? exits before the post-evaluation and before bus.publish runs. The policy state remains changed, but active grants/leases are never told about it. The new test t14e_mutate_with_short_circuits_on_closure_error codifies exactly this unsafe behavior.

A concrete scenario: admin_remove() succeeds, a later operation in the same closure fails, mutate_with returns Err, and the package has now fallen from FirstParty to default Sandbox without a TrustChange. Stale privileged grants can survive under the old ceiling.

Please make mutation transactional/queued, roll back on closure errors, or at minimum detect that a mutation occurred and still post-evaluate/publish any authority reduction before returning the closure error.

Comment thread crates/ironclaw_trust/src/policy.rs Outdated
};
let result = f(&mutators)?;

let curr = self.evaluate(&probe)?;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High Severity — there is no synchronization that enforces “publish before any subsequent evaluate returns the new decision.”

mutate_with pre-evaluates, runs in-place source mutations, post-evaluates, then publishes. The source mutators only take per-source write locks and release them before mutate_with post-evaluates and publishes. A concurrent thread can call evaluate() after the source mutation is visible but before bus.publish(change) runs, observe the new lower trust decision, and proceed before grants have been invalidated.

That violates the core fail-closed contract in this PR: trust downgrade/revocation must invalidate active grants before any further side effect can run under stale authority.

Please add a policy-level synchronization/epoch mechanism that spans pre-evaluate → mutation → post-evaluate → publish, or otherwise make evaluate/dispatch wait until the corresponding invalidation has completed.

Comment thread crates/ironclaw_trust/src/policy.rs Outdated
// is the canonical filter, so a closure that mutates without
// changing this identity's effective trust class produces no
// publish.
if let Some(change) = TrustChange::new(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High Severity — invalidation only considers effective_trust, not reductions in the authority ceiling.

TrustChange::new is called with only prev.effective_trust and curr.effective_trust. If an entry keeps the same trust class but removes allowed_effects or lowers max_resource_ceiling, no event is published. Existing grants issued under the old ceiling can therefore survive after the policy has reduced what may be granted.

This is not just metadata: AuthorityCeiling is the cap downstream authorization is expected to intersect with each CapabilityGrant. A same-class reduction from [ReadFilesystem, WriteFilesystem] to [ReadFilesystem], or a lower resource ceiling, needs the same fail-closed invalidation behavior as a trust downgrade.

Please compare the full authority ceiling (or emit a separate authority-ceiling change event) and publish invalidation whenever allowed effects/resource ceilings shrink, even if the trust class is unchanged. Add a test that mutates an entry from the same EffectiveTrustClass with fewer effects and verifies the bus fires.

Comment thread crates/ironclaw_trust/Cargo.toml Outdated
# Off by default — production builds must NEVER enable this. Tests opt in
# via `cargo test --features test-fixtures` (or `--features
# ironclaw_trust/test-fixtures` from the workspace root).
test-fixtures = []
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High Severitytest-fixtures exposes privileged effective-trust constructors as a normal Cargo feature.

The PR states production builds cannot import privileged constructors, but test-fixtures is a regular additive feature that makes pub mod fixtures available. Any workspace crate or downstream build that enables ironclaw_trust/test-fixtures can call effective_system_for_test() outside tests and produce privileged EffectiveTrustClass values without policy evaluation.

Because the central invariant is “privileged effective trust only comes from the policy engine,” this should not be a public feature surface. Prefer keeping privileged tests inside crate-internal #[cfg(test)] modules, or add a hard compile-time guard that prevents building non-test targets with test-fixtures enabled.

@serrrfirat
Copy link
Copy Markdown
Collaborator

I think the mechanics here are right, but I’m worried the naming may blur two different Reborn concepts.

crates/ironclaw_trust/src/decision.rs defines the output of trust-policy evaluation — package identity + requested trust -> effective trust ceiling/provenance. That seems distinct from the existing host-api Decision concept, which is an invocation authorization result (Allow/Deny + obligations).

Would it be clearer to name this TrustEvaluation / TrustPolicyEvaluation and perhaps move decision.rs to evaluation.rs? Similarly, AuthorityCeiling might be clearer as GrantCeiling, since this value caps what grants may be issued but does not itself authorize an invocation.

The important distinction I want to preserve is:

  • trust evaluation: “what trust ceiling does host policy assign this package?”
  • authorization decision: “can this specific invocation run?”
  • runtime profile resolution: “what execution backend/policy is active?”

I don’t think this is a functional blocker, but the naming may make future wiring into ironclaw_authorization easier to reason about.

@serrrfirat
Copy link
Copy Markdown
Collaborator

I like the intent of §9 — built-ins should eventually move from implicit ToolRegistry trust to Bundled PackageIdentity + CapabilityDescriptor + EffectiveTrustClass. But I think the current table is too coarse to be treated as the contract for current tools.

A few concerns:

  • It maps individual tools directly to trust, whereas trust policy should evaluate provider/package identity (ironclaw.filesystem, ironclaw.shell, ironclaw.memory, etc.) and capabilities should carry the per-action effects.
  • message (channel-bound) -> System seems too broad. User-facing channel send/reply feels more like FirstParty; only internal host notifications should be System.
  • credential-backed (Gmail / GitHub / Slack) -> Bundled depends on provenance. If shipped by IronClaw, yes; if installed via MCP/WASM/registry, it should be Registry/LocalManifest and policy-evaluated.
  • memory / workspace -> ReadFilesystem/WriteFilesystem is probably only an approximation; memory has its own service semantics.
  • shell -> FirstParty needs the explicit caveat that this does not imply LocalHost or host-full access. Runtime profiles / EffectiveRuntimePolicy select Docker/SRT/LocalHost and mount/network authority.
  • Many current built-ins are missing: jobs/processes, routines, extension lifecycle, skills, secrets, image tools, restart, system introspection, tool_permission_set, etc.

Could we either mark this table as illustrative/non-exhaustive, or reshape it as a provider-package mapping rather than a tool mapping? For example:

ironclaw.filesystem       -> FirstParty -> read/write/list/apply_patch capabilities
ironclaw.shell            -> FirstParty -> ExecuteCode; backend selected by RuntimeProfile
ironclaw.system.lifecycle -> System     -> restart
ironclaw.policy           -> System/admin facade -> tool_permission_set

That would better preserve the axes we’ve been separating:

trust evaluation = provider ceiling
authorization    = invocation grant/approval
runtime profile  = backend/mount/network policy

@serrrfirat serrrfirat added the reborn IronClaw Reborn architecture and landing work label Apr 29, 2026
@serrrfirat serrrfirat merged commit fcacf8b into reborn-integration Apr 29, 2026
18 checks passed
@serrrfirat serrrfirat deleted the feat/policy-engine branch April 29, 2026 10:39
This was referenced May 7, 2026
@ironclaw-ci ironclaw-ci Bot mentioned this pull request May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs reborn IronClaw Reborn architecture and landing work risk: medium Business logic, config, or moderate-risk modules scope: dependencies Dependency updates scope: docs Documentation size: XL 500+ changed lines skip-regression-check Bypass regression test CI gate (tests exist but not in tests/ dir)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reborn PR1b: add host-controlled trust-class policy engine

4 participants