Skip to content

feat(ASI): add behavioral trust evidence type specification#819

Open
0xbrainkid wants to merge 3 commits intoOWASP:mainfrom
0xbrainkid:feat/behavioral-trust-evidence-type
Open

feat(ASI): add behavioral trust evidence type specification#819
0xbrainkid wants to merge 3 commits intoOWASP:mainfrom
0xbrainkid:feat/behavioral-trust-evidence-type

Conversation

@0xbrainkid
Copy link
Copy Markdown

Summary

Adds a formal behavioral trust evidence type specification to the ASI agentic top 10 documentation. This addresses the gap identified in issue #802 around formalizing behavioral trust as a control class for runtime enforcement.

Per @desiorac's invitation in issue #802.

What it adds

A standardized evidence type for expressing agent behavioral trustworthiness as an input to admissibility predicates at mutation boundaries.

  • Evidence structure: trust_score, derivation, drift_status, verification
  • Trust scoring formula: success_rate × confidence(volume), scoped per task class — deterministic, auditable, no ML required
  • Three enforceability tiers:
    • Strong (on-chain atomic, ~400ms)
    • Bounded (version-anchored with configurable TTL)
    • Detectable-only (self-declared or signed without anchor)
  • Integration points: OWASP admissibility predicates, mutation boundary gate (from issue Runtime Enforcement Mapping for OWASP Agentic Top 10 (ASI01–ASI10) #802 discussion), W3C TrustProvider interface, LangGraph multi-provider trust
  • Security considerations: Sybil resistance via confidence function, temporal decay, cold-start delegation chain

Connections to existing work

Adds a formal evidence type definition for behavioral trust scoring
as a control class input to ASI01-ASI03 runtime enforcement.

Includes:
- Evidence structure (trust_score, drift_status, verification)
- Trust scoring formula: success_rate × confidence(volume), per-task-class
- Enforceability classification: strong / bounded / detectable-only
- Integration points: OWASP admissibility predicates, mutation boundary (MITRE), TrustProvider interface (W3C/LangGraph)
- Security considerations: Sybil resistance, temporal decay, cold-start

Referenced from discussion OWASP#802 (Runtime Enforcement Mapping).
brainGROWTH added 2 commits April 7, 2026 18:54
…table fingerprint

Per @desiorac's review in issue OWASP#802: baseline_version (opaque string)
was insufficient because it doesn't provide the monotonic reference
property needed for replay-verifiable proofs.

Replaced with:
- baseline_snapshot_hash: SHA-256 of canonicalized baseline (JCS)
- baseline_snapshot_ts: when baseline was computed
- Added MUST requirement: verifiers reject enforcement-mode evidence without this field
Per @desiorac review: high-volume read-only calls inflate confidence
for the whole agent, allowing write/payment operations to reach 'strong'
enforceability on borrowed trust.

Fix: gates MUST evaluate task_class-scoped evidence. cross_class_score
is optional for display but MUST NOT be used for enforcement decisions.
Enforceability tier is now task-class-specific.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant