Skip to content

[RFC] Behavioral reputation for OpenClaw skills — the missing layer beyond identity verification #55342

@viftode4

Description

@viftode4

The malicious skill problem is documented: 341 malicious skills on ClawHub (Koi Security, Jan 2026), 13.4% of scanned skills with critical issues (Snyk), VirusTotal explicitly unable to detect prompt injection, agent impersonation, or slow-burn trust accumulation.

Identity verification (#49971) addresses part of this — cryptographic proof of who published a skill. But there's a structural gap: a flagged actor creates a new identity with a clean record. The history doesn't travel. Sybil clusters (multiple fake identities controlled by one actor) are invisible to any identity-only system, regardless of how good the DID resolver is.

The missing layer is behavioral reputation — trust earned from real interaction history that can't be manufactured by creating a new identity.

How it works

Every agent-skill interaction creates a bilateral cryptographic record signed by both parties. Trust scores emerge from the interaction graph, not ratings or credentials:

  • A skill that delivers consistently builds a record that follows it everywhere
  • A skill that fails or misbehaves carries that permanently — abandoning the identity and re-registering starts from zero with no history
  • Sybil clusters get zero trust by design: max-flow computation from seed nodes means a cluster of fake identities with no legitimate transaction graph scores zero regardless of how many new identities they create
  • Slow-burn attacks become visible: every past interaction is co-signed by real counterparties, retroactive rewriting is impossible

What this looks like in practice

Before installing a skill:

trustchain_check_trust(pubkey) → { score: 0.91, interactions: 847, status: "established" }

Before executing a payment to an agent:

trustchain_check_trust(pubkey) → { score: 0.03, interactions: 2, status: "bootstrap" }
// payment blocked — insufficient history

Discover trusted agents for a capability:

trustchain_discover_peers(capability: "code-review", min_trust: 0.7)
→ ranked list of agents with verified interaction history

Working implementation

OpenClaw plugin: trustchain-js/packages/openclaw

5 MCP tools: trustchain_check_trust, trustchain_discover_peers, trustchain_record_interaction, trustchain_verify_chain, trustchain_get_identity

Protocol spec: IETF draft-viftode-trustchain-trust-00 — no blockchain, offline-capable, open source, Ed25519 identity.

Full implementation: github.com/viftode4/trustchain — Rust core (304 tests), Python SDK (311 tests), TypeScript SDK (126 tests), adapters for 12 agent frameworks (205 tests).

Relationship to identity verification

This is complementary to DID-based approaches (#49971), not competing. Identity verification answers "is this the same agent that published the skill I trust?" Behavioral reputation answers "has this agent built a track record worth trusting?" Both questions matter. Neither alone is sufficient.

The combination: DID confirms continuity of identity, behavioral reputation confirms continuity of good behavior.

Addressing open questions from #49971

  1. Blocking vs non-blocking: behavioral trust check should be non-blocking with configurable threshold — warn on install, hard-block on payment
  2. Default trust provider: the protocol should be standardized (IETF), not the provider — multiple providers using the same wire format
  3. Delegation chain propagation: solved — trust attenuates with each hop, scope narrows, TTL enforced, max depth bounded
  4. Skill publisher DID requirement: strongest when combined with behavioral history — DID + clean interaction record, not DID alone

Open to discuss how this fits into the OpenClaw plugin lifecycle — particularly whether onAgentVerify should surface both identity and behavioral signals, or whether they warrant separate hooks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions