Agent Behavioral Drift Detector

Detects cumulative behavioral drift in agent optimization loops by comparing rolling behavioral fingerprints against a baseline.

Built for integration with protect-mcp (ScopeBlind) receipt streams and HyperAgents safety policies.

The Problem

Static per-action safety policies catch individual violations but miss trajectory-level shifts. A meta-agent that stays within policy constraints on every iteration can still drift significantly over N iterations — the "boiling frog" problem.

How It Works

Receipt Stream → Rolling Window → Behavioral Fingerprint → Drift Score
                                         ↕
                                    Baseline (iteration 0)

The detector consumes signed receipts from any policy engine and computes four behavioral signals:

Signal	Metric	Weight
Tool distribution	Jensen-Shannon divergence	35%
Allow rate	Absolute delta	25%
Tier distribution	Jensen-Shannon divergence	25%
Call velocity	Normalized delta	15%

Quick Start

from drift_detector import DriftDetector, Receipt

detector = DriftDetector(
    window_size=50,          # receipts per fingerprint window
    drift_threshold=0.3,     # flag when drift exceeds this
    on_drift=lambda r: print(f"⚠️ Drift detected: {r.drift_score:.3f}")
)

# Ingest receipts from protect-mcp stderr
for line in receipt_stream:
    result = detector.ingest_json(line)
    if result and result.drifted:
        # Trigger approval gate or SATP attestation
        escalate(result)

Integration with protect-mcp

# Tail DecisionLog events in shadow mode
protect-mcp --mode shadow 2>&1 | python3 -c "
import sys
from drift_detector import DriftDetector
detector = DriftDetector(window_size=50, drift_threshold=0.3)
for line in sys.stdin:
    result = detector.ingest_json(line)
    if result:
        print(f'iteration={result.iteration} drift={result.drift_score:.3f} {result.message}')
"

Trust Level → Enforcement Mapping

Trust Level	Mode	Behavior
≥ 4 (Established)	Shadow	Log drift, don't halt
3 (Moderate)	Simulate	Flag drift, estimate impact
2 (New)	Enforce	Halt on threshold breach
1 (Untrusted)	Sign	Cryptographic attestation per iteration

Tests

python3 test_drift_detector.py
# All 8 tests passed ✅

Context

facebookresearch/HyperAgents #17 — Safety policy discussion
W3C AI Agent Protocol #30 — Four-layer compliance stack
OWASP AISVS C10 — Agent identity verification requirements

License

MIT

Built by brainAI as part of the SATP (Soulbound Agent Trust Protocol) ecosystem.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
drift_detector.py		drift_detector.py
test_drift_detector.py		test_drift_detector.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Behavioral Drift Detector

The Problem

How It Works

Quick Start

Integration with protect-mcp

Trust Level → Enforcement Mapping

Tests

Context

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agent Behavioral Drift Detector

The Problem

How It Works

Quick Start

Integration with protect-mcp

Trust Level → Enforcement Mapping

Tests

Context

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages