creatornader
diff --git a/‎.changeset/recall-durable-content-index.md‎
Lines changed: 5 additions & 0 deletions b/‎.changeset/recall-durable-content-index.md‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎DECISIONS.md‎
Lines changed: 51 additions & 1 deletion b/‎DECISIONS.md‎
Lines changed: 51 additions & 1 deletion
diff --git a/‎services/atrib-recall/README.md‎
Lines changed: 14 additions & 9 deletions b/‎services/atrib-recall/README.md‎
Lines changed: 14 additions & 9 deletions
@@ -0,0 +1,5 @@
+---
+'@atrib/recall': patch
+---
+
+Add a durable content-token index for complete content recall and expose runtime/index coverage metadata.
@@ -6664,7 +6664,7 @@ This is a coverage contract, not an indexing claim. The current complete path is
 - `max_records` remains a caller-owned partial-corpus request. In complete mode, setting it below `total_records` is still an evidence failure.
 - Response consumers should inspect `coverage.strategy`, not just `truncated_corpus`, when they need to defend a recall claim.
 - Large complete recalls may still be slow. That is now a performance gap, not a correctness guard disguised as a corpus limit.
-- The next structural step is a durable recall index with high-water-mark verification. It must preserve the [D125](#d125-complete-content-recall-is-coverage-first-not-cap-first) coverage contract.
+- Durable content-token indexing is now covered by [D126](#d126-content-recall-uses-a-durable-index-behind-complete-evidence-coverage). Future index strategies must preserve the [D125](#d125-complete-content-recall-is-coverage-first-not-cap-first) coverage contract.
 
 **Cross-references.**
 
@@ -6673,6 +6673,56 @@ This is a coverage contract, not an indexing claim. The current complete path is
 - [`services/atrib-recall/README.md`](services/atrib-recall/README.md), operator-facing recall contract.
 - [`skills/atrib/SKILL.md`](skills/atrib/SKILL.md), agent-facing critical-path recall guidance.
 
+## D126: Content recall uses a durable index behind complete-evidence coverage
+
+**Date:** 2026-06-21
+
+**Status:** Accepted
+
+**Extends:** [D062](#d062-local-mirror-sidecar-two-tier-private-local--public-canonical-persistence), [D084](#d084-read-primitive-instrumentation-for-empirical-loop-closure-measurement), [D086](#d086-bm25-corpus-extended-from-annotations-to-per-event_type-record-content), [D123](#d123-critical-path-content-recall-requires-complete-evidence-or-explicit-fallback), and [D125](#d125-complete-content-recall-is-coverage-first-not-cap-first).
+
+**Context.** [D125](#d125-complete-content-recall-is-coverage-first-not-cap-first) fixed the correctness bug: `require_complete` no longer turns an arbitrary record-count guardrail into an evidence boundary. That still left two operational gaps:
+
+- Complete recall rebuilt the content-search corpus inside each fresh MCP process, so critical-path recall stayed slower than it needed to be.
+- Operators could verify source, npm, and tags while a running MCP process still served an older implementation. The response shape needed a runtime contract that made stale process binding obvious.
+
+**Decision.** Add a durable content-token index for `recall_by_content`, behind the [D125](#d125-complete-content-recall-is-coverage-first-not-cap-first) coverage contract:
+
+- The sidecar schema is `content-index-v1`.
+- The sidecar stores the BM25 token corpus plus display metadata needed by `recall_by_content`; it does not store log-inclusion proofs and does not change the local-signature trust boundary.
+- A sidecar is accepted only when its stored mirror signature and mirror high-water mark match the current local mirror fingerprint.
+- `require_complete` writes or rewrites the sidecar after a complete loaded-mirror build when the sidecar is missing or stale.
+- Bounded recall may use a valid durable sidecar when present, but it does not force a full mirror load just to create one.
+- If the sidecar is disabled, missing, stale, invalid, or unwritable, recall falls back to the loaded-mirror path and reports that state.
+
+Extend `recall_by_content` responses:
+
+- `runtime` names the loaded `@atrib/recall` package version, `coverage-v1`, and `content-index-v1`.
+- `coverage.index` reports the sidecar status: `hit`, `rebuilt`, `memory_only`, `disabled`, or `write_failed`.
+- `coverage.corpus` remains `local_mirror`; the index is an acceleration and process-restart cache, not a new source of truth.
+
+**Alternatives considered.**
+
+- _SQLite FTS in the first durable-index patch._ Deferred. A JSON token sidecar avoids native dependencies in the MCP startup path and lets the coverage contract settle first.
+- _Persist full inverted BM25 postings._ Deferred. Persisting tokens is enough to avoid reparsing and re-tokenizing every local mirror line across MCP restarts. The in-memory postings can still be rebuilt cheaply from the sidecar.
+- _Create the durable index on every bounded search._ Rejected. That would make a casual bounded query unexpectedly load the full mirror. Complete-mode recall is the right index-build trigger because it already claims full corpus coverage.
+- _Trust package version or git state to prove runtime freshness._ Rejected. A running MCP process can stay old after source and npm are correct. The result itself must expose the runtime contract.
+
+**Consequences.**
+
+- Complete content recall can reuse a mirror-keyed sidecar across MCP process restarts.
+- Stale or partial sidecars cannot produce a complete coverage claim because the mirror signature must match.
+- Agents can detect stale MCP processes by checking for `runtime.content_index_version` and `coverage.index` in `recall_by_content` responses.
+- The sidecar is local cache material. It inherits the privacy posture of the local mirror and should not be committed.
+- Embedding retrieval remains future work. It can add semantic relevance over the same mirror-keyed coverage boundary, but it cannot weaken the [D125](#d125-complete-content-recall-is-coverage-first-not-cap-first) complete-evidence semantics.
+
+**Cross-references.**
+
+- [`services/atrib-recall/src/index.ts`](services/atrib-recall/src/index.ts), content-index implementation and runtime metadata.
+- [`services/atrib-recall/test/mcp-protocol.test.ts`](services/atrib-recall/test/mcp-protocol.test.ts), JSON-RPC coverage for rebuild, hit, stale-index invalidation, and disabled mode.
+- [`services/atrib-recall/README.md`](services/atrib-recall/README.md), operator-facing content-index contract and env vars.
+- [`skills/atrib/SKILL.md`](skills/atrib/SKILL.md), agent-facing stale-runtime and coverage guidance.
+
 # Pending decisions
 
 These will get full ADRs when we act on them. Recorded here so they remain findable and don't silently drop. Per the global Deferred Decision Logging convention, this section uses the forward-looking pattern (forward-looking decisions that will become numbered ADRs when codified).
 
@@ -81,10 +81,12 @@ Every call to this tool (and every sibling tool below) writes a per-call jsonl e
 
 - `mcp__atrib-recall__recall_revisions({ record_hash })` - returns the forward revision chain for the target record. Each chain entry carries `record_hash`, `timestamp`, and the [D086](../../DECISIONS.md#d086-bm25-corpus-extended-from-annotations-to-per-event_type-record-content)-normative content fields (`new_position`, `reason`, `importance`) when present, so the agent can read the chain inline without follow-up `recall` calls per revision. The chain follows the first-by-timestamp revision at each step; when more than one revision targets the same record (sibling fan-out, common in multi-agent flows), the other branch heads are listed on that step's entry as `sibling_hashes`, so the agent can recursively call `recall_revisions` on a sibling to traverse a parallel branch instead of having to manually enumerate revisions via `recall_my_attribution_history`.
 
-- `mcp__atrib-recall__recall_by_content({ query, k?, max_records?, evidence_mode? })` - BM25 free-form retrieval over the newest `max_records` records' indexable text + annotation summary + topic_tags when present, then reranked by Park et al. weighted-sum scoring (recency + importance + relevance). Default k=10, max 50. Default `max_records` is `ATRIB_RECALL_CONTENT_MAX_RECORDS` or 5000. The default `evidence_mode: "bounded"` keeps casual searches fast by tail-loading that newest-first window instead of loading the whole mirror. The response includes `evidence_mode`, `evidence_status`, `fallback_required`, `total_records`, `searched_records`, `candidate_records`, `truncated_corpus`, and `coverage`; `total_records` is `null` when recall served a partial tail-loaded snapshot instead of a full mirror snapshot. `coverage` carries a version, strategy, local-mirror high-water mark, mirror file count, and searched record count so callers can tell whether a result came from a bounded newest-first window or a complete loaded-mirror scan. Per [D086](../../DECISIONS.md#d086-bm25-corpus-extended-from-annotations-to-per-event_type-record-content) and [D118](../../DECISIONS.md#d118-primary-trace-path-is-a-presentation-rule-over-trace-and-chain), "indexable text" is per-event_type record content from the [D062](../../DECISIONS.md#d062-local-mirror-sidecar-two-tier-private-local--public-canonical-persistence) sidecar (observation: `what + why_noted + intent + rationale + topics`; tool_call: `tool_name + args excerpt + result excerpt`; annotation: `summary + topics`; revision: `prior_position + new_position + reason + topics`; transaction: counterparty + memo + protocol fields; directory_anchor: `tree_root + epoch_id`). Extension URIs fall back to a generic recursive string-walk (depth <= 4, field cap 2KB). OpenInference local sidecars add recall tokens for span kind/name, tool/agent/model, prompt identifiers and templates, inputs/outputs, usage, cost, score, and metadata when those fields are mirrored locally per [D108](../../DECISIONS.md#d108-observability-span-trees-are-intake-local-sidecars-are-cognitive-payload). BM25 contribution is clamped to [0, 1] in the parkScore site so the documented Park-component bound is honored. Layer 2 (sqlite-vec sidecar, separate ship) extends with embedding similarity over the same indexed text.
+- `mcp__atrib-recall__recall_by_content({ query, k?, max_records?, evidence_mode? })` - BM25 free-form retrieval over the newest `max_records` records' indexable text + annotation summary + topic_tags when present, then reranked by Park et al. weighted-sum scoring (recency + importance + relevance). Default k=10, max 50. Default `max_records` is `ATRIB_RECALL_CONTENT_MAX_RECORDS` or 5000. The default `evidence_mode: "bounded"` keeps casual searches fast by tail-loading that newest-first window instead of loading the whole mirror. The response includes `runtime`, `evidence_mode`, `evidence_status`, `fallback_required`, `total_records`, `searched_records`, `candidate_records`, `truncated_corpus`, and `coverage`; `total_records` is `null` when recall served a partial tail-loaded snapshot instead of a full mirror snapshot. `runtime` names the loaded `@atrib/recall` package version plus the coverage and content-index contract versions, so a stale MCP process is detectable from the result. `coverage` carries a version, strategy, local-mirror high-water mark, mirror file count, searched record count, and `coverage.index` status so callers can tell whether a result came from a bounded newest-first window, a complete scan that rebuilt the durable sidecar, a durable-index hit, or an explicit disabled/write-failed fallback. Per [D086](../../DECISIONS.md#d086-bm25-corpus-extended-from-annotations-to-per-event_type-record-content) and [D118](../../DECISIONS.md#d118-primary-trace-path-is-a-presentation-rule-over-trace-and-chain), "indexable text" is per-event_type record content from the [D062](../../DECISIONS.md#d062-local-mirror-sidecar-two-tier-private-local--public-canonical-persistence) sidecar (observation: `what + why_noted + intent + rationale + topics`; tool_call: `tool_name + args excerpt + result excerpt`; annotation: `summary + topics`; revision: `prior_position + new_position + reason + topics`; transaction: counterparty + memo + protocol fields; directory_anchor: `tree_root + epoch_id`). Extension URIs fall back to a generic recursive string-walk (depth <= 4, field cap 2KB). OpenInference local sidecars add recall tokens for span kind/name, tool/agent/model, prompt identifiers and templates, inputs/outputs, usage, cost, score, and metadata when those fields are mirrored locally per [D108](../../DECISIONS.md#d108-observability-span-trees-are-intake-local-sidecars-are-cognitive-payload). BM25 contribution is clamped to [0, 1] in the parkScore site so the documented Park-component bound is honored. A future embedding sidecar can add semantic similarity over the same indexed text.
 
 For critical-path audits, use `evidence_mode: "require_complete"`. That mode loads the full mirror and searches every loaded record. If a caller also sets `max_records` below `total_records`, the tool returns no results with `evidence_status: "incomplete"`, `fallback_required: true`, `truncated_corpus: true`, and the `search_cap` / `total_records` mismatch. Do not treat that as an empty search result. The deterministic fallback is to rerun without `max_records` for full loaded-mirror coverage, or to run a caller-owned partition plan and treat each partition as its own explicit coverage claim.
 
+Complete-mode recall uses a durable content-token sidecar when it can. The sidecar is keyed to the current local mirror fingerprint and stores the BM25 token corpus plus display metadata for `recall_by_content`. A sidecar is accepted only when its stored mirror signature and high-water mark match the current mirror stats. If the sidecar is absent or stale, `require_complete` rebuilds it from the full local mirror and still returns complete evidence. If the sidecar is disabled with `ATRIB_RECALL_CONTENT_INDEX=0`, or if writing the sidecar fails, recall falls back to the loaded-mirror path and reports that status in `coverage.index`.
+
 In wrapped MCP hosts, the recall tool call and its JSON response are signed as a `tool_call` record. That means an incomplete critical-path recall is not a quiet warning in transcript prose; the signed result carries `fallback_required: true`. Agents should emit an observation naming the incomplete recall status and the fallback they chose before continuing.
 
 - `mcp__atrib-recall__recall_session_chain({ context_id?, limit?, include_content? })` - returns all records in a context_id, ordered chronologically (oldest-first). The natural traversal of the CHAIN_PRECEDES topology for a single session/trace. Each entry carries `record_hash`, `event_type`, `timestamp`, `display_summary`, `display_producer`, `age`, plus signed causal/tool fields when present (`informed_by`, `tool_name`, `args_hash`, `result_hash`). When `include_content` is true, each entry also includes the [D062](../../DECISIONS.md#d062-local-mirror-sidecar--two-tier-private-local--public-canonical-persistence) local mirror body as `local_content` and the local producer label as `local_producer`. Defaults false to keep the session chain cheap. When `context_id` is omitted, falls back to `resolveEnvContextId` (the same precedence as the other tools: `ATRIB_CONTEXT_ID` env, then a [D083](../../DECISIONS.md#d083-harness-session-id-discovery-extends-d078-for-cognitive-primitive-mcp-servers)-registered harness env like `CLAUDE_CODE_SESSION_ID`).
@@ -97,14 +99,17 @@ In wrapped MCP hosts, the recall tool call and its JSON response are signed as a
 
 The Park et al. ranking weights and recency time constant are environment-tunable for per-axis sensitivity studies:
 
-| Env var                            | Default | Role                                                              |
-| ---------------------------------- | ------- | ----------------------------------------------------------------- |
-| `ATRIB_RECALL_ALPHA`               | 0.3     | Recency component weight                                          |
-| `ATRIB_RECALL_BETA`                | 0.3     | Importance component weight                                       |
-| `ATRIB_RECALL_GAMMA`               | 0.4     | Relevance (BM25) component weight                                 |
-| `ATRIB_RECALL_TAU_DAYS`            | 7       | Exponential-decay time constant for recency                       |
-| `ATRIB_RECALL_NOISE_FLOOR`         | 0.6     | Anti-noise threshold for `rank_by=relevance` (see below)          |
-| `ATRIB_RECALL_CONTENT_MAX_RECORDS` | 5000    | Newest-first corpus size for bounded `recall_by_content` searches |
+| Env var                            | Default          | Role                                                              |
+| ---------------------------------- | ---------------- | ----------------------------------------------------------------- |
+| `ATRIB_RECALL_ALPHA`               | 0.3              | Recency component weight                                          |
+| `ATRIB_RECALL_BETA`                | 0.3              | Importance component weight                                       |
+| `ATRIB_RECALL_GAMMA`               | 0.4              | Relevance (BM25) component weight                                 |
+| `ATRIB_RECALL_TAU_DAYS`            | 7                | Exponential-decay time constant for recency                       |
+| `ATRIB_RECALL_NOISE_FLOOR`         | 0.6              | Anti-noise threshold for `rank_by=relevance` (see below)          |
+| `ATRIB_RECALL_CONTENT_MAX_RECORDS` | 5000             | Newest-first corpus size for bounded `recall_by_content` searches |
+| `ATRIB_RECALL_CONTENT_INDEX`       | enabled          | Set to `0` to disable the durable content-token sidecar           |
+| `ATRIB_RECALL_CONTENT_INDEX_DIR`   | `~/.atrib/cache` | Directory for mirror-keyed content index files                    |
+| `ATRIB_RECALL_CONTENT_INDEX_FILE`  | unset            | Exact content index file path, mainly for tests                   |
 
 The implementation does not enforce that alpha + beta + gamma sum to 1.0; the operator-facing defaults do. See [D085](../../DECISIONS.md#d085-recall-calibration-defaults-survey-grounded-rationale) for the survey-grounded rationale: `ALPHA=0.3` matches CrewAI's `recency_weight=0.3` (the only normalized-weights peer in a 2026-05-23 OSS survey); `TAU_DAYS=7` produces a ~4.85-day half-life inside the field range and close to Park et al.'s ~5.75-day empirical anchor.
-Original file line number
+Diff line change
@@ @@ -0,0 +1,5 @@ @@
 +---
 +'@atrib/recall': patch
 +---
++
 +Add a durable content-token index for complete content recall and expose runtime/index coverage metadata.