feat(hooks): persistent predicate counter backend (successor to #3573) by zmanian · Pull Request #3635 · nearai/ironclaw

zmanian · 2026-05-14T13:16:25Z

Successor PR from #3573. Draft — scope doc only, no implementation yet.

Scope

Add a `PredicateStateBackend` trait + Postgres / libSQL impls so the sliding-window counter survives process restarts and is consistent across processes.

Design doc

`crates/ironclaw_hooks/docs/successors/03-persistent-counter.md`

Threat-model

D5 (eviction at MAX_HISTORY_KEYS = 8192) still applies in-memory; the durable backend is the source of truth.

Status

Draft for design review. Performance question (read/write batching) is the load-bearing design discussion.

gemini-code-assist

Code Review

This pull request introduces a design document for a persistent predicate counter, outlining the PredicateStateBackend trait and its implementations for Postgres and libSQL to ensure rate-limit state survives process restarts. The review feedback highlights several areas for refinement: resolving the discrepancy regarding run_id storage in the schema, adopting chrono::DateTime<Utc> for consistency with the existing codebase, addressing libSQL-specific data type requirements for decimal precision, and clarifying the consistency guarantees when balancing cross-process synchronization with batched write performance.

gemini-code-assist · 2026-05-14T13:18:29Z

+   must NOT double-count. The backend stores `(timestamp, run_id,
+   event_id)` so duplicate-event detection works at replay time.


The design doc mentions that the backend stores run_id, but this field is missing from the PredicateStateBackend trait methods and the proposed SQL schema. To maintain clarity between the current implementation state and future or planned states, either add run_id to the trait signatures and database tables if it is intended for deduplication/auditing, or remove the reference from the documentation.

References

In documentation, clearly distinguish between the current implementation state and future or planned states.

Addressed in c9c5fb0 / b4d8a35 / 5b8b8b… across the doc revisions. run_id is intentionally NOT carried by the trait or the schema — see lines 72-78 of the doc:

The sliding-window state is per-tenant + per-hook-id; replay refusal is driven by event_id (which is RuntimeEventId from ironclaw_events, itself already keyed to the current run's emission). An earlier draft mentioned storing run_id alongside; that's redundant given the event id's uniqueness contract and was removed in this revision.

The 'state' for clarity is: run_id is not stored; the design has shifted to a per-tenant + per-hook key with event_id as the dedup axis.

gemini-code-assist · 2026-05-14T13:18:29Z

+pub trait PredicateStateBackend: Send + Sync {
+    async fn record_invocation(
+        &self,
+        key: &InvocationKey,           // (tenant, hook_id, capability)
+        timestamp: SystemTime,
+        event_id: RuntimeEventId,
+    ) -> Result<(), PredicateBackendError>;
+
+    async fn count_in_window(
+        &self,
+        key: &InvocationKey,
+        window: Duration,
+    ) -> Result<u32, PredicateBackendError>;
+
+    async fn record_value(
+        &self,
+        key: &ValueKey,                // (tenant, hook_id, capability, field)
+        timestamp: SystemTime,
+        value: Decimal,
+        event_id: RuntimeEventId,
+    ) -> Result<(), PredicateBackendError>;
+
+    async fn sum_in_window(
+        &self,
+        key: &ValueKey,
+        window: Duration,
+    ) -> Result<Decimal, PredicateBackendError>;
+
+    async fn evict_older_than(&self, cutoff: SystemTime) -> Result<u64, PredicateBackendError>;
+}


The trait uses std::time::SystemTime, but the existing database layer in this repository (e.g., src/db/mod.rs) consistently uses chrono::DateTime<Utc>. Using DateTime<Utc> would be more idiomatic for this project and simplify integration with the existing persistence and migration logic.

pub trait PredicateStateBackend: Send + Sync { async fn record_invocation( &self, key: &InvocationKey, // (tenant, hook_id, capability) timestamp: DateTime<Utc>, event_id: RuntimeEventId, ) -> Result<(), PredicateBackendError>; async fn count_in_window( &self, key: &InvocationKey, window: Duration, ) -> Result<u32, PredicateBackendError>; async fn record_value( &self, key: &ValueKey, // (tenant, hook_id, capability, field) timestamp: DateTime<Utc>, value: Decimal, event_id: RuntimeEventId, ) -> Result<(), PredicateBackendError>; async fn sum_in_window( &self, key: &ValueKey, window: Duration, ) -> Result<Decimal, PredicateBackendError>; async fn evict_older_than(&self, cutoff: DateTime<Utc>) -> Result<u64, PredicateBackendError>; }

Addressed in doc revision. The current trait sketch (line 41-49) uses DateTime<Utc> for the durable backend, and the rationale is called out explicitly on lines 63-70:

the in-memory backend that already shipped in PR #3635 uses Instant because it's process-local and monotonic. Durable backends serialize across processes, so they must use chrono::DateTime<Utc> to match the rest of the project (src/db/mod.rs, ironclaw_events). The trait will accept DateTime<Utc> and the existing in-memory backend will gain a thin shim mapping Instant-driven callers to a fixed reference point.

gemini-code-assist · 2026-05-14T13:18:29Z

+      capability    text NOT NULL,
+      field         text NOT NULL,
+      occurred_at   timestamptz NOT NULL,
+      value         numeric NOT NULL,


For the libSQL implementation, the value column should be TEXT to preserve rust_decimal precision, as noted in src/db/CLAUDE.md. The current Postgres schema uses numeric, which is correct for PG, but the design should acknowledge the dialect difference for libSQL to ensure parity across both backends.

Addressed in doc revision (lines 120-128). The libSQL impl is now explicitly called out as having two material differences from the Postgres impl mandated by src/db/CLAUDE.md:

value column is TEXT NOT NULL (not numeric), because libSQL's integer/real types can't preserve rust_decimal precision; the backend serializes via Decimal::to_string() / from_str() at the row boundary.

occurred_at is stored as ISO-8601 TEXT (libSQL convention).

gemini-code-assist · 2026-05-14T13:18:29Z

+- Performance: every predicate evaluation becomes a DB round-trip.
+  Plan: batched writes via the dispatcher's tick boundary, with
+  reads cached for the current dispatch.


There is a potential conflict between "Cross-process consistency" (Requirement 1) and the performance plan of "batched writes". If writes are buffered until a dispatcher tick, a count_in_window call from another process might return stale data until the batch is committed. The design should clarify if eventual consistency is acceptable for these counters, or if synchronous writes are required for strict rate-limiting across multiple host processes.

Addressed in doc revision (lines 149-163). You correctly flagged that batched writes conflict with the cross-process consistency requirement; the doc now resolves it explicitly:

Performance: every predicate evaluation becomes a DB round-trip. An earlier draft suggested batching writes at the dispatcher's tick boundary. That conflicts with requirement #1 (cross-process consistency): a deferred write from host A wouldn't be visible to host B's read until the next tick, so two concurrent hosts could each see 'under cap' simultaneously and both proceed past max (gemini's review on the prior draft). Resolution: the v1 production backend keeps writes synchronous (read-your-own-writes within the call); the in-process cache stays per-dispatch-only. A future optimization could batch reads — collapse N predicate evaluations in one dispatch into a single batch SELECT — but never writes.

Successor PR from #3573. Current sliding-window state is in-memory and resets on restart. Adds a PredicateStateBackend trait + Postgres/libSQL impls for cross-process and restart-survival semantics.

…ory impl Addresses codex review's three Critical findings on PR #3635: 1. Backend wiring: the trait is now registered (lib.rs:25-26) and PredicateEvaluator delegates to Arc<dyn PredicateStateBackend> via with_backend(...). Default constructor preserves the in-memory behavior so all 154 existing tests pass unchanged. 2. Atomic record-and-read: each record_invocation / record_value call performs the write AND returns the resulting in-window count/sum under a single mutex (in-memory) / transaction (durable backends). Splitting into separate record + read would let two hosts each see 'under cap' and both proceed, drifting past max. 3. Replay refusal: each record call carries a PredicateEventId. Re-emitting the same event_id is a no-op against the count. In-memory backend implements via a per-key bounded set (RECENT_EVENT_ID_CAP = 256); durable backends will use INSERT … ON CONFLICT DO NOTHING. Trait surface (predicate_state.rs): - PredicateEventId(String): opaque dedup key - PredicateBackendError: thiserror enum for fallible durable backends; in-memory backend never returns Err - PredicateStateBackend trait with Result return types - InMemoryPredicateStateBackend default impl - MAX_HISTORY_KEYS const re-exported via evaluator for back-compat Evaluator changes (evaluator.rs): - holds Arc<dyn PredicateStateBackend> (no more inline maps) - evictions_observed() reads through to backend - synth_event_id() generates per-call-unique ids via a process-local atomic counter so tests with identical (hook, ctx, now) still produce distinct ids - LRU helpers + HistoryKey/ValueHistoryKey types moved into predicate_state.rs (as InvocationKey/ValueKey) Tests: - 6 new predicate_state tests: - in_memory_invocation_counts_within_window - in_memory_invocation_trims_outside_window - in_memory_value_sums_within_window - in_memory_tenant_isolation (regression on threat-model C2) - in_memory_duplicate_event_id_is_a_noop_for_invocations - in_memory_duplicate_event_id_is_a_noop_for_values - 160 unit tests pass total. Reborn hooks_integration unchanged at 19 scenarios. Clippy/fmt/no-panics clean. Sync trait + Instant timestamps documented as a v1 choice; durable backends (Postgres, libSQL) will need an async companion trait using SystemTime — tracked in the scope doc as the next slice. Scope doc: crates/ironclaw_hooks/docs/successors/03-persistent-counter.md

zmanian · 2026-05-14T13:55:45Z

Codex review addressed (commit `d5f68de`)

All three Critical findings resolved:

Backend not wired: module is now registered in lib.rs and PredicateEvaluator delegates to Arc<dyn PredicateStateBackend> via with_backend(...). Default constructor keeps in-memory behavior; existing 154 evaluator tests pass unchanged.
Trait can't support durable contract: trait now returns Result<_, PredicateBackendError>, accepts a PredicateEventId for replay dedup, and explicitly documents the atomicity contract (single lock / transaction across record + read).
Race-prone separate calls: each record_* method performs the write AND returns the resulting in-window count/sum in a single atomic operation. The in-memory backend holds the mutex across both halves; durable backends must use a single transaction.

Schema recommendation (event_id PK too broad): noted in the scope doc — the implementation slice for the Postgres backend will use (tenant_id, hook_id, capability[, field], event_id) as a composite key.

Sync trait + Instant timestamps: kept for v1 to minimize call-site churn; durable backends will use a SystemTime-based async companion trait, tracked in the scope doc.

Tests: 160 unit total (+6 new in predicate_state):

4 backend-shape tests (counts within window, trim, value sum, tenant isolation)
2 event-id dedup tests (invocations + values)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d5f68de302

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-14T13:55:52Z

+            while bucket.recent_ids.len() > RECENT_EVENT_ID_CAP {
+                bucket.recent_ids.pop_front();
+            }


Keep replay dedup IDs as long as window entries can survive

Capping recent_ids at 256 causes false non-idempotence under normal high-throughput keys: after >256 distinct events in the same window, replaying an older event_id (whose original timestamp is still in entries) is treated as new and increments again. That violates the backend’s replay-refusal contract and can inflate counts/sums enough to trigger incorrect deny/pause decisions. The dedup retention needs to track active-window entries (or equivalent durable seen-set semantics), not a fixed unrelated cap.

Useful? React with 👍 / 👎.

Fixed in 9f9513c — dedup memory is now intrinsic to in-window entries. Each entry stores (timestamp, event_id), dedup is any(in-window entry has this id). The fixed-size 256 ring is gone, so dedup memory equals the in-window entry set — no silent loss possible under any throughput. New regression test dedup_memory_covers_full_window_under_high_throughput pushes 512 distinct events then replays event-0; pre-fix this would have re-counted, post-fix it's a no-op.

chatgpt-codex-connector · 2026-05-14T13:55:52Z

+        .filter_map(|(k, v)| v.entries.front().map(|ts| (k.clone(), *ts)))
+        .min_by_key(|(_, ts)| *ts)
+        .map(|(k, _)| k);


Evict empty buckets when enforcing MAX_HISTORY_KEYS

The LRU victim search ignores buckets whose entries deque is empty, so once duplicate replays trim entries and skip re-insert, those keys become non-evictable. If enough such keys accumulate, record_* can hit len() >= MAX_HISTORY_KEYS, fail to find a victim, and still insert a new key, letting the map grow past the intended cap and breaking the D5 memory bound. Include empty buckets in eviction (or remove empty buckets immediately after trim/no-op).

Useful? React with 👍 / 👎.

Fixed in 9f9513c. Two-part fix: (1) record_* now drops empty buckets eagerly via history.remove(key) (mostly defense-in-depth — under the new dedup design the record path can't actually leave a bucket empty). (2) evict_lru_* preferentially evicts empty buckets first, only falling back to the oldest-timestamp scan if no empty bucket exists. Regression test lru_evicts_empty_buckets_first crafts an empty bucket alongside a live one and asserts the empty one is the eviction victim.

Addresses codex P1 review on PR #3635: P1 #1 — replay dedup loss under high-throughput keys The prior design used a fixed-size (256) recent_ids ring per bucket decoupled from entries. Under any workload with >256 distinct events in the same window, the first event's id aged out of the ring while its timestamp entry was still live, so a replay silently re-counted. Fix: dedup memory is now intrinsic to entries. Each entry stores (timestamp, event_id), and the dedup check is 'does any in-window entry have this id?'. Dedup memory is therefore exactly the in-window entry set — no fixed cap, no silent loss. P1 #2 — zombie buckets clogging LRU Two-part fix: 1. record_* drops empty buckets eagerly via history.remove(key). This is mostly defense-in-depth — under the new dedup design, the record path can't actually leave a bucket empty (proved in the test rationale comment). 2. evict_lru_* now preferentially targets empty buckets first (find any v.entries.is_empty()), only falling back to the oldest-timestamp scan if no empty bucket exists. Filter-out behavior is gone, so any empty bucket that somehow survives becomes the next eviction victim instead of a permanent zombie. Test changes (+2 new, -0 removed): - dedup_memory_covers_full_window_under_high_throughput: pushes 512 distinct events into one bucket, then replays event-0. Pre-fix this would have counted again (silent dedup loss); post-fix the replay is a no-op. - lru_evicts_empty_buckets_first: crafts an empty bucket alongside a live one, runs LRU eviction, asserts the empty one is evicted and the live one retained. Tests: 162 unit total (+2 new). Clippy/fmt/no-panics clean.

Four medium-priority doc nits from gemini-code-assist on the crates/ironclaw_hooks/docs/successors/03-persistent-counter.md scope: 1. run_id in the trait: removed. The trait dedupes on event_id (RuntimeEventId is already run-scoped), not run_id. Replaces the earlier 'backend stores (timestamp, run_id, event_id)' claim. 2. SystemTime vs chrono::DateTime<Utc>: switched to DateTime<Utc> to match project convention (src/db/mod.rs, ironclaw_events). The in-memory backend keeps Instant for monotonic process-local semantics; durable backends require DateTime<Utc> for cross- process serialization. Documented as a clock note. 3. libSQL TEXT column for rust_decimal: per src/db/CLAUDE.md, libSQL can't preserve Decimal precision with numeric/real types. LibSqlPredicateStateBackend serializes value as TEXT via Decimal::to_string() / from_str(). Postgres impl keeps numeric (correct for PG). Documented as the two LibSql-specific schema differences. 4. Batched-writes vs cross-process consistency tension: gemini was right that deferring writes to the tick boundary breaks requirement #1 (two hosts would each see 'under cap' simultaneously). v1 production backend keeps writes synchronous; future optimization batches reads (not writes).

serrrfirat · 2026-05-14T20:08:56Z

Summary

Reviewed PR #3635 only. Base 5793e4d90e1316adb93ec9c7edf6511d85f8873e → head df505aea23e1c8ece6fa97a9a7b505cc7efe359f.

PR adds persistent predicate counter backend plumbing. In-memory backend looks concurrency-safe in process, but caller/public contracts do not yet support replay-safe durable semantics. Merge stance: blocking correctness finding plus public API concern.

Findings

#	Sev	Category	File:Line	Issue	Fix suggestion
1	High	Correctness / Idempotency	`crates/ironclaw_hooks/src/evaluator.rs:145`, `crates/ironclaw_hooks/src/evaluator.rs:236`	Evaluator synthesizes a fresh `event_id` for every predicate evaluation. The backend contract dedupes duplicate `event_id`, but same logical invocation retried/replayed through the evaluator gets a new ID and counts again. That defeats durable replay/restart/idempotency semantics this PR is preparing.	Plumb stable runtime event identity through evaluator/caller path before durable backend use (for example `evaluate_with_event_id` or equivalent), and use that same ID for invocation/value counter writes. Add caller-boundary replay test: same stable runtime event retried must not increment count/sum twice.
2	Medium	Architecture / Persistence semantics	`crates/ironclaw_hooks/src/predicate_state.rs:33`, `crates/ironclaw_hooks/src/predicate_state.rs:109`, `crates/ironclaw_hooks/src/lib.rs:26`	Public `predicate_state` module exposes `PredicateStateBackend`, but trait methods take `Instant`. Docs note `Instant` is process-local and not serializable, so external durable backends cannot faithfully implement cross-process/restart semantics on this public API without a future breaking contract change.	Keep predicate-state backend trait internal until durable API is stable, or switch public contract now to durable/wall-clock time type (or separate sealed in-memory trait vs durable trait) before exposing it. Add contract coverage for durable-time semantics if public API remains.

Security/data-flow notes

Tenant/key partitioning exists in predicate keys; no direct auth/scope bypass found in changed evaluator path.
Backend errors still fail closed.

Correctness/invariant notes

predicate_state.rs:100-108: backend replay-refusal contract depends on stable duplicate event_id.
evaluator.rs:145-148: caller path passes synthetic ID to record_invocation.
evaluator.rs:236-250: synthetic ID includes process-local counter, guaranteeing uniqueness across repeated identical calls rather than idempotency.
predicate_state.rs:33-38: docs explicitly call out Instant as process-local/non-serializable while module is public.

Missing tests

Caller-boundary replay/idempotency test through evaluator with a stable runtime event ID.
Duplicate event ID no-op through evaluator-facing API, not backend-only tests.
Public durable backend contract test if PredicateStateBackend remains exported.

henrypark133

What looks good:

The backend trait cleanly separates predicate evaluation from state storage.
The in-memory backend keeps record-and-read atomic within the process.
The prior fixed-size dedup and empty-bucket LRU issues look addressed in the backend itself.

Findings:

High - crates/ironclaw_hooks/src/evaluator.rs:145: PredicateEvaluator synthesizes a fresh PredicateEventId on every evaluation by mixing in a process-local counter, so the production installed-hook path never forwards a stable event identity into PredicateStateBackend. Why it matters: replay dedup is presented as load-bearing, but real retries/replays through PredicateBackedBeforeCapabilityHook -> PredicateEvaluator always get a new ID and are counted again. Expected fix direction: thread a stable runtime/event identity through the hook context or add an evaluator entrypoint that accepts one, with caller-level tests proving duplicate event IDs are no-ops.

Summary:

Recommended verdict: Request changes
Prior feedback status: backend-internal Codex P1s appear fixed, but this end-to-end replay-dedup gap remains.
Residual risk: durable Postgres/libSQL persistence is deferred, so cross-process and restart guarantees remain future work.

…y dedup) henrypark133 HIGH on PR #3635 + serrrfirat HIGH #1: the `PredicateBackedBeforeCapabilityHook -> PredicateEvaluator` path always synthesized a fresh `event_id` per evaluation by mixing in a process-local atomic counter, so the same logical invocation retried/replayed always got a different id. The backend's UNIQUE constraint on `event_id` — the load-bearing dedup contract — never engaged on the real production path. Replay dedup was effectively "documented but unused." Plumb a stable per-invocation identity through the public hook surface: - `BeforeCapabilityHookContext` gains a `caller_event_id: Option<PredicateEventId>` field. Middleware that threads through from the calling layer's runtime event identity populates `Some(...)`; older / in-memory-only callers pass `None` and degrade to the current synth path (no behavior change). - New builder method `with_caller_event_id(...)`. - `PredicateEvaluator` resolves the id through a new `resolve_event_id` helper: prefer `ctx.caller_event_id`, fall back to `synth_event_id`. Both `record_invocation` and `record_value` paths use it. - Backend dedup behavior is unchanged — it was already correct on `event_id`. The bug was the caller path never supplying a stable id. Tests (caller-boundary, henrypark133's required regression): - `duplicate_caller_event_id_is_deduped_in_invocation_count`: two evaluations with the same `caller_event_id` count as one invocation; a third with a different id counts as two; a fourth crosses the cap. Sanity branch confirms the no-id synth path still exhibits "every call counts" semantics. This is the API contract slice. Wiring the middleware to actually supply a stable id (e.g. derived from the originating `RuntimeEventId` once that runs through the BeforeCapability path) is the follow-up that lights up the durable backend's end-to-end replay-safety promise.

…at MED on PR #3635) serrrfirat MED: the `predicate_state` module exposed `PredicateStateBackend` as `pub`, but the trait's `now: Instant` parameter is process-local and not serializable. Any external durable backend impl built against the current trait would have to be rewritten when the durable contract lands with `chrono::DateTime<Utc>` (see successor doc 03-persistent-counter.md). Hold the public surface back until that contract is stable so we don't ship a public API we know we'll break. Demoted to `pub(crate)`: - `PredicateStateBackend` (trait) - `InvocationKey`, `ValueKey` (key types — backend ABI only) - `PredicateBackendError` (error type, with `#[allow(dead_code)]` on the `Unavailable` variant since the in-memory backend is infallible and no durable backend exists yet) - `InMemoryPredicateStateBackend` (the only impl) - `PredicateEvaluator::with_backend` (with `#[allow(dead_code)]` — reserved for future internal injection paths) Kept `pub`: - `PredicateEventId` — it appears on the public hook surface via `BeforeCapabilityHookContext::caller_event_id` (from the #3635 HIGH fix). Hook authors who want stable replay-dedup ids construct one. No behavior change. All 163 hooks lib tests + 19 reborn integration tests still pass.

henrypark133 · 2026-05-15T01:01:29Z

Code Review — PR #3635 (persistent predicate counter backend)

Verdict: REQUEST CHANGES — above-average quality with a credible extensibility story, but several correctness and safety issues need addressing before merge.

Overview

Extracts sliding-window counter bookkeeping from PredicateEvaluator into a new synchronous PredicateStateBackend trait + InMemoryPredicateStateBackend. Adds atomic record-and-read, event-id replay dedup (scoped to the in-window entry set, fixing the old fixed-ring P1 bug), and LRU eviction at 8192 keys. 1050 lines added, 167 removed.

Issues

Must fix before merge:

Dedup scan is O(n) in in-window entry count — linear scan of bucket.entries per call. At high throughput (1-hour window, thousands of entries), this serializes all evaluations under the mutex. Fix: companion HashSet<PredicateEventId> inside each bucket; remove from set when paired (ts, id) is popped from the deque. O(1) dedup, invariant preserved.
Mutex poison handling calls .expect("predicate history mutex poisoned") — a panic in any thread cascades to all subsequent callers panicking. For fail-closed semantics, use .unwrap_or_else(|e| e.into_inner()) (recover poisoned lock) or add a PoisonError variant to PredicateBackendError.
caller_event_id (inner String) has no format validation — empty string or null bytes reach the backend unchecked. Validate non-empty in with_caller_event_id and document the expected format. Durable backends constraining to UUID will fail cryptically otherwise.
with_backend is #[allow(dead_code)] instead of #[cfg(test)] — exists in release builds, inviting future callers to inject unvalidated backends outside tests.

Important before durable backend PRs:

Trait has no evict_older_than stub — scope doc promises this for durable backends. Adding a default-impl no-op now locks the signature before the first durable backend PR, avoiding a breaking trait-object change later.
Trait is synchronous but durable backends need async — document explicitly that the sync surface is intentional and that an async_trait migration is required for durable impls, so implementors don't assume the sync contract is permanent.

Non-blocking nits:

synth_event_id lives in evaluator.rs but its types/consumers live in predicate_state.rs — move it there for co-location and testability.
// safety: comment on non-unsafe blocks — use // RATIONALE: to avoid confusion with the unsafe convention.
PredicateEventId(pub String) inner field is pub, allowing in-place mutation — expose via as_str() and drop the pub.
Test fixture PredicateEventId("11112222...") is 62 chars, not 64 — inconsistent with synth_event_id's stated 64-char output.

Missing test coverage

Concurrent write correctness — no multi-threaded test proving the core atomicity claim (N threads, verify count is exactly N)
LRU eviction driven through the public API to MAX_HISTORY_KEYS + 1
Value sum after clock advance past window
Window boundary exact-at-cutoff (front_ts == cutoff, trim condition is < cutoff, so boundary entries are retained — worth an explicit test)
synth_event_id output format pin (64-char hex)
Cross-type event-id isolation (same id used in both record_invocation and record_value)

Security notes

caller_event_id injection — if a tenant can influence the event id, they can replay a prior id to suppress one unit of rate-limit consumption. The id must be host-assigned, not tenant-supplied. Clarify this in the trait doc and enforce it in durable backends.
LRU eviction as D5 attack — high-cardinality key injection evicts legitimate tenant counters, silently resetting rate-limit state. The current threat model frames D5 as a memory concern, not a correctness-under-attack concern. Add the explicit variant.
Cross-process replay — in-memory dedup does not protect against replay to a different instance. Acknowledge this in PredicateStateBackend security notes (not just in D5).

Five items from the 5-15 review: **#1 (must-fix) O(n) dedup scan** The previous `bucket.entries.iter().any(...)` linear scan held the outer history mutex while walking thousands of in-window entries at high throughput. Add a companion `HashSet<PredicateEventId>` per bucket (`InvocationBucket.dedup_ids` / `ValueBucket.dedup_ids`), maintained alongside the deque via `pop_front`/`push_back` helpers. O(1) dedup, same correctness, same memory bound (one set entry per in-window entry — no fixed ring). **#2 (must-fix) Mutex poison cascade** `.expect("predicate history mutex poisoned")` propagated a panic to every subsequent caller. Replace with `match self.invocation_history.lock() { Ok(g) => g, Err(p) => p.into_inner() }` so a poisoning thread doesn't take down all subsequent evaluations. **#3 (must-fix) `caller_event_id` format validation** `with_caller_event_id` now rejects empty strings and ids containing NUL bytes. Failed validation logs a `tracing::warn!` and leaves `caller_event_id == None` so the synth path takes over — operator sees the warning, predicate dedup still works. Also: `PredicateEventId(pub String)` → `PredicateEventId(String)` with `new()` / `as_str()` (henrypark133 nit #9). Inner field is no longer in-place mutable from outside the crate. **#4 (must-fix) `with_backend` is `#[cfg(test)]`** Previously `#[allow(dead_code)]` — reachable from release builds and inviting future callers to inject backends through an unstable seam. Gated to `cfg(test)`. **#5 (important) `evict_older_than` trait stub** Default-impl no-op added to `PredicateStateBackend` so the trait signature is locked before the first durable-backend PR. Trait-object callers won't break when durable impls override it. **Bonus** (henrypark133 missing-coverage #1): `in_memory_record_invocation_is_atomic_under_concurrent_writers` — 32 threads each record a distinct event id; final count must equal 32, proving the atomic record-and-read contract holds under contention. **Bonus** (henrypark133 nit #10): The third stable id in `duplicate_caller_event_id_is_deduped_in_invocation_count` was 62 chars; bumped to 64 to match the synth output format.

…3635 CI)

serrrfirat

Findings:

High - crates/ironclaw_hooks/docs/successors/03-persistent-counter.md:91, :103: the proposed durable schema uses event_id uuid PRIMARY KEY, but the backend contract dedupes only within the same counter key. caller_event_id is per capability invocation, not per predicate hook. If two predicate-backed hooks observe the same invocation, the second hook's insert conflicts on the global event_id and does not count for its own (tenant, hook_id, capability[, field]) bucket. That undercounts independent hooks and can let a later hook's cap fail open. Use composite uniqueness scoped to the counter key, e.g. (tenant_id, hook_id, capability, event_id) and (tenant_id, hook_id, capability, field, event_id), and align the predicate_state.rs replay-refusal docs with that scope.
Medium - crates/ironclaw_hooks/src/points/capability.rs:54, :105-114: caller_event_id is a public field, so the validation in with_caller_event_id() is bypassable. Callers can construct a context and then assign Some(PredicateEventId::new("")) or a NUL-containing value directly; PredicateEventId::new() is intentionally permissive, and the evaluator trusts the field. Durable backends can still receive invalid IDs or accidentally dedupe unrelated events under an empty ID. Make the field private and require the validated setter, or make PredicateEventId construction validated with an internal unchecked constructor for synthesized IDs.

I did not run tests; this was a review-only pass against head 7c83c3d14db6403fe3e549a32cf987a8c9b96aa3.

serrrfirat · 2026-05-15T10:10:17Z

+      hook_id       bytea NOT NULL,
+      capability    text NOT NULL,
+      occurred_at   timestamptz NOT NULL,
+      event_id      uuid PRIMARY KEY -- for replay dedup


The schema dedupes event_id globally, but the backend contract dedupes only within a counter key. Because caller_event_id is per capability invocation, two predicate-backed hooks observing the same invocation would collide here and the second hook would not count its own bucket. Please scope uniqueness to (tenant_id, hook_id, capability, event_id) for invocations and (tenant_id, hook_id, capability, field, event_id) for values, then align the replay-refusal docs with that scope.

Addressed in b4d8a355f.

Schema PKs changed to composite uniqueness scoped to the counter key:

hook_invocation_events: PRIMARY KEY (tenant_id, hook_id, capability, event_id)

hook_value_events: PRIMARY KEY (tenant_id, hook_id, capability, field, event_id)

The trait's replay-refusal docs in predicate_state.rs were rewritten to spell out the per-key scope explicitly and to specify the corresponding INSERT … ON CONFLICT (tenant, hook, capability[, field], event_id) DO NOTHING shape that durable backends must use. Two predicate hooks observing the same invocation now record into distinct buckets — no first-writer-wins undercounting.

serrrfirat · 2026-05-15T10:10:17Z

+    /// dedup degrades to "every evaluation counts" semantics — appropriate
+    /// for the in-memory backend without replay, but **not** for durable
+    /// backends. Durable callers MUST supply a value.
+    pub caller_event_id: Option<crate::predicate_state::PredicateEventId>,


This field is public, so callers can bypass the validation in with_caller_event_id() by assigning Some(PredicateEventId::new("")) or a NUL-containing value after construction. Since the evaluator trusts this field and PredicateEventId::new() is permissive, durable backends can still receive invalid IDs. Make the field private and force the validated setter, or validate construction at the PredicateEventId boundary with an internal unchecked constructor for synthesized IDs.

Addressed in b4d8a355f.

Moved validation INTO the type boundary: PredicateEventId::new(...) -> Result<Self, PredicateEventIdError> validates non-empty + NUL-free at construction. The public field on BeforeCapabilityHookContext can no longer carry an invalid id because constructing any PredicateEventId value goes through the validating path.

For the evaluator's internal synth path and tests that mint ids from known-good shapes (e.g. blake3 hex digests), PredicateEventId::new_unchecked(...) is the explicit escape hatch — its name makes the bypass visible to reviewers.

with_caller_event_id dropped its now-redundant runtime check (the type already enforces it). Existing call sites switched to .expect("fixture passes validation") for the test fixtures and new_unchecked for the synth path.

Tests:

predicate_event_id_rejects_empty

predicate_event_id_rejects_nul_bytes

predicate_event_id_accepts_typical_hex_digest

**MEDIUM — `caller_event_id` validation bypass** `with_caller_event_id` validated for empty/NUL but the field on `BeforeCapabilityHookContext` is `pub`, so callers could direct- assign `Some(PredicateEventId::new("..."))` with `new()` permissive and bypass the setter entirely. Move validation INTO the type boundary: - `PredicateEventId::new(...) -> Result<Self, PredicateEventIdError>` validates non-empty + NUL-free at construction. Any value that reaches a downstream backend now satisfies the format invariant by construction. - `PredicateEventId::new_unchecked(...)` for internal synth paths and tests that mint ids from known-good shapes (hex digests). - `with_caller_event_id` drops its now-redundant runtime check; the type already enforces it. - Internal synth in `evaluator.rs` switches to `new_unchecked` (64-char hex output is always valid by construction). Tests: - `predicate_event_id_rejects_empty` - `predicate_event_id_rejects_nul_bytes` - `predicate_event_id_accepts_typical_hex_digest` **HIGH — durable schema: dedup scope mismatch** The successor doc's Postgres schema declared `event_id uuid PRIMARY KEY` (globally unique), but the trait's replay-refusal contract dedupes within the counter `key`. `caller_event_id` is per capability invocation — two predicate-backed hooks observing the same invocation share an id. A global PK lets the first hook's INSERT win and silently undercounts the second hook's bucket. - `docs/successors/03-persistent-counter.md`: PK changes to composite `(tenant_id, hook_id, capability, event_id)` for invocations and `(tenant_id, hook_id, capability, field, event_id)` for values, matching the trait's per-key dedup scope. - `predicate_state.rs` trait doc: replay-refusal section rewritten to spell out the per-key scope and the corresponding `INSERT … ON CONFLICT (tenant, hook, capability[, field], event_id) DO NOTHING` shape durable backends should use.

henrypark133 / serrrfirat blocker B4 on PR #3635: the `caller_event_id` threading through `BeforeCapabilityHookContext` partially shipped earlier (commit b4d8a35), but the trust-boundary documentation explaining the host-assigned invariant was still missing. Add rustdoc to `PredicateEventId` and the `PredicateStateBackend` trait clarifying that: - the id MUST be minted by trusted host code from authoritative sources (dispatcher RuntimeEventId, host-side hash, arguments digest) - it MUST NOT pass through unchanged from any tenant-controlled surface (capability arguments, manifest fields, WASM memory, HTTP bodies) - the format invariants in `PredicateEventId::new` (non-empty, NUL-free) are a durability contract for SQL backends, NOT a trust check - a tenant-supplied id can either undercount itself into infinity by replaying a fixed id, or poison adjacent buckets if scoping is ever weakened Doc-only; no behavior change.

henrypark133 HIGH blocker B1 on PR #3635: replay dedup must engage at the caller boundary — `PredicateBackedBeforeCapabilityHook::evaluate` is the production path the dispatcher invokes for installed predicate hooks. A unit test on `PredicateEvaluator::evaluate_at` alone is insufficient regression coverage (repo CLAUDE.md rule "Test through the caller, not just the helper"): the wrapper hook reads `BeforeCapabilityHookContext::caller_event_id` and threads it down to the backend, so the regression test must drive the wrapper itself. The threading work already shipped in commit e6df47d (`caller_event_id` field on the public hook context + evaluator preferring it over the synth path). This commit adds the missing end-to-end test: 1. Two `PredicateBackedBeforeCapabilityHook::evaluate` calls with the same `caller_event_id` and a `RateOrValueCap { max: 1 }` predicate — the second call must stay under cap (dedupe engages at the wrapper boundary, not be re-counted into a deny). 2. A third call with a DISTINCT `caller_event_id` crosses the cap — proving dedup is replay-scoped (same id → no-op), not blanket- suppress (any id → no-op). If the wrapper were synthesizing a fresh id per call (the bug Henry flagged before threading landed), this test would fail at step 2 with the second evaluation being denied.

zmanian · 2026-05-15T14:41:24Z

@henrypark133 thanks for the careful review — addressing your HIGH finding plus the joint blockers from Firat's pass.

B1 (HIGH) — Replay-dedup gap through `PredicateBackedBeforeCapabilityHook` — `evaluator.rs:145, :236`

Fixed across two commits.

Threading shipped earlier in e6df47dd (caller_event_id field on BeforeCapabilityHookContext, evaluator now prefers it via resolve_event_id and falls back to the synth path only when absent). The piece your review specifically called out — "caller-level tests proving duplicate event IDs are no-ops" through the installed-hook path — is now f632d225a:

New caller_event_id_replay_is_deduped_through_wrapper_hook test in installed_hook.rs drives PredicateBackedBeforeCapabilityHook::evaluate directly (not just PredicateEvaluator::evaluate_at), per the repo CLAUDE.md "Test through the caller, not just the helper" rule. The test:
1. Two evaluations with the same stable caller_event_id against a RateOrValueCap { max: 1 } predicate — second call stays under cap (dedup engages at the wrapper boundary).
2. A third evaluation with a distinct caller_event_id crosses the cap — proving dedup is replay-scoped (same id → no-op), not blanket-suppress (any id → no-op).

If the wrapper were synthesizing a fresh id per call (the bug), the test would fail at step 2 with the second evaluation being denied.

B2 (HIGH) — O(n) dedup scan under mutex — `predicate_state.rs`

Fixed in 0ea35d8a4 (must-fix #1). Added companion HashSet<PredicateEventId> per bucket (InvocationBucket::dedup_ids / ValueBucket::dedup_ids). Membership check is now O(1); the VecDeque<entries> is kept for ordered iteration and FIFO trim. Invariants documented at the bucket struct: dedup_ids is exactly the set of event_id values currently in entries; every push/pop updates both via push_back / pop_front helpers that keep them in sync.

B3 (HIGH) — Mutex poison cascade — `predicate_state.rs`

Fixed in 0ea35d8a4 (must-fix #2). Replaced .expect() on .lock() with match { Ok(g) => g, Err(poisoned) => poisoned.into_inner() }. Choice documented inline (record_invocation:319-322):

Recover from a poisoned mutex by reading the inner value rather than cascading the panic to every subsequent caller. A poisoning thread has already aborted; refusing service indefinitely is worse than proceeding with possibly-incomplete state.

Backend has no other failure modes today (Unavailable is reserved for the future durable backends), so propagation would force every caller to handle an error the in-memory impl never returns. Documented as "best-effort recovery" semantics.

B4 — `caller_event_id` trust-boundary doc — `predicate_state.rs`

Fixed in c9c5fb014. Added explicit rustdoc on both PredicateEventId and the PredicateStateBackend trait clarifying the host-assigned invariant:

The id MUST be minted by trusted host code from authoritative sources (dispatcher RuntimeEventId, host-side hash, arguments digest).
It MUST NOT pass through unchanged from tenant-controlled surfaces (capability args, manifest fields, WASM memory, HTTP bodies).
The format invariants in PredicateEventId::new (non-empty, NUL-free) are a durability contract for SQL backends, NOT a trust check.
A tenant-supplied id can undercount itself into infinity by replaying a fixed id, or poison adjacent buckets if scoping is ever weakened.

Quality gate green on ironclaw_hooks: 168 tests pass, cargo clippy --all --benches --tests --examples --all-features -- -D warnings is clean. Re-requesting review.

henrypark133 should-fix S8 + S9 on PR #3635. S8 — threat-model expansion: - Add D5a as the correctness-under-attack variant of D5: an attacker flooding high-cardinality keys can LRU-evict legitimate tenants' counters and reset their rate-limit state. Distinct from the memory-only framing of D5; tied back to per-extension caps (D3/D4) and the durable-backend successor (doc 03). - Document the cross-process replay limit on the in-memory backend inside the PredicateStateBackend trait docs, not just in D5 — the process-local dedup is a property callers need at the trait surface, with a pointer to the durable backend as the cross-host story. S9 — three new tests on the in-memory backend public API: - lru_eviction_via_public_api_holds_max_history_keys_cap: drives MAX_HISTORY_KEYS + 1 distinct keys through record_invocation and asserts the map size cap holds + evictions_observed() advances. The previous coverage manually crafted buckets and called the LRU helper directly; this exercises the production path. - in_memory_invocation_retains_entry_at_exact_window_cutoff: pins the `< cutoff` trim semantics so a refactor to `<=` would fail loud. - event_id_dedup_is_isolated_across_invocation_and_value_maps: same event_id used in both record_invocation and record_value must not cross-suppress — the two maps key on disjoint types. The fourth S9 item (concurrent N-thread atomicity) and the caller- boundary replay test on the wrapper hook already landed in earlier commits (f632d22, predicate_state.rs line 840). S2 (evict_older_than stub), S3 (sync-trait docs), and S7 (consistency vs batched-writes) were also already in HEAD; this commit ships the remaining items. Quality gate: cargo fmt clean, cargo clippy -p ironclaw_hooks --all-features --tests -D warnings clean, full hooks test suite green (15 predicate_state unit tests + lib + integration). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…t; pin synth format henrypark133 nits N1, N2, N5 on PR #3635. N1 — Move `synth_event_id` from `evaluator.rs` to `predicate_state.rs` as `PredicateEventId::synth(...)`. The id format (64-char lowercase hex, no NUL, never empty) is part of the backend's durable contract, so co-locating with `PredicateStateBackend` keeps the format change- surface adjacent to the consumer. To avoid inverting the module dependency (`predicate_state` is a leaf below `points`), the synth helper takes raw bytes / &str rather than a `&BeforeCapabilityHookContext`. The evaluator's `resolve_event_id` fallback unpacks the context and delegates. N2 — `// safety:` comment on a non-`unsafe` block (the `write!(s, "{byte:02x}")` infallibility note) renamed to `// RATIONALE:`. By convention `// SAFETY:` pairs with `unsafe` blocks; using `// safety:` elsewhere conflates the two. N5 — Add `synth_event_id_is_64_char_lowercase_hex` to pin the synth output shape. A refactor that silently changes length or case would break the durable backend's `uuid`-shaped UNIQUE constraint without a test failure today; the new test fails loud. Quality gate: cargo fmt clean, cargo clippy --all --benches --tests --examples --all-features -D warnings clean, full hooks lib test suite green (172 passing including the new pin). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

zmanian · 2026-05-15T14:52:23Z

Should-fix + nits follow-up — henrypark133 review (5-15)

Status on the remaining items from your should-fix + nits list. All addressed; SHAs below.

Should-fix

#	Item	Status
S2	`evict_older_than` trait stub with default impl	Already landed in HEAD — `predicate_state.rs:252` defines the default-impl no-op on `PredicateStateBackend` (commit `b4d8a355f`). Trait-object callers won't break when the durable impl lands.
S3	Document sync surface as intentional	Already landed in HEAD — module-level docs `predicate_state.rs:46-53` explicitly mark the trait as synchronous in this PR with the durable-backend async migration noted (commit `b4d8a355f`).
S7	Cross-process consistency vs batched-writes tradeoff	Already landed in HEAD — `docs/successors/03-persistent-counter.md` lines 149-163 spell out the conflict between cross-process consistency and write-batching, and state the v1 resolution (synchronous writes; reads may batch in future) explicitly (commit `df505aea2`).
S8	LRU eviction as D5 correctness-under-attack variant + cross-process replay in `PredicateStateBackend` notes	`2b0383282` — Threat-model: added D5a as the correctness-under-attack variant of D5 (attacker-driven high-cardinality key flood evicts legitimate tenants' counters, resetting rate-limit state). PredicateStateBackend: added explicit `# Cross-process replay limits (in-memory backend)` section documenting that the in-memory dedup is process-local; durable backend is the only defense across hosts.
S9	Missing tests (4)	Partially landed before, completed in `2b0383282`. Concurrent N-thread atomicity already at `predicate_state.rs:840` (`in_memory_record_invocation_is_atomic_under_concurrent_writers`). New in this commit: `lru_eviction_via_public_api_holds_max_history_keys_cap` (drives `MAX_HISTORY_KEYS + 1` through the public API, asserts cap holds + evictions counter advances), `in_memory_invocation_retains_entry_at_exact_window_cutoff` (pins `< cutoff` trim semantics), `event_id_dedup_is_isolated_across_invocation_and_value_maps` (cross-type event-id isolation). Caller-boundary replay-dedup test through the wrapper hook is at `installed_hook.rs:154` (`f632d225a`).

Nits

#	Item	Status
N1	Move `synth_event_id` from `evaluator.rs` → `predicate_state.rs`	`33521f7e2` — Moved to `PredicateEventId::synth(...)` next to the backend that consumes the id format. Takes raw bytes / `&str` (not `&BeforeCapabilityHookContext`) so `predicate_state` stays a leaf module below `points` — the evaluator's `resolve_event_id` unpacks the context and delegates.
N2	Replace `// safety:` on non-`unsafe` blocks with `// RATIONALE:`	`33521f7e2` — The `write!(s, "{byte:02x}")` infallibility note was the only `// safety:` outside an `unsafe` block; renamed in the synth move.
N3	`PredicateEventId(pub String)` → drop inner `pub`, expose via `as_str()`	Already landed in HEAD — `predicate_state.rs:115` is `pub struct PredicateEventId(String)` (no inner `pub`), and `as_str()` is the read accessor (commit `b4d8a355f`).
N4	Test fixture id is 62 chars, not 64 — align with `synth_event_id`	Already landed in HEAD — current fixtures (e.g. `evaluator.rs:486`, `:506`, `:516`, `installed_hook.rs:171`, `:209`) all use 64-char hex strings matching the synth output (commit `b4d8a355f`).
N5	Pin `synth_event_id` 64-char hex output in a test	`33521f7e2` — `synth_event_id_is_64_char_lowercase_hex` asserts the synth output is exactly 64 chars and lowercase-hex. Failure mode: a refactor that silently changes the length or case would break the durable backend's `uuid` UNIQUE constraint; the new test fails loud.

Quality gate

Each commit: cargo fmt && cargo clippy --all --benches --tests --examples --all-features -- -D warnings && cargo test -p ironclaw_hooks — all green. Final HEAD 33521f7e2 runs 172 hooks lib tests passing including all new pins.

Branch: hooks-fu-persistent-counter @ 33521f7e2.

…fu-persistent-counter

The "No panics in production code" CI check (scripts/check_no_panics.py) only recognizes `// safety:` suppression markers, not `RATIONALE:`. Since std::fmt::Write for String is infallible, just discard the Result with `let _ =` instead of `.expect()` — no panic call, no marker needed. Also merges in latest origin/hooks-foundation-01 (now includes the reborn-integration merge and PR #3636).

github-actions Bot added scope: docs Documentation size: XS < 10 changed lines (excluding docs) risk: low Changes to docs, tests, or low-risk modules contributor: core 20+ merged PRs labels May 14, 2026

gemini-code-assist Bot reviewed May 14, 2026

View reviewed changes

docs(hooks): scope persistent predicate counter backend (successor #3)

d4dd47a

Successor PR from #3573. Current sliding-window state is in-memory and resets on restart. Adds a PredicateStateBackend trait + Postgres/libSQL impls for cross-process and restart-survival semantics.

zmanian force-pushed the hooks-fu-persistent-counter branch from 6a6e0e0 to d4dd47a Compare May 14, 2026 13:18

zmanian changed the base branch from reborn-integration to hooks-foundation-01 May 14, 2026 13:19

zmanian marked this pull request as ready for review May 14, 2026 13:52

github-actions Bot added size: XL 500+ changed lines and removed size: XS < 10 changed lines (excluding docs) labels May 14, 2026

chatgpt-codex-connector Bot reviewed May 14, 2026

View reviewed changes

zmanian added 2 commits May 14, 2026 08:01

zmanian mentioned this pull request May 14, 2026

feat(hooks): event-triggered hooks Phase 5 (successor to #3573) #3640

Open

henrypark133 requested changes May 14, 2026

View reviewed changes

zmanian added 2 commits May 14, 2026 16:35

zmanian added 2 commits May 14, 2026 20:00

fix(hooks): clippy doc-list-indentation + remove unused with_backend (#…

7c83c3d

…3635 CI)

serrrfirat requested changes May 15, 2026

View reviewed changes

zmanian added 3 commits May 15, 2026 05:35

zmanian requested a review from henrypark133 May 15, 2026 14:41

zmanian requested a review from serrrfirat May 15, 2026 14:41

zmanian and others added 2 commits May 15, 2026 07:48

zmanian added 2 commits May 17, 2026 05:43

Merge remote-tracking branch 'origin/hooks-foundation-01' into hooks-…

8660d65

…fu-persistent-counter

		must NOT double-count. The backend stores `(timestamp, run_id,
		event_id)` so duplicate-event detection works at replay time.

Conversation

zmanian commented May 14, 2026

Scope

Design doc

Threat-model

Status

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zmanian commented May 14, 2026

Codex review addressed (commit d5f68de)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

serrrfirat commented May 14, 2026

Summary

Findings

Security/data-flow notes

Correctness/invariant notes

Missing tests

Uh oh!

henrypark133 left a comment

Choose a reason for hiding this comment

Uh oh!

henrypark133 commented May 15, 2026

Code Review — PR #3635 (persistent predicate counter backend)

Overview

Issues

Missing test coverage

Security notes

Uh oh!

serrrfirat left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zmanian commented May 15, 2026

B1 (HIGH) — Replay-dedup gap through PredicateBackedBeforeCapabilityHook — evaluator.rs:145, :236

B2 (HIGH) — O(n) dedup scan under mutex — predicate_state.rs

B3 (HIGH) — Mutex poison cascade — predicate_state.rs

Codex review addressed (commit `d5f68de`)

B1 (HIGH) — Replay-dedup gap through `PredicateBackedBeforeCapabilityHook` — `evaluator.rs:145, :236`

B2 (HIGH) — O(n) dedup scan under mutex — `predicate_state.rs`

B3 (HIGH) — Mutex poison cascade — `predicate_state.rs`

B4 — `caller_event_id` trust-boundary doc — `predicate_state.rs`