Skip to content
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
15 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
157 changes: 157 additions & 0 deletions crates/ironclaw_hooks/docs/successors/03-persistent-counter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
# Successor PR: persistent predicate counter

> Successor work from PR #3573. Current sliding-window counter state is
> in-memory only and resets on every restart. This PR adds a durable
> backend so rate-limit / value-cap predicates survive process
> lifecycles.

## Scope

Add a `PredicateStateBackend` trait + Postgres / libSQL impls so
`PredicateEvaluator` can persist its counter state. Threat-model D5
(LRU eviction at 8192 keys per map) still applies in-memory; the
durable backend is the source of truth.

## Required behavior

1. **Cross-process consistency**: two host processes against the same
tenant share counter state. Run-1 increments → run-2 sees the
updated count.
2. **Restart survival**: counter state survives process restart.
3. **Replay refusal**: re-emitting a recorded invocation must NOT
double-count. The backend dedupes on `event_id` (a sanitized
`RuntimeEventId` hex) so duplicate detection works across restart
/ replay.
4. **Backend-agnostic**: the predicate evaluator depends on the trait,
not a specific store.

## Likely surface

```rust
#[async_trait]
pub trait PredicateStateBackend: Send + Sync {
/// Atomic record-and-read: the implementation MUST perform the
/// insert AND the in-window count read under a single
/// lock/transaction. Splitting them is a race that lets the cap
/// drift past `max` (codex Critical from #3635).
async fn record_invocation(
&self,
key: &InvocationKey, // (tenant, hook_id, capability)
timestamp: DateTime<Utc>, // project-wide convention; see src/db/mod.rs
event_id: &PredicateEventId, // opaque hex; durable backends UNIQUE-constraint on it
window: Duration,
) -> Result<u32, PredicateBackendError>;

async fn record_value(
&self,
key: &ValueKey, // (tenant, hook_id, capability, field)
timestamp: DateTime<Utc>,
event_id: &PredicateEventId,
value: Decimal,
window: Duration,
) -> Result<Decimal, PredicateBackendError>;

/// Garbage-collect rows older than `cutoff`. Operator runs this as
/// a reaper task, typically at the slowest configured window.
async fn evict_older_than(
&self,
cutoff: DateTime<Utc>,
) -> Result<u64, PredicateBackendError>;
}
Comment on lines +32 to +60
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The trait uses std::time::SystemTime, but the existing database layer in this repository (e.g., src/db/mod.rs) consistently uses chrono::DateTime<Utc>. Using DateTime<Utc> would be more idiomatic for this project and simplify integration with the existing persistence and migration logic.

pub trait PredicateStateBackend: Send + Sync {
    async fn record_invocation(
        &self,
        key: &InvocationKey,           // (tenant, hook_id, capability)
        timestamp: DateTime<Utc>,
        event_id: RuntimeEventId,
    ) -> Result<(), PredicateBackendError>;

    async fn count_in_window(
        &self,
        key: &InvocationKey,
        window: Duration,
    ) -> Result<u32, PredicateBackendError>;

    async fn record_value(
        &self,
        key: &ValueKey,                // (tenant, hook_id, capability, field)
        timestamp: DateTime<Utc>,
        value: Decimal,
        event_id: RuntimeEventId,
    ) -> Result<(), PredicateBackendError>;

    async fn sum_in_window(
        &self,
        key: &ValueKey,
        window: Duration,
    ) -> Result<Decimal, PredicateBackendError>;

    async fn evict_older_than(&self, cutoff: DateTime<Utc>) -> Result<u64, PredicateBackendError>;
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in doc revision. The current trait sketch (line 41-49) uses DateTime<Utc> for the durable backend, and the rationale is called out explicitly on lines 63-70:

the in-memory backend that already shipped in PR #3635 uses Instant because it's process-local and monotonic. Durable backends serialize across processes, so they must use chrono::DateTime<Utc> to match the rest of the project (src/db/mod.rs, ironclaw_events). The trait will accept DateTime<Utc> and the existing in-memory backend will gain a thin shim mapping Instant-driven callers to a fixed reference point.

```

**Clock note (responding to gemini's review on the prior draft):** the
in-memory backend that already shipped in PR #3635 uses `Instant`
because it's process-local and monotonic. Durable backends serialize
across processes, so they must use `chrono::DateTime<Utc>` to match
the rest of the project (`src/db/mod.rs`, `ironclaw_events`). The
trait will accept `DateTime<Utc>` and the existing in-memory backend
will gain a thin shim mapping `Instant`-driven callers to a fixed
reference point.

**Run scope:** the trait does NOT carry `run_id` directly. The
sliding-window state is per-tenant + per-hook-id; replay refusal is
driven by `event_id` (which is `RuntimeEventId` from `ironclaw_events`,
itself already keyed to the current run's emission). An earlier draft
mentioned storing `run_id` alongside; that's redundant given the event
id's uniqueness contract and was removed in this revision.

## Backends

- **`InMemoryPredicateStateBackend`**: keeps current behavior; used
for tests + the standalone `ironclaw_hooks` integration tests that
don't want to spin up a database.
- **`PostgresPredicateStateBackend`**: production. Schema:
```sql
CREATE TABLE hook_invocation_events (
tenant_id text NOT NULL,
hook_id bytea NOT NULL,
capability text NOT NULL,
occurred_at timestamptz NOT NULL,
event_id uuid PRIMARY KEY -- for replay dedup
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schema dedupes event_id globally, but the backend contract dedupes only within a counter key. Because caller_event_id is per capability invocation, two predicate-backed hooks observing the same invocation would collide here and the second hook would not count its own bucket. Please scope uniqueness to (tenant_id, hook_id, capability, event_id) for invocations and (tenant_id, hook_id, capability, field, event_id) for values, then align the replay-refusal docs with that scope.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in b4d8a355f.

Schema PKs changed to composite uniqueness scoped to the counter key:

  • hook_invocation_events: PRIMARY KEY (tenant_id, hook_id, capability, event_id)
  • hook_value_events: PRIMARY KEY (tenant_id, hook_id, capability, field, event_id)

The trait's replay-refusal docs in predicate_state.rs were rewritten to spell out the per-key scope explicitly and to specify the corresponding INSERT … ON CONFLICT (tenant, hook, capability[, field], event_id) DO NOTHING shape that durable backends must use. Two predicate hooks observing the same invocation now record into distinct buckets — no first-writer-wins undercounting.

);
CREATE INDEX ix_hook_invocation_events_window ON hook_invocation_events
(tenant_id, hook_id, capability, occurred_at DESC);

CREATE TABLE hook_value_events (
tenant_id text NOT NULL,
hook_id bytea NOT NULL,
capability text NOT NULL,
field text NOT NULL,
occurred_at timestamptz NOT NULL,
value numeric NOT NULL,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For the libSQL implementation, the value column should be TEXT to preserve rust_decimal precision, as noted in src/db/CLAUDE.md. The current Postgres schema uses numeric, which is correct for PG, but the design should acknowledge the dialect difference for libSQL to ensure parity across both backends.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in doc revision (lines 120-128). The libSQL impl is now explicitly called out as having two material differences from the Postgres impl mandated by src/db/CLAUDE.md:

  • value column is TEXT NOT NULL (not numeric), because libSQL's integer/real types can't preserve rust_decimal precision; the backend serializes via Decimal::to_string() / from_str() at the row boundary.
  • occurred_at is stored as ISO-8601 TEXT (libSQL convention).

event_id uuid PRIMARY KEY
);
CREATE INDEX ix_hook_value_events_window ON hook_value_events
(tenant_id, hook_id, capability, field, occurred_at DESC);
```
- **`LibSqlPredicateStateBackend`**: parity with the rest of the
dual-backend story (per `src/db/CLAUDE.md`). Same shape as the
Postgres impl, with two material differences mandated by
`src/db/CLAUDE.md`:
- `value` column is `TEXT NOT NULL` (not numeric), because libSQL's
integer/real types can't preserve `rust_decimal` precision; the
backend serializes via `Decimal::to_string()` / `from_str()` at
the row boundary.
- `occurred_at` is stored as ISO-8601 `TEXT` (libSQL convention).

## Migration / coexistence

- `PredicateEvaluator::new()` keeps the in-memory default.
- `PredicateEvaluator::with_backend(backend)` opt-in for production.
- The registrar / Reborn factory wires the backend per host.

## Required tests

1. **Replay refusal**: record the same `event_id` twice → second call
no-ops, count unchanged.
2. **Cross-process**: two backends pointing at the same database,
one increments, the other reads the updated count.
3. **Restart survival**: backend roundtrip after a connection cycle.
4. **Eviction policy**: `evict_older_than` clears expired rows; called
periodically by a reaper task.
5. **Tenant isolation**: tenant A's writes don't show up in tenant B's
reads (already pinned for in-memory by `HistoryKey` — confirm
backend-side).

## Risk

- Touches `src/db/` migrations machinery. See `.claude/rules/database.md`
for the dual-backend conventions.
- Performance: every predicate evaluation becomes a DB round-trip.
An earlier draft suggested batching writes at the dispatcher's tick
boundary. **That conflicts with requirement #1 (cross-process
consistency)**: a deferred write from host A wouldn't be visible to
host B's read until the next tick, so two concurrent hosts could
each see "under cap" simultaneously and both proceed past `max`
(gemini's review on the prior draft). Resolution: the v1 production
backend keeps writes synchronous (read-your-own-writes within the
call); the in-process cache stays per-dispatch-only. A future
optimization could batch *reads* — collapse N predicate evaluations
in one dispatch into a single batch SELECT — but never writes.

## Effort

Medium-Large. Schema migration + backend impls are mechanical; the
performance-conscious read/write batching is the design-discussion
piece.
Loading
Loading