joshuatownsend · joshuatownsend · May 16, 2026 · May 16, 2026 · chatgpt-codex-connector · May 16, 2026
diff --git a/docs/architecture/adr/index.md b/docs/architecture/adr/index.md
@@ -0,0 +1,71 @@
+---
+title: Architecture Decision Records
+description: Index of every accepted ADR, with status and one-line summary.
+---
+
+# Architecture Decision Records
+
+Architecture Decision Records (ADRs) document **why** a particular architectural choice was made — the context, the alternatives considered, the trade-offs accepted. They're written once and not edited; if a decision changes, a new ADR supersedes the old.
+
+For background on the ADR practice, see Michael Nygard's [original post](https://www.cognitect.com/blog/2011/11/15/documenting-architecture-decisions). PAL-X follows the standard lightweight pattern: one Markdown file per decision, numbered chronologically, with a status field.
+
+## Status legend
+
+| Status | Meaning |
+|---|---|
+| `Proposed` | Under discussion. Not yet implemented. |
+| `Accepted` | Decided, ratified, implemented. |
+| `Deprecated` | Decision still applies historically but a new ADR supersedes the recommendation. |
+| `Superseded by ADR-####` | Replaced. The new ADR documents the change. |
+
+No ADRs are currently `Deprecated` or `Superseded`.
+
+## Accepted ADRs
+
+| # | Title | Date | Status | One-line summary |
+|---:|---|---|---|---|
+| 0001 | [Ratified Deviations from Seed Documentation](0001-deviations-from-seed-docs.md) | 2026-04-23 | Accepted | 12 deviations from the ChatGPT-generated seed docs, ratified at project kickoff: tri-state status (no 0–100 score), declarative comparators (no DSL), content-hash IDs, snake_case fields, `host_context` in v1 schema, Spectre.Console.Cli over System.CommandLine, ScottPlot for charts, and others. |
+| 0002 | [Declarative Rule Schema Instead of Custom DSL](0002-declarative-rule-schema.md) | 2026-04-23 | Accepted | Rule conditions are declarative — `metric` + `aggregation` + `operator` + `threshold` + `duration_percent` + optional `window`. No expression parser. Trades expressivity for stability, auditability, and zero parser maintenance. |
+| 0003 | [Pack Signing Format and Trust Model](0003-pack-signing-format.md) | 2026-04-27 | Accepted | RSA-PSS-SHA256 with 3072-bit keys, signing raw `pack.yaml` bytes, sidecar file at `pack.yaml.sig`. BCL-only (no NuGet dep). Trust model is consumer-rooted via embedded project key + CLI `--trust-key`. |
+| 0004 | [Schema v1.1: Rolling-Window Aggregations (In-Place Enum Bump)](0004-schema-v1.1-rolling-windows.md) | 2026-04-27 | Accepted | Pack schema gains rolling-window aggregations via an additive `window:` field on `Condition`. Schema discriminator is in-place (`schema_version: "pal.pack/v1.1"`); no new JSON Schema file. Validator gates `window:` on the v1.1 version. |
+
+## Reading an ADR
+
+Each ADR follows the same structure:
+
+- **Context** — the problem being solved and the constraints.
+- **Decisions** — what was chosen, often broken into sub-decisions.
+- **Consequences** — what changed, what we gave up, what's now harder or easier.
+- **Alternatives considered** — what we didn't pick and why.
+
+The most important read for new contributors is **[ADR 0001](0001-deviations-from-seed-docs.md)** — it documents every design choice that diverges from the seeded ChatGPT spec, and the diverging choice is the load-bearing one in nearly every case.
+
+## Authoring a new ADR
+
+When a non-trivial architectural decision is made:
+
+1. Number the new ADR sequentially (e.g., `0005-…`).
+2. Use the same heading structure as the existing ones.
+3. Set status to `Accepted` once the decision is final; don't ship `Proposed` ADRs in production branches.
+4. Link to the ADR from any code or doc that implements it — bidirectional references catch drift.
+5. Add an entry to this index.
+
+If an ADR supersedes an earlier one, update the earlier ADR's status to `Superseded by ADR-####` and link forward.
+
+ADRs are not retrospective documentation. If a decision was made informally and you're documenting it after the fact, that's fine — but make it clear in the Context section. Date the ADR with when it was written; date the decision (in Context) with when it was made.
+
+## Where ADRs are NOT the answer
+
+ADRs are heavyweight. Don't write one for:
+
+- **Bug fixes** — those live in commit messages and PR descriptions.
+- **Refactors that preserve external behaviour** — same.
+- **Tactical implementation choices** — e.g., "use a `HashSet` here" doesn't need an ADR.
+- **Configuration defaults** — those belong in `appsettings.json` and **[Reference — Configuration](../../reference/configuration.md)**.
+
+ADRs are for **decisions that constrain future work** — choices a future contributor needs to know about to avoid relitigating, breaking, or reversing without strong cause.
+
+## Related
+
+- **[Architecture index](../index.md)** — the broader architecture context.
+- **[Data flow](../data-flow.md)** / **[Persistence](../persistence.md)** / **[Schema evolution](../schema-evolution.md)** — implementations of these decisions.
diff --git a/docs/architecture/adr/toc.yml b/docs/architecture/adr/toc.yml
@@ -0,0 +1,10 @@
+- name: Index
+  href: index.md
+- name: "0001 — Deviations from seed docs"
+  href: 0001-deviations-from-seed-docs.md
+- name: "0002 — Declarative rule schema"
+  href: 0002-declarative-rule-schema.md
+- name: "0003 — Pack signing format"
+  href: 0003-pack-signing-format.md
+- name: "0004 — Schema v1.1 rolling windows"
+  href: 0004-schema-v1.1-rolling-windows.md
diff --git a/docs/architecture/data-flow.md b/docs/architecture/data-flow.md
@@ -0,0 +1,215 @@
+---
+title: Data flow
+description: End-to-end — from a counter file on disk to a finding in a report — with the types and components at each hop.
+---
+
+# Data flow
+
+This is the runtime story: how a `.csv` or `.blg` becomes a finding in a report. Six hops, two modes (CLI synchronous vs API asynchronous), one engine.
+
+For the per-component reference, see the **[Project map](index.md#project-map)** on the architecture index.
+
+## The engine pipeline (same in all modes)
+
+```text
+                          ┌─────────────────────────────────────┐
+                          │             RAW INPUT               │
+                          │   capture.csv  or  capture.blg     │
+                          └─────────────┬───────────────────────┘
+                                        │
+            (1) Collector dispatch by file extension
+                                        │
+                  ┌──────────────────────┴──────────────────────┐
+                  ▼                                             ▼
+        ┌──────────────────┐                       ┌─────────────────────┐
+        │   CsvCollector   │                       │    BlgCollector     │
+        │  (any platform)  │                       │   (Windows / PDH)   │
+        └─────────┬────────┘                       └──────────┬──────────┘
+                  │                                            │
+                  └────────────────────┬───────────────────────┘
+                                       │
+                              raw counter paths
+                                       │
+            (2) MetricAliasRegistry normalises paths to canonical IDs
+                                       │
+                                       ▼
+                       ┌─────────────────────────────┐
+                       │           Dataset           │
+                       │  series[], samples, gaps,   │
+                       │       host_context          │
+                       └───────────────┬─────────────┘
+                                       │
+            (3) PackLoader reads YAML; PackValidator gates malformed packs
+                                       │
+                                       ▼
+                       ┌─────────────────────────────┐
+                       │       Pack[] in memory      │
+                       │     applicability filter    │
+                       └───────────────┬─────────────┘
+                                       │
+            (4) RuleEngine evaluates conditions against series
+                                       │
+                                       ▼
+                       ┌─────────────────────────────┐
+                       │         Finding[]           │
+                       │   evidence + statistics     │
+                       │  sorted: sev/cat/rule/id    │
+                       └───────────────┬─────────────┘
+                                       │
+            (5) Report writers serialise
+                                       │
+                  ┌────────────────────┼────────────────────┐
+                  ▼                    ▼                    ▼
+         ┌──────────────┐    ┌──────────────┐     ┌──────────────────┐
+         │  JSON report │    │  HTML report │     │ Markdown report  │
+         │  (canonical) │    │ (browser UX) │     │   (optional)     │
+         └──────────────┘    └──────────────┘     └──────────────────┘
+                                       │
+            (6) ScottPlot writes SVG charts (optional)
+                                       │
+                                       ▼
+                       ┌─────────────────────────────┐
+                       │     charts/*.svg            │
+                       └─────────────────────────────┘
+```
+
+## Hop 1 — Collector dispatch
+
+`CollectorFactory.For(path)` looks at the file extension:
+
+- `.csv` → `CsvCollector` (any platform).
+- `.blg` → `BlgCollector` (Windows-only, throws `PlatformNotSupportedException` elsewhere with a `relog -f CSV` fallback message).
+
+Both collectors emit the same `Dataset` shape — downstream code can't tell them apart.
+
+The CSV path is text — read line by line, parse perfmon's CSV header for counter paths, parse samples by column. The BLG path is binary — open via PDH (`Pdh.dll`), enumerate counters, fetch samples through `PdhCollectQueryData`.
+
+## Hop 2 — Canonical metric IDs
+
+Raw counter paths look like `\\WEB-01\Processor(_Total)\% Processor Time`. Rules don't reference paths — they reference canonical IDs like `processor.percent_processor_time`. `MetricAliasRegistry.Resolve(path)` runs the path against compiled regex patterns and returns the canonical ID, or `null` if nothing matches (which becomes `unknown.<sanitised>`).
+
+The registry's default entries are built into `Pal.Engine.Normalization.MetricAliasRegistry.BuildDefault()` — see **[Reference — Canonical metric IDs](../reference/metric-ids.md)** for the table. Pack-level `metric_aliases:` extends this registry per analysis.
+
+## Hop 3 — Pack loading
+
+`PackLoader.Load(yamlPath, signatureRequirement, trustedKeys)`:
+
+1. Reads the YAML file.
+2. Parses into the `Pack` model (DTOs in `Pal.Engine.Model`).
+3. Optionally verifies the `pack.yaml.sig` sidecar.
+4. Hands the parsed pack to `PackValidator.Validate(pack)`.
+
+`PackValidator` is the source of truth for what constitutes a valid pack — every schema constraint (severity enum, aggregation enum, operator enum, window invariants) is enforced here, not at YAML parse time. Validation errors and warnings are returned to the caller; failures surface as exit code `4` from the CLI or `400/422` from the API.
+
+`PackRegistrySyncService` (API only) drives the loader at startup: it walks `Packs:Directory`, loads each `pack.yaml`, and persists the result into Postgres so the API has a database-backed pack registry alongside the disk source.
+
+## Hop 4 — Rule evaluation
+
+The heart of the engine. `RuleEngine.Evaluate(dataset, packs)`:
+
+```text
+for each pack:
+  if pack.applicability matches dataset:
+    for each rule:
+      if rule.applies_when matches:
+        for each condition:
+          select series (canonical_metric + optional instance filter)
+          compute aggregation (avg, p95, ..., trend, or window-bounded)
+          compare to threshold (number or host_context-resolved)
+          check duration_percent
+        if all conditions satisfied:
+          emit Finding with evidence
+sort findings: severity desc, category asc, rule_id asc, finding_id asc
+```
+
+A few important properties:
+
+- **Determinism.** Two runs against the same dataset with the same packs produce identical findings (modulo `generated_at_utc`, overridable with `--now`). The sort order is total, with `finding_id` (a content hash) as the final tiebreaker.
+- **`host_context` is informational-fallback.** If a rule references `host_context.total_physical_memory_mb` and the value is unknown, the rule is skipped and an informational warning is emitted. Run still succeeds.
+- **Pack-level `applicability` is a fast skip.** If `requires_any` doesn't match the dataset's metric set, the pack's rules are never evaluated. Rule-level `applies_when` is a per-rule equivalent.
+
+`Finding` carries everything needed to render the result: rule metadata, category, severity, the resolved evidence (series + statistics + trigger expression), and inlined recommendations from the pack's `recommendations:` map.
+
+## Hop 5 — Report writing
+
+Three writers, one shared shape:
+
+- `JsonReportWriter` — emits `pal.report/v1` JSON. Canonical; downstream consumers read this.
+- `HtmlReportWriter` — emits a self-contained HTML page. Derived view; renders the same data with a human-friendly layout.
+- `MarkdownReportWriter` — emits GFM tables. Derived view; only invoked when explicitly requested.
+
+All three call `JsonReportWriter.WriteInput(...)` internally to compose the report model, then serialise to their target format. This is why golden-fixture tests work — the writers are deterministic transforms of a fixed-input model.
+
+UTF-8 without BOM is enforced via `new UTF8Encoding(false)` on every write. This is non-negotiable: golden tests are byte-comparison, and a BOM would break them.
+
+## Hop 6 — Chart SVGs (optional)
+
+If `--include-charts` is set (CLI) or charts are otherwise requested, the engine attaches `ChartRef` entries to findings and writes SVGs via `ScottPlot.Plot.Save`. One SVG per (finding × metric) pair, capped by `--chart-limit` (default 20).
+
+Charts are written to `out/charts/<report-name>-<chart-id>.svg`. The HTML report embeds them inline. The JSON report references them by relative path in each finding's `evidence.charts[]`.
+
+ScottPlot's SVG output is canonicalised by `SvgCanonicalizer` before write — IDs are normalised so two runs produce byte-identical SVGs. Without this step, ScottPlot's gradient IDs include process-local counters that would defeat determinism.
+
+## Two runtime modes share the pipeline
+
+### CLI — synchronous
+
+```text
+                       ┌─────────────┐
+   user typed args ───►│  pal CLI    │
+                       │ (synchronous)│
+                       └──────┬──────┘
+                              │
+                              ▼
+                  the 6 hops above, in process
+                              │
+                              ▼
+                       writes to ./out/
+                              │
+                              ▼
+                     exits with status code
+```
+
+`AnalyzeCommand.ExecuteAsync` orchestrates collectors, the engine, the writers. Failures map to `ExitCodes.*` per **[Reference — Exit codes](../reference/exit-codes.md)**.
+
+### API — asynchronous
+
+```text
+                       ┌──────────────┐                           ┌────────────────┐
+   POST /analysis ────►│   HTTP        │──► writes job row ──────►│   Postgres     │
+                       │  handler      │                           └────────────────┘
+                       │ enqueues Guid │
+                       └──────┬───────┘
+                              │
+                              ▼
+                       Channel<Guid> (in-process, single-reader)
+                              │
+                              ▼
+                       ┌──────────────┐
+                       │AnalysisWorker│ (BackgroundService)
+                       └──────┬───────┘
+                              │
+                              ▼
+                  the 6 hops above, same code
+                              │
+                              ▼
+                       writes JSON/HTML to disk + result row to Postgres
+                              │
+                              ▼
+                       (auto-compare if selectedBaselineId set)
+                              │
+                              ▼
+                       (policy evaluation → alerts → webhook delivery)
+```
+
+The engine pipeline is identical. What's different is the orchestration: HTTP enqueues, the worker dequeues, repositories persist, and additional services (`PolicyEvaluator`, `IAutoCompareService`, `NotificationService`) extend the post-analysis flow with alerting and comparisons.
+
+The in-process `Channel<Guid>` keeps the API simple — no external message broker, no Postgres `LISTEN/NOTIFY`. The trade-off: if the API process crashes, queued-but-not-started jobs are lost (the worker channel is in-memory). Jobs that have started but not finished are detected on restart and marked `failed`. This is documented as a Phase 5 improvement candidate.
+
+## Related
+
+- **[Persistence](persistence.md)** — what gets stored after the pipeline completes.
+- **[Schema evolution](schema-evolution.md)** — how the input contract evolves.
+- **[Reference — Report schema](../reference/report-schema.md)** — output shape.
+- **[Reference — Canonical metric IDs](../reference/metric-ids.md)** — the rewrite table for Hop 2.
+- **[ADR 0002 — Declarative Rule Schema](adr/0002-declarative-rule-schema.md)** — why Hop 4 doesn't have an expression parser.