-
Notifications
You must be signed in to change notification settings - Fork 0
docs(site): Architecture + ADR index — 5 pages #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| --- | ||
| title: Architecture Decision Records | ||
| description: Index of every accepted ADR, with status and one-line summary. | ||
| --- | ||
|
|
||
| # Architecture Decision Records | ||
|
|
||
| Architecture Decision Records (ADRs) document **why** a particular architectural choice was made — the context, the alternatives considered, the trade-offs accepted. They're written once and not edited; if a decision changes, a new ADR supersedes the old. | ||
|
|
||
| For background on the ADR practice, see Michael Nygard's [original post](https://www.cognitect.com/blog/2011/11/15/documenting-architecture-decisions). PAL-X follows the standard lightweight pattern: one Markdown file per decision, numbered chronologically, with a status field. | ||
|
|
||
| ## Status legend | ||
|
|
||
| | Status | Meaning | | ||
| |---|---| | ||
| | `Proposed` | Under discussion. Not yet implemented. | | ||
| | `Accepted` | Decided, ratified, implemented. | | ||
| | `Deprecated` | Decision still applies historically but a new ADR supersedes the recommendation. | | ||
| | `Superseded by ADR-####` | Replaced. The new ADR documents the change. | | ||
|
|
||
| No ADRs are currently `Deprecated` or `Superseded`. | ||
|
|
||
| ## Accepted ADRs | ||
|
|
||
| | # | Title | Date | Status | One-line summary | | ||
| |---:|---|---|---|---| | ||
| | 0001 | [Ratified Deviations from Seed Documentation](0001-deviations-from-seed-docs.md) | 2026-04-23 | Accepted | 12 deviations from the ChatGPT-generated seed docs, ratified at project kickoff: tri-state status (no 0–100 score), declarative comparators (no DSL), content-hash IDs, snake_case fields, `host_context` in v1 schema, Spectre.Console.Cli over System.CommandLine, ScottPlot for charts, and others. | | ||
| | 0002 | [Declarative Rule Schema Instead of Custom DSL](0002-declarative-rule-schema.md) | 2026-04-23 | Accepted | Rule conditions are declarative — `metric` + `aggregation` + `operator` + `threshold` + `duration_percent` + optional `window`. No expression parser. Trades expressivity for stability, auditability, and zero parser maintenance. | | ||
| | 0003 | [Pack Signing Format and Trust Model](0003-pack-signing-format.md) | 2026-04-27 | Accepted | RSA-PSS-SHA256 with 3072-bit keys, signing raw `pack.yaml` bytes, sidecar file at `pack.yaml.sig`. BCL-only (no NuGet dep). Trust model is consumer-rooted via embedded project key + CLI `--trust-key`. | | ||
| | 0004 | [Schema v1.1: Rolling-Window Aggregations (In-Place Enum Bump)](0004-schema-v1.1-rolling-windows.md) | 2026-04-27 | Accepted | Pack schema gains rolling-window aggregations via an additive `window:` field on `Condition`. Schema discriminator is in-place (`schema_version: "pal.pack/v1.1"`); no new JSON Schema file. Validator gates `window:` on the v1.1 version. | | ||
|
|
||
| ## Reading an ADR | ||
|
|
||
| Each ADR follows the same structure: | ||
|
|
||
| - **Context** — the problem being solved and the constraints. | ||
| - **Decisions** — what was chosen, often broken into sub-decisions. | ||
| - **Consequences** — what changed, what we gave up, what's now harder or easier. | ||
| - **Alternatives considered** — what we didn't pick and why. | ||
|
|
||
| The most important read for new contributors is **[ADR 0001](0001-deviations-from-seed-docs.md)** — it documents every design choice that diverges from the seeded ChatGPT spec, and the diverging choice is the load-bearing one in nearly every case. | ||
|
|
||
| ## Authoring a new ADR | ||
|
|
||
| When a non-trivial architectural decision is made: | ||
|
|
||
| 1. Number the new ADR sequentially (e.g., `0005-…`). | ||
| 2. Use the same heading structure as the existing ones. | ||
| 3. Set status to `Accepted` once the decision is final; don't ship `Proposed` ADRs in production branches. | ||
| 4. Link to the ADR from any code or doc that implements it — bidirectional references catch drift. | ||
| 5. Add an entry to this index. | ||
|
|
||
| If an ADR supersedes an earlier one, update the earlier ADR's status to `Superseded by ADR-####` and link forward. | ||
|
|
||
| ADRs are not retrospective documentation. If a decision was made informally and you're documenting it after the fact, that's fine — but make it clear in the Context section. Date the ADR with when it was written; date the decision (in Context) with when it was made. | ||
|
|
||
| ## Where ADRs are NOT the answer | ||
|
|
||
| ADRs are heavyweight. Don't write one for: | ||
|
|
||
| - **Bug fixes** — those live in commit messages and PR descriptions. | ||
| - **Refactors that preserve external behaviour** — same. | ||
| - **Tactical implementation choices** — e.g., "use a `HashSet` here" doesn't need an ADR. | ||
| - **Configuration defaults** — those belong in `appsettings.json` and **[Reference — Configuration](../../reference/configuration.md)**. | ||
|
|
||
| ADRs are for **decisions that constrain future work** — choices a future contributor needs to know about to avoid relitigating, breaking, or reversing without strong cause. | ||
|
|
||
| ## Related | ||
|
|
||
| - **[Architecture index](../index.md)** — the broader architecture context. | ||
| - **[Data flow](../data-flow.md)** / **[Persistence](../persistence.md)** / **[Schema evolution](../schema-evolution.md)** — implementations of these decisions. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| - name: Index | ||
| href: index.md | ||
| - name: "0001 — Deviations from seed docs" | ||
| href: 0001-deviations-from-seed-docs.md | ||
| - name: "0002 — Declarative rule schema" | ||
| href: 0002-declarative-rule-schema.md | ||
| - name: "0003 — Pack signing format" | ||
| href: 0003-pack-signing-format.md | ||
| - name: "0004 — Schema v1.1 rolling windows" | ||
| href: 0004-schema-v1.1-rolling-windows.md |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,215 @@ | ||
| --- | ||
| title: Data flow | ||
| description: End-to-end — from a counter file on disk to a finding in a report — with the types and components at each hop. | ||
| --- | ||
|
|
||
| # Data flow | ||
|
|
||
| This is the runtime story: how a `.csv` or `.blg` becomes a finding in a report. Six hops, two modes (CLI synchronous vs API asynchronous), one engine. | ||
|
|
||
| For the per-component reference, see the **[Project map](index.md#project-map)** on the architecture index. | ||
|
|
||
| ## The engine pipeline (same in all modes) | ||
|
|
||
| ```text | ||
| ┌─────────────────────────────────────┐ | ||
| │ RAW INPUT │ | ||
| │ capture.csv or capture.blg │ | ||
| └─────────────┬───────────────────────┘ | ||
| │ | ||
| (1) Collector dispatch by file extension | ||
| │ | ||
| ┌──────────────────────┴──────────────────────┐ | ||
| ▼ ▼ | ||
| ┌──────────────────┐ ┌─────────────────────┐ | ||
| │ CsvCollector │ │ BlgCollector │ | ||
| │ (any platform) │ │ (Windows / PDH) │ | ||
| └─────────┬────────┘ └──────────┬──────────┘ | ||
| │ │ | ||
| └────────────────────┬───────────────────────┘ | ||
| │ | ||
| raw counter paths | ||
| │ | ||
| (2) MetricAliasRegistry normalises paths to canonical IDs | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────┐ | ||
| │ Dataset │ | ||
| │ series[], samples, gaps, │ | ||
| │ host_context │ | ||
| └───────────────┬─────────────┘ | ||
| │ | ||
| (3) PackLoader reads YAML; PackValidator gates malformed packs | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────┐ | ||
| │ Pack[] in memory │ | ||
| │ applicability filter │ | ||
| └───────────────┬─────────────┘ | ||
| │ | ||
| (4) RuleEngine evaluates conditions against series | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────┐ | ||
| │ Finding[] │ | ||
| │ evidence + statistics │ | ||
| │ sorted: sev/cat/rule/id │ | ||
| └───────────────┬─────────────┘ | ||
| │ | ||
| (5) Report writers serialise | ||
| │ | ||
| ┌────────────────────┼────────────────────┐ | ||
| ▼ ▼ ▼ | ||
| ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ | ||
| │ JSON report │ │ HTML report │ │ Markdown report │ | ||
| │ (canonical) │ │ (browser UX) │ │ (optional) │ | ||
| └──────────────┘ └──────────────┘ └──────────────────┘ | ||
| │ | ||
| (6) ScottPlot writes SVG charts (optional) | ||
| │ | ||
| ▼ | ||
| ┌─────────────────────────────┐ | ||
| │ charts/*.svg │ | ||
| └─────────────────────────────┘ | ||
| ``` | ||
|
|
||
| ## Hop 1 — Collector dispatch | ||
|
|
||
| `CollectorFactory.For(path)` looks at the file extension: | ||
|
|
||
| - `.csv` → `CsvCollector` (any platform). | ||
| - `.blg` → `BlgCollector` (Windows-only, throws `PlatformNotSupportedException` elsewhere with a `relog -f CSV` fallback message). | ||
|
|
||
| Both collectors emit the same `Dataset` shape — downstream code can't tell them apart. | ||
|
|
||
| The CSV path is text — read line by line, parse perfmon's CSV header for counter paths, parse samples by column. The BLG path is binary — open via PDH (`Pdh.dll`), enumerate counters, fetch samples through `PdhCollectQueryData`. | ||
|
|
||
| ## Hop 2 — Canonical metric IDs | ||
|
|
||
| Raw counter paths look like `\\WEB-01\Processor(_Total)\% Processor Time`. Rules don't reference paths — they reference canonical IDs like `processor.percent_processor_time`. `MetricAliasRegistry.Resolve(path)` runs the path against compiled regex patterns and returns the canonical ID, or `null` if nothing matches (which becomes `unknown.<sanitised>`). | ||
|
|
||
| The registry's default entries are built into `Pal.Engine.Normalization.MetricAliasRegistry.BuildDefault()` — see **[Reference — Canonical metric IDs](../reference/metric-ids.md)** for the table. Pack-level `metric_aliases:` extends this registry per analysis. | ||
|
|
||
| ## Hop 3 — Pack loading | ||
|
|
||
| `PackLoader.Load(yamlPath, signatureRequirement, trustedKeys)`: | ||
|
|
||
| 1. Reads the YAML file. | ||
| 2. Parses into the `Pack` model (DTOs in `Pal.Engine.Model`). | ||
| 3. Optionally verifies the `pack.yaml.sig` sidecar. | ||
| 4. Hands the parsed pack to `PackValidator.Validate(pack)`. | ||
|
|
||
| `PackValidator` is the source of truth for what constitutes a valid pack — every schema constraint (severity enum, aggregation enum, operator enum, window invariants) is enforced here, not at YAML parse time. Validation errors and warnings are returned to the caller; failures surface as exit code `4` from the CLI or `400/422` from the API. | ||
|
|
||
| `PackRegistrySyncService` (API only) drives the loader at startup: it walks `Packs:Directory`, loads each `pack.yaml`, and persists the result into Postgres so the API has a database-backed pack registry alongside the disk source. | ||
|
|
||
| ## Hop 4 — Rule evaluation | ||
|
|
||
| The heart of the engine. `RuleEngine.Evaluate(dataset, packs)`: | ||
|
|
||
| ```text | ||
| for each pack: | ||
| if pack.applicability matches dataset: | ||
| for each rule: | ||
| if rule.applies_when matches: | ||
| for each condition: | ||
| select series (canonical_metric + optional instance filter) | ||
| compute aggregation (avg, p95, ..., trend, or window-bounded) | ||
| compare to threshold (number or host_context-resolved) | ||
| check duration_percent | ||
| if all conditions satisfied: | ||
| emit Finding with evidence | ||
| sort findings: severity desc, category asc, rule_id asc, finding_id asc | ||
| ``` | ||
|
|
||
| A few important properties: | ||
|
|
||
| - **Determinism.** Two runs against the same dataset with the same packs produce identical findings (modulo `generated_at_utc`, overridable with `--now`). The sort order is total, with `finding_id` (a content hash) as the final tiebreaker. | ||
| - **`host_context` is informational-fallback.** If a rule references `host_context.total_physical_memory_mb` and the value is unknown, the rule is skipped and an informational warning is emitted. Run still succeeds. | ||
| - **Pack-level `applicability` is a fast skip.** If `requires_any` doesn't match the dataset's metric set, the pack's rules are never evaluated. Rule-level `applies_when` is a per-rule equivalent. | ||
|
|
||
| `Finding` carries everything needed to render the result: rule metadata, category, severity, the resolved evidence (series + statistics + trigger expression), and inlined recommendations from the pack's `recommendations:` map. | ||
|
|
||
| ## Hop 5 — Report writing | ||
|
|
||
| Three writers, one shared shape: | ||
|
|
||
| - `JsonReportWriter` — emits `pal.report/v1` JSON. Canonical; downstream consumers read this. | ||
| - `HtmlReportWriter` — emits a self-contained HTML page. Derived view; renders the same data with a human-friendly layout. | ||
| - `MarkdownReportWriter` — emits GFM tables. Derived view; only invoked when explicitly requested. | ||
|
|
||
| All three call `JsonReportWriter.WriteInput(...)` internally to compose the report model, then serialise to their target format. This is why golden-fixture tests work — the writers are deterministic transforms of a fixed-input model. | ||
|
|
||
| UTF-8 without BOM is enforced via `new UTF8Encoding(false)` on every write. This is non-negotiable: golden tests are byte-comparison, and a BOM would break them. | ||
|
|
||
| ## Hop 6 — Chart SVGs (optional) | ||
|
|
||
| If `--include-charts` is set (CLI) or charts are otherwise requested, the engine attaches `ChartRef` entries to findings and writes SVGs via `ScottPlot.Plot.Save`. One SVG per (finding × metric) pair, capped by `--chart-limit` (default 20). | ||
|
|
||
| Charts are written to `out/charts/<report-name>-<chart-id>.svg`. The HTML report embeds them inline. The JSON report references them by relative path in each finding's `evidence.charts[]`. | ||
|
|
||
| ScottPlot's SVG output is canonicalised by `SvgCanonicalizer` before write — IDs are normalised so two runs produce byte-identical SVGs. Without this step, ScottPlot's gradient IDs include process-local counters that would defeat determinism. | ||
|
|
||
| ## Two runtime modes share the pipeline | ||
|
|
||
| ### CLI — synchronous | ||
|
|
||
| ```text | ||
| ┌─────────────┐ | ||
| user typed args ───►│ pal CLI │ | ||
| │ (synchronous)│ | ||
| └──────┬──────┘ | ||
| │ | ||
| ▼ | ||
| the 6 hops above, in process | ||
| │ | ||
| ▼ | ||
| writes to ./out/ | ||
| │ | ||
| ▼ | ||
| exits with status code | ||
| ``` | ||
|
|
||
| `AnalyzeCommand.ExecuteAsync` orchestrates collectors, the engine, the writers. Failures map to `ExitCodes.*` per **[Reference — Exit codes](../reference/exit-codes.md)**. | ||
|
|
||
| ### API — asynchronous | ||
|
|
||
| ```text | ||
| ┌──────────────┐ ┌────────────────┐ | ||
| POST /analysis ────►│ HTTP │──► writes job row ──────►│ Postgres │ | ||
| │ handler │ └────────────────┘ | ||
| │ enqueues Guid │ | ||
| └──────┬───────┘ | ||
| │ | ||
| ▼ | ||
| Channel<Guid> (in-process, single-reader) | ||
| │ | ||
| ▼ | ||
| ┌──────────────┐ | ||
| │AnalysisWorker│ (BackgroundService) | ||
| └──────┬───────┘ | ||
| │ | ||
| ▼ | ||
| the 6 hops above, same code | ||
| │ | ||
| ▼ | ||
| writes JSON/HTML to disk + result row to Postgres | ||
| │ | ||
| ▼ | ||
| (auto-compare if selectedBaselineId set) | ||
| │ | ||
| ▼ | ||
| (policy evaluation → alerts → webhook delivery) | ||
| ``` | ||
|
|
||
| The engine pipeline is identical. What's different is the orchestration: HTTP enqueues, the worker dequeues, repositories persist, and additional services (`PolicyEvaluator`, `IAutoCompareService`, `NotificationService`) extend the post-analysis flow with alerting and comparisons. | ||
|
|
||
| The in-process `Channel<Guid>` keeps the API simple — no external message broker, no Postgres `LISTEN/NOTIFY`. The trade-off: if the API process crashes, queued-but-not-started jobs are lost (the worker channel is in-memory). Jobs that have started but not finished are detected on restart and marked `failed`. This is documented as a Phase 5 improvement candidate. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This sentence describes restart behavior opposite to the implementation: jobs are persisted in Useful? React with 👍 / 👎. |
||
|
|
||
| ## Related | ||
|
|
||
| - **[Persistence](persistence.md)** — what gets stored after the pipeline completes. | ||
| - **[Schema evolution](schema-evolution.md)** — how the input contract evolves. | ||
| - **[Reference — Report schema](../reference/report-schema.md)** — output shape. | ||
| - **[Reference — Canonical metric IDs](../reference/metric-ids.md)** — the rewrite table for Hop 2. | ||
| - **[ADR 0002 — Declarative Rule Schema](adr/0002-declarative-rule-schema.md)** — why Hop 4 doesn't have an expression parser. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section documents a chart pipeline (
--include-charts,evidence.charts[],out/charts/...) that the current code path does not implement: findings do not carry chart references and neither CLI nor API analysis flow invokes chart rendering/writes chart files. Users and automation following this architecture contract will wait for artifacts that are never produced.Useful? React with 👍 / 👎.