Skip to content

Commit bb5a740

Browse files
committed
fix(C9): move peer delegation registry control to C9.6 as 9.6.7 per Otto's review; resolve conflict
- Move from C9.5 (messaging/protocol) to C9.6 (authorization/delegation) where it belongs - Narrow to the net-new concept: approved agent registry/allowlist as a second check beyond authentication (identity and scope validation already covered by 9.4.1, 9.5.1, 9.6.1, 9.6.3) - Adopt Otto's suggested wording verbatim - Renumber from 9.5.5 to 9.6.7 - Update Appendix D entry description and ID - Resolve Appendix D conflict (keep both 9.6.7 and 9.8.7 entries)
2 parents 07cf4ec + c1cf517 commit bb5a740

File tree

69 files changed

+341
-230
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

69 files changed

+341
-230
lines changed

1.0/en/0x10-C06-Supply-Chain.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ AI supply-chain attacks exploit third-party models, frameworks, or datasets to e
1111
Assess and authenticate third-party model origins, licenses, and hidden behaviors before any fine-tuning or deployment.
1212

1313
| # | Description | Level |
14-
| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---:|
14+
| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---: |
1515
| **6.1.1** | **Verify that** every third-party model artifact includes a signed origin-and-integrity record identifying its source, version, and integrity checksum. | 1 |
1616
| **6.1.2** | **Verify that** models are scanned for malicious layers or Trojan triggers using automated tools before import. | 1 |
1717
| **6.1.3** | **Verify that** model licenses, export-control tags, and data-origin statements are recorded in an AI BOM entry. | 2 |
@@ -26,7 +26,7 @@ Assess and authenticate third-party model origins, licenses, and hidden behavior
2626
Continuously scan AI frameworks and libraries for vulnerabilities and malicious code to keep the runtime stack secure.
2727

2828
| # | Description | Level |
29-
| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---:|
29+
| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---: |
3030
| **6.2.1** | **Verify that** CI pipelines run dependency scanners on AI frameworks and critical libraries. | 1 |
3131
| **6.2.2** | **Verify that** critical and high-severity vulnerabilities block promotion to production images. | 2 |
3232
| **6.2.3** | **Verify that** static code analysis runs on forked or vendored AI libraries. | 2 |
@@ -40,7 +40,7 @@ Continuously scan AI frameworks and libraries for vulnerabilities and malicious
4040
Pin every dependency to immutable digests and verify builds to guarantee tamper-free artifacts.
4141

4242
| # | Description | Level |
43-
| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---:|
43+
| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---: |
4444
| **6.3.1** | **Verify that** all package managers enforce version pinning via lockfiles. | 1 |
4545
| **6.3.2** | **Verify that** immutable digests are used instead of mutable tags in container references. | 1 |
4646
| **6.3.3** | **Verify that** expired or unmaintained dependencies trigger automated notifications to update or replace pinned versions. | 2 |
@@ -54,12 +54,13 @@ Pin every dependency to immutable digests and verify builds to guarantee tamper-
5454
Allow artifact downloads only from cryptographically verified, organization-approved sources and block everything else.
5555

5656
| # | Description | Level |
57-
| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---:|
57+
| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---: |
5858
| **6.4.1** | **Verify that** model weights, datasets, and containers are downloaded only from approved sources or internal registries. | 1 |
5959
| **6.4.2** | **Verify that** cryptographic signatures validate publisher identity before artifacts are cached locally. | 1 |
6060
| **6.4.3** | **Verify that** egress controls block unauthenticated artifact downloads to enforce trusted-source policy. | 2 |
6161
| **6.4.4** | **Verify that** repository allow-lists are reviewed periodically with evidence of business justification for each entry. | 3 |
6262
| **6.4.5** | **Verify that** policy violations trigger quarantining of artifacts and rollback of dependent pipeline runs. | 3 |
63+
| **6.4.6** | **Verify that** cryptographic signing keys used to authenticate model publishers are pinned per source registry (e.g., Hugging Face, internal registry), and that key rotation events require explicit re-approval before updated keys are trusted. | 3 |
6364

6465
---
6566

@@ -68,7 +69,7 @@ Allow artifact downloads only from cryptographically verified, organization-appr
6869
Evaluate external datasets for poisoning, bias, and legal compliance, and monitor them throughout their lifecycle.
6970

7071
| # | Description | Level |
71-
| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---:|
72+
| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---: |
7273
| **6.5.1** | **Verify that** external datasets undergo poisoning risk assessment (e.g., data fingerprinting, outlier detection). | 1 |
7374
| **6.5.2** | **Verify that** disallowed content (e.g., copyrighted material, PII) is detected and removed via automated scrubbing prior to training. | 1 |
7475
| **6.5.3** | **Verify that** origin, lineage, and license terms for datasets are captured in AI BOM entries. | 2 |
@@ -82,7 +83,7 @@ Evaluate external datasets for poisoning, bias, and legal compliance, and monito
8283
Detect supply-chain threats early through vulnerability feeds, audit-log analytics, and incident response readiness.
8384

8485
| # | Description | Level |
85-
| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---:|
86+
| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---: |
8687
| **6.6.1** | **Verify that** incident response playbooks include rollback procedures for compromised models or libraries. | 2 |
8788
| **6.6.2** | **Verify that** CI/CD audit logs are streamed to centralized security monitoring in real time. | 2 |
8889
| **6.6.3** | **Verify that** threat-intelligence enrichment tags AI-specific indicators (e.g., model-poisoning indicators of compromise) in alert triage. | 3 |
@@ -95,7 +96,7 @@ Detect supply-chain threats early through vulnerability feeds, audit-log analyti
9596
Generate and sign detailed AI-specific bills of materials (AI BOMs) so downstream consumers can verify component integrity at deploy time.
9697

9798
| # | Description | Level |
98-
| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---:|
99+
| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---: |
99100
| **6.7.1** | **Verify that** every model artifact publishes an AI BOM that lists datasets, weights, hyperparameters, and licenses. | 1 |
100101
| **6.7.2** | **Verify that** AI BOM generation and cryptographic signing are automated in CI and required for merge. | 2 |
101102
| **6.7.3** | **Verify that** AI BOM completeness checks fail the build if any component metadata (hash and license) is missing. | 2 |

1.0/en/0x10-C07-Model-Behavior.md

Lines changed: 15 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ This control category ensures that model outputs are technically constrained, va
1111
Ensure the model outputs data in a way that helps prevent injection.
1212

1313
| # | Description | Level |
14-
|:--------:|---------------------------------------------------------------------------------------------------------------------|:---:|
14+
| :--------: | --------------------------------------------------------------------------------------------------------------------- | :---: |
1515
| **7.1.1** | **Verify that** the application validates all model outputs against a strict schema (like JSON Schema) and rejects any output that does not match. | 1 |
1616
| **7.1.2** | **Verify that** the system uses "stop sequences" or token limits to strictly cut off generation before it can overflow buffers or executes unintended commands. | 1 |
1717
| **7.1.3** | **Verify that** components processing model output treat it as untrusted input (e.g., using parameterized queries or safe de-serializers). | 1 |
@@ -24,7 +24,7 @@ Ensure the model outputs data in a way that helps prevent injection.
2424
Detect when the model produces potentially inaccurate or fabricated content and prevent unreliable outputs from reaching users or downstream systems.
2525

2626
| # | Description | Level |
27-
|:--------:|---------------------------------------------------------------------------------------------------------------------|:---:|
27+
| :--------: | --------------------------------------------------------------------------------------------------------------------- | :---: |
2828
| **7.2.1** | **Verify that** the system assesses the reliability of generated answers using a confidence or uncertainty estimation method (e.g., confidence scoring, retrieval-based verification, or model uncertainty estimation). | 1 |
2929
| **7.2.2** | **Verify that** the application automatically blocks answers or switches to a fallback message if the confidence score drops below a defined threshold. | 2 |
3030
| **7.2.3** | **Verify that** hallucination events (low-confidence responses) are logged with input/output metadata for analysis. | 2 |
@@ -38,7 +38,7 @@ Detect when the model produces potentially inaccurate or fabricated content and
3838
Technical controls to detect and scrub bad content before it is shown to the user.
3939

4040
| # | Description | Level |
41-
|:--------:|---------------------------------------------------------------------------------------------------------------------|:---:|
41+
| :--------: | --------------------------------------------------------------------------------------------------------------------- | :---: |
4242
| **7.3.1** | **Verify that** automated classifiers scan every response and block content that matches hate, harassment, or sexual violence categories. | 1 |
4343
| **7.3.2** | **Verify that** the system scans every response for PII (like credit cards or emails) and automatically redacts it before display. | 1 |
4444
| **7.3.3** | **Verify that** PII detection and redaction events are logged without including the redacted PII values themselves, to maintain an audit trail without creating secondary PII exposure. | 1 |
@@ -47,6 +47,7 @@ Technical controls to detect and scrub bad content before it is shown to the use
4747
| **7.3.6** | **Verify that** the system requires a human approval step or re-authentication if the model generates high-risk content. | 3 |
4848
| **7.3.7** | **Verify that** output filters detect and block responses that reproduce verbatim segments of system prompt content. | 2 |
4949
| **7.3.8** | **Verify that** LLM client applications prevent model-generated output from triggering automatic outbound requests (e.g., auto-rendered images, iframes, or link prefetching) to attacker-controlled endpoints, for example by disabling automatic external resource loading or restricting it to explicitly allowlisted origins as appropriate. | 2 |
50+
| **7.3.9** | **Verify that** generated outputs are analyzed for statistical steganographic covert channels (e.g., biased token-choice patterns or output distribution anomalies) that could encode hidden data across the model's valid output space, and that detections are flagged for review. | 3 |
5051

5152
---
5253

@@ -55,10 +56,11 @@ Technical controls to detect and scrub bad content before it is shown to the use
5556
Prevent the model from doing too much, too fast, or accessing things it should not.
5657

5758
| # | Description | Level |
58-
|:--------:|---------------------------------------------------------------------------------------------------------------------|:---:|
59+
| :--------: | --------------------------------------------------------------------------------------------------------------------- | :---: |
5960
| **7.4.1** | **Verify that** the system enforces hard limits on requests and tokens per user to prevent cost spikes and denial of service. | 1 |
6061
| **7.4.2** | **Verify that** the model cannot execute high-impact actions (like writing files, sending emails, or executing code) without explicit user confirmation. | 1 |
61-
| **7.4.3** | **Verify that** the application or orchestration framework explicitly configures and enforces the maximum depth of recursive calls, delegation limits, and the list of allowed external tools. | 2 |
62+
| **7.4.3** | **Verify that** the application or orchestration framework explicitly configures and enforces a maximum depth for recursive calls to prevent unbounded recursion. | 2 |
63+
| **7.4.4** | **Verify that** the application or orchestration framework explicitly configures and enforces a maximum number of sequential or nested sub-task delegations within a single execution chain, and that chains exceeding this limit are halted. For agent-specific tool and action authorization, see C9.6. | 2 |
6264

6365
---
6466

@@ -67,10 +69,10 @@ Prevent the model from doing too much, too fast, or accessing things it should n
6769
Ensure the user knows why a decision was made.
6870

6971
| # | Description | Level |
70-
| :-------: | ------------------------------------------------------------------------------------------------------------------------------ | :---:|
72+
| :-------: | ------------------------------------------------------------------------------------------------------------------------------ | :---: |
7173
| **7.5.1** | **Verify that** explanations provided to the user are sanitized to remove system prompts or backend data. | 1 |
7274
| **7.5.2** | **Verify that** the UI displays a confidence score or "reasoning summary" to the user for critical decisions. | 2 |
73-
| **7.5.3** | **Verify that** technical evidence of the model's decision, such as model interpretability artifacts (e.g., attention maps, feature attributions), are logged.| 3 |
75+
| **7.5.3** | **Verify that** technical evidence of the model's decision, such as model interpretability artifacts (e.g., attention maps, feature attributions), are logged. | 3 |
7476

7577
---
7678

@@ -79,8 +81,8 @@ Ensure the user knows why a decision was made.
7981
Ensure the application sends the right signals for security teams to watch.
8082

8183
| # | Description | Level |
82-
| :-------: | -------------------------------------------------------------------------------------------------------------------------------------------- | :---:|
83-
| **7.6.1** | **Verify that** the system logs real-time metrics for safety violations (e.g., "Hallucination Detected", "PII Blocked").| 1 |
84+
| :-------: | -------------------------------------------------------------------------------------------------------------------------------------------- | :---: |
85+
| **7.6.1** | **Verify that** the system logs real-time metrics for safety violations (e.g., "Hallucination Detected", "PII Blocked"). | 1 |
8486
| **7.6.2** | **Verify that** the system triggers an alert if safety violation rates exceed a defined threshold within a specific time window. | 2 |
8587
| **7.6.3** | **Verify that** logs include the specific model version and other details necessary to investigate potential abuse. | 2 |
8688

@@ -91,7 +93,7 @@ Ensure the application sends the right signals for security teams to watch.
9193
Prevent the creation of illegal or fake media.
9294

9395
| # | Description | Level |
94-
| :-------: | -------------------------------------------------------------------------------------------------------------------------------------------- | :---:|
96+
| :-------: | -------------------------------------------------------------------------------------------------------------------------------------------- | :---: |
9597
| **7.7.1** | **Verify that** input filters block prompts requesting explicit or non-consensual synthetic content before the model processes them. | 1 |
9698
| **7.7.2** | **Verify that** the system refuses to generate media (images/audio) that depicts real people without verified consent. | 2 |
9799
| **7.7.3** | **Verify that** the system checks generated content for copyright violations before releasing it. | 2 |
@@ -105,9 +107,10 @@ Prevent the creation of illegal or fake media.
105107
Ensure RAG-grounded outputs are traceable to their source documents and that cited claims are verifiably supported by retrieved content.
106108

107109
| # | Description | Level |
108-
| :-------: | -------------------------------------------------------------------------------------------------------------------------------------------- | :---:|
109-
| **7.8.1** | **Verify that** responses generated using retrieval-augmented generation (RAG) include attribution to the source documents that grounded the response, and that attributions are derived from retrieval metadata rather than generated by the model. | 1 |
110+
| :-------: | -------------------------------------------------------------------------------------------------------------------------------------------- | :---: |
111+
| **7.8.1** | **Verify that** responses generated using retrieval-augmented generation (RAG) include attribution to the source documents that grounded the response. | 1 |
110112
| **7.8.2** | **Verify that** each sourced claim in a RAG-grounded response can be traced to a specific retrieved chunk, and that the system detects and flags responses where claims are not supported by any retrieved content before the response is served. | 3 |
113+
| **7.8.3** | **Verify that** RAG attributions are derived from retrieval metadata and are not generated by the model, ensuring provenance cannot be fabricated. | 1 |
111114

112115
---
113116

0 commit comments

Comments
 (0)