Skip to content

feat(pq): dependency-level PQ scanning (closes #221)#260

Merged
peaktwilight merged 9 commits intomainfrom
feat/dep-level-pq-scanning
Apr 24, 2026
Merged

feat(pq): dependency-level PQ scanning (closes #221)#260
peaktwilight merged 9 commits intomainfrom
feat/dep-level-pq-scanning

Conversation

@Darkroom4364
Copy link
Copy Markdown
Collaborator

@Darkroom4364 Darkroom4364 commented Apr 22, 2026

Summary

  • Scan Cargo.lock and requirements.txt for dependencies that use PQ-vulnerable cryptography
  • CargoLockPqCrypto: BFS transitive graph traversal with tiered confidence (0.9 for single-algorithm seeds like rsa/ed25519-dalek, 0.6 for multi-algorithm seeds like ring/aws-lc-rs)
  • RequirementsTxtPqCrypto: direct membership check against 11 curated packages with per-package confidence (0.5–0.95)
  • New dep_name field on Finding for manifest-level attribution
  • Replaces is_pq_rule_id substring hack with explicit whitelist (fixes Dockerfile rule ID)
  • Flows into CBOM, SARIF, JSON, TUI, CNSA 2.0 panel with zero output plumbing changes

Test plan

  • cargo test — 531 pass, 0 fail
  • Fixture Cargo.lock with rustls→rsa (tier 1) and reqwest→ring (tier 2) transitive chains
  • Fixture requirements.txt with comments, -e, git+, extras, env markers — all correctly skipped
  • Seed crates themselves are not flagged (only dependents)
  • detect_language recognizes Cargo.lock and requirements.txt as Language::Manifest

Summary by CodeRabbit

  • New Features

    • Manifest-level scanning for Rust (Cargo.lock) and Python (requirements.txt) to detect quantum-vulnerable cryptography with transitive analysis, confidence scoring, and fix suggestions.
    • UI/rules list now includes a "Manifest" rule group and manifest-language detection for relevant files.
    • Findings may include an optional dependency name for manifest-level results.
  • Tests

    • Added unit and integration tests validating manifest parsing, detection, language recognition, offsets, and reported finding fields.

…s.txt (closes #221)

Scan manifest/lock files against curated crypto package databases to
surface PQ-vulnerable dependencies that source-level rules can't see.

- CargoLockPqCrypto: BFS transitive graph traversal with 6 tier-1 seeds
  (RSA, ECDSA, Ed25519, X25519 at 0.9 confidence) and 3 tier-2 seeds
  (ring, openssl-sys, aws-lc-rs at 0.6)
- RequirementsTxtPqCrypto: direct lookup against 11 curated packages
  with per-package confidence (0.5-0.95), PEP 503 normalization
- New dep_name field on Finding for manifest-level attribution
- Replace is_pq_rule_id substring hack with explicit whitelist
- Language::Manifest via bash-dummy tree-sitter pattern
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 22, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds manifest-level PQ crypto scanning: new Language::Manifest, dep_name on Finding, two manifest rules (Cargo.lock and requirements.txt), parser/scanner wiring for manifests, many Finding constructions/tests updated, and rules/UI generation updated.

Changes

Cohort / File(s) Summary
Core Types & deps
Cargo.toml, src/lib.rs
Add toml = "0.8" dependency; add Language::Manifest and new field pub dep_name: Option<String> to Finding (serialized omitted when None).
Manifest Rules
src/rules/manifest.rs, src/rules/mod.rs
Add CargoLockPqCrypto and RequirementsTxtPqCrypto rules; register manifest module and both rules in the RuleRegistry.
Parser & Scanner
src/engine/parser.rs, src/engine/scanner.rs
Treat Cargo.lock and requirements.txt as Language::Manifest; map Manifest to the same tree-sitter parser as Dockerfile/bash; set # as manifest comment prefix for inline ignore parsing.
Rule Finding Construction
src/rules/..., src/rules/common.rs
Initialize dep_name: None across Finding construction sites (Go, JS, Kotlin, Python, semgrep, taint helpers) to match updated Finding shape.
Secrets & Reports
src/secrets.rs, src/report/cbom.rs, src/report/github_pr.rs
Populate dep_name: None for emitted findings and test helpers.
App & PQ ID matching
src/app.rs
Restrict PQ-rule detection by replacing broad substring match with an exact rule-id equality check for a specific config rule.
Test Fixtures & Helpers
src/baseline.rs, src/compliance.rs, src/config.rs, src/diff.rs, src/tui.rs
Update many test fixtures and helpers to include dep_name: None.
Integration Tests & Fixtures
tests/integration.rs, tests/fixtures/deps/requirements.txt
Add integration tests validating manifest PQ detection and language detection; add a requirements.txt fixture.
Rules TS output / UI
src/bin/gen_rules_ts.rs, www/src/data/rules.ts
Add manifestRules and a Manifest group to generated rules.ts; increment productLanguageCount (14 → 15).

Sequence Diagram

sequenceDiagram
    participant File as Manifest file (Cargo.lock / requirements.txt)
    participant Scanner as Scanner (language detection)
    participant Parser as Parser (tree-sitter)
    participant Rule as Manifest Rule
    participant Graph as DepGraph/Matcher
    participant Emitter as Finding Emitter

    File->>Scanner: detect language -> Manifest
    Scanner->>Parser: parse_file(content, Manifest)
    Parser->>Rule: provide AST / content
    Rule->>Graph: extract packages & build dependency graph
    Graph->>Graph: traverse (BFS) to find curated seed reachability
    Graph->>Rule: report matched vulnerable package(s)
    Rule->>Emitter: compute offsets, confidence, crypto_algorithm, dep_name
    Emitter->>File: emit Finding(s)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

Suggested reviewers

  • peaktwilight

"I nibble bytes and hop through trees,
Cargo locks and pipled seas,
I trace transitive seeds with glee,
Pin the culprit for you to see,
A rabbit's sniff finds PQ spree! 🐇🔎"

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(pq): dependency-level PQ scanning (closes #221)' clearly and concisely summarizes the main change: adding dependency-level post-quantum cryptography scanning to the codebase.
Docstring Coverage ✅ Passed Docstring coverage is 95.05% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/dep-level-pq-scanning

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 22, 2026

Greptile Summary

This PR adds dependency-level PQ (post-quantum) cryptography scanning for Cargo.lock and requirements.txt, fulfilling issue #221. The two new rules — CargoLockPqCrypto and RequirementsTxtPqCrypto — join the existing source-level PQ rules across Python, Go, JavaScript, and Rust, and flow into CBOM, SARIF, JSON, and TUI outputs without changes to those output layers.

Key changes:

  • src/rules/manifest.rs: New rules using BFS transitive graph traversal (Cargo.lock) and curated list lookup (requirements.txt); tiered confidence scoring (0.9 for single-algorithm seeds, 0.6 for multi-algorithm); CRLF-aware byte offset tracking; find_name_version_offset helper disambiguates duplicate crate versions
  • src/lib.rs: New dep_name: Option<String> field on Finding for manifest-level attribution
  • src/engine/scanner.rs: detect_config_language maps Cargo.lock / requirements.txt to Language::Manifest; comment_markers returns [\"#\"] for Language::Manifest
  • src/rules/mod.rs: Manifest rules registered as standard (non-opt-in) rules
  • Prior review concerns (CRLF drift, duplicate-crate offset, pip_map key normalization, is_pq_rule_id substring hack) are all addressed in this iteration
  • One remaining concern: Language::Manifest is still routed to the Bash tree-sitter grammar in parser.rs, wasting a parse on every manifest file (noted in prior review threads)
  • Minor new P2: pip name normalization uses replace(['_', '.'], \"-\") which doesn't collapse consecutive separators per PEP 503 — low practical risk with the current 11-entry curated list, but worth hardening as the list grows

Confidence Score: 4/5

Safe to merge; prior review concerns are resolved and the two new P2s are non-blocking improvements

The core logic — BFS transitive scanning for Cargo.lock and curated list matching for requirements.txt — is correct and well-tested. The three issues raised in previous review threads (CRLF byte-offset drift, duplicate crate name disambiguation, pip_map key normalization) are all addressed. One lingering architectural concern (Bash tree-sitter grammar used for Manifest) was noted in prior threads and has a known fix but no correctness impact. The only new finding is a minor PEP 503 normalization deviation that doesn't affect any of the current 11 curated packages. 531 tests pass per the PR description.

src/rules/manifest.rs — PEP 503 normalization could be tightened; src/engine/parser.rs — Language::Manifest still parses via Bash grammar (CPU waste, no correctness impact)

Important Files Changed

Filename Overview
src/rules/manifest.rs New manifest scanning rules for Cargo.lock (BFS transitive graph) and requirements.txt (curated list); CRLF handling, duplicate-crate disambiguation, and pip_map key normalization all addressed; minor PEP 503 normalization deviation remains
src/engine/parser.rs Language::Manifest still routed to tree_sitter_bash grammar (noted in previous review threads); manifest rules ignore the tree, so correctness is unaffected
src/engine/scanner.rs Adds Cargo.lock and requirements.txt to detect_config_language; Language::Manifest gets comment_markers = ["#"] for inline-ignore; no logic regressions
src/rules/mod.rs CargoLockPqCrypto and RequirementsTxtPqCrypto registered via register() (not register_opt_in), making them active by default; consistent with other PQ rules
src/lib.rs New dep_name: Option field added to Finding with skip_serializing_if; serializes only when present in JSON output
tests/integration.rs Integration tests cover Cargo.lock transitive chain (tier-1/tier-2), requirements.txt curated matching, non-flagged packages, and language detection for manifest files
tests/fixtures/deps/Cargo.lock Test fixture with my-app→rustls→rsa (tier-1) and reqwest→ring (tier-2) transitive chains; serde has no crypto dependency and is correctly excluded
tests/fixtures/deps/requirements.txt Test fixture covering comments, -e, git+, extras, env markers; python-rsa, cryptography, fabric, paramiko are flagged; flask/requests are not

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[scan_directory] --> B{detect_language}
    B -->|Cargo.lock| C[Language::Manifest]
    B -->|requirements.txt| C
    C --> D[parse_file\ntree-sitter Bash\n*unused tree*]
    D --> E{rule.applies_to_path?}
    E -->|Cargo.lock| F[CargoLockPqCrypto.check]
    E -->|requirements.txt| G[RequirementsTxtPqCrypto.check]
    F --> F1[Parse TOML via toml crate]
    F1 --> F2[Build name to indices map and adjacency graph]
    F2 --> F3[For each non-seed package: BFS over dependency graph]
    F3 --> F4{Reached seed crates?}
    F4 -->|Yes| F5[Pick highest-confidence seed - Emit Finding with dep_name]
    F4 -->|No| F6[Skip]
    G --> G1[Iterate lines with CRLF-aware byte offsets]
    G1 --> G2[Skip comments, -flags, git+, URLs]
    G2 --> G3[Normalize: lowercase + hyphenate]
    G3 --> G4{In PIP_PACKAGES curated map?}
    G4 -->|Yes| G5[Emit Finding with dep_name and fix_suggestion if applicable]
    G4 -->|No| G6[Skip]
    F5 --> H[annotate_cnsa2_deadlines]
    G5 --> H
    H --> I[CBOM / SARIF / JSON / TUI output]
Loading

Reviews (2): Last reviewed commit: "fix(pq): address CI and review feedback ..." | Re-trigger Greptile

Comment thread src/rules/manifest.rs Outdated
Comment thread src/rules/manifest.rs Outdated
Comment thread src/rules/manifest.rs Outdated
Comment thread src/engine/parser.rs
Comment on lines +21 to +22
| Language::Dockerfile
| Language::Manifest => tree_sitter_bash::LANGUAGE.into(),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Language::Manifest routed to Bash tree-sitter grammar unnecessarily

Both CargoLockPqCrypto and RequirementsTxtPqCrypto accept _tree: &tree_sitter::Tree and never use it — they parse the source directly with the toml crate or by iterating over lines. Routing Language::Manifest through tree_sitter_bash::LANGUAGE therefore parses valid TOML (Cargo.lock) and pip syntax as Bash, producing a parse tree that is silently discarded. This wastes CPU time on every manifest scan and may emit confusing error-level log lines from tree-sitter if it encounters Cargo.lock syntax it can't parse as Bash.

Consider returning None early from parse_file for Language::Manifest, or making the tree-sitter step optional for languages that don't need it.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
tests/fixtures/deps/requirements.txt (1)

1-1: Consider explicitly marking this file as a non-production security fixture.

This helps reduce alert-triage confusion when dependency scanners parse test fixtures.

♻️ Suggested tweak
-# App dependencies
+# Test fixture for dependency-level PQ scanning.
+# Intentionally includes packages/versions for detection coverage; not for installation.
+# App dependencies
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/fixtures/deps/requirements.txt` at line 1, Add an explicit
non-production marker to the top of the requirements fixture so dependency
scanners and reviewers know this is a test-only file: update the
tests/fixtures/deps/requirements.txt content to prepend a clear header comment
like a "NON-PRODUCTION TEST FIXTURE" or "TEST-FIXTURE: NOT FOR PRODUCTION" token
and a short sentence explaining it’s only for tests, ensuring scanners will
treat it as a non-production/security fixture.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/rules/manifest.rs`:
- Around line 227-230: The loop that iterates over source.lines() and updates
byte_offset (variables: source.lines(), byte_offset, line_start, line_end)
miscomputes offsets on CRLF files because lines() strips the delimiter; change
the iteration to use the original-slice-aware splitter (e.g.,
split_inclusive('\n') or split_terminator while accounting for delimiter) or
otherwise compute positions from the original source bytes so each slice
includes the newline(s) and you can set line_end = byte_offset + slice.len() and
then advance byte_offset = line_end (no +1 adjustment). Ensure the loop uses the
actual slice length (including CR or LF) to avoid drifting offsets.
- Around line 234-243: In the manifest.rs parsing block that currently uses
trimmed.starts_with('-') to skip entries, special-case editable flags instead:
detect and strip leading "-e" or "--editable" (allowing optional space) and then
if the remaining value is a local path (e.g., "." or starts with "./" or "/")
skip it, otherwise assign the stripped value back to trimmed and continue
parsing so named editables like "-e python-rsa" or "--editable paramiko" are
processed; remove the broad trimmed.starts_with('-') check and only skip
hyphen-prefixed tokens when they indicate local editables.
- Around line 80-103: The graph construction (name_to_indices / graph) must
match package entries by both name and version to avoid attaching spans to the
wrong duplicate-name package: change the keying to include version (e.g.,
name+version from pkg.get("version")) and when parsing dependency strings in the
loop that creates dep_name currently using split_once(' '), extract both name
and version (when present) and only link i->j when both match; update the
span-finding logic that uses snippet_pattern and source.find(...) to search for
the specific package occurrence (e.g., locate the Nth occurrence or compute
offsets by scanning lines until the package header that matches both name and
version) instead of always using the first match. For RequirementsTxtPqCrypto
adjust the byte_offset calculation (currently byte_offset = line_end + 1) to
account for CRLF vs LF by tracking cumulative byte offsets while iterating lines
(use the actual line byte lengths including their line endings or detect
line_ending style once and add 2 for CRLF) and stop skipping all '-' prefixed
lines: allow '-e' editable entries to be processed by treating '-e' or
'--editable' as valid dependency entries rather than filtering them out.

---

Nitpick comments:
In `@tests/fixtures/deps/requirements.txt`:
- Line 1: Add an explicit non-production marker to the top of the requirements
fixture so dependency scanners and reviewers know this is a test-only file:
update the tests/fixtures/deps/requirements.txt content to prepend a clear
header comment like a "NON-PRODUCTION TEST FIXTURE" or "TEST-FIXTURE: NOT FOR
PRODUCTION" token and a short sentence explaining it’s only for tests, ensuring
scanners will treat it as a non-production/security fixture.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: a9ba8d28-0cac-4531-94b7-8c9b211eaecd

📥 Commits

Reviewing files that changed from the base of the PR and between 322883d and b0b2309.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • tests/fixtures/deps/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (26)
  • Cargo.toml
  • src/app.rs
  • src/baseline.rs
  • src/bin/gen_rules_ts.rs
  • src/compliance.rs
  • src/config.rs
  • src/diff.rs
  • src/engine/parser.rs
  • src/engine/scanner.rs
  • src/lib.rs
  • src/report/cbom.rs
  • src/report/github_pr.rs
  • src/rules/common.rs
  • src/rules/go.rs
  • src/rules/javascript.rs
  • src/rules/kotlin.rs
  • src/rules/manifest.rs
  • src/rules/mod.rs
  • src/rules/python.rs
  • src/rules/semgrep_compat.rs
  • src/rules/semgrep_taint.rs
  • src/secrets.rs
  • src/tui.rs
  • tests/fixtures/deps/requirements.txt
  • tests/integration.rs
  • www/src/data/rules.ts

Comment thread src/rules/manifest.rs
Comment thread src/rules/manifest.rs Outdated
Comment thread src/rules/manifest.rs
Comment on lines +234 to +243
// Skip blank, comments, options, URLs
if trimmed.is_empty()
|| trimmed.starts_with('#')
|| trimmed.starts_with('-')
|| trimmed.starts_with("git+")
|| trimmed.starts_with("http://")
|| trimmed.starts_with("https://")
{
continue;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /tmp && find . -name "manifest.rs" -path "*/rules/*" 2>/dev/null | head -5

Repository: PwnKit-Labs/foxguard

Length of output: 46


🏁 Script executed:

cat -n src/rules/manifest.rs | sed -n '220,260p'

Repository: PwnKit-Labs/foxguard

Length of output: 2041


🏁 Script executed:

cat -n src/rules/manifest.rs | sed -n '1,50p'

Repository: PwnKit-Labs/foxguard

Length of output: 2881


🏁 Script executed:

cat -n src/rules/manifest.rs | sed -n '200,280p'

Repository: PwnKit-Labs/foxguard

Length of output: 3744


🏁 Script executed:

rg -n "extract_pip_package_name" src/rules/manifest.rs

Repository: PwnKit-Labs/foxguard

Length of output: 582


🏁 Script executed:

cat -n src/rules/manifest.rs | sed -n '60,120p'

Repository: PwnKit-Labs/foxguard

Length of output: 2938


🏁 Script executed:

cat -n src/rules/manifest.rs | sed -n '296,330p'

Repository: PwnKit-Labs/foxguard

Length of output: 1145


🏁 Script executed:

rg -A 10 "editable\|--editable\|-e " src/rules/manifest.rs

Repository: PwnKit-Labs/foxguard

Length of output: 46


🏁 Script executed:

cat -n src/rules/manifest.rs | sed -n '303,350p'

Repository: PwnKit-Labs/foxguard

Length of output: 905


Strip -e/--editable prefix before skipping local editable installs.

The blanket trimmed.starts_with('-') check prevents packages installed with -e python-rsa or --editable paramiko from being parsed and checked. Named editable installs are valid pip requirements and should be scanned for vulnerabilities. Only local editables like -e . should be skipped.

Suggested fix
-            if trimmed.is_empty()
-                || trimmed.starts_with('#')
-                || trimmed.starts_with('-')
-                || trimmed.starts_with("git+")
+            let trimmed = if let Some(rest) = trimmed.strip_prefix("-e ") {
+                if rest.trim() == "." {
+                    continue;
+                }
+                rest.trim()
+            } else if let Some(rest) = trimmed.strip_prefix("--editable ") {
+                rest.trim()
+            } else {
+                trimmed
+            };
+
+            if trimmed.is_empty()
+                || trimmed.starts_with('#')
+                || trimmed.starts_with("--")
+                || trimmed.starts_with("git+")
                 || trimmed.starts_with("http://")
                 || trimmed.starts_with("https://")
             {
                 continue;
             }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Skip blank, comments, options, URLs
if trimmed.is_empty()
|| trimmed.starts_with('#')
|| trimmed.starts_with('-')
|| trimmed.starts_with("git+")
|| trimmed.starts_with("http://")
|| trimmed.starts_with("https://")
{
continue;
}
// Skip blank, comments, options, URLs
let trimmed = if let Some(rest) = trimmed.strip_prefix("-e ") {
if rest.trim() == "." {
continue;
}
rest.trim()
} else if let Some(rest) = trimmed.strip_prefix("--editable ") {
rest.trim()
} else {
trimmed
};
if trimmed.is_empty()
|| trimmed.starts_with('#')
|| trimmed.starts_with("--")
|| trimmed.starts_with("git+")
|| trimmed.starts_with("http://")
|| trimmed.starts_with("https://")
{
continue;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/manifest.rs` around lines 234 - 243, In the manifest.rs parsing
block that currently uses trimmed.starts_with('-') to skip entries, special-case
editable flags instead: detect and strip leading "-e" or "--editable" (allowing
optional space) and then if the remaining value is a local path (e.g., "." or
starts with "./" or "/") skip it, otherwise assign the stripped value back to
trimmed and continue parsing so named editables like "-e python-rsa" or
"--editable paramiko" are processed; remove the broad trimmed.starts_with('-')
check and only skip hyphen-prefixed tokens when they indicate local editables.

- Fix clippy collapsible_str_replace: use replace(['_', '.'], "-")
- Fix rustfmt formatting for struct initializer arrays
- Fix source.find() returning first occurrence for duplicate crate
  names by searching for unique name+version pair
- Fix CRLF line ending drift in requirements.txt byte offset tracking
- Normalize pip_map keys so future entries with mixed case won't break
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/integration.rs`:
- Around line 3731-3737: The tests capture the child process output into
variable output and parse findings but never assert the process exit code;
update the assertions so that after parsing findings (Vec serde_json::Value from
output.stdout) you assert the process exited with a non-zero status when
findings.len() > 0 (e.g. check !output.status.success() or
output.status.code().unwrap_or(0) != 0), and add the same check in the second
occurrence that parses findings around the other block (the
foxguard_cmd()/output/findings flows).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 2aa2dfa3-3d9b-4852-9b54-adaeadc491ce

📥 Commits

Reviewing files that changed from the base of the PR and between b0b2309 and 3e61964.

📒 Files selected for processing (3)
  • src/rules/manifest.rs
  • src/rules/mod.rs
  • tests/integration.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/rules/mod.rs
  • src/rules/manifest.rs

Comment thread tests/integration.rs
Comment on lines +3731 to +3737
let output = foxguard_cmd()
.args(["pqc", "tests/fixtures/deps/Cargo.lock", "-f", "json"])
.output()
.expect("failed to execute foxguard");

let findings: Vec<serde_json::Value> =
serde_json::from_slice(&output.stdout).expect("invalid JSON output");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Assert pqc exit semantics for vulnerable manifest fixtures.

Line 3731 and Line 3779 validate findings content but don’t assert non-zero exit when findings exist. A CLI exit-code regression could pass these tests unnoticed.

Suggested patch
 fn cargo_lock_pq_finds_transitive_rsa_dep() {
     let output = foxguard_cmd()
         .args(["pqc", "tests/fixtures/deps/Cargo.lock", "-f", "json"])
         .output()
         .expect("failed to execute foxguard");

+    assert!(
+        !output.status.success(),
+        "pqc should exit non-zero when vulnerable dependencies are found"
+    );
+
     let findings: Vec<serde_json::Value> =
         serde_json::from_slice(&output.stdout).expect("invalid JSON output");
@@
 fn requirements_txt_pq_finds_crypto_deps() {
     let output = foxguard_cmd()
         .args(["pqc", "tests/fixtures/deps/requirements.txt", "-f", "json"])
         .output()
         .expect("failed to execute foxguard");

+    assert!(
+        !output.status.success(),
+        "pqc should exit non-zero when vulnerable dependencies are found"
+    );
+
     let findings: Vec<serde_json::Value> =
         serde_json::from_slice(&output.stdout).expect("invalid JSON output");

Also applies to: 3779-3786

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration.rs` around lines 3731 - 3737, The tests capture the child
process output into variable output and parse findings but never assert the
process exit code; update the assertions so that after parsing findings (Vec
serde_json::Value from output.stdout) you assert the process exited with a
non-zero status when findings.len() > 0 (e.g. check !output.status.success() or
output.status.code().unwrap_or(0) != 0), and add the same check in the second
occurrence that parses findings around the other block (the
foxguard_cmd()/output/findings flows).

@Darkroom4364
Copy link
Copy Markdown
Collaborator Author

@greptileai

@Darkroom4364 Darkroom4364 self-assigned this Apr 23, 2026
…-verify

id.contains("insecure-tls") matches non-PQ rules like
go/insecure-tls-skip-verify. Use exact match for the one
Dockerfile rule that belongs in the PQ set.
- Remove find_name_version_offset fallback that silently picked wrong
  package entry when name+version pair wasn't adjacent
- Stop BFS at seed crates instead of traversing through their deps
- Handle bare CR line endings in requirements.txt offset tracking
- Use .first().unwrap() over index [0] for reached_seeds
- Unify CrateEntry/PipEntry into SeedEntry, extract shared constants
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/rules/manifest.rs (1)

435-462: Good test coverage for pip package name extraction.

Tests cover the main parsing scenarios. Consider adding a test case for find_name_version_offset to validate CRLF handling and duplicate name disambiguation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/manifest.rs` around lines 435 - 462, Add a unit test for
find_name_version_offset that ensures it correctly handles CRLF line endings and
disambiguates duplicate package occurrences; specifically, create inputs with
CRLF ("\r\n") separators and repeated package names (e.g., "pkg" appearing in
multiple positions) and assert the returned offsets point to the correct
name/version instance and not a prior duplicate. Locate the function
find_name_version_offset in the manifest parser tests and add cases that check
both CRLF termination and duplicate-name selection behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/rules/manifest.rs`:
- Around line 435-462: Add a unit test for find_name_version_offset that ensures
it correctly handles CRLF line endings and disambiguates duplicate package
occurrences; specifically, create inputs with CRLF ("\r\n") separators and
repeated package names (e.g., "pkg" appearing in multiple positions) and assert
the returned offsets point to the correct name/version instance and not a prior
duplicate. Locate the function find_name_version_offset in the manifest parser
tests and add cases that check both CRLF termination and duplicate-name
selection behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 163e8536-4b4c-4229-9ecb-f2cbe37ae5ff

📥 Commits

Reviewing files that changed from the base of the PR and between 3e61964 and e6061a7.

📒 Files selected for processing (2)
  • src/app.rs
  • src/rules/manifest.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/app.rs

Cover CargoLockPqCrypto (BFS graph walk, seed detection, transitive
deps, version-qualified strings, confidence tiebreaking, BFS stopping
at seeds), RequirementsTxtPqCrypto (PEP 503 normalization, CRLF
offsets, comments/options skipping, environment markers, extras),
and find_name_version_offset (disambiguation, CRLF, missing entries).
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
src/rules/manifest.rs (2)

191-203: ⚠️ Potential issue | 🟠 Major

Match Cargo.lock dependencies by resolved version, not just crate name.

The dependency parser throws away the version from entries like "foo 1.2.3" and then links foo to every package entry with that name. In lockfiles that contain multiple versions of the same crate, this creates false transitive paths and can report PQ findings for packages that do not actually reach the vulnerable branch.


348-357: ⚠️ Potential issue | 🟠 Major

Do not skip named editable requirements.

The blanket trimmed.starts_with('-') filter drops valid requirements such as -e python-rsa and --editable paramiko, so those packages are never scanned. Strip editable prefixes first and only skip local/path editables or true option lines.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/manifest.rs` around lines 348 - 357, The current skip condition
`trimmed.starts_with('-')` in the parsing block accidentally drops named
editable requirements (e.g. `-e pkg`, `--editable pkg`); change the logic to
first detect and strip editable prefixes (`-e ` and `--editable `) from
`trimmed` and treat the remainder as a normal requirement to be scanned (e.g.
`let mut t = trimmed; if t.starts_with("-e ") { t = t.trim_start_matches("-e
").trim(); } else if t.starts_with("--editable ") { t =
t.trim_start_matches("--editable ").trim(); }`), and only after that skip lines
that are true options (other `-` flags) or local/path editables (e.g. after
stripping if `t` starts with `./`, `../`, `/`, or `file:` or looks like a path),
while removing the blanket `trimmed.starts_with('-')` check so `-e`/`--editable`
named packages are not ignored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/rules/manifest.rs`:
- Around line 348-357: The current skip condition `trimmed.starts_with('-')` in
the parsing block accidentally drops named editable requirements (e.g. `-e pkg`,
`--editable pkg`); change the logic to first detect and strip editable prefixes
(`-e ` and `--editable `) from `trimmed` and treat the remainder as a normal
requirement to be scanned (e.g. `let mut t = trimmed; if t.starts_with("-e ") {
t = t.trim_start_matches("-e ").trim(); } else if t.starts_with("--editable ") {
t = t.trim_start_matches("--editable ").trim(); }`), and only after that skip
lines that are true options (other `-` flags) or local/path editables (e.g.
after stripping if `t` starts with `./`, `../`, `/`, or `file:` or looks like a
path), while removing the blanket `trimmed.starts_with('-')` check so
`-e`/`--editable` named packages are not ignored.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 51bae745-c570-40f2-bcfd-9da3c862fc67

📥 Commits

Reviewing files that changed from the base of the PR and between e6061a7 and 819c5af.

📒 Files selected for processing (1)
  • src/rules/manifest.rs

- Redesign BFS-stops-at-seed test: use ring (0.6) → rsa (0.9) chain
  so confidence value distinguishes correct vs incorrect traversal
- Assert column and end_column in CRLF offset test, not just line
- Add test for name-exists-but-version-mismatches → returns None
- Add multi-version diamond test (syn 1.x vs 2.x with version-
  qualified dep strings)
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
src/rules/manifest.rs (2)

194-202: ⚠️ Potential issue | 🟠 Major

Honor version-qualified Cargo.lock edges.

"syn 2.0.0" is reduced to "syn" here, so the graph links to every syn node. In a multi-version lockfile that can report a PQ path through the wrong version. Filter candidates by the parsed version when the dependency string includes one.

Suggested fix
-                    let dep_name = match dep.as_str() {
-                        Some(s) => s.split_once(' ').map_or(s, |(name, _)| name),
+                    let (dep_name, dep_version) = match dep.as_str() {
+                        Some(s) => s
+                            .split_once(' ')
+                            .map_or((s, None), |(name, version)| (name, Some(version))),
                         None => continue,
                     };
                     if let Some(indices) = name_to_indices.get(dep_name) {
                         for &j in indices {
-                            graph[i].push(j);
+                            let version_matches = match dep_version {
+                                Some(version) => {
+                                    packages[j].get("version").and_then(|v| v.as_str())
+                                        == Some(version)
+                                }
+                                None => true,
+                            };
+                            if version_matches {
+                                graph[i].push(j);
+                            }
                         }
                     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/manifest.rs` around lines 194 - 202, The current code reduces a
dependency like "syn 2.0.0" to "syn" and then links to every syn node; when
dep.as_str() contains a version (split_once returned (_, ver)), parse and use
that version to filter candidate indices from name_to_indices before pushing
into graph: for each &j in indices, only graph[i].push(j) if the package at
index j has the same version string parsed from dep (compare against whatever
structure holds package versions for indices), otherwise skip; keep the existing
fallback of linking all indices when no version is present.

348-357: ⚠️ Potential issue | 🟠 Major

Don't drop named editable requirements.

The blanket hyphen filter skips valid entries like -e python-rsa and --editable paramiko, so those packages are never checked. Strip the editable prefix first, keep skipping local paths like -e ., and then run the normal package-name parsing.

Suggested fix
-            let trimmed = line.trim();
+            let mut trimmed = line.trim();
+            if let Some(rest) = trimmed.strip_prefix("-e ") {
+                trimmed = rest.trim();
+                if trimmed == "." || trimmed.starts_with("./") || trimmed.starts_with('/') {
+                    continue;
+                }
+            } else if let Some(rest) = trimmed.strip_prefix("--editable ") {
+                trimmed = rest.trim();
+                if trimmed == "." || trimmed.starts_with("./") || trimmed.starts_with('/') {
+                    continue;
+                }
+            }
 
             // Skip blank, comments, options, URLs
             if trimmed.is_empty()
                 || trimmed.starts_with('#')
-                || trimmed.starts_with('-')
+                || trimmed.starts_with('-')
                 || trimmed.starts_with("git+")
                 || trimmed.starts_with("http://")
                 || trimmed.starts_with("https://")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/rules/manifest.rs` around lines 348 - 357, The current blanket check uses
trimmed.starts_with('-') and therefore drops named editable requirements like
"-e python-rsa" or "--editable paramiko"; instead, normalize the line by
removing leading editable flags ("-e ", "--editable ") into a new variable
(e.g., normalized) before applying the rest of the filters, then skip only local
editable paths (e.g., normalized == "." or normalized.starts_with("./") or
normalized.starts_with("/") or starts_with("file:")) and continue to run the
existing comment/URL/option checks against normalized; update the logic around
the existing trimmed variable in this block so functions/flows that parse
package names operate on normalized rather than the original trimmed string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/rules/manifest.rs`:
- Around line 194-202: The current code reduces a dependency like "syn 2.0.0" to
"syn" and then links to every syn node; when dep.as_str() contains a version
(split_once returned (_, ver)), parse and use that version to filter candidate
indices from name_to_indices before pushing into graph: for each &j in indices,
only graph[i].push(j) if the package at index j has the same version string
parsed from dep (compare against whatever structure holds package versions for
indices), otherwise skip; keep the existing fallback of linking all indices when
no version is present.
- Around line 348-357: The current blanket check uses trimmed.starts_with('-')
and therefore drops named editable requirements like "-e python-rsa" or
"--editable paramiko"; instead, normalize the line by removing leading editable
flags ("-e ", "--editable ") into a new variable (e.g., normalized) before
applying the rest of the filters, then skip only local editable paths (e.g.,
normalized == "." or normalized.starts_with("./") or normalized.starts_with("/")
or starts_with("file:")) and continue to run the existing comment/URL/option
checks against normalized; update the logic around the existing trimmed variable
in this block so functions/flows that parse package names operate on normalized
rather than the original trimmed string.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b4475346-e67c-4718-89da-e6b9319d011e

📥 Commits

Reviewing files that changed from the base of the PR and between 819c5af and a9dff77.

📒 Files selected for processing (1)
  • src/rules/manifest.rs

@Darkroom4364
Copy link
Copy Markdown
Collaborator Author

Darkroom4364 commented Apr 23, 2026

Summary of changes since initial push

Round 1 — CI + reviewer feedback (3e61964)

  • Clippy: collapsed consecutive .replace() calls
  • Rustfmt: expanded struct initializer arrays
  • Greptile P1: source.find() returned first occurrence for duplicate crate names — replaced with find_name_version_offset() that searches for unique name+version pair
  • Greptile P1: CRLF line endings caused drifting byte offsets — now checks actual byte at line boundary
  • Greptile P2: pip_map keys normalized at construction time so future entries with mixed case/underscores won't silently fail

Round 2 — audit fixes (53fe6b2, e6061a7)

  • Critical: is_pq_rule_id had id.contains("insecure-tls") which false-positived on go/insecure-tls-skip-verify (not PQ-related) — changed to exact match id == "config/dockerfile-insecure-tls-env"
  • find_name_version_offset fallback: removed silent fallback to first occurrence when name+version pair isn't adjacent — returns None, caller skips gracefully
  • BFS traversal: now stops at seed crates instead of traversing through their deps (prevents accumulating unreachable seeds)
  • Bare CR: offset tracking handles \r-only line endings (not just \r\n and \n)
  • Minor: reached_seeds[0].first().unwrap(), unified CrateEntry/PipEntry into SeedEntry, extracted shared constants

Round 3 — unit test coverage (819c5af, a9dff77)

27 unit tests added covering all previously-untested code paths:

find_name_version_offset (5 tests): exact pair match, duplicate crate disambiguation, missing entry, version mismatch → None, CRLF handling

CargoLockPqCrypto::check (8 tests): direct seed dep, seed crate skipping, transitive deps, BFS stops at seeds (verified via confidence: ring 0.6 → rsa 0.9 chain), tier 1 beats tier 2, clean lockfile, invalid TOML, version-qualified dep strings, multi-version diamond

RequirementsTxtPqCrypto::check (7 tests): known package detection, algorithm attribution, fix suggestions, comments/options/URLs skipped, environment markers, extras, PEP 503 normalization, CRLF byte offsets (asserts column + end_column), empty input

Round 4 — version-qualified dep resolution (d0b4957, 6ba0aee)

  • Bug fix: version-qualified dep strings like "syn 2.0.0" were split on space but the version was discarded — edges fanned out to all indices with that crate name. Added name_ver_to_index lookup so version-qualified strings resolve to exactly one package
  • Test fix: multi-version diamond test was tautological — swapped seeds so syn 1.0 → rsa (0.9) and syn 2.0 → ring (0.6). Assert confidence == 0.6 now genuinely fails on the old buggy code

Intentionally deferred (not blocking)

  • pip_map rebuilt per check() call — trivial perf, not worth OnceLock complexity
  • unwrap_or(trimmed) dead branch on split(';').next() — cosmetic
  • find_name_version_offset assumes version immediately follows name — true for all Cargo versions, undocumented invariant
  • Language::Manifest parses through tree_sitter_bash — tree is unused, but <1ms and matches existing config rule pattern

PQC sitrep

The PQ scanning arc is feature-complete across all three layers:

Layer Rules Status
Source-level {py,js,go,java,rs}/pq-vulnerable-crypto, {py,js,java}/hardcoded-crypto-algorithm Shipped on main
Config-level config/{nginx,apache,haproxy}-pq-vulnerable-tls, config/dockerfile-insecure-tls-env Shipped on main
Dependency-level manifest/cargo-pq-vulnerable-dep, manifest/pip-pq-vulnerable-dep This PR

Supporting infrastructure (all on main): CBOM output, CNSA 2.0 compliance panel in TUI, --pq-mode flag, confidence scoring, cnsa2_deadline annotations.

Not in v1 scope (documented in plan): poetry.lock/uv.lock, go.mod, package-lock.json, pom.xml, cross-rule dedup between source and manifest findings, requirements.txt variants.

556 tests pass, 0 fail. No regressions.

Version-qualified dep strings like "syn 2.0.0" were split on space
but the version was discarded — edges fanned out to all indices with
that crate name regardless of version. Add name_ver_to_index lookup
so version-qualified strings resolve to exactly one package.

Strengthen multi-version diamond test: syn 1.0 now depends on ring
(0.6) while syn 2.0 depends on rsa (0.9), so incorrect edge
resolution would produce a different confidence value.
Swap seeds: syn 1.0 → rsa (0.9), syn 2.0 → ring (0.6). Old buggy
code (fan-out to both versions) would reach rsa and report 0.9.
Correct code resolves to syn 2.0 only → ring → 0.6. The assertion
now actually discriminates the bug from the fix.
Avoids mutating the vec just to pick one element.
@peaktwilight
Copy link
Copy Markdown
Collaborator

Approved. Code quality is strong — BFS dedup is sound, the multi-version-diamond test is exactly the right regression guard, dep_name is safe for baselines (not in the fingerprint hash), all 28 Finding constructors are updated (no merge-skew), and the cnsa2_deadline() metadata on both rules + is_pq_rule_id substring filter means these flow cleanly through --cnsa2 and foxguard pqc.

Merging. Will file a follow-up issue for the non-blocking items below.

Follow-ups (filing as a separate issue, non-blocking)

1. Seed list has real gaps. Missing from CARGO_SEEDS: k256 (Bitcoin/Ethereum ubiquity, same category as p256), secp256k1, openssl (the high-level Rust crate, not just openssl-sys). Missing from PIP_PACKAGES: pyjwt, authlib, python-jose, jwcrypto — JWT libs are pinned to RS256/ES256 by default, arguably higher-signal than cryptography itself.

2. fabric attribution. fabric doesn't implement crypto — it wraps paramiko. Setting crypto_algorithm: Some("RSA") is misleading; should be None like cryptography.

3. SARIF dep_name passthrough. src/report/sarif.rs currently drops it. Add as properties.depName alongside confidence and cnsa2Deadline.

4. Document dev-deps limitation. Cargo.lock doesn't distinguish dev from prod deps, so a cargo-tarpaulin-only rsa dep flags the whole app. Not fixable without parsing Cargo.toml too; worth a line in the rule description.

5. Future lockfile formats. Pipfile.lock, poetry.lock, uv.lock are more canonical than raw requirements.txt for modern Python projects.

None of these block merge. Closes #221.

@peaktwilight peaktwilight merged commit 3edae66 into main Apr 24, 2026
17 checks passed
@peaktwilight peaktwilight deleted the feat/dep-level-pq-scanning branch April 24, 2026 07:37
peaktwilight added a commit that referenced this pull request Apr 24, 2026
…s next

#260 just landed dependency-level PQ scanning for Cargo.lock and
requirements.txt — mention it in the 'What 0.8 does' section and
refresh the 'What's next' section to point at #262 (more lockfile
formats) since #221 is now closed.
peaktwilight added a commit that referenced this pull request Apr 24, 2026
* docs: add v0.8.0 PQ crypto audit blog post (refs #254)

Announces foxguard 0.8's PQ-vulnerable-crypto rules and CNSA 2.0
deadline annotations, with explicit prior-art acknowledgment of
CodeQL's PQC queries, IBM CBOMkit, sonar-cryptography, and cdxgen.

* docs(blog): add slh_dsa to allowlist list, use @latest on npx

Greptile review caught two real issues:
- The PQ-safe allowlist in src/rules/rust_lang.rs has 5 crates
  (ml_kem, ml_dsa, slh_dsa, fn_dsa, hqc) but the post only listed 4.
  Added slh_dsa.
- Other blog posts consistently use 'npx foxguard@latest' to avoid
  npm cache staleness; this post was missing the @latest tag. Fixed.

* docs(blog): add dependency-level PQ scanning paragraph; refresh what's next

#260 just landed dependency-level PQ scanning for Cargo.lock and
requirements.txt — mention it in the 'What 0.8 does' section and
refresh the 'What's next' section to point at #262 (more lockfile
formats) since #221 is now closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants