Publish false-positive methodology and per-rule precision numbers

## Problem

Today the only published validation is Semgrep parity (\`tests/semgrep_parity.rs\`): \"foxguard finds what Semgrep finds on this fixed corpus.\" That is a correctness check against another tool, not a measurement of precision against real code. There is no public FP rate, no labeled corpus, and no per-rule numbers.

Users evaluating the scanner need to know: for rule X, out of N findings on real code, how many are true positives?

## Proposed approach

1. **Labeled corpus.** Assemble a small set of real OSS repos (pinned by SHA). For each rule that fires, label each finding TP / FP / unsure, with a one-line justification. Store as JSON alongside the corpus.
2. **Methodology doc.** Write \`docs/false-positive-methodology.md\` explaining corpus selection, labeling criteria, and how to reproduce.
3. **Per-rule precision table.** Generate a table of \`rule_id | findings | TP | FP | precision\` from the labeled data. Publish in the docs site and link from README.
4. **Regression harness.** Re-run labeling (or at least finding counts) in CI so precision doesn't silently regress when rules change.

## Non-goals
- Recall measurement (needs ground-truth vuln datasets; separate effort).
- Benchmarking against Semgrep/CodeQL on precision — we can't publish their rules' numbers.

## Acceptance
- First version of the labeled corpus committed under \`benchmarks/precision/\`.
- Methodology doc merged.
- Per-rule precision table published for at least the top 20 most-triggered rules.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Publish false-positive methodology and per-rule precision numbers #9

Problem

Proposed approach

Non-goals

Acceptance

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Publish false-positive methodology and per-rule precision numbers #9

Description

Problem

Proposed approach

Non-goals

Acceptance

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions