UNITARES Verdict Counterfactual — Paper v6.8 Reproducibility Kit

license

cc-by-4.0

language

en

size_categories

10K<n<100K

task_categories

other

pretty_name

UNITARES Verdict Counterfactual — Paper v6.8 Repro Kit

UNITARES Verdict Counterfactual — Paper v6.8 Reproducibility Kit

Reproducibility artifacts for §11.6 of UNITARES: Information-Theoretic Governance of Heterogeneous Agent Fleets (Wang, 2026).

Paper: CIRWEL/unitares-paper-v6, tag paper-v6.8.1
Paper DOI (concept): 10.5281/zenodo.19647159
Data DOI (this release, v6.8.1-repro): 10.5281/zenodo.19705151
GitHub: CIRWEL/unitares-repro-v6
Source: UNITARES production governance database, 30-day rolling window of core.agent_state
License: CC-BY-4.0 (same as paper)

What this is

The paper's §11.6 counterfactual asks: how many basin assignments would flip if we replaced the legacy fleet-wide tanh coherence with a class-conditional grounded coherence? On a 13,310-row production slice it reports a 28.9% flip rate, with directional bias into the low basin — empirical support for the homogenization-failure thesis.

This dataset publishes two snapshots of that counterfactual:

File	Window	Rows	Flip rate
`verdict_counterfactual_v6_submission.csv`	30 days ending 2026-04-18 21:00 MDT	13,292	28.8%
`verdict_counterfactual_2026-04-23.csv`	30 days ending 2026-04-23	16,879	44.3%

The submission snapshot is within 18 rows / 0.1pp of the figures in the published paper (exact rerun timing varies by seconds).

The 2026-04-23 snapshot shows a ~15pp increase in overall flip rate over 4 days of window shift, with the Phase 2 calibration constants held frozen at v6.8 submission values. We do not claim this is a steady-state drift signal — a 4-day gap with ~87% window overlap is one measurement, not a trend, and the magnitude is within the kind of shift class-conditional calibration drift could absorb in a single re-calibration pass.

Volume checks confirm the shift is not dominated by observer effects (daily row counts over the interval are ~800–1,500, no spikes). The per-class pattern is informative (Sentinel +25.4pp, Vigil +19.6pp, Lumen +15.8pp, default +15.3pp, Watcher −11.3pp). We publish both snapshots so others can reproduce §11.6 exactly and so the v7 empirical agenda has a concrete before/after pair to iterate from. See paper §11.6 and unitares-v7-outline.tex for the open questions this speaks to.

Columns

Each row is one agent-state observation, already pseudonymized to a class label — no agent UUIDs, session IDs, prompts, or KG content are in this export.

Column	Type	Meaning
`class`	str	Agent class: `Lumen`, `Sentinel`, `Vigil`, `Watcher`, or `default` (tag-derived for non-resident agents)
`E`	float ∈ [0, 1]	Energy (productive capacity)
`I`	float ∈ [0, 1]	Information integrity
`S`	float ∈ [0, 1]	Entropy (lower is better)
`V`	float ∈ [-1, 1]	Void / accumulated E–I imbalance
`risk`	float ∈ [0, 1]	Risk score at time of observation
`c_legacy`	float ∈ [0, 1]	Legacy fleet-wide tanh coherence
`c_grounded`	float ∈ [0, 1]	Class-conditional grounded coherence
`basin_legacy`	str	Basin assignment under legacy coherence: `high`, `boundary`, `low`
`basin_grounded`	str	Basin assignment under grounded coherence
`flipped`	int	1 if `basin_legacy != basin_grounded`, else 0

Basin thresholds and class-conditional calibration constants are in scripts/verdict_counterfactual.py.

How to reproduce

pip install pandas matplotlib
python analysis.py  # prints the Table 5 equivalent + drift comparison

To regenerate either CSV from a live UNITARES instance (requires access to the governance DB):

python scripts/verdict_counterfactual.py --window-days 30 \
  --end-date "2026-04-18 21:00:00-06:00" \
  --csv --output verdict_counterfactual_v6_submission.csv

Limitations

Fleet composition is UNITARES-specific (five resident agents + ephemeral coding assistants + embodied Lumen). The absolute flip rates are not directly transferable to other fleets; the method is.
Paper §12.5 (write-path hygiene) and §11.7 (identity system maturity) describe known caveats in the underlying trajectory data. Read both sections before citing these numbers in derivative work.
Class-conditional calibration constants in the script are frozen at v6.8 submission. Regenerating the 2026-04-23 snapshot against the current production calibration (not frozen) would yield a different number — we deliberately use the frozen constants to isolate fleet drift from calibration drift.

Citation

Cite the paper for the method, cite the dataset for the data.

@article{wang2026unitares,
  author  = {Wang, Kenny},
  title   = {UNITARES: Information-Theoretic Governance of Heterogeneous Agent Fleets},
  year    = {2026},
  version = {v6.8.1},
  doi     = {10.5281/zenodo.19647159},
  url     = {https://github.com/CIRWEL/unitares-paper-v6}
}

@dataset{wang2026unitares_repro_v6,
  author    = {Wang, Kenny},
  title     = {UNITARES Verdict Counterfactual — Paper v6.8 Reproducibility Kit},
  year      = {2026},
  version   = {v6.8.1-repro},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.19705151},
  url       = {https://github.com/CIRWEL/unitares-repro-v6}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
scripts		scripts
README.md		README.md
analysis.py		analysis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UNITARES Verdict Counterfactual — Paper v6.8 Reproducibility Kit

What this is

Columns

How to reproduce

Limitations

Citation

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

UNITARES Verdict Counterfactual — Paper v6.8 Reproducibility Kit

What this is

Columns

How to reproduce

Limitations

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages