Record: SP8192 QRescue + JEPA-Lite + LQER + Pergroup/lrzip + Legal TTT — val_bpb 1.08064 by H1cSuNtDr4C0n3S · Pull Request #2027 · openai/parameter-golf

H1cSuNtDr4C0n3S · 2026-04-30T21:58:52Z

Summary

Adds records/track_10min_16mb/2026-04-26_SP8192_QRescue_JEPALite_LegalTTT.
Reported 3-seed mean val_bpb = 1.08064386, std 0.00096256.
Uses SP8192 QRescue/Hessian SDClip stack with training-side JEPA-Lite, LQER rank-4 residuals, pergroup compression with system lrzip -z -L 9, and legal chunkwise score-first full-SGD TTT.
No tokenizer or dataset changes.

3-Seed Results

Seed	TTT BPB	Sliding BPB	Artifact bytes	Train seconds	Eval seconds
42	1.07971401	1.08152899	15,693,775	588.075	503.000
314	1.08163610	1.08312592	15,696,850	588.067	512.571
999	1.08058146	1.08208204	15,695,674	588.030	495.416

Compliance

roundtrip_ok: True for all seeds.
compressor_used: pergroup-lrzip for all seeds.
Artifacts are under the decimal 16,000,000 byte cap.
Training and eval are under 600s for all seeds.
TTT logs show chunkwise_score_first_full_sgd, score_before_update: true, and no_rescore: true.
lrzip is documented in the README as a system dependency (apt-get install -y lrzip).

Validation

python -m py_compile train_gpt.py parse_run_logs.py update_submission_json.py validate_submission_artifacts.py
python validate_submission_artifacts.py --log train_seed42.log
python validate_submission_artifacts.py --log train_seed314.log
python validate_submission_artifacts.py --log train_seed999.log

@dexhunter

Audits every CaseOps-lineage record-track PR (merged + unmerged) since 2026-04-18 for whether val docs are also in the training set. Working set: 34 PRs (31 from chronological seed list + 3 discovered ancestors: openai#1908, openai#1923, openai#2007). Boundary nodes openai#1493 / openai#1626 (pre-CaseOps). Verdicts: - CLEAN (8): openai#1729, openai#1851, openai#1868, openai#1908, openai#2019, openai#2027, openai#2031, openai#2068 - LEAK (25): openai#1736 (our research baseline) → openai#1769 → openai#1787 → openai#1797 → openai#1855 → V21 family (openai#1945, openai#1923, openai#1953, openai#1967) → openai#2018 → openai#2118 (current claimed frontier 1.04350), plus siblings. - INHERIT (1): openai#2050 (eval-only on frozen openai#1915) Code-level evidence (not README claims): - Every shipped prepare_caseops_data.py is byte-identical: SHARD_TOKENS=10_000_000, default=10_000 for --val-docs - NO PR overrides --val-docs (searched all .sh files in all 34 PRs) - cached_challenge_fineweb.py downloads from romeerp/parameter-golf-caseops-v1 HF dataset whose manifest pins docs_val=50000, docs_train=8181945, sums match → CLEAN by construction - PR openai#2018's DATASET_AUDIT.md is gold-standard explicit leak description - PR openai#2118's submission.json admits "--val-docs=10000 train shards + 50k val eval" Three signposts: - Leak introduced: PR openai#1736 by @dexhunter (Apr 19) — first prepare_caseops_data.py default invocation - Leak fixed: PR openai#1851 by @aquariouseworkman (Apr 27) — switched to HF dataset - Leak re-introduced: PR openai#1855 by @codemath3000 (same day) — rebuilt locally The merged-leaderboard SOTA (openai#1851/openai#1868 at 1.06128/1.06141) is CLEAN. The unmerged frontier (openai#2118 at 1.04350) is LEAK. The 0.018 bpb gap is inflated by val memorization; spec 301 was designed to measure how much remains under clean data. Files: caseops-memory-leakage/README.md — overview, methodology, takeaways caseops-memory-leakage/verdicts.md — 34-row master table with evidence caseops-memory-leakage/family-tree.md — ASCII trees with [C]/[L] annotations

Add SP8192 QRescue LQER pergroup submission

142d27c

H1cSuNtDr4C0n3S marked this pull request as ready for review April 30, 2026 22:01

NewyorkDev mentioned this pull request May 1, 2026

Record: BIJEPAX-lite JEPA + SP8192 CaseOps PPM — val_bpb 0.97271 #2080

Open

leon2k2k2k mentioned this pull request May 1, 2026

Train/val data leakage in CaseOps records — prepare_caseops_data.py default overlaps 80% of val docs with training data #2127

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: SP8192 QRescue + JEPA-Lite + LQER + Pergroup/lrzip + Legal TTT — val_bpb 1.08064#2027

Record: SP8192 QRescue + JEPA-Lite + LQER + Pergroup/lrzip + Legal TTT — val_bpb 1.08064#2027
H1cSuNtDr4C0n3S wants to merge 1 commit intoopenai:mainfrom
H1cSuNtDr4C0n3S:codex/sp8192-qrescue-lqer-pergroup-submission

H1cSuNtDr4C0n3S commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

H1cSuNtDr4C0n3S commented Apr 30, 2026

Summary

3-Seed Results

Compliance

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant