Skip to content

Record: SP8192 QRescue + JEPA-Lite + LQER + Pergroup/lrzip + Legal TTT — val_bpb 1.08064#2027

Open
H1cSuNtDr4C0n3S wants to merge 1 commit intoopenai:mainfrom
H1cSuNtDr4C0n3S:codex/sp8192-qrescue-lqer-pergroup-submission
Open

Record: SP8192 QRescue + JEPA-Lite + LQER + Pergroup/lrzip + Legal TTT — val_bpb 1.08064#2027
H1cSuNtDr4C0n3S wants to merge 1 commit intoopenai:mainfrom
H1cSuNtDr4C0n3S:codex/sp8192-qrescue-lqer-pergroup-submission

Conversation

@H1cSuNtDr4C0n3S
Copy link
Copy Markdown

Summary

  • Adds records/track_10min_16mb/2026-04-26_SP8192_QRescue_JEPALite_LegalTTT.
  • Reported 3-seed mean val_bpb = 1.08064386, std 0.00096256.
  • Uses SP8192 QRescue/Hessian SDClip stack with training-side JEPA-Lite, LQER rank-4 residuals, pergroup compression with system lrzip -z -L 9, and legal chunkwise score-first full-SGD TTT.
  • No tokenizer or dataset changes.

3-Seed Results

Seed TTT BPB Sliding BPB Artifact bytes Train seconds Eval seconds
42 1.07971401 1.08152899 15,693,775 588.075 503.000
314 1.08163610 1.08312592 15,696,850 588.067 512.571
999 1.08058146 1.08208204 15,695,674 588.030 495.416

Compliance

  • roundtrip_ok: True for all seeds.
  • compressor_used: pergroup-lrzip for all seeds.
  • Artifacts are under the decimal 16,000,000 byte cap.
  • Training and eval are under 600s for all seeds.
  • TTT logs show chunkwise_score_first_full_sgd, score_before_update: true, and no_rescore: true.
  • lrzip is documented in the README as a system dependency (apt-get install -y lrzip).

Validation

  • python -m py_compile train_gpt.py parse_run_logs.py update_submission_json.py validate_submission_artifacts.py
  • python validate_submission_artifacts.py --log train_seed42.log
  • python validate_submission_artifacts.py --log train_seed314.log
  • python validate_submission_artifacts.py --log train_seed999.log

@H1cSuNtDr4C0n3S H1cSuNtDr4C0n3S marked this pull request as ready for review April 30, 2026 22:01
leon2k2k2k added a commit to leon2k2k2k/parameter-golf that referenced this pull request May 1, 2026
Audits every CaseOps-lineage record-track PR (merged + unmerged) since
2026-04-18 for whether val docs are also in the training set.

Working set: 34 PRs (31 from chronological seed list + 3 discovered ancestors:
openai#1908, openai#1923, openai#2007). Boundary nodes openai#1493 / openai#1626 (pre-CaseOps).

Verdicts:
  - CLEAN (8): openai#1729, openai#1851, openai#1868, openai#1908, openai#2019, openai#2027, openai#2031, openai#2068
  - LEAK (25): openai#1736 (our research baseline) → openai#1769openai#1787openai#1797openai#1855 → V21 family (openai#1945, openai#1923, openai#1953, openai#1967) → openai#2018openai#2118
    (current claimed frontier 1.04350), plus siblings.
  - INHERIT (1): openai#2050 (eval-only on frozen openai#1915)

Code-level evidence (not README claims):
  - Every shipped prepare_caseops_data.py is byte-identical:
    SHARD_TOKENS=10_000_000, default=10_000 for --val-docs
  - NO PR overrides --val-docs (searched all .sh files in all 34 PRs)
  - cached_challenge_fineweb.py downloads from romeerp/parameter-golf-caseops-v1
    HF dataset whose manifest pins docs_val=50000, docs_train=8181945,
    sums match → CLEAN by construction
  - PR openai#2018's DATASET_AUDIT.md is gold-standard explicit leak description
  - PR openai#2118's submission.json admits "--val-docs=10000 train shards + 50k val eval"

Three signposts:
  - Leak introduced: PR openai#1736 by @dexhunter (Apr 19) — first prepare_caseops_data.py
    default invocation
  - Leak fixed: PR openai#1851 by @aquariouseworkman (Apr 27) — switched to HF dataset
  - Leak re-introduced: PR openai#1855 by @codemath3000 (same day) — rebuilt locally

The merged-leaderboard SOTA (openai#1851/openai#1868 at 1.06128/1.06141) is CLEAN.
The unmerged frontier (openai#2118 at 1.04350) is LEAK. The 0.018 bpb gap is
inflated by val memorization; spec 301 was designed to measure how much
remains under clean data.

Files:
  caseops-memory-leakage/README.md       — overview, methodology, takeaways
  caseops-memory-leakage/verdicts.md     — 34-row master table with evidence
  caseops-memory-leakage/family-tree.md  — ASCII trees with [C]/[L] annotations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant