Non-record: 11L gated Krylov + AR GPTQ int6 + lzma, 1.09596 BPB by LauraGomezjurado · Pull Request #1446 · openai/parameter-golf

LauraGomezjurado · 2026-04-07T17:36:10Z

Submission

Score: 1.09596320 BPB (sliding-window eval), 1.11953265 BPB exact roundtrip
Track: non-record-unlimited-compute-16mb
Artifact: 15,925,099 bytes (under 16MB cap)
Authors: Ganesh Talluri, Laura Gomezjurado, Hiroki Naganuma (@g4nesh, @LauraGomezjurado, @Hiroki11x)
Compute: 1×A100 80GB, 8h 52m, seed=1337

Approach

Standard SentencePiece GPT (11L, 512d, 26.99M params) with two main additions:

Gated Krylov correction on Muon: Estimates nonnormality of square weight slices via Hutchinson trace of W^T W − W W^T. Slices exceeding a threshold get a small adaptive-rank Krylov residual correction blended in at α=0.05. Muon stays the base optimizer; Krylov fires selectively on nonnormal slices.

AR self-generated Full-Hessian GPTQ int6 + lzma: Hessians calibrated on 64×2048 tokens from the model's own autoregressive output (temp=0.8), avoiding val/train data leakage. Percentile clipping across 5 levels, selective ±1 pruning to fit the 16MB cap.

Additional architecture: XSA across all 11 layers, BigramHash, SmearGate, VE128, partial RoPE, U-Net skips, LeakyReLU(0.5)^2, seq_len=2048, EMA (decay=0.997).

Results

Metric	Value
Sliding BPB	1.09596320
Int6 roundtrip exact BPB	1.11953265
Artifact bytes	15,925,099
Training time	8h 52m (1×A100 80GB)

Muon with gated Krylov correction on nonnormal square slices, AR self-gen Full-Hessian GPTQ int6 + lzma, selective ±1 pruning, sliding-window eval. 26.99M params, 15,925,099 bytes, 1xA100 80GB, 8h 52m. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ai#1430 stalled, 2 new PRs validate deferred specs Patches 15/16/21 still uncontested in 150+ open + 10 closed PRs (5 audits in a row). Strong evidence of true novelty. PR openai#1430 still OPEN, 0 comments, no comp owner activity since creation. Increasingly likely to be reverted or outlawed. NEW PRs validate two of our deferred H100 escalation specs: - PR openai#1445 (1.0889): "Depth Recurrence + EMA 0.9965" → validates Patch 17 EMA spec - PR openai#1446 (1.0960): "int6 GPTQ + lzma" → validates Patch 23 INT6 GPTQ-Lite spec Combined with PR openai#1437/openai#1420 already validating Patch 23 N-gram Tilt, the 3-spec H100 escalation bundle (EMA + Tilt + INT6 GPTQ) is now triple- confirmed by independent comp PRs. Spend ~$3.00/$36 (8% utilization). Pod healthy at 6h uptime. Reminder: depth recurrence is back on the table — 5+ records use it now. LESSONS.md §29 needs another update from "stale" to "real direction". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

MatoTeziTanka · 2026-04-11T20:04:39Z

Community Review — Non-record: 11L gated Krylov + AR GPTQ int6 + lzma, 1.09596 BPB

Compliance: NEEDS AUTHOR ACTION — train_gpt.py fails to import on CT2038 (Python 3.10 / torch 2.10.0+cpu)

What I found: The CPU smoke test on CT2038 (proteus-engine, 128 GB RAM, Triton 3.6.0, flash_attn stub, cutlass_evt_fusion stub) failed at the import step with:

ModuleNotFoundError: No module named 'golf'

A few of the common patterns I've seen for this class of error in the 2026-04-11 sweep:

PEP 701 f-string nesting — e.g. log(f" {cat}: {", ".join(...)}") is valid Python 3.12+ but invalid Python 3.10 because the inner ", " re-enters the outer double-quote context. One-character fix: ', ' instead of ", ". See PR Record: SP8192 + Improved Parallel Residuals + Muon 0.97 + LR 0.03 + Legal TTT — val_bpb 1.07785 (3-seed mean) #1541 / Record: SP8192 + Triple Recurrence + Banking + Fused MLP + Muon 0.97 — val_bpb 1.0778 (3-seed mean) #1523 for reference.
Missing flash_attn variants — e.g. from flash_attn_interface import flash_attn_varlen_func when the wrapper script only stubs flash_attn_func. Not a PR defect on H100s, but the eval image / CPU preflight path needs a guarded import.
Local compiled extension — e.g. import cutlass_evt_fusion from a records/*/cutlass_evt_fusion/ subfolder that isn't on the import path at smoke time. Usually an import-order issue inside the script.
Actual syntax error — typo, missing bracket, etc.

Recommendation: Could you run python3 -c "import py_compile; py_compile.compile('train_gpt.py')" on your records-folder train_gpt.py under Python 3.10 specifically? The eval image is Python 3.10 per Issue #17 / the README, so any parse error on 3.10 blocks the submission at import time before any of the scored-eval logic runs.

Once the parse/import issue is fixed, I'll re-run the compliance audit through the normal pipeline. No other flags identified yet because the audit halts at the import step.

Reviewed by @MatoTeziTanka — The Agora. CPU smoke test (CT2038 proteus-engine, 2026-04-11): IMPORT_FAIL — ModuleNotFoundError: No module named 'golf'. Classification via classify_prs.py AST-based classifier; full compliance audit deferred until the import issue is resolved. Auto-drafted from a template and spot-checked before posting.

g4nesh · 2026-04-11T22:14:59Z

@MatoTeziTanka I have fixed the problem. Please let me know if any error persists. Thanks for your help!

MatoTeziTanka · 2026-04-12T14:53:54Z

Re-audited at head SHA 4d4bb02.

Fix confirmed. The golf module import is now wrapped in a try/except chain (lines 31-39): first tries from hnet_tokenizer import HNetByteLM (sibling), then from golf.hnet_tokenizer import HNetByteLM (package), then falls back to HNetByteLM = None. File compiles clean under Python 3.10.

The import-blocking issue from my original review is resolved. I haven't run the full compliance audit yet (TTT/SLOT/n-gram checks on the model architecture) — I'll queue that for the next sweep. No hnet_tokenizer.py sibling file in the submission folder, so the HNet path is likely unused in this submission's active code path, but the graceful fallback to None means it won't block import regardless.

Thanks for the quick fix @g4nesh.

Re-audit by @MatoTeziTanka. Verified import chain at lines 31-39, py_compile OK under Python 3.10.

LauraGomezjurado and others added 3 commits April 7, 2026 10:30

Add Hiroki Naganuma as co-author in submission.json

dd27e53

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Update README.md

586e05d

g4nesh added 6 commits April 11, 2026 14:48

Update README.md

83d9fcc

Update submission.json

f49e29d

Update train_gpt.py

5f5fb53

Create requirements.txt

dd1013f

Update submission.json

590b808

Create run_a100_tmux.sh

4d4bb02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-record: 11L gated Krylov + AR GPTQ int6 + lzma, 1.09596 BPB#1446

Non-record: 11L gated Krylov + AR GPTQ int6 + lzma, 1.09596 BPB#1446
LauraGomezjurado wants to merge 9 commits intoopenai:mainfrom
LauraGomezjurado:main

LauraGomezjurado commented Apr 7, 2026 •

edited

Loading

Uh oh!

MatoTeziTanka commented Apr 11, 2026 •

edited

Loading

Uh oh!

g4nesh commented Apr 11, 2026

Uh oh!

MatoTeziTanka commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

LauraGomezjurado commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Submission

Approach

Results

Uh oh!

MatoTeziTanka commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Community Review — Non-record: 11L gated Krylov + AR GPTQ int6 + lzma, 1.09596 BPB

Uh oh!

g4nesh commented Apr 11, 2026

Uh oh!

MatoTeziTanka commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LauraGomezjurado commented Apr 7, 2026 •

edited

Loading

MatoTeziTanka commented Apr 11, 2026 •

edited

Loading