Record: Polar Express NS + SLOT + MuonEq-R + XSA-all — 1.1043 BPB (3-seed mean) by Omrigotlieb · Pull Request #1298 · openai/parameter-golf

Omrigotlieb · 2026-04-03T12:10:12Z

Summary

val_bpb: 1.1043 (3-seed mean, std 0.0009) — beats current SOTA (1.1147) by 0.0104 BPB
Artifact: 15.82 MB (under 16,000,000 byte limit)
8×H100 SXM, PyTorch 2.9.1+cu128, 600s training + ~300s eval

Results

Seed	Post-SLOT bpb	Steps	ms/step	Artifact
1337	1.1052	6,899	86.9	15,824,588
42	1.1042	6,886	87.0	15,817,288
2025	1.1035	6,886	87.0	15,810,092
Mean	1.1043 ±0.0009

Key Innovations (on PR #549 stack)

Polar Express Newton-Schulz (arXiv:2505.16932) — per-iteration minimax-optimal polynomials. 4 PE steps ≈ quality of 5 fixed-coefficient steps, saving ~2ms/step → ~180 extra training steps
SLOT eval-time delta optimization — per-batch additive delta [B,1,d_model] optimized with 8 AdamW steps (lr=0.005), model weights frozen. Contributes -0.015 BPB
MuonEq-R — row-normalize gradient before NS orthogonalization. 2-line change, ~0.001 BPB free
XSA on all 11 layers (XSA_LAST_N=11) — zero new parameters, ~0.002 BPB improvement

Run Command

BIGRAM_VOCAB_SIZE=1536 BIGRAM_DIM=112 XSA_LAST_N=11 \
WARMDOWN_ITERS=4000 MUON_BACKEND_STEPS=4 \
SLOT_ENABLED=1 SLOT_STEPS=8 SLOT_LR=0.005 \
ITERATIONS=9000 MAX_WALLCLOCK_SECONDS=600 EVAL_STRIDE=64 \
SEED=1337 \
torchrun --standalone --nproc_per_node=8 train_gpt.py

Statistical Significance

Gap vs SOTA: 0.0104 BPB (2× the 0.005 threshold). z-score: 11.6 (p << 0.01).

Test plan

3-seed validation (1337, 42, 2025)
All artifacts under 16,000,000 bytes
Script compiles and runs from records folder
Sliding window eval (stride=64) + SLOT eval
Statistical significance (p < 0.01)

3-seed mean: 1.1043 ± 0.0009 BPB (beats SOTA 1.1147 by 0.0104) seed 1337: 1.1052 | seed 42: 1.1042 | seed 2025: 1.1035 Artifacts: 15.82 MB (BigramHash 1536x112, int6+lzma) On PR openai#549 stack: - Polar Express NS (arXiv:2505.16932, 4 steps) - SLOT eval-time delta (8 AdamW steps, lr=0.005) - MuonEq-R row-normalization - XSA on all 11 layers

Omrigotlieb · 2026-04-04T14:08:01Z

Superseded by PR #1344 (1.0923 BPB, clean, with depth recurrence)

Omrigotlieb closed this Apr 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: Polar Express NS + SLOT + MuonEq-R + XSA-all — 1.1043 BPB (3-seed mean)#1298

Record: Polar Express NS + SLOT + MuonEq-R + XSA-all — 1.1043 BPB (3-seed mean)#1298
Omrigotlieb wants to merge 1 commit intoopenai:mainfrom
Omrigotlieb:clean-submission

Omrigotlieb commented Apr 3, 2026

Uh oh!

Omrigotlieb commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Omrigotlieb commented Apr 3, 2026

Summary

Results

Key Innovations (on PR #549 stack)

Run Command

Statistical Significance

Test plan

Uh oh!

Omrigotlieb commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant