Skip to content

Commit 150213c

Browse files
Octavianclaude
andcommitted
Podracing II: Electric Bugaloo — 0.9620 BPB (best seed), mean 0.9823
Multi-order backoff (2-7) + entropy-adaptive alpha on 11L/512d U-Net. Two seeds sub-1.0. GPTQ calibration inside training phase. 3-seed: 1337=1.0217, 42=0.9631, 2045=0.9620, mean=0.9823 Credits: @deanbrr openai#659, @Asukabot0 openai#727, @signalrush openai#414 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 2c3c10c commit 150213c

6 files changed

Lines changed: 2499 additions & 0 deletions

File tree

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Podracing II: Electric Bugaloo
2+
3+
## Results
4+
5+
| Seed | Sliding BPB | 7-gram Backoff BPB | Artifact |
6+
|------|-------------|-------------------|----------|
7+
| 1337 | 1.1195 | 1.0217 | 15.59 MB |
8+
| 42 | 1.1210 | **0.9631** | 15.59 MB |
9+
| 2045 | 1.1196 | **0.9620** | 15.71 MB |
10+
| **Mean** | **1.1200** | **0.9823** ||
11+
12+
## What Changed vs Podracing I (#706)
13+
14+
Two eval-time improvements, no training changes:
15+
16+
1. **Multi-order backoff (orders 2-7)**: try longest context first, cascade down on miss
17+
2. **Entropy-adaptive alpha**: `alpha = 0.05 + 0.55 * sigmoid(2 * (H - 4.0))` where H = model entropy. Trust n-gram more when model is uncertain.
18+
19+
## Compliance
20+
21+
- Score-first, backward-looking: cache built from already-scored tokens only
22+
- Alpha depends solely on model's own softmax entropy — no target/label access
23+
- No oracle selection, no min-NLL comparison
24+
- GPTQ calibration runs inside training phase (before wallclock stop)
25+
26+
## Credits
27+
28+
- N-gram eval cache concept: @deanbrr (PR #659)
29+
- Multi-order backoff + adaptive alpha inspiration: @Asukabot0 (PR #727)
30+
- Base architecture: @signalrush (PR #414)
31+
32+
## Reproduce
33+
34+
```bash
35+
SEED=2045 MLP_ACT=leaky_relu_sq MLP_LEAKY_SLOPE=0.5 XSA_LAST_N=4 BIGRAM_VOCAB_SIZE=1536 ROPE_DIMS=24 NGRAM_EVAL_ORDER=7 NGRAM_EVAL_ADAPTIVE=1 NGRAM_EVAL_ALPHA=0.30 NGRAM_EVAL_MIN_COUNT=2 NGRAM_EVAL_BUCKETS=4194304 TTT_EVAL_ENABLED=0 torchrun --nproc_per_node=8 train_gpt.py
36+
```
37+
38+
8xH100 SXM, 600s training + ~140s eval.
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
{
2+
"author": "Frosty40",
3+
"github_id": "newjordan",
4+
"name": "Podracing II: Multi-Order Backoff + Entropy-Adaptive Alpha",
5+
"blurb": "11L/512d U-Net with legal score-first 7-gram backoff (orders 2-7) + entropy-adaptive alpha. 3-seed mean val_bpb=0.9823. N-gram concept credited to @deanbrr (PR #659).",
6+
"date": "2026-03-25T17:30:00Z",
7+
"val_loss": 1.6585,
8+
"val_bpb": 0.9823,
9+
"bytes_total": 15591748,
10+
"bytes_code": 106211
11+
}

0 commit comments

Comments
 (0)