[Non-record] Meta-Learned TTT + Error-Guided Adaptation Analysis (val_bpb=1.1645) by sseanliu · Pull Request #294 · openai/parameter-golf

sseanliu · 2026-03-21T00:21:08Z

Summary

Non-record research submission exploring test-time adaptation strategies for compressed language models at 16MB scale.

Key findings

Reptile meta-learning improves SmearGate models by 0.011 BPB — 10x better than naive TTT (+0.001), partially overcoming the SmearGate/TTT redundancy reported in the competition
Error-guided TTT is a negative result — concentrating adaptation budget on highest-loss tokens does not improve val_loss, indicating these tokens are genuinely unpredictable rather than under-adapted
13 layers beat 10 layers on 8xH100 (1.1884 vs 1.2090) despite 23% fewer training steps
Per-token loss distribution analysis on full 62M val set: the hardest 2.7% of tokens (loss > 7.0) account for ~15% of total loss

Score

val_bpb: 1.1645 (sliding window, stride=64)
Artifact: 12.7MB (well under 16MB)
Hardware: 8x H100 SXM, 600s training

Methodology

Base: PR 11-Layer Int6 + WD=0.04 + SWA + FA3 (val_bpb: 1.1318) #198 recipe (11L, int6+zstd, 3x MLP, SmearGate, BigramHash, SWA, Muon WD=0.04)
Reptile meta-learning: last 20% of training time, 1576 meta-steps on last 3 blocks' MLPs
Error-guided TTT: two-pass eval with rank-4 LoRA on top 2% highest-loss windows
Inspired by TTT-E2E (Sun et al., 2025) and SIFT (ICLR 2025 Best Paper)

Files

records/track_10min_16mb/2026-03-20_MetaTTT_v2/train_gpt.py — Training with Reptile
eval_error_guided_ttt.py — Error-guided TTT evaluation
records/track_10min_16mb/2026-03-20_MetaTTT_v2/README.md — Full analysis
records/track_10min_16mb/2026-03-20_MetaTTT_v2/submission.json — Metadata

See README for detailed methodology, results, and theoretical context.

…x 4 recycled = 12 effective layers Architecture: 3 unique blocks at dim=768 (12 heads, 6 KV heads) recycled 4x each for 12 effective layers with per-iteration scale/mix params and U-Net skip connections. 13.2M unique params in ~12MB compressed (3.9MB headroom vs 16MB cap). 50% wider representations + 20% more effective depth vs SOTA's 10x512.

- Extract _apply_block helper for cleaner per-iteration logic - Remove _block_for_layer, use direct modular indexing - Reduce code from 1325 to 1321 lines

Three files: - program.md: Instructions for the AI agent (experiment loop, logging, directions) - prepare.py: Fixed utilities (data loading, evaluation, quantization, size checking) - train.py: Modifiable baseline (SOTA architecture, the only file the agent edits) Based on Karpey's autoresearch framework, adapted for parameter-golf constraints (16MB artifact limit, fixed FineWeb dataset, SentencePiece 1024 vocab).

…ons learned

…o MetaTTT submission

… training

…tion results

…ded TTT analysis

sseanliu added 12 commits March 19, 2026 17:38

Refactor shared block forward pass for torch.compile compatibility

381db06

- Extract _apply_block helper for cleaner per-iteration logic - Remove _block_for_layer, use direct modular indexing - Reduce code from 1325 to 1321 lines

Add README for Seed Crystallization submission

235396f

Fix data paths to resolve relative to repo root

bb74e96

Improve autoresearch program.md with depth frontier strategy and less…

97305d1

…ons learned

Add early-vs-late reversal warning to program.md

3771ab3

Add comprehensive literature review for meta-learned TTT strategy

a9cf85e

Add SmearGate, BigramHash, SWA, Reptile meta-learning, and TTT eval t…

a1b949e

…o MetaTTT submission

Add MetaTTT v2 research submission: Reptile meta-learning + test-time…

4eb547c

… training

MetaTTT v2: int5 quantization fix, TTT eval stride fix, 1xH100 valida…

ae25f74

…tion results

Add MetaTTT v2 research submission: Reptile meta-learning + error-gui…

2d4c422

…ded TTT analysis

sseanliu closed this Mar 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Non-record] Meta-Learned TTT + Error-Guided Adaptation Analysis (val_bpb=1.1645)#294

[Non-record] Meta-Learned TTT + Error-Guided Adaptation Analysis (val_bpb=1.1645)#294
sseanliu wants to merge 12 commits intoopenai:mainfrom
sseanliu:main

sseanliu commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sseanliu commented Mar 21, 2026

Summary

Key findings

Score

Methodology

Files

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant