Skip to content

Commit 1572115

Browse files
leon2k2k2kclaude
andcommitted
spec 009: implement spinquant_hotstart.py (baseline + R_a-only modes)
Two new files in the openai#1736 submission dir: spinquant_hotstart.py (~360 LOC): - Imports from train_gpt.py for Hyperparameters/GPT/serialize/deserialize/ eval_val/eval_val_ttt_phased/BatchedTTTLoRA/etc. - Modes: baseline, internal_only (R_a only, per-layer per-KV-group, d_head rotation on V-output and O-input). - full, port_1695 are stubs — raise NotImplementedError with explanation. - Pipeline: load FP state_dict from HOTSTART_FP_CKPT -> apply rotations in-place on banked qo_bank/kv_bank -> optional pre-quant diagnostic eval -> call serialize() (GPTQ+compress) -> deserialize() -> quantized eval -> phased TTT eval -> write final.json. - Reproduces the TTT eval block from train_and_eval (lines 2997-3075) in _run_ttt_eval() rather than refactoring the source file. test_rotation_invariance.py (~250 LOC): - CPU-only, standalone (no train_gpt.py import due to flash_attn_3/triton module-level deps). - Self-contained minimal attention forward: Q/K/V projection from the banked tensors, RMSNorm on Q and K (matches real model's bound on attention logits; without this, trained weights saturate softmax and float noise in V amplifies catastrophically). - Tests baseline (bit-exact identity) and internal_only (rel tolerance 1e-4) against either synthetic random weights or spec 008's final_model.pt. Both pass cleanly (rel_max ~1e-6 on real checkpoint). - Can load either banked (qo_bank/kv_bank) or unbanked (blocks.N.attn.*.weight) state_dict format. Spec 009 updated: reduced scope to 2 modes (baseline, internal_only) for this session; full and port_1695 deferred. Rationale in the spec: MLP LeakyReLU-squared breaks R_m float-invariance, resid_mix can't be cleanly folded through RMSNorm, both needing design before implementation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent a552fba commit 1572115

3 files changed

Lines changed: 897 additions & 6 deletions

File tree

0 commit comments

Comments
 (0)