Description
Refactor train_gpt_cpu.py into importable, testable modules:
pgolf/config.py — Hyperparameters dataclass
pgolf/data.py — TokenStream, data loading
pgolf/model.py — GPT, Block, Attention, MLP, RMSNorm, Rotary
pgolf/optim.py — Muon optimizer, LR scheduling
pgolf/quantize.py — int8 quantization + zlib
pgolf/eval.py — BPB evaluation, tokenizer LUTs
train_cpu.py — Thin CLI entrypoint
Acceptance Criteria
Description
Refactor train_gpt_cpu.py into importable, testable modules:
pgolf/config.py— Hyperparameters dataclasspgolf/data.py— TokenStream, data loadingpgolf/model.py— GPT, Block, Attention, MLP, RMSNorm, Rotarypgolf/optim.py— Muon optimizer, LR schedulingpgolf/quantize.py— int8 quantization + zlibpgolf/eval.py— BPB evaluation, tokenizer LUTstrain_cpu.py— Thin CLI entrypointAcceptance Criteria
train_cpu.pyreproduces exact same behavior astrain_gpt_cpu.py