Add lowercase SP10240 QK 5.125 ablation#1814
Open
suryavanshi wants to merge 1 commit intoopenai:mainfrom
Open
Conversation
Fija
pushed a commit
to Fija/parameter-golf
that referenced
this pull request
Apr 28, 2026
Phase J (one-time data prep, done): - train_sp10240_caseops.py: train SentencePiece BPE at vocab=10240 over CaseOps-transformed FineWeb. Reserves U+E001..U+E005 as user-defined symbols (matches PR openai#1729 / SP8192 reservation set). 96-worker, ~25 min. - prepare_caseops_data_parallel.py with --sp pointing at the new model produces SP10240 caseops shards (~27 GB). Uploaded to private HF dataset hf://FijaEE/parameter-golf-sp10240-caseops (1434 train + 5 val + 5 val_bytes shards). - Tokenizer model + vocab file committed under tokenizers/ for git clone. Phase K (TTT params budget tradeoff, ready to run): - runpod/phase_k_ttt_tradeoff.sh: train SP8192 V2 baseline once on 8xH100 (~10 min, saves model.bin), then run TTT_EVAL_ONLY=1 for 4 configs reusing the saved artifact: K0: grad=1 prefix=2000 phases=3 ctx=2048 (V2 baseline) K1: grad=2 prefix=2000 phases=3 ctx=2048 (oracle, expected over-budget) K2: grad=2 prefix=1500 phases=1 ctx=2048 (cut prefix+phases) K3: grad=2 prefix=2000 phases=3 ctx=1024 (cut ctx) Auto-picks the lowest-BPB config that fits 600s for Phase L. Phase L (3-seed combo, parametrized by Phase K winner): - runpod/phase_l_combo.sh: PR openai#1797 V2 stack + SP10240 + LoRA rank 96 + best TTT params from K. Runs 3 seeds (42, 314, 1234), reports Welch t-test vs PR openai#1797 (1.06157±0.00066) and the 0.005-nat record bar. Hypothesis (per user observation): vocab progression 1024→2048→4096→8192 has been monotonically beneficial; no one in the queue has tried sp10240 without PPM-D. PR openai#1814's lowercase-SP10240 single-seed (1.0742) suggests ~ -0.0015 BPB delta from vocab alone vs PR openai#1797's V2 SP8192 baseline (1.05998 seed-42). Combined with TTT 2-step bump (PR openai#1812 showed 4-epoch delivered -0.008 BPB on a different stack) and LoRA rank 96, total expected ~1.045-1.055 BPB if Phase K finds a feasible budget.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.