Add W104 faithful SP8192 LegalTTT bad-seed probe#9
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
seed=314first and gate additional seeds (42, 999) on its pass to focus effort on the bad-seed failure mode.Description
records/track_10min_16mb/2026-04-20_SP8192_LegalTTT_W104_FaithfulReplay/containing the replay artifacts and guidance.train_gpt.pythat surfaces source-visible defaults at the top (VOCAB_SIZE,TOKENIZER_PATH,DATA_PATH,TRAIN_SHARDS_OVERRIDE,QK_GAIN_INIT,TTT_ENABLED,TTT_LR,TTT_EPOCHS) and exports them viaos.environ.setdefault(...), while preserving the original packed SP8192 LegalTTT payload and architecture/compression surface (3-layer recurrence, parallel residuals, QK gain 5.25, legal score-first TTT, quantized+brotli target under 16 MB).run_w104_seed314_probe.shwhich bootstraps a venv and deps, caches the official FineWebsp8192usingMATCHED_FINEWEB_REPO_ID="kevclark/parameter-golf", runs onlySEED=314, writes logs to/workspace/w104_seed314.log, and prints the finalquantized_ttt val_bpb.README.mdexplaining this is not a new submission, documenting intent and pass criteria forseed314(must beat1.08168719, strong pass<1.0812) and instructing to run seeds 42/999 only afterseed314passes.Testing
python -m py_compile train_gpt.pyin the new folder and it succeeded.grepfor the expected lines intrain_gpt.pyand all expected defaults were found.Codex Task