Skip to content

Record: SP10240 Casefold + TTT + GPTQ + PPM-D — val_bpb 0.82005771 (3-seed mean)#1873

Open
schattenjuwel wants to merge 5 commits intoopenai:mainfrom
schattenjuwel:main
Open

Record: SP10240 Casefold + TTT + GPTQ + PPM-D — val_bpb 0.82005771 (3-seed mean)#1873
schattenjuwel wants to merge 5 commits intoopenai:mainfrom
schattenjuwel:main

Conversation

@schattenjuwel
Copy link
Copy Markdown

val_bpb: 0.82005771 (3-seed mean, seeds 123/999/42)
Artifact size: ~15.99 MB (all 3 runs < 16,000,000 bytes)
Training time: < 600s wall clock (8×H100 SXM)

Combines SP10240 casefold tokenization, Test-Time Training (TTT), GPTQ quantization, and a novel byte-level PPM-D order-5 mixture. The PPM predictor runs causally on Rank 0 after distributed TTT scoring, mixing at token probability level with confidence gating (λ=0.05 when PPM confidence ≥ 0.9).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant