@valerio-oai @0hq @cocohearts @openai/parameter-golf-team Hi again! Submitted a new record PR with a substantial improvement over the current 1.0810 BPB SOTA:
Built on PR #1797's base, with two technical contributions on top:
- SmearGate cross-document BOS leak fix — masks the prev-token term wherever the current token is BOS, so packed-stream eval no longer leaks doc N's last token into doc N+1's BOS embedding.
- Per-group compression pipeline — adds
COMPRESSOR=pergroup (lrzip ZPAQ + L1 row similarity-sort on hot tensors + brotli remainder), ~280 KB smaller artifact.
Plus a stack of 9 greedy-validated hyperparameter overrides (full table in the PR).
Happy to address any concerns — thanks again for taking the time to review!
@valerio-oai @0hq @cocohearts @openai/parameter-golf-team Hi again! Submitted a new record PR with a substantial improvement over the current 1.0810 BPB SOTA:
Built on PR #1797's base, with two technical contributions on top:
COMPRESSOR=pergroup(lrzip ZPAQ + L1 row similarity-sort on hot tensors + brotli remainder), ~280 KB smaller artifact.Plus a stack of 9 greedy-validated hyperparameter overrides (full table in the PR).
Happy to address any concerns — thanks again for taking the time to review!