Skip to content

Commit 1b32847

Browse files
committed
exp76: full phrase probes [48,36,28,20,16] (PR openai#880 set)
Each additional probe length adds ~0.005 BPB. probe[28] → -0.007, probe[36] → -0.005. Testing if probe[48] captures even longer verbatim patterns.
1 parent fc5f627 commit 1b32847

2 files changed

Lines changed: 2 additions & 1 deletion

File tree

results.tsv

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,3 +46,4 @@ b224b23 1.1323 15.88 keep TTT AdamW 5ep lr=0.0005 DDP-synced BEATS SOTA! sw=1.13
4646
49aaca9 0.2532 15.67 discard hierarchical Dirichlet c=5.0 + order-13 no improvement, eval=627s OVER BUDGET
4747
1a8ee89 0.2534 15.26 discard hierarchical Dirichlet c=0.5 order-9 slightly worse than c=1.0, eval=532s
4848
cd10ecb 0.2463 15.39 keep flat Dirichlet c=1.0 + phrase[28,20,16] NEW BEST! phrase[28] adds -0.007, eval=529s
49+
fc5f627 0.2417 15.39 keep flat Dirichlet c=1.0 + phrase[36,28,20,16] NEW BEST! phrase[36] adds -0.005, eval=548s

train_gpt.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -896,7 +896,7 @@ def eval_val_sliding(
896896
class LongPhraseCache:
897897
"""variable-length suffix matcher for verbatim repetition (PR #880).
898898
probes at lengths [48,36,28,20,16] using rolling hashes."""
899-
PROBE_LENGTHS = [36, 28, 20, 16] # extended probes for more phrase matching
899+
PROBE_LENGTHS = [48, 36, 28, 20, 16] # full probe set (matching PR #880)
900900
PRIMES = [np.uint64(p) for p in [
901901
36313, 27191, 51647, 81929, 131071, 174763, 233017, 299993, 350377,
902902
412391, 479909, 541267, 613651, 700897, 786433, 850001, 921587,

0 commit comments

Comments
 (0)