Skip to content

Commit e608af8

Browse files
committed
exp77: order-13 flat Dirichlet + phrase[36,28,20,16]
Extend n-gram to order-13 (PR openai#921 validates higher orders: 0.0939). Trim phrase to [36,28,20,16] to fit eval budget. Flat Dirichlet c=1.0 (highest match only — avoids hierarchical overhead).
1 parent 1b32847 commit e608af8

2 files changed

Lines changed: 3 additions & 2 deletions

File tree

results.tsv

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,3 +47,4 @@ b224b23 1.1323 15.88 keep TTT AdamW 5ep lr=0.0005 DDP-synced BEATS SOTA! sw=1.13
4747
1a8ee89 0.2534 15.26 discard hierarchical Dirichlet c=0.5 order-9 slightly worse than c=1.0, eval=532s
4848
cd10ecb 0.2463 15.39 keep flat Dirichlet c=1.0 + phrase[28,20,16] NEW BEST! phrase[28] adds -0.007, eval=529s
4949
fc5f627 0.2417 15.39 keep flat Dirichlet c=1.0 + phrase[36,28,20,16] NEW BEST! phrase[36] adds -0.005, eval=548s
50+
1b32847 0.2380 15.65 keep flat Dirichlet c=1.0 + phrase[48,36,28,20,16] NEW BEST! -0.004, eval=586s (14s spare)

train_gpt.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -896,7 +896,7 @@ def eval_val_sliding(
896896
class LongPhraseCache:
897897
"""variable-length suffix matcher for verbatim repetition (PR #880).
898898
probes at lengths [48,36,28,20,16] using rolling hashes."""
899-
PROBE_LENGTHS = [48, 36, 28, 20, 16] # full probe set (matching PR #880)
899+
PROBE_LENGTHS = [36, 28, 20, 16] # trimmed to fit with order-13
900900
PRIMES = [np.uint64(p) for p in [
901901
36313, 27191, 51647, 81929, 131071, 174763, 233017, 299993, 350377,
902902
412391, 479909, 541267, 613651, 700897, 786433, 850001, 921587,
@@ -2116,7 +2116,7 @@ def lr_mul(step: int, elapsed_ms: float) -> float:
21162116
ngram_enabled = bool(int(os.environ.get("NGRAM_ENABLED", "1")))
21172117
sw_seq_len = effective_eval_seq_len
21182118
if ngram_enabled:
2119-
ngram_order = int(os.environ.get("NGRAM_ORDER", "9"))
2119+
ngram_order = int(os.environ.get("NGRAM_ORDER", "13"))
21202120
ngram_min_order = int(os.environ.get("NGRAM_MIN_ORDER", "2"))
21212121
ngram_buckets = int(os.environ.get("NGRAM_BUCKETS", "4194304"))
21222122
ngram_min_count = int(os.environ.get("NGRAM_MIN_COUNT", "2"))

0 commit comments

Comments
 (0)