Skip to content

Record: 11-gram Eval Cache + Hedge Mixer (val_bpb: 0.8609)#7

Merged
sunnypatneedi merged 1 commit intomainfrom
claude/clever-darwin
Mar 27, 2026
Merged

Record: 11-gram Eval Cache + Hedge Mixer (val_bpb: 0.8609)#7
sunnypatneedi merged 1 commit intomainfrom
claude/clever-darwin

Conversation

@sunnypatneedi
Copy link
Copy Markdown
Owner

Summary

  • Adds submission for track 10min_16mb: 11-gram n-gram eval cache with entropy-adaptive alpha and Hedge Mixer
  • 3-seed mean 0.8609 bpb (seed 42→0.8600, 1337→0.8611, 2025→0.8616)
  • All artifacts under 16MB limit

Files

  • records/track_10min_16mb/2026-03-26_sunnypatneedi_moonshot/submission.json — submission metadata
  • records/track_10min_16mb/2026-03-26_sunnypatneedi_moonshot/README.md — experiment notes
  • records/track_10min_16mb/2026-03-26_sunnypatneedi_moonshot/train_gpt.py — training code

Test plan

  • Verify submission.json seeds/track fields match requirements
  • Confirm artifact sizes ≤ 16MB
  • Re-run eval on at least one seed to validate bpb score

🤖 Generated with Claude Code

3-seed mean 0.8609 bpb (42→0.8600, 1337→0.8611, 2025→0.8616).
All artifacts under 16MB. 11-gram n-gram cache with entropy-adaptive
alpha and Hedge Mixer on PR openai#549 base architecture.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sunnypatneedi sunnypatneedi merged commit 6e32ea7 into main Mar 27, 2026
sunnypatneedi pushed a commit that referenced this pull request May 3, 2026
…AsymLogit Rescale, 7th BPB bug

- logs/daily_research.md: new May 3 entry; DRAFT PR openai#2146 grace-policy audit adds 4 records
  (pending SOTA 1.05651 via PR openai#2135); AsymLogit Rescale documented (~5 lines, zero legality
  risk); PR openai#2124 seed/config inconsistency; PR openai#2138 BPB bug #7 confirmed; data overlap
  hazard in PR openai#2130 flagged; no new high-relevance papers beyond prior scan.
- CLAUDE.md: Competition Strategy updated to reflect closed competition, pending audit status,
  and key post-competition findings (AsymLogit Rescale, GPTQ calibration batches, data overlap
  isolation requirement).

https://claude.ai/code/session_013Q2rFE4xRHRRYaSPfzCiip
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant