Skip to content

v10 moonshot: ternary MLP quant + scaled model + hedge mixer + enhanced n-gram#2

Merged
sunnypatneedi merged 2 commits intomainfrom
claude/priceless-rosalind
Mar 26, 2026
Merged

v10 moonshot: ternary MLP quant + scaled model + hedge mixer + enhanced n-gram#2
sunnypatneedi merged 2 commits intomainfrom
claude/priceless-rosalind

Conversation

@sunnypatneedi
Copy link
Copy Markdown
Owner

Re-opened from openai#863 (opened on wrong repo).

sunnypatneedi and others added 2 commits March 26, 2026 01:05
…hanced n-gram

- train_gpt_v10_safe.py: v9a + Hedge Mixer (multiplicative weights) + add-delta n-gram smoothing, dim=512
- train_gpt_v10_moonshot.py: model_dim=640 (42M params) + adaptive quant (ternary MLP / int4 attn / int6 embed) + Hedge Mixer
- auto_experiment.py: local CPU random search over 20 configs, logs to experiments.jsonl
- submit.sh: packaging and staging script for H100 runs
- PLAN.md: strategy doc with size estimates and run order

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- validate_configs.py: CPU-only artifact size estimator for moonshot configs (no GPU/data needed)
- experiments.jsonl: 20 initial random search results from auto_experiment.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@sunnypatneedi sunnypatneedi merged commit c6ec05f into main Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant