Non-record: Trinity Ternary CPU v3 — Apple M1 Pro 72h, val_bpb 1.5042 by deborahnelson8788726 · Pull Request #1866 · openai/parameter-golf

deborahnelson8788726 · 2026-04-27T18:23:43Z

Non-record: Trinity Ternary CPU v3 on Apple M1 Pro

val_bpb: 1.5042 - single seed, 24M parameter model trained for 72.04 hours on Apple M1 Pro CPU only (10 cores, 16GB RAM, no GPU/MPS/NPU).

This is intentionally a non-record / unlimited-compute / notable submission. It is not a main 10-minute leaderboard claim because training used a laptop CPU for 72 hours rather than 8xH100 for 600 seconds.

Scope after cleanup

This PR now contains one submission folder only:

records/track_non_record_16mb/2026-04-24_Trinity_Ternary_CPU_v2/

Result summary

Metric	Value
val_bpb	1.5042
val_loss	2.5479
tokens/byte (SP1024)	0.4092
artifact size	5,525,048 bytes (5.53 MB decimal)
training time	72.04h on M1 Pro CPU
total params	24,128,000
ternary params	23,592,960
final ternary blend	alpha=1.0 full ternary

Method

10L 512d 8-head transformer, MLP 2.5x, RoPE, RMSNorm, tied 1024-vocab embeddings.
BitNet b1.58-style ternary QAT with STE and full alpha=1.0 ternary weights.
Trinity base-3 packing: 5 balanced trits per byte, lossless, 99% of the log2(3) theoretical optimum.
Step-based ternary ramp plus cosine LR decay, so macOS sleep cannot advance the quantization schedule while training is paused.
CPU-only training path for Apple Silicon / commodity laptop reproducibility.

Compliance

Track A style evaluation: causal attention, normalized full-vocab softmax, single left-to-right pass, no SLOT, no n-gram cache, no pre-quant TTT, no eval-time adaptation.

Reproduction notes

The packed submission artifact is included as final_model_v3.trinity.ptz. The reported v3 training run warm-started from a prior v1 CPU checkpoint that is not included in this PR; exact reruns should set WARM_START_PATH=/path/to/final_model.pt. Without that variable, train_gpt.py runs the same configuration from scratch.

python3 data/cached_challenge_fineweb.py --variant sp1024
WARM_START_PATH=/path/to/final_model.pt caffeinate -i -m -s python3 records/track_non_record_16mb/2026-04-24_Trinity_Ternary_CPU_v2/train_gpt.py
python3 records/track_non_record_16mb/2026-04-24_Trinity_Ternary_CPU_v2/pack_and_eval_v3.py

deborahnelson8788726 mentioned this pull request Apr 27, 2026

Record: Trinity v7+skip — val_bpb 0.22311 (3-seed mean, NEW #1) #1246

Closed

Narrow Trinity ternary CPU non-record submission

f6b81d6

deborahnelson8788726 force-pushed the trinity-ternary-cpu branch from 61bcb71 to f6b81d6 Compare May 4, 2026 20:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-record: Trinity Ternary CPU v3 — Apple M1 Pro 72h, val_bpb 1.5042#1866

Non-record: Trinity Ternary CPU v3 — Apple M1 Pro 72h, val_bpb 1.5042#1866
deborahnelson8788726 wants to merge 1 commit intoopenai:mainfrom
deborahnelson8788726:trinity-ternary-cpu

deborahnelson8788726 commented Apr 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

deborahnelson8788726 commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Non-record: Trinity Ternary CPU v3 on Apple M1 Pro

Scope after cleanup

Result summary

Method

Compliance

Reproduction notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

deborahnelson8788726 commented Apr 27, 2026 •

edited

Loading