Non-record: Verifily Three-Tier Token Weighting + DCLS Salience (SP1024, 1.1335 BPB) by arsenis-cmd · Pull Request #1634 · openai/parameter-golf

arsenis-cmd · 2026-04-15T04:03:32Z

Summary

Pure data-quality approach — zero architectural changes. First submission to apply token-level quality signals to training loss weighting without modifying the model architecture.

Three components layered on an SP1024 11L 512d baseline:

Three-tier token weighting: Classify tokens as Predictable (w=0.10), Frontier (w=1.0), or Noise (w=0.70) using GPU-resident bigram statistics + document quality scoring
DCLS salience batch reweighting: Per-batch loss multiplier [0.85, 1.15] based on surprise signal and document quality
Quality-conditioned bigram mixer at eval: Alpha conditioned on document quality (0.15 for high quality, 0.30 for low quality)

Results

2-seed mean: 1.13350264 BPB on 8×H100 SXM (~#16 on leaderboard).

Seed	BPB	Loss	Steps	Artifact
314	1.13414677	1.91495424	6524	15.8MB
42	1.13285851	1.91277908	6732	15.9MB

Seed 999 was not completed due to pod termination. Submitting as non-record with 2 seeds.

Key Takeaway

Data-quality signals provide measurable training improvement but cannot close the ~0.05 BPP gap driven by architectural advances (SP8192, depth recurrence, parallel residuals, TTT). A competitive submission integrating these components onto the current SOTA stack is in progress.

Ablation

All components independently controllable via env vars: VERIFILY_ENABLED=0, VERIFILY_SALIENCE=0, VERIFILY_MIXER=0.

Test plan

Verify submission.json schema matches competition spec
Verify train_gpt.py passes python3 -c "import ast; ast.parse(open('train_gpt.py').read())"
Verify all env var ablation flags are documented

Pure data-quality approach — zero architectural changes. Three components: 1. Three-tier token weighting (Predictable=0.10, Frontier=1.0, Noise=0.70) 2. DCLS salience batch reweighting [0.85, 1.15] 3. Quality-conditioned bigram mixer at eval 2-seed mean: 1.13350264 BPB on 8xH100 SXM (~openai#16 on leaderboard). Demonstrates data-quality signals help but can't close architecture gap. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-record: Verifily Three-Tier Token Weighting + DCLS Salience (SP1024, 1.1335 BPB)#1634

Non-record: Verifily Three-Tier Token Weighting + DCLS Salience (SP1024, 1.1335 BPB)#1634
arsenis-cmd wants to merge 1 commit intoopenai:mainfrom
arsenis-cmd:verifily-non-record

arsenis-cmd commented Apr 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

arsenis-cmd commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Results

Key Takeaway

Ablation

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

arsenis-cmd commented Apr 15, 2026 •

edited

Loading