From 81b6bd7a20480e2ffd2527eda23247b70ac3c3fb Mon Sep 17 00:00:00 2001 From: Alex Date: Thu, 9 Apr 2026 14:14:54 -0700 Subject: [PATCH] Update README leaderboard for April records --- README.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/README.md b/README.md index 39012623ea..0393a3b7f2 100644 --- a/README.md +++ b/README.md @@ -30,6 +30,15 @@ Happy training! | Run | Score | Author | Summary | Date | Info | |-----|------:|--------|---------|------|------| +| SP8192 + 3-Layer Recurrence + Parallel Residuals + Legal TTT | 1.0810 | bigbag | On PR #1493: 3-layer recurrence, parallel residuals, QK-Gain 5.25, and legal score-first TTT on the PR #1394 stack | 2026-04-09 | [info](records/track_10min_16mb/2026-04-09_SP8192_3LayerRecur_ParResid_QK525_LegalTTT/README.md) | +| SP8192 + Parallel Residuals + Score-First TTT | 1.0822 | aryanbhosale | On PR #1477: parallel residuals on the PR #1413 SP8192 + legal score-first TTT stack | 2026-04-08 | [info](records/track_10min_16mb/2026-04-08_SP8192_ParallelResid_ScoreFirstTTT/README.md) | +| SP8192 + QK-Gain 5 + Legal Score-First TTT | 1.0828 | dexhunter | On PR #1413: QK-Gain 5.0 + legal score-first TTT on the PR #1394 SP8192 stack | 2026-04-06 | [info](records/track_10min_16mb/2026-04-06_SP8192_QK5_LegalTTT_1.0828/README.md) | +| SP8192 + Parallel Residuals + Hessian-Aware SDClip | 1.0835 | Robby Sneiderman | On PR #1412: parallel residuals, Hessian-aware SDClip, and progressive recurrence on the PR #1394 stack | 2026-04-06 | [info](records/track_10min_16mb/2026-04-06_SP8192_HessianSDClip_ProgressiveRecurrence/README.md) | +| SP8192 + GPTQ Embeddings + Depth Recurrence + SDClip | 1.0856 | Kevin Clark | On PR #1394: SP8192, GPTQ embeddings, looped layers 4-5, MuonEq-R, and std-based GPTQ clipping | 2026-04-05 | [info](records/track_10min_16mb/2026-04-05_SP8192_GPTQ-Embeddings_SDClip_Loop45x2/README.md) | +| SP4096 + Depth Recurrence + Parallel Residuals + MuonEq-R | 1.0897 | aryanbhosale | On PR #1334: SP4096 + depth recurrence + parallel residuals + MuonEq-R + QK-Gain 5.0 | 2026-04-04 | [info](records/track_10min_16mb/2026-04-04_SP4096_DepthRecurrence_ParallelResid_MuonEqR/README.md) | +| MuonEq-R + Depth Recurrence + WD=0.090 + All-Int6 GPTQ | 1.0912 | dexhunter | On PR #1285: MuonEq-R + layers 4-5 recurrence + higher weight decay + all-int6 GPTQ | 2026-04-03 | [info](records/track_10min_16mb/2026-04-03_MuonEqR_DepthRecurrence_WD090_AllInt6/README.md) | +| 4096-Vocab + Larger Model + High WD + Simplifications | 1.0979 | Kevin Clark | On PR #1218: SP4096 + 4x MLP + high weight decay, with TTT, hash embeddings, SmearGate, and value residuals removed | 2026-04-01 | [info](records/track_10min_16mb/2026-04-01_Vocab4096_MLPMult4_WD085/README.md) | +| Parallel Residuals + Mini Depth Recurrence | 1.1063 | Marko Sisovic | On PR #1204: mini recurrence on layers 4-5 + parallel attention/MLP residual lanes + AR self-generated GPTQ calibration | 2026-03-31 | [info](records/track_10min_16mb/2026-03-31_ParallelResiduals_MiniDepthRecurrence/README.md) | | 11L AR Self-Gen GPTQ + XSA | 1.1147 | abaybektursun | On PR #1019: Self-Generated GPTQ Calibration Data + all-layer XSA on the PR #549 stack | 2026-03-25 | [info](records/track_10min_16mb/2026-03-25_ValCalib_GPTQ_XSA_BigramHash3072/README.md) | | LeakyReLU² + Legal Score-First TTT + Parallel Muon | 1.1194 | abaybektursun | On PR #549: LeakyReLU(0.5)^2 + TTT + Parallel Muon on the PR #414 stack | 2026-03-23 | [info](records/track_10min_16mb/2026-03-23_LeakyReLU_LegalTTT_ParallelMuon/README.md) | | 11L EMA + GPTQ-lite + warmdown3500 | 1.1228 | signalrush | On PR #374: GPTQ-lite clip search + EMA, plus warmdown3500 and QAT@0.15 | 2026-03-22 | [info](records/track_10min_16mb/2026-03-22_11L_EMA_GPTQ-lite_warmdown3500_QAT015_1.1233/README.md) |