Skip to content

Commit 9090c27

Browse files
yuyeonclaude
andcommitted
Update log: VQ compression killed, SLOT compliance confirmed
VQ (vector quantization) compression: 2064× worse MSE than int6. Dead end. SLOT confirmed competition-legal per PRs openai#1229 and openai#1313. SLOT debugging: implementation works but needs 8×H100 for proper testing. Session 3 kill count: 7 (PartialRoPE, DiffAttn, curriculum, shared KV, factored MLP, VQ compression, + DiffAttn) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent e391487 commit 9090c27

1 file changed

Lines changed: 12 additions & 0 deletions

File tree

docs/research_log_session3.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,18 @@ SLOT (Scored-position Learnable Optimization at Test-time) is the new paradigm:
119119

120120
SLOT is test-time only: optimizes per-sample delta + logit_bias on frozen hidden states.
121121
Architecture-agnostic. ~0.25 BPP gain. Implemented for FiLM in experiments/film_slot/.
122+
SLOT is competition-legal: score-first protocol, frozen model, torch.no_grad hidden states.
123+
124+
### SLOT debugging (session 3)
125+
- FiLM+SLOT implementation works (39.6 GB VRAM, 100% GPU)
126+
- SLOT eval too slow on 1 GPU (~30 min for stride=64 on full val set)
127+
- #1313's SLOT eval failed on 1 GPU (double torch.compile issue)
128+
- Proper SLOT testing requires 8×H100
129+
130+
### Vector quantization analysis (novel compression — KILLED)
131+
VQ with K=256 centroids on FiLM weights: 2064× worse MSE than int6.
132+
Only helps tall matrices (fc 4096×512: 7.9× ratio). Most weights are
133+
square or wide → codebook overhead kills compression. Int6+GPTQ is far superior.
122134

123135
Running: SLOT24 (PR #1313) baseline on 1×H100 for comparison.
124136

0 commit comments

Comments
 (0)