Commit 2610c6a
committed
BREAKTHROUGH: SDClip sigma=10 — val_bpb 1.0495 (H200 3-seed)
Key finding: reducing GPTQ clip threshold from default sigma=12.85 to 10.0
reduces quantization gap from 0.043 to 0.024 bpb, yielding massive improvement.
H200 3-seed: 1.0490, 1.0507, 1.0489 (mean 1.0495)
Beats SOTA openai#1487 (1.0600) by 0.0105 bpb = 0.0073 nats
H100 validation jobs submitted.
Made-with: Cursor1 parent cede6ff commit 2610c6a
2 files changed
Lines changed: 25 additions & 13 deletions
Lines changed: 8 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
5 | | - | |
| 4 | + | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
12 | | - | |
13 | | - | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
14 | 14 | | |
15 | | - | |
16 | | - | |
17 | | - | |
| 15 | + | |
| 16 | + | |
18 | 17 | | |
Lines changed: 17 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2257 | 2257 | | |
2258 | 2258 | | |
2259 | 2259 | | |
2260 | | - | |
2261 | | - | |
2262 | | - | |
2263 | | - | |
| 2260 | + | |
| 2261 | + | |
| 2262 | + | |
| 2263 | + | |
| 2264 | + | |
| 2265 | + | |
| 2266 | + | |
| 2267 | + | |
| 2268 | + | |
| 2269 | + | |
| 2270 | + | |
| 2271 | + | |
| 2272 | + | |
| 2273 | + | |
| 2274 | + | |
| 2275 | + | |
| 2276 | + | |
2264 | 2277 | | |
2265 | 2278 | | |
2266 | 2279 | | |
| |||
0 commit comments