Commit f589076
Iter 117b-2: verify Triton entmax kernel against deep-spin/entmax reference
Cross-checked Triton forward + backward formulation against the official
deep-spin/entmax Entmax15Function source (commit-current as of 2026-04-30):
- Forward: equivalent to reference modulo X/2 vs 0.5*(z-tau) factoring
(algebra: reference's tau' = our tau / 2; output Y = max(X'-tau', 0)²
= max(z/2 - tau/2, 0)² = max(0.5(z-tau), 0)² = our w). Matches our
existing pure-PyTorch entmax_1p5 in train_gpt.py.
- Backward: matches deep-spin/entmax line-for-line.
Reference: gppr = sqrt(Y); dX = dY*gppr;
q = dX.sum(dim)/gppr.sum(dim); dX -= q*gppr
Our kernel: s = sqrt(w); c = sum(s*grad_w)/sum(s);
grad_z = s*(grad_w - c)
Identical (dX.sum = sum(grad_w * sqrt(w)) = sum(s*grad_w)).
- Numerical stability: our discr.clamp_min(1e-6) is STRICTER than the
reference's clamp(delta, 0); the reference has a latent sqrt(0)
backward NaN bug (sqrt(0) gradient = Inf → 0*Inf = NaN under chain
rule with downstream zero coefficients) which we already fixed in
iter 117 v3 (commit a9ec303 → 339adfc).
Sources:
- https://github.com/deep-spin/entmax/blob/master/entmax/activations.py
- https://arxiv.org/pdf/1905.05702 (Peters/Niculae/Martins 2019, §3 Algorithm 2 + Proposition 2)
Updated experiments/test_entmax_triton.py header to document the
verification chain. Kernel is correctness-verified by reference review;
empirical numerical-equivalence tests still gated on iter 117b-1
finishing (GPUs currently saturated by iter 117b-1 training).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 554ad79 commit f589076
1 file changed
Lines changed: 21 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
6 | 27 | | |
7 | 28 | | |
8 | 29 | | |
| |||
0 commit comments