Commit 2b239a8
Sandermage
hybrid: add PN25 — Inductor-safe silu_and_mul pool (sister-patch to PN12)
PN12 patches the eager-mode SiluAndMul.forward_cuda; under V1 default
custom_ops=["none"], dispatch routes through forward_native which
torch.compile/Inductor inlines and lowers to empty_strided_cuda(...) —
completely bypassing PN12's pool. Reported by noonghunna in club-3090#16
(VolandBerlioz Reddit + ampersandru cross-rig confirmation): RTX 3090
24 GB + Lorbus 27B + OpenCode 29K-token prefill OOMs at
inductor_cache/...py:1208 allocating (s18, 17408) fp16 = 137.6 MiB.
Genesis stack inherits the same flaw — our PN12 only patches forward_cuda.
We don't see it in PROD only because our 27B Lorbus configs short-circuit
the compile path on this kernel. Future Inductor-default configs would
expose the leak.
PN25 design (complement, not replacement of PN12):
- New kernel module vllm/_genesis/kernels/silu_and_mul_customop.py
registers `genesis::silu_and_mul_pooled` via torch.library.custom_op
with mutates_args=() and device_types=("cuda",). Inductor treats
custom ops as opaque — emits a call, never inlines. Inside the body
we acquire from FFNIntermediateCache.acquire_silu_out (same pool used
by PN12) and dispatch to torch.ops._C.silu_and_mul.
- New wiring patch_N25_silu_inductor_safe_pool.py text-patches
SiluAndMul.forward_native to dispatch through the opaque op. Falls
back to vanilla F.silu * mul math when registration unavailable
(torch < 2.4 or CPU-only build) — soft degradation.
- Dispatcher entry declares conflicts_with=[], requires_patches=[].
PN12 + PN25 patch DIFFERENT methods (forward_cuda vs forward_native)
so they can coexist on the same file without anchor collision.
Bug fix in PN12 marker logic:
- Removed "FFNIntermediateCache" from upstream_drift_markers — this is
our own internal pool class name and may legitimately appear in
sister-patches (PN25 docstring references it). Drift markers should
signal upstream variants, not Genesis-internal symbols. Without this
fix, applying PN25 to a vanilla file would correctly patch
forward_native, but a subsequent PN12 attempt on the same file would
see "FFNIntermediateCache" via PN25's docstring and skip with
upstream_merged — silently bypassing forward_cuda pool wiring.
Validation (clean container test):
- register: True
- op_callable: genesis.silu_and_mul_pooled
- eager call: shape (2, 8) bfloat16, finite=True
- torch.compile call: shape (2, 8), finite=True (fake impl validates
for shape inference)
- live container PN12+PN25 sequential apply: both "applied" status,
both markers in activation.py.
PN25 ships opt-in OFF (GENESIS_ENABLE_PN25_SILU_INDUCTOR_SAFE=1) and
is NOT enabled in any launch script. Users with inductor-heavy
configs (e.g. 35B FP8 long-context, future MoE) can pair it with PN12
for full FFN intermediate pooling coverage.
Cross-reference: noonghunna's club-3090#16 work-in-progress on the
same flaw. Independent convergence on the torch.library.custom_op
approach.1 parent 1ac34a8 commit 2b239a8
5 files changed
Lines changed: 526 additions & 1 deletion
File tree
- vllm/_genesis
- kernels
- patches
- wiring/hybrid
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
613 | 613 | | |
614 | 614 | | |
615 | 615 | | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
616 | 640 | | |
617 | 641 | | |
618 | 642 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1953 | 1953 | | |
1954 | 1954 | | |
1955 | 1955 | | |
| 1956 | + | |
| 1957 | + | |
| 1958 | + | |
| 1959 | + | |
| 1960 | + | |
| 1961 | + | |
| 1962 | + | |
| 1963 | + | |
| 1964 | + | |
| 1965 | + | |
| 1966 | + | |
| 1967 | + | |
| 1968 | + | |
| 1969 | + | |
| 1970 | + | |
| 1971 | + | |
| 1972 | + | |
| 1973 | + | |
| 1974 | + | |
| 1975 | + | |
| 1976 | + | |
| 1977 | + | |
| 1978 | + | |
| 1979 | + | |
| 1980 | + | |
| 1981 | + | |
| 1982 | + | |
| 1983 | + | |
| 1984 | + | |
| 1985 | + | |
| 1986 | + | |
| 1987 | + | |
| 1988 | + | |
| 1989 | + | |
| 1990 | + | |
| 1991 | + | |
| 1992 | + | |
| 1993 | + | |
| 1994 | + | |
| 1995 | + | |
| 1996 | + | |
| 1997 | + | |
| 1998 | + | |
| 1999 | + | |
| 2000 | + | |
| 2001 | + | |
| 2002 | + | |
1956 | 2003 | | |
1957 | 2004 | | |
1958 | 2005 | | |
| |||
Lines changed: 5 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
188 | 188 | | |
189 | 189 | | |
190 | 190 | | |
191 | | - | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
192 | 196 | | |
193 | 197 | | |
194 | 198 | | |
| |||
0 commit comments