Skip to content

Commit 2f0bc46

Browse files
saltyfish66yuethe
authored andcommitted
fix: fix illegal cuda memory access at fused_moe_kernel (sgl-project#4727)
Co-authored-by: yuethe <[email protected]>
1 parent cb5d8fa commit 2f0bc46

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

python/sglang/srt/layers/moe/fused_moe_triton/fused_moe.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,7 @@ def fused_moe_kernel(
152152
return
153153
offs_token_id = pid_m * BLOCK_SIZE_M + tl.arange(0, BLOCK_SIZE_M)
154154
offs_token = tl.load(sorted_token_ids_ptr + offs_token_id)
155+
offs_token = offs_token.to(tl.int64)
155156
token_mask = offs_token < num_valid_tokens
156157

157158
offs_bn = (pid_n * BLOCK_SIZE_N + tl.arange(0, BLOCK_SIZE_N)) % N

0 commit comments

Comments
 (0)