Skip to content

Commit 0d29bba

Browse files
trevor-mtarinkk
authored andcommitted
Cutlass MLA decode - fix dtype error (sgl-project#5868)
1 parent 1dad1a0 commit 0d29bba

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

python/sglang/srt/layers/attention/cutlass_mla_backend.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -268,7 +268,7 @@ def forward_decode(
268268
reshape_q = q.view(-1, layer.tp_q_head_num, layer.head_dim)
269269

270270
o = cutlass_mla_decode(
271-
q_nope_and_q_pe=reshape_q,
271+
q_nope_and_q_pe=reshape_q.to(self.q_data_type),
272272
kv_c_and_k_pe_cache=k_cache.view(-1, PAGE_SIZE, self.kv_cache_dim),
273273
seq_lens=forward_batch.seq_lens.to(torch.int32),
274274
page_table=self.forward_metadata.block_kv_indices,

0 commit comments

Comments
 (0)