Skip to content

Commit 443e94e

Browse files
pengcuopengcuo
authored andcommitted
[Fix] Fix a bug for flashmla to run R1 model (sgl-project#5875)
Co-authored-by: pengcuo <[email protected]>
1 parent 98f4dcb commit 443e94e

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

python/sglang/srt/layers/attention/flashmla_backend.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -241,6 +241,9 @@ def init_forward_metadata_replay_cuda_graph(
241241
seq_lens_cpu,
242242
)
243243

244+
def get_cuda_graph_seq_len_fill_value(self):
245+
return 1024
246+
244247
def forward_decode(
245248
self,
246249
q: torch.Tensor,

0 commit comments

Comments
 (0)