[CI/Build] Bump flashinfer to v0.6.10 by arpera · Pull Request #41711 · vllm-project/vllm

arpera · 2026-05-05T08:19:58Z

Purpose

Bump FlashInfer from v0.6.8.post1 to v0.6.10.
Adjust installation to use flashinfer-python[cu13] extra for cu13 users.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Signed-off-by: Artem Perevedentsev <aperevedents@nvidia.com>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

mergify · 2026-05-05T08:20:55Z

Hi @arpera, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

gemini-code-assist

Code Review

This pull request updates the FlashInfer version to 0.6.10 across the project's Docker configurations and dependency files. It also introduces conditional logic in the Dockerfile and setup.py to include the [cu13] extra for flashinfer-python when CUDA 13 is detected, facilitating support for SM100 GDN kernels. I have no feedback to provide.

pavanimajety · 2026-05-05T15:39:07Z

FYI: 0.6.9 update - #40998

arpera · 2026-05-05T16:01:02Z

Yes, I have seen this PR #40998, thanks. It wasn't finished, so I think now v0.6.10 makes more sense.

arpera · 2026-05-05T16:15:55Z

I would also like to point out that in this PR, in addition to directly integrating the new FI version v0.6.8, I made a small fix that wasn't accounted for in vLLM when integrating previous FI versions.
Specifically, I added installation of the flashinfer-python[cu13] extras in cases where the user has cu13 installed. Right now this is necessary because without the extras, Flashinfer does not install nvidia-cutlass-dsl[cu13] extras by default, which is required in particular for using the FI Blackwell GDN implementation whose support I'm currently trying to add here: #40717.

There is also a small discussion about this issue in comments: 1, 2.

Since I don't have much experience managing build dependencies in vLLM, I'd be happy to get suggestions for a more correct way to handle this in vLLM.

wzhao18 · 2026-05-05T21:14:42Z

I am noticing some potential numeric issues with the newer flashinfer versions. Specifically, the generation length for GPQA with DSv4 with the new versions are significantly longer than before (claude suggests the model is stuck in self-doubt loop).

I am still investigating the issue. But just wanted to flag this out. It may be worth doing some more eval studies before merging this.

vadiklyutiy · 2026-05-06T12:45:39Z

Do I understand correctly that if have environment with cu13 and do pip install flashinfer-python it doesn't install everything and users have to additionally make pip install flashinfer-python[cu13]?

arpera · 2026-05-06T12:56:10Z

Yes, you understand right

wzhao18 · 2026-05-06T21:56:42Z

@arpera With more investigation, I think the issue that I was hitting was not related to newer flashinfer versions (but with something else). I tested v0.6.10 GPQA eval with deepseek v4, it looks good. I have no more concern for upgrading.

[CI/Build] Bump flashinfer to v0.6.10

f336153

Signed-off-by: Artem Perevedentsev <aperevedents@nvidia.com>

claude Bot reviewed May 5, 2026

View reviewed changes

mergify Bot added ci/build nvidia labels May 5, 2026

github-project-automation Bot added this to NVIDIA May 5, 2026

gemini-code-assist Bot reviewed May 5, 2026

View reviewed changes

Merge branch 'main' into bump-flashinfer-0.6.10

9d91406

pavanimajety added the ready-run-all-tests Trigger CI with all tests for wide-ranging PRs label May 5, 2026

Merge branch 'main' into bump-flashinfer-0.6.10

0d1be65

pavanimajety removed the ready-run-all-tests Trigger CI with all tests for wide-ranging PRs label May 5, 2026

arpera mentioned this pull request May 6, 2026

[GDN] Enable FI Blackwell GDN prefill kernel #40717

Open

4 tasks

djmmoss mentioned this pull request May 6, 2026

[Attention] Add head_dim=512 support for FlashInfer trtllm attention backend #38822

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI/Build] Bump flashinfer to v0.6.10#41711

[CI/Build] Bump flashinfer to v0.6.10#41711
arpera wants to merge 3 commits intovllm-project:mainfrom
arpera:bump-flashinfer-0.6.10

arpera commented May 5, 2026 •

edited

Loading

Uh oh!

claude Bot left a comment

Uh oh!

mergify Bot commented May 5, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

pavanimajety commented May 5, 2026

Uh oh!

arpera commented May 5, 2026

Uh oh!

arpera commented May 5, 2026

Uh oh!

wzhao18 commented May 5, 2026 •

edited

Loading

Uh oh!

vadiklyutiy commented May 6, 2026

Uh oh!

arpera commented May 6, 2026

Uh oh!

wzhao18 commented May 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

arpera commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

mergify Bot commented May 5, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

pavanimajety commented May 5, 2026

Uh oh!

arpera commented May 5, 2026

Uh oh!

arpera commented May 5, 2026

Uh oh!

wzhao18 commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vadiklyutiy commented May 6, 2026

Uh oh!

arpera commented May 6, 2026

Uh oh!

wzhao18 commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

arpera commented May 5, 2026 •

edited

Loading

wzhao18 commented May 5, 2026 •

edited

Loading

wzhao18 commented May 6, 2026 •

edited

Loading