Skip to content

Pull requests: HabanaAI/vllm-hpu-extension

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Cherry-pick window FusedSDPA for Gemma3 onto v1.22
#302 opened Jul 18, 2025 by MohitIntel Loading…
Fix fallback buckets
#301 opened Jul 18, 2025 by madamczyk-intel Loading…
Add block_softmax_adjustment and block_softmax kernels
#289 opened Jul 16, 2025 by czhu15 Loading…
[V1] Defragmentation support
#275 opened Jul 10, 2025 by madamczyk-intel Loading…
skip invalid decoding buckets with bs>blocks
#269 opened Jul 10, 2025 by yangulei Loading…
Add pre-commit static checks
#247 opened Jun 30, 2025 by kzawora-intel Loading…
Update dependabot.yml
#242 opened Jun 26, 2025 by michalkuligowski Loading…
Update linear.py
#239 opened Jun 25, 2025 by michalkuligowski Loading…
Exponential bucketing tweaks
#224 opened Jun 13, 2025 by madamczyk-intel Loading…
Add useful internal vllm test
#200 opened May 27, 2025 by nirda7 Draft
fix the issue that bmax not in bucket buffer
#191 opened May 22, 2025 by sywangyi Loading…
Optimized MoE on Gaudi
#159 opened Apr 18, 2025 by gyou2021 Draft
[FIX] fp8 gc compile error
#110 opened Mar 4, 2025 by maktukmak Draft
ProTip! Follow long discussions with comments:>50.