forked from NVIDIA/cutlass
-
Notifications
You must be signed in to change notification settings - Fork 34
Pull requests: codeplaysoftware/cutlass-sycl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Create a helper for constructing tiled copies of default size
#454
opened Jul 3, 2025 by
t4c1
Loading…
[WIP] FP8 scaledMM with DeepSeek-style dequantization
#453
opened Jul 2, 2025 by
sanchitintel
•
Draft
1 of 4 tasks
Refactor tests for Flash Attention Prefill Cached
#449
opened Jun 26, 2025 by
muhammad-tanvir-1211
Loading…
Refactor benchmarks for Flash Attention Prefill
#447
opened Jun 26, 2025 by
muhammad-tanvir-1211
Loading…
Simplify Flash Attention Decode benchmarks generation
#437
opened Jun 19, 2025 by
muhammad-tanvir-1211
Loading…
Unify interface for Flash Attention Decode
#423
opened Jun 11, 2025 by
muhammad-tanvir-1211
Loading…
Add tests and benchmark configurations for BF16 | FP16 output for Flash Decode
#408
opened Jun 5, 2025 by
muhammad-tanvir-1211
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.