Skip to content

fix KQ FP32 precision fpr parallel_blocks > 1

44ca576
Select commit
Loading
Failed to load commit list.
Merged

CUDA: faster FlashAttention for batch sizes > 1 #6646

fix KQ FP32 precision fpr parallel_blocks > 1
44ca576
Select commit
Loading
Failed to load commit list.

Workflow runs completed with no jobs