Skip to content

CUDA: faster FlashAttention for batch sizes > 1#6646

Merged
JohannesGaessler merged 7 commits intoggml-org:gg/flash-attnfrom
JohannesGaessler:jg/flash-attn-20
Apr 18, 2024
Merged

CUDA: faster FlashAttention for batch sizes > 1#6646
JohannesGaessler merged 7 commits intoggml-org:gg/flash-attnfrom
JohannesGaessler:jg/flash-attn-20

Commits