CUDA: faster FlashAttention for batch sizes > 1#6646
Merged
JohannesGaessler merged 7 commits intoggml-org:gg/flash-attnfrom Apr 18, 2024
Merged
CUDA: faster FlashAttention for batch sizes > 1#6646JohannesGaessler merged 7 commits intoggml-org:gg/flash-attnfrom
JohannesGaessler merged 7 commits intoggml-org:gg/flash-attnfrom
Commits
Commits on Apr 17, 2024
- committed
- committed
- committed
- committed
- committed
- committed
- committed