Fix incorrect attention mask truncate in WhisperFlashAttention2 #36477

OliBomby · 2025-02-28T14:23:59Z

What does this PR do?

Fixes incorrect attention calculation when training Whisper with Flash Attention 2 and passing decoder_attention_mask with some values set to False.

This error might've been made when copying the same truncating code from the other attention implementations. The problem is that in WhisperFlashAttention2 the dimensions have been transposed.

cc @sanchit-gandhi @ylacombe

github-actions · 2025-02-28T14:24:11Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the Ready for review button (at the bottom of the PR page).

Rocketknight1 · 2025-03-03T15:09:41Z

cc @eustlb

src/transformers/models/whisper/modeling_whisper.py

Co-authored-by: Anton Vlasjuk <[email protected]>

vasqu · 2025-04-04T17:57:46Z

I think this is still a pretty big bug since it highly affects the attention computations in fa2 and addresses #36585 (which has been closed under the assumption of this fix)

cc @eustlb

ronansgd · 2025-05-09T16:44:58Z

It would be great to merge this fix! cc @eustlb

greg2451 · 2025-05-14T14:45:11Z

Also affected by the issue, thanks for the fix @OliBomby.

@eustlb would be really great to get this merged 🙏🏼

eustlb

LGTM! Great catch, thanks a lot 🤗

HuggingFaceDocBuilderDev · 2025-05-14T20:09:13Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Fix incorrect attention mask truncate in whisper flash attention

54feb17

github-actions bot marked this pull request as draft February 28, 2025 14:24

OliBomby marked this pull request as ready for review February 28, 2025 14:25

also fix incorrect attention mask truncate in qwen2 audio

28a4190

vasqu mentioned this pull request Mar 6, 2025

Inconsistent Outputs When Using Flash Attention 2 and SDPA Attention with Attention Mask #36585

Closed

4 tasks

vasqu reviewed Mar 7, 2025

View reviewed changes

src/transformers/models/whisper/modeling_whisper.py Outdated Show resolved Hide resolved

OliBomby and others added 2 commits March 7, 2025 20:33

Nit attention mask truncate modeling_qwen2_audio.py

44244b9

Nit attention mask truncate modeling_whisper.py

bcce054

Co-authored-by: Anton Vlasjuk <[email protected]>

Merge branch 'main' into patch-1

ec3785a

eustlb approved these changes May 14, 2025

View reviewed changes

eustlb enabled auto-merge (squash) May 14, 2025 20:00

eustlb merged commit 4005e30 into huggingface:main May 14, 2025
20 checks passed

OliBomby deleted the patch-1 branch May 14, 2025 20:59

Anjum48 mentioned this pull request Jul 16, 2025

Whisper models appear to be broken with Flash Attention 2 #38662

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix incorrect attention mask truncate in WhisperFlashAttention2 #36477

Fix incorrect attention mask truncate in WhisperFlashAttention2 #36477

Uh oh!

OliBomby commented Feb 28, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Feb 28, 2025

Uh oh!

Rocketknight1 commented Mar 3, 2025

Uh oh!

Uh oh!

vasqu commented Apr 4, 2025

Uh oh!

ronansgd commented May 9, 2025

Uh oh!

greg2451 commented May 14, 2025

Uh oh!

eustlb left a comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Fix incorrect attention mask truncate in WhisperFlashAttention2 #36477

Fix incorrect attention mask truncate in WhisperFlashAttention2 #36477

Uh oh!

Conversation

OliBomby commented Feb 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

github-actions bot commented Feb 28, 2025

Uh oh!

Rocketknight1 commented Mar 3, 2025

Uh oh!

Uh oh!

vasqu commented Apr 4, 2025

Uh oh!

ronansgd commented May 9, 2025

Uh oh!

greg2451 commented May 14, 2025

Uh oh!

eustlb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

OliBomby commented Feb 28, 2025 •

edited

Loading