[bugfix] fix flash attention 2 unavailable error on Ascend NPU #39166

FightingZhen · 2025-07-02T06:53:14Z

What does this PR do?

#38972 introduce flash attention 3 into transformers. However, the modification introduce a bug when using flash attention 2 on Ascend NPU.

The core reason is due to function names mismatch:

Functions defined from transformers.integrations.npu_flash_attention:

transformers/src/transformers/modeling_flash_attention_utils.py

Lines 140 to 153 in e8e0c76

    
           if is_torch_npu_available(): 
        
               from .integrations.npu_flash_attention import ( 
        
                   npu_apply_rotary_emb as apply_rotary_emb,  # noqa: F401 
        
               ) 
        
               from .integrations.npu_flash_attention import ( 
        
                   npu_flash_attn_func as flash_attn_func, 
        
               ) 
        
               from .integrations.npu_flash_attention import ( 
        
                   npu_flash_attn_varlen_func as flash_attn_varlen_func, 
        
               ) 
        
               from .integrations.npu_flash_attention import ( 
        
                   pad_input, 
        
                   unpad_input, 
        
               )

Functions actually used:

transformers/src/transformers/modeling_flash_attention_utils.py

Lines 470 to 475 in e8e0c76

    
           elif attn_implementation == "flash_attention_2": 
        
               _flash_attn_varlen_func = flash_attn_2_varlen_func 
        
               _flash_attn_func = flash_attn_2_func 
        
               _pad_input = pad_input_fa2 
        
               _unpad_input = unpad_input_fa2 
        
               _is_fa3 = False

This PR is committed for solving this problem, by renaming flash attention 2 related functions (e.g. npu_flash_attn_func) from transformers.integrations.npu_flash_attention to correct names, which should contain _2_ symbol (e.g. flash_attn_2_func)

Fixes # (issue)
Not related.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

FightingZhen · 2025-07-02T09:30:35Z

This PR is ready for merge @ArthurZucker @SunMarc, cc @EduardDurech

EduardDurech · 2025-07-02T09:55:36Z

Thanks, I didn't have a device to test this on but looks like you did it correctly

FightingZhen · 2025-07-03T13:47:07Z

Looking forward for review @ArthurZucker @SunMarc

ArthurZucker

Makes sense, I think this deserves to be in the patch!

HuggingFaceDocBuilderDev · 2025-07-07T13:03:56Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

[bugfix] fix flash attention 2 error on Ascend NPU

…ngface#39166) [bugfix] fix flash attention 2 error on Ascend NPU

[bugfix] fix flash attention 2 error on Ascend NPU

df6b916

FightingZhen changed the title ~~[bugfix] fix flash attention 2 error on Ascend NPU~~ [bugfix] fix flash attention 2 unavailable error on Ascend NPU Jul 2, 2025

Merge branch 'main' into bugfix_fa2_npu

19a6b5c

EduardDurech mentioned this pull request Jul 7, 2025

Support for Flash Attention 3 #38972

Merged

ArthurZucker approved these changes Jul 7, 2025

View reviewed changes

ArthurZucker added the for patch Tag issues / labels that should be included in the next patch label Jul 7, 2025

ArthurZucker enabled auto-merge (squash) July 7, 2025 12:50

ArthurZucker merged commit 00e9efc into huggingface:main Jul 7, 2025
26 checks passed

Cyrilvallez pushed a commit that referenced this pull request Jul 11, 2025

[bugfix] fix flash attention 2 unavailable error on Ascend NPU (#39166)

6023ca8

[bugfix] fix flash attention 2 error on Ascend NPU

SwiftAkira pushed a commit to SwiftAkira/transformers that referenced this pull request Jul 11, 2025

[bugfix] fix flash attention 2 unavailable error on Ascend NPU (huggi…

3ed01ae

…ngface#39166) [bugfix] fix flash attention 2 error on Ascend NPU

rjgleaton pushed a commit to rjgleaton/transformers that referenced this pull request Jul 17, 2025

[bugfix] fix flash attention 2 unavailable error on Ascend NPU (huggi…

2b99bd2

…ngface#39166) [bugfix] fix flash attention 2 error on Ascend NPU

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bugfix] fix flash attention 2 unavailable error on Ascend NPU #39166

[bugfix] fix flash attention 2 unavailable error on Ascend NPU #39166

Uh oh!

FightingZhen commented Jul 2, 2025

Uh oh!

FightingZhen commented Jul 2, 2025

Uh oh!

EduardDurech commented Jul 2, 2025

Uh oh!

FightingZhen commented Jul 3, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jul 7, 2025

Uh oh!

Uh oh!

	if is_torch_npu_available():
	from .integrations.npu_flash_attention import (
	npu_apply_rotary_emb as apply_rotary_emb, # noqa: F401
	)
	from .integrations.npu_flash_attention import (
	npu_flash_attn_func as flash_attn_func,
	)
	from .integrations.npu_flash_attention import (
	npu_flash_attn_varlen_func as flash_attn_varlen_func,
	)
	from .integrations.npu_flash_attention import (
	pad_input,
	unpad_input,
	)

	elif attn_implementation == "flash_attention_2":
	_flash_attn_varlen_func = flash_attn_2_varlen_func
	_flash_attn_func = flash_attn_2_func
	_pad_input = pad_input_fa2
	_unpad_input = unpad_input_fa2
	_is_fa3 = False

[bugfix] fix flash attention 2 unavailable error on Ascend NPU #39166

[bugfix] fix flash attention 2 unavailable error on Ascend NPU #39166

Uh oh!

Conversation

FightingZhen commented Jul 2, 2025

What does this PR do?

Before submitting

Uh oh!

FightingZhen commented Jul 2, 2025

Uh oh!

EduardDurech commented Jul 2, 2025

Uh oh!

FightingZhen commented Jul 3, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jul 7, 2025

Uh oh!

Uh oh!