Skip to content

Fix handling of attention-bias in MHA fusion #2332

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 22, 2025
Merged

Conversation

gramalingam
Copy link
Collaborator

In models generated from pytorch, masks may have shapes that are broadcastable to (B, H, S, St): eg., a 2D mask of shape (S, St) or even shape (1, 1, 1, St) in one example.

ONNX's opset23 Attention op allows masks of this shape. However, ORT's contrib ops (MHA, Attention) allow a mask of shape (1 or B, 1 or H, S, St). That is: they support broadcast only for the first two dimensions. (Even that is not supported by some earlier versions of ORT, which we don't consider here.)

So, while doing fusion for MHA, we should expand the mask to ensure it satisfies the constraints of MHA/Attention.

Copy link

codecov bot commented May 22, 2025

❌ 3 Tests Failed:

Tests completed Failed Passed Skipped
16179 3 16176 1701
View the top 3 failed test(s) by shortest run time
onnxscript.backend.onnx_export_test.TestOnnxBackEnd::test_export2python_produces_correct_onnx_script_model_0649_test_min_float32
Stack Traces | 0.003s run time
onnxscript\backend\onnx_export_test.py:137: in extract_functions
    mod = importlib.import_module(import_name)
C:\hostedtoolcache\windows\Python\3.10.11\x64\lib\importlib\__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
E   ModuleNotFoundError: No module named 'tests.onnx_backend_test_code.test_min_float32'

The above exception was the direct cause of the following exception:
.nox\test\lib\site-packages\parameterized\parameterized.py:620: in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
onnxscript\backend\onnx_export_test.py:271: in test_export2python_produces_correct_onnx_script_model
    functions = extract_functions(backend_test.name, code, self.test_folder)
onnxscript\backend\onnx_export_test.py:139: in extract_functions
    raise AssertionError(
E   AssertionError: Unable to import 'tests.onnx_backend_test_code.test_min_float32' (e=No module named 'tests.onnx_backend_test_code.test_min_float32') (file: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_min_float32.py', absolute path: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_min_float32.py', current folder: D:\a\onnxscript\onnxscript
E   ---- CONTENT --
E   import numpy
E   from onnx import TensorProto
E   from onnx.helper import make_tensor
E   from onnxscript import script, external_tensor
E   from onnxscript.values import Opset
E   from onnxscript.onnx_types import FLOAT
E   from onnxscript.onnx_opset import opset13
E   
E   @script()
E   def bck_test_min_float32(data_0: FLOAT[3], data_1: FLOAT[3]) -> (FLOAT[3]):
E       result = opset13.Min(data_0, data_1)
E       return result
onnxscript.backend.onnx_export_test.TestOnnxBackEnd::test_export2python_produces_correct_onnx_script_model_0680_test_mul_example
Stack Traces | 0.003s run time
onnxscript\backend\onnx_export_test.py:137: in extract_functions
    mod = importlib.import_module(import_name)
C:\hostedtoolcache\windows\Python\3.10.11\x64\lib\importlib\__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
E   ModuleNotFoundError: No module named 'tests.onnx_backend_test_code.test_mul_example'

The above exception was the direct cause of the following exception:
.nox\test\lib\site-packages\parameterized\parameterized.py:620: in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
onnxscript\backend\onnx_export_test.py:271: in test_export2python_produces_correct_onnx_script_model
    functions = extract_functions(backend_test.name, code, self.test_folder)
onnxscript\backend\onnx_export_test.py:139: in extract_functions
    raise AssertionError(
E   AssertionError: Unable to import 'tests.onnx_backend_test_code.test_mul_example' (e=No module named 'tests.onnx_backend_test_code.test_mul_example') (file: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_mul_example.py', absolute path: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_mul_example.py', current folder: D:\a\onnxscript\onnxscript
E   ---- CONTENT --
E   import numpy
E   from onnx import TensorProto
E   from onnx.helper import make_tensor
E   from onnxscript import script, external_tensor
E   from onnxscript.values import Opset
E   from onnxscript.onnx_types import FLOAT
E   from onnxscript.onnx_opset import opset14
E   
E   @script()
E   def bck_test_mul_example(x: FLOAT[3], y: FLOAT[3]) -> (FLOAT[3]):
E       z = opset14.Mul(x, y)
E       return z
onnxscript.backend.onnx_export_test.TestOnnxBackEnd::test_export2python_produces_correct_onnx_script_model_0411_test_gemm_transposeA
Stack Traces | 0.004s run time
onnxscript\backend\onnx_export_test.py:137: in extract_functions
    mod = importlib.import_module(import_name)
C:\hostedtoolcache\windows\Python\3.10.11\x64\lib\importlib\__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
E   ModuleNotFoundError: No module named 'tests.onnx_backend_test_code.test_gemm_transposeA'

The above exception was the direct cause of the following exception:
.nox\test\lib\site-packages\parameterized\parameterized.py:620: in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
onnxscript\backend\onnx_export_test.py:271: in test_export2python_produces_correct_onnx_script_model
    functions = extract_functions(backend_test.name, code, self.test_folder)
onnxscript\backend\onnx_export_test.py:139: in extract_functions
    raise AssertionError(
E   AssertionError: Unable to import 'tests.onnx_backend_test_code.test_gemm_transposeA' (e=No module named 'tests.onnx_backend_test_code.test_gemm_transposeA') (file: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_gemm_transposeA.py', absolute path: 'D:\\a\\onnxscript\\onnxscript\\tests\\onnx_backend_test_code\\test_gemm_transposeA.py', current folder: D:\a\onnxscript\onnxscript
E   ---- CONTENT --
E   import numpy
E   from onnx import TensorProto
E   from onnx.helper import make_tensor
E   from onnxscript import script, external_tensor
E   from onnxscript.values import Opset
E   from onnxscript.onnx_types import FLOAT
E   from onnxscript.onnx_opset import opset13
E   
E   @script()
E   def bck_test_gemm_transposeA(a: FLOAT[6,3], b: FLOAT[6,4], c: FLOAT[1,4]) -> (FLOAT[3,4]):
E       y = opset13.Gemm(a, b, c, transA=1)
E       return y

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@justinchuby justinchuby requested a review from Copilot May 22, 2025 20:17
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances attention-bias (mask) handling in the MHA fusion by enforcing ORT contrib ops’ mask shape requirements and expanding 2D masks for broadcasting.

  • Adds shape checks to ensure masks are 2D or 4D with broadcastable first two dims
  • Tracks when mask broadcast is needed via _use_mask_broadcast
  • Inserts an Expand in rewrite() to reshape 2D masks to 4D for MultiHeadAttention
Comments suppressed due to low confidence (1)

onnxscript/rewriter/ort_fusions/mha.py:285

  • [nitpick] The name mask_dim_2 is ambiguous; consider renaming it to something more descriptive like mask_seq_len_dim or mask_S_or_1 to clarify that this binding holds the S-or-1 dimension.
mask_dim_2 = bindings.get("S_or_1")

@gramalingam gramalingam enabled auto-merge (squash) May 22, 2025 20:51
@gramalingam gramalingam merged commit ef7e9e7 into main May 22, 2025
23 of 27 checks passed
@gramalingam gramalingam deleted the rama/attn-bias-shape branch May 22, 2025 20:52
@justinchuby justinchuby added this to the 0.2.7 milestone May 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

2 participants