-
Notifications
You must be signed in to change notification settings - Fork 72
GQA Fusion #2161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
GQA Fusion #2161
Changes from all commits
Commits
Show all changes
37 commits
Select commit
Hold shift + click to select a range
88adde4
A couple of MHA extensions
gramalingam c4c1f71
Run lint
gramalingam 0425174
Minor fixes
gramalingam 4b1a68d
Update GQA
gramalingam 4dbb44c
Minor fixes
gramalingam 248a113
Merge with main
gramalingam 4d73c6e
Switch to new GQA
gramalingam 9c79b98
Fix variable naming
gramalingam 0d0e8ae
Add num heads attributes
gramalingam b588582
Use seqlens and totalseqlen
gramalingam 3febda2
Add cos and sin cache
gramalingam afcf0a7
Fix int32 type
gramalingam 03c08c7
GQA fusion
gramalingam 9040172
Merge branch 'rama/GQA2' of https://github.com/microsoft/onnx-script …
gramalingam 0bc603f
Basic GQA test
gramalingam 794f0dd
Minor refactoring
gramalingam a7ba01b
Switch to script
gramalingam d68b0e7
Add blank line
gramalingam e535207
Merge branch 'main' into rama/gqa-basic-test
gramalingam df5c69d
Add test case with past and rotary
gramalingam 045fc6f
Add new test
gramalingam edf289f
Cleanup test case
gramalingam 457ac32
Added test with past and rotary
gramalingam 1efdb26
Remove debug print
gramalingam 13a71c0
Minor cleanup
gramalingam e74000e
Merge with main
gramalingam e3dadc9
Add causal mask pattern
gramalingam ff72ac3
Merge branch 'rama/gqa-basic-test' into rama/GQA2
gramalingam b7a1398
Add test case
gramalingam 3bdc0b2
Complete GQA tests
gramalingam 9545e5f
Cleanup
gramalingam d19367a
Merge branch 'main' into rama/GQA2
gramalingam 97bb1c2
Address copilot fixes
gramalingam 56adee8
Merge branch 'rama/GQA2' of https://github.com/microsoft/onnx-script …
gramalingam c8bbb02
Add checks
gramalingam 43b3368
Minor cleanup
gramalingam 78f9243
Merge with main and address PR comments
gramalingam File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.