-
Notifications
You must be signed in to change notification settings - Fork 537
Use to_edge_lower_and_transform for XNNPack #8624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8624
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 951d91e with merge base 77589c6 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Have you tested the performance of a model (like llama 3B), before and after? Asking because export_llama is used by different users to prepare for the .pte files in cpu. Please make sure there's no perf regress. |
Running on demand perf benchmark here: https://github.com/pytorch/executorch/actions/runs/13505211116 |
) | ||
if args.verbose: | ||
print_delegation_info(builder.edge_manager.exported_program().graph_module) | ||
if args.num_sharding > 0 and args.qnn: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code shouldn't be here right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah technically should remove since qnn. Will remove after benchmarking finishes
@iseeyuan please see the performance benchmark graph I posted in the pr description |
Can you run internal CI tests before merging? Otherwise looks good. |
@jackzhxng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: Use `to_edge_transform_and_lower` in `export_llama` for XNNPack. As part of these changes, this also means that you cannot specify multiple backends in `export_llama` in the args, although I'm not sure if that is happening anywhere at the moment. Closes #8621 Performance regression benchmarking for xnnpack (on android) vs. past 3 days: <img width="1427" alt="Screenshot 2025-02-24 at 11 39 52 AM" src="https://github.com/user-attachments/assets/1640cf2c-a579-491f-8940-7ccfbe464903" /> These benchmark numbers also normally fluctuate a bit across runs and these differences are within the usual fluctuation ranges. Test Plan: See if CI passes Differential Revision: D70124742 Pulled By: jackzhxng
b30733d
to
1b0c5e4
Compare
This pull request was exported from Phabricator. Differential Revision: D70124742 |
@jackzhxng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: Use `to_edge_transform_and_lower` in `export_llama` for XNNPack. As part of these changes, this also means that you cannot specify multiple backends in `export_llama` in the args, although I'm not sure if that is happening anywhere at the moment. Closes #8621 Performance regression benchmarking for xnnpack (on android) vs. past 3 days: <img width="1427" alt="Screenshot 2025-02-24 at 11 39 52 AM" src="https://github.com/user-attachments/assets/1640cf2c-a579-491f-8940-7ccfbe464903" /> These benchmark numbers also normally fluctuate a bit across runs and these differences are within the usual fluctuation ranges. Test Plan: See if CI passes Differential Revision: D70124742 Pulled By: jackzhxng
930ec0e
to
951d91e
Compare
This pull request was exported from Phabricator. Differential Revision: D70124742 |
Differential Revision: D70221944 Pull Request resolved: #8717
Summary
Use
to_edge_transform_and_lower
inexport_llama
for XNNPack. As part of these changes, this also means that you cannot specify multiple backends inexport_llama
in the args, although I'm not sure if that is happening anywhere at the moment.Closes #8621
Performance regression benchmarking for xnnpack (on android) vs. past 3 days:

These benchmark numbers also normally fluctuate a bit across runs and these differences are within the usual fluctuation ranges.
Test plan
See if CI passes