Skip to content

Use to_edge_lower_and_transform for XNNPack #8624

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 25, 2025
Merged

Conversation

jackzhxng
Copy link
Contributor

@jackzhxng jackzhxng commented Feb 21, 2025

Summary

Use to_edge_transform_and_lower in export_llama for XNNPack. As part of these changes, this also means that you cannot specify multiple backends in export_llama in the args, although I'm not sure if that is happening anywhere at the moment.

Closes #8621

Performance regression benchmarking for xnnpack (on android) vs. past 3 days:
Screenshot 2025-02-24 at 11 39 52 AM

These benchmark numbers also normally fluctuate a bit across runs and these differences are within the usual fluctuation ranges.

Test plan

See if CI passes

Copy link

pytorch-bot bot commented Feb 21, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8624

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 951d91e with merge base 77589c6 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 21, 2025
@jackzhxng jackzhxng changed the title Use to_edge_lower_and_transform for xnnpack Use to_edge_lower_and_transform for XNNPack Feb 21, 2025
@jackzhxng jackzhxng added topic: not user facing release notes: xnnpack Changes to the XNNPack backend delegate and removed topic: not user facing labels Feb 21, 2025
@mergennachin mergennachin self-requested a review February 21, 2025 20:43
@iseeyuan
Copy link
Contributor

Have you tested the performance of a model (like llama 3B), before and after? Asking because export_llama is used by different users to prepare for the .pte files in cpu. Please make sure there's no perf regress.

@jackzhxng
Copy link
Contributor Author

jackzhxng commented Feb 24, 2025

Running on demand perf benchmark here: https://github.com/pytorch/executorch/actions/runs/13505211116

)
if args.verbose:
print_delegation_info(builder.edge_manager.exported_program().graph_module)
if args.num_sharding > 0 and args.qnn:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code shouldn't be here right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah technically should remove since qnn. Will remove after benchmarking finishes

@jackzhxng jackzhxng temporarily deployed to upload-benchmark-results February 24, 2025 19:19 — with GitHub Actions Inactive
@jackzhxng jackzhxng temporarily deployed to upload-benchmark-results February 24, 2025 19:48 — with GitHub Actions Inactive
@jackzhxng
Copy link
Contributor Author

@iseeyuan please see the performance benchmark graph I posted in the pr description

@mergennachin
Copy link
Contributor

Can you run internal CI tests before merging? Otherwise looks good.

@facebook-github-bot
Copy link
Contributor

@jackzhxng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Feb 25, 2025
Summary:
Use `to_edge_transform_and_lower` in `export_llama` for XNNPack. As part of these changes, this also means that you cannot specify multiple backends in `export_llama` in the args, although I'm not sure if that is happening anywhere at the moment.

Closes #8621

Performance regression benchmarking for xnnpack (on android) vs. past 3 days:
<img width="1427" alt="Screenshot 2025-02-24 at 11 39 52 AM" src="https://github.com/user-attachments/assets/1640cf2c-a579-491f-8940-7ccfbe464903" />

These benchmark numbers also normally fluctuate a bit across runs and these differences are within the usual fluctuation ranges.


Test Plan: See if CI passes

Differential Revision: D70124742

Pulled By: jackzhxng
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D70124742

@facebook-github-bot
Copy link
Contributor

@jackzhxng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary:
Use `to_edge_transform_and_lower` in `export_llama` for XNNPack. As part of these changes, this also means that you cannot specify multiple backends in `export_llama` in the args, although I'm not sure if that is happening anywhere at the moment.

Closes #8621

Performance regression benchmarking for xnnpack (on android) vs. past 3 days:
<img width="1427" alt="Screenshot 2025-02-24 at 11 39 52 AM" src="https://github.com/user-attachments/assets/1640cf2c-a579-491f-8940-7ccfbe464903" />

These benchmark numbers also normally fluctuate a bit across runs and these differences are within the usual fluctuation ranges.


Test Plan: See if CI passes

Differential Revision: D70124742

Pulled By: jackzhxng
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D70124742

@jackzhxng jackzhxng merged commit b5344c1 into main Feb 25, 2025
119 of 121 checks passed
@jackzhxng jackzhxng deleted the jz/export_llama_new_api branch February 25, 2025 09:58
@jackzhxng jackzhxng restored the jz/export_llama_new_api branch February 25, 2025 09:59
@jackzhxng jackzhxng temporarily deployed to upload-benchmark-results February 25, 2025 10:45 — with GitHub Actions Inactive
swolchok added a commit that referenced this pull request Feb 26, 2025
@swolchok swolchok mentioned this pull request Feb 26, 2025
swolchok added a commit that referenced this pull request Feb 26, 2025
* Revert "Switch to new ao quant api for 8da4w (#8501)"

This reverts commit f3fc096.

* Revert "Use to_edge_lower_and_transform for XNNPack (#8624)"

This reverts commit b5344c1.

#8624 caused concerning test failure internally -- out of bounds array access. #8501 depends on it per author
facebook-github-bot pushed a commit that referenced this pull request Feb 27, 2025
Summary:
Trying to bring back #8624 after it got reverted due to an internal test failing


Differential Revision: D70221944

Pulled By: jackzhxng
facebook-github-bot pushed a commit that referenced this pull request Feb 27, 2025
Differential Revision: D70221944

Pull Request resolved: #8717
iseeyuan pushed a commit that referenced this pull request Mar 14, 2025
* Revert "Switch to new ao quant api for 8da4w (#8501)"

This reverts commit f3fc096.

* Revert "Use to_edge_lower_and_transform for XNNPack (#8624)"

This reverts commit b5344c1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported release notes: xnnpack Changes to the XNNPack backend delegate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move export_llama to to_edge_transform_and_lower API
5 participants