-
Notifications
You must be signed in to change notification settings - Fork 543
Use Core ML Quantizer in Llama Export #4458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/4458
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 68d345c with merge base 5a20a49 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Add @cccclai @shoumikhin as reviewers (smh I cannot assign reviewer 😂 so @ you here) |
@cymbalrush since now we have
|
Thanks for putting up the PR! For the change in llama_transformer.py, actually there was some CI regression as shown in #3786. |
Yeah this is quite interesting
Anyway, since that's just a minor fix, I reverted the |
2902624
to
7ca8eba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general. Can we fix the CI and rename the coreml_xnnpack
, coreml_xnnpack_qc4
?
We still want it |
7b993ae
to
4ef6875
Compare
Ok, reverted that change to still keep |
There is also a lint error in the CI. Mind addressing it? |
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Applied the lint fix |
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are some pyre
error but they aren't part of the oss ci...
lint error again 😅 |
…opriate iOS version accordingly
d88bcad
to
68d345c
Compare
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Fixed 😅 |
This PR is an initial step to add Core ML quantizer in Llama export. We start with "quantize model with XNNPack quantizer then fully delegate to Core ML backend". "Quantize with Core ML quantizer" is under development
This PR does 2 things: