-
Notifications
You must be signed in to change notification settings - Fork 537
Qualcomm AI Engine Direct - Enable 4 bits BW quantization #2506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
chunit-quic
commented
Mar 19, 2024
- Add QNN_QUANTIZATION_ENCODING_BW... confings for qnn wrapper
- Add 4 bits quant config
- Add 4 bits quant single op tests
- Add per channel weight setting for quantizer
- Fix convert_to_linear error
- Refine quantizer
- Add QNN_QUANTIZATION_ENCODING_BW... confings for qnn wrapper - Add 4 bits quant config - Add 4 bits quant single op tests - Add per channel weight setting for quantizer - Fix convert_to_linear error - Refine quantizer
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/2506
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 2c0be5e with merge base 588c391 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Looks like the quantization is applied to the whole model, is it possible to have more granularity like specifying the quantization type for each op cc: @jerryzh168 |
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Hi Chen,
Our quantizer provides two ways to quantize op with different configs. One thing to note is that the mixed precision(WIP) is not completed yet, and it might fail to validate some op.
Finally, we could consider to refine _get_quant_config a bits, and allow users to declare their own {op :config} dict. This way needs us to submit another PR. |
Thank you! On Monday's meeting, we brought up the json configurable quantization, is it something we can target for? |
In case I have any misuderstanding. May you kindly give us an example to show what would the content of json be like? Thank you. :) |
Hi @cccclai, Just a gentle ping. Would you mind to give us an example of the content of json? So that we could think about how to fullfill your requirements. Thank you. :D |