-
Notifications
You must be signed in to change notification settings - Fork 4.1k
PyTorch 2 Export Quantization for OpenVINO torch.compile backend #3321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
PyTorch 2 Export Quantization for OpenVINO torch.compile backend #3321
Conversation
Co-authored-by: Alexander Suslov <[email protected]> Co-authored-by: Yamini Nimmagadda <[email protected]>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3321
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 0a422c2 with merge base a5632da ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Co-authored-by: Alexander Suslov <[email protected]>
@HamidShojanazeri can someone from the partner's team take a look? |
Introduction | ||
-------------- | ||
|
||
**This is an experimental feature, the quantization API is subject to change.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a suggestion, but I'd put this in a note callout: .. note::
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Done
1c6bc7c
to
f09a85f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks great!
fyi we just copied the pytorch pt2e quantization flow code to torchao: pytorch/ao#2048
if you have bug fixes or new features that you need in pt2e quantization please make changes in torchao instead, we are planning to deprecate the code in pytorch/pytorch
overall plan: https://dev-discuss.pytorch.org/t/torch-ao-quantization-migration-plan/2810
|
||
# Capture the FX Graph to be quantized | ||
with torch.no_grad(), nncf.torch.disable_patching(): | ||
exported_model = torch.export.export(model, example_inputs).module() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
torch.export.export_for_training
is the recommended API now I think
Edit: looks like in new pytorch versions these two are the same, in that case we might be recommending torch.export.export
since it's simpler. cc @tugsbayasgalan @gmagogsfm to confirm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I checked with Tugsuu, please continue to use torch.export.export
, we'll be migrating to this as well
from torch.ao.quantization.quantize_pt2e import convert_pt2e | ||
from torch.ao.quantization.quantize_pt2e import prepare_pt2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for long term support, importing from torchao.quantization.pt2e.quantize_pt2e
might be better
this requires people to install torchao nightly though: pip install --pre torchao --index-url https://download.pytorch.org/whl/nightly/cu126 # full options are cpu/cu118/cu126/cu128
but this can be done in a separate step since you might need to adapt your code to work with the torchao copy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few suggestions
Co-authored-by: Svetlana Karslioglu <[email protected]>
@daniil-lyakhov there seems to be a merge conflict - can you please take a look? |
Sure! Done |
@HamidShojanazeri , @williamwen42, could you please take a look? |
Hello there!
In this example we would like to present a quantization pipeline that allows users to run quantized OpenVINO models optimally—without ever leaving the PyTorch ecosystem
CC: @alexsu52, @ynimmaga, @anzr299 @AlexKoff88
cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @ZailiWang @ZhaoqiongZ @leslie-fang-intel @Xia-Weiwen @sekahler2 @CaoE @zhuhaozhe @Valentine233