-
Notifications
You must be signed in to change notification settings - Fork 260
Add decorator for custom op and inductor decomp registration #408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/408
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit ac2e283 with merge base bc8599f ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
cf3234c
to
2ba121f
Compare
@@ -151,12 +144,12 @@ def quantize_affine( | |||
output_dtype (torch.dtype): requested dtype (e.g. torch.uint8) for output Tensor | |||
quant_min (Optional[int]): minimum quantized value for output Tensor, if not specified, it will be derived from dtype | |||
quant_max (Optional[int]): maximum quantized value for output Tensor, if not specified, it will be derived from dtype | |||
zero_point_domain (ZeroPointDomain): the domain that zero_point is in, should be eitehr integer or float | |||
zero_point_domain (str): the domain that zero_point is in, should be eitehr "int" for "float" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you change these from Enum to str?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh it's because custom op API doesn't support enum right now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zou3519 any plans to add support for this in the API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the set of types supported by operators is intentionally limited so that people don't have a difficult time working with it (for example, writing graph passes on FX graphs with operators). You can wrap the call to the operator in a python function that does support enums
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can wrap the call to the operator in a python function that does support enums
oh, I see, maybe that's what we should be doing here
@@ -205,15 +203,16 @@ def quantize_affine( | |||
|
|||
return quant | |||
|
|||
@register_custom_op("quant::dequantize_affine") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this make the default behavior to preserve the higher level op when we run export? What about compile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, and in compile we'll register a inductor decomposition to decompose the op
if TORCH_VERSION_AFTER_2_5: | ||
opdef = torch.library.custom_op(name, mutates_args=())(fn) | ||
opdef.register_fake(fn) | ||
register_decomposition([opdef._opoverload])(fn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@supriyar here we register a decomp for inductor, so in torch.compile these ops will still be decomposed
Requires pytorch/pytorch#129179 and pytorch/pytorch#129189 to be landed, and the nightly version got updated until the CI can pass |
Summary: This PR adds a decorator to register custom op and also an inductor dcomposition. The goal is for torch.export path to be able to see high level ops like quantize_affine instead of breaking down the op, this is because some backends like xnnpack wants to work with these higher level ops. Test Plan: regression tests: `python test/quantization/test_quant_api.py` `python test/integration/test_integration.py` also need to check performance with `python tutorials/quantize_vit/run_vit_b_quant.py` Reviewers: Subscribers: Tasks: Tags:
2ba121f
to
ac2e283
Compare
Summary: This PR adds a decorator to register custom op and also an inductor dcomposition. The goal is for torch.export path to be able to see high level ops like quantize_affine instead of breaking down the op, this is because some backends like xnnpack wants to work with these higher level ops. This is a redo for pytorch#408, difference is we can preserve the enums on the python side in this PR Test Plan: regression tests: python test/quantization/test_quant_api.py python test/integration/test_integration.py also need to check performance with python tutorials/quantize_vit/run_vit_b_quant.py Reviewers: Subscribers: Tasks: Tags:
Summary: This PR adds a decorator to register custom op and also an inductor dcomposition. The goal is for torch.export path to be able to see high level ops like quantize_affine instead of breaking down the op, this is because some backends like xnnpack wants to work with these higher level ops. This is a redo for pytorch#408, difference is we can preserve the enums on the python side in this PR Test Plan: regression tests: python test/quantization/test_quant_api.py python test/integration/test_integration.py also need to check performance with python tutorials/quantize_vit/run_vit_b_quant.py Reviewers: Subscribers: Tasks: Tags:
recreated the PR in #434 |
Summary: This PR adds a decorator to register custom op and also an inductor dcomposition. The goal is for torch.export path to be able to see high level ops like quantize_affine instead of breaking down the op, this is because some backends like xnnpack wants to work with these higher level ops. This is a redo for #408, difference is we can preserve the enums on the python side in this PR Test Plan: regression tests: python test/quantization/test_quant_api.py python test/integration/test_integration.py also need to check performance with python tutorials/quantize_vit/run_vit_b_quant.py Reviewers: Subscribers: Tasks: Tags:
Summary: This PR adds a decorator to register custom op and also an inductor dcomposition. The goal is for torch.export path to be able to see high level ops like quantize_affine instead of breaking down the op, this is because some backends like xnnpack wants to work with these higher level ops. This is a redo for pytorch#408, difference is we can preserve the enums on the python side in this PR Test Plan: regression tests: python test/quantization/test_quant_api.py python test/integration/test_integration.py also need to check performance with python tutorials/quantize_vit/run_vit_b_quant.py Reviewers: Subscribers: Tasks: Tags:
Summary: This PR adds a decorator to register custom op and also an inductor dcomposition. The goal is for torch.export path to be able to see high level ops like quantize_affine instead of breaking down the op, this is because some backends like xnnpack wants to work with these higher level ops. This is a redo for pytorch#408, difference is we can preserve the enums on the python side in this PR Test Plan: regression tests: python test/quantization/test_quant_api.py python test/integration/test_integration.py also need to check performance with python tutorials/quantize_vit/run_vit_b_quant.py Reviewers: Subscribers: Tasks: Tags:
Summary: This PR adds a decorator to register custom op and also an inductor dcomposition. The goal is for torch.export path to be able to see high level ops like quantize_affine instead of breaking down the op, this is because some backends like xnnpack wants to work with these higher level ops. This is a redo for pytorch#408, difference is we can preserve the enums on the python side in this PR Test Plan: regression tests: python test/quantization/test_quant_api.py python test/integration/test_integration.py also need to check performance with python tutorials/quantize_vit/run_vit_b_quant.py Reviewers: Subscribers: Tasks: Tags:
Summary: This PR adds a decorator to register custom op and also an inductor dcomposition. The goal is for torch.export path to be able to see high level ops like quantize_affine instead of breaking down the op, this is because some backends like xnnpack wants to work with these higher level ops. This is a redo for pytorch#408, difference is we can preserve the enums on the python side in this PR Test Plan: regression tests: python test/quantization/test_quant_api.py python test/integration/test_integration.py also need to check performance with python tutorials/quantize_vit/run_vit_b_quant.py Reviewers: Subscribers: Tasks: Tags:
Summary: This PR adds a decorator to register custom op and also an inductor dcomposition. The goal is for torch.export path to be able to see high level ops like quantize_affine instead of breaking down the op, this is because some backends like xnnpack wants to work with these higher level ops. This is a redo for pytorch#408, difference is we can preserve the enums on the python side in this PR Test Plan: regression tests: python test/quantization/test_quant_api.py python test/integration/test_integration.py also need to check performance with python tutorials/quantize_vit/run_vit_b_quant.py Reviewers: Subscribers: Tasks: Tags:
Summary: This PR adds a decorator to register custom op and also an inductor dcomposition. The goal is for torch.export path to be able to see high level ops like quantize_affine instead of breaking down the op, this is because some backends like xnnpack wants to work with these higher level ops. This is a redo for pytorch#408, difference is we can preserve the enums on the python side in this PR Test Plan: regression tests: python test/quantization/test_quant_api.py python test/integration/test_integration.py also need to check performance with python tutorials/quantize_vit/run_vit_b_quant.py Reviewers: Subscribers: Tasks: Tags:
Summary: This PR adds a decorator to register custom op and also an inductor dcomposition. The goal is for torch.export path to be able to see high level ops like quantize_affine instead of breaking down the op, this is because some backends like xnnpack wants to work with these higher level ops. This is a redo for pytorch#408, difference is we can preserve the enums on the python side in this PR Test Plan: regression tests: python test/quantization/test_quant_api.py python test/integration/test_integration.py also need to check performance with python tutorials/quantize_vit/run_vit_b_quant.py Reviewers: Subscribers: Tasks: Tags:
Summary: This PR adds a decorator to register custom op and also an inductor dcomposition. The goal is for torch.export path to be able to see high level ops like quantize_affine instead of breaking down the op, this is because some backends like xnnpack wants to work with these higher level ops. This is a redo for #408, difference is we can preserve the enums on the python side in this PR Test Plan: regression tests: python test/quantization/test_quant_api.py python test/integration/test_integration.py also need to check performance with python tutorials/quantize_vit/run_vit_b_quant.py Reviewers: Subscribers: Tasks: Tags:
…#434) Summary: This PR adds a decorator to register custom op and also an inductor dcomposition. The goal is for torch.export path to be able to see high level ops like quantize_affine instead of breaking down the op, this is because some backends like xnnpack wants to work with these higher level ops. This is a redo for pytorch#408, difference is we can preserve the enums on the python side in this PR Test Plan: regression tests: python test/quantization/test_quant_api.py python test/integration/test_integration.py also need to check performance with python tutorials/quantize_vit/run_vit_b_quant.py Reviewers: Subscribers: Tasks: Tags:
* Refactor tokenizer Summary: Prepare to add tiktoken Test Plan: Rely on CI jobs Reviewers: Subscribers: Tasks: Tags: * Fix aoti Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Fix more aoti Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Fix tokenizer read format Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary:
This PR adds a decorator to register custom op and also an inductor dcomposition.
The goal is for torch.export path to be able to see high level ops like quantize_affine instead of breaking down the op, this is because some backends like xnnpack wants to work with these higher level ops.
Test Plan:
regression tests:
python test/quantization/test_quant_api.py
python test/integration/test_integration.py
also need to check performance with
python tutorials/quantize_vit/run_vit_b_quant.py
Reviewers:
Subscribers:
Tasks:
Tags: