-
Notifications
You must be signed in to change notification settings - Fork 571
Qualcomm AI Engine Direct - Fix UT example script hang when exception happened #4355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/4355
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit b41626f with merge base 5a20a49 ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Hi @cccclai , |
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
d17d18a
to
b41626f
Compare
Hi @cccclai, It seems like there are some failures in CI, so I have force pushed another commit to trigger CI.
Please have a look. |
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@@ -187,6 +187,7 @@ def build_executorch_binary( | |||
quantizer = QnnQuantizer() | |||
quantizer.add_custom_quant_annotations(custom_annotations) | |||
quantizer.set_per_channel_linear_quant(per_channel_linear) | |||
quantizer.set_per_channel_conv_quant(True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it always true or configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we set it always to true as there are couple of models having bad accuracy due to turning off per_channel_conv.
@@ -182,7 +181,6 @@ def set_per_channel_linear_quant(self, enable: bool) -> None: | |||
self._update_per_channel_weight_quant_ops(linear_ops, enable) | |||
|
|||
def transform_for_annotation(self, model: GraphModule) -> GraphModule: | |||
model = RemoveRedundancy()(model).graph_module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have a follow up to add this back? Feel like we may have perf regresssion without it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing. Although the pass RemoveRedundancy() is removed during quantizer, it will still be called during capture_program(), so the final performance should be the same.
executorch/backends/qualcomm/utils/utils.py
Line 196 in 7f6a341
RemoveRedundancy()(graph_module) |
Summary: