-
Notifications
You must be signed in to change notification settings - Fork 533
Upcoming changes to export API in ExecuTorch (published on 9/12/2023) #290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi Kimish, I am having trouble with exir.capture.to_edge when I run it using my model as arguments for capture it fails with the traceback listed below. It looks like the model fails to run to completion even though it runs to completion when executed outside of executorch. I thought of adding constraints but I am not sure how to do that. Please let me know if you need additional information Thanks
|
Hi, Put another way what is the equivalent of Thanks |
Let me know if this answers your question. |
Hi @kimishpatel thanks for the response. I did save the model exported with torch.export without any problems. And, I did read the examples and related tutorials (several times). Unfortunately, exir.capture does not work for me. As you can see from the traceback I posted in my message yesterday (please see above). I also tried to load a saved model (and its dictionary) and then use it as one of the arguments for exir.capture. It failed with the traceback below. torch.export works with a saved model. I cannot see from the traceback above what causes the exir.capture failure. What do you think? What should I do next ? TRACEBACK PRODUCED WHEN USING SAVED MODEL AS ARGUMENT IN exir.capture
|
oh seems like aten.detach is not considered core op. For now make this, https://github.com/pytorch/executorch/blob/main/exir/capture/_config.py#L34, False. And try again. Note that you can pass it also via config to to_edge. cc: @guangy10 |
I made the change you suggested. Now the code fails with the following error:
|
Stange. @larryliu0820 can you take a look. alias doesnt have out variant but alias_copy does in native_functions.yaml. Not sure why functionalization is not generating alias_copy. Maybe @bdhirsh knows |
So I think |
Did you see |
@kimishpatel , sorry to bother you. I was wondering when you think there will be an update on the issue |
Apologies for late response. was on pto. Let me follow up on the issue |
@kimishpatel sorry for coming back to you. I have received no response to the two issues that are blocking progress on my work. pytorch/pytorch#120219 I realize that we may still be during leave period. If that is the case, please let me know when I should touch base again. Thanks for your patience and your help |
@adonnini dont apologize. You have been very patient. Let me follow up and see whats happening. |
Hi @kimishpatel , I hope you are well. I opened two issues: |
@kimishpatel |
Yeah I don tknow the compatiblity with v0.1.0. I |
@kimishpatel Hi, In the next few weeks we will start test deployments of the Android application. I would love to have the model run-for-inference function using executorch running by then. |
@kimishpatel I hope you are well. In a couple of weeks we will start deployment of my Android application. I would love to be able to include run for inference of the two models I am using to predict user location. Two issues: |
@adonnini thanks for bringing this back. Let me raise it internally and see what traction we get. I truly appreciate how you ahve been trying to make this work. |
@kimishpatel A quick update. |
Ok let me ping them agai |
@kimishpatel Thanks for your help. I really appreciate it! I think #2163 is pretty close to being resolved. @kirklandsign was very helpful. |
@adonnini no problem and thank you for your patience. I really appreciate it. I think @kirklandsign should be able to help you resolve it. If not, please bring it to my attention again. Thanks |
@adonnini do you know if the error happens only with android app or have you also trying running the model vai standalone binary, like executor runner, https://github.com/pytorch/executorch/tree/main/examples/portable/executor_runner |
@kimishpatel I am not familiar with |
@kimishpatel It's been just a little over a month since I last heard about the resolution of |
Cleans up our CLI interface through `torchchat.py`, but also allows direct access to the command files. Updates help messages as well.
@kimishpatel No response on the two issues that are preventing me from using Executorch. I really would like to use Executorch but at this point I don't really know what to do. I realize that probably all the messages sent to you @angelayi @anijain2305 and @tarun292 are a nuisance. |
@kimishpatel I am back to bothering you. I am sorry. There has been non action by @tarun292 on #1350 for a few months. |
@kimishpatel Someone with this handle, @luyi-711 just contacted me about one of the issue. The message was pretty suspicious. Is this someone on your team? A colleague? |
Where are we?
Exporting pytorch model for ExecuTorch runtime goes through multiple AoT (Ahead of Time) stages.
At high level there are 3 stages.
exir.capture
: This captures model’s graph using ATen IR.to_edge
: translate ATen dialect into edge dialect with dtype specialization.to_executorch
: translate edge dialect to executorch dialect, along with running various passes, e.g. out variant, memory planning etc., to make model ready for executorch runtime.Two important stops in model’s journey to executorch runtime are: a) quantization and b) delegation.
Entry points for quantization are between step 1 and 2. Thus quantization APIs consume ATen IR and are not edge/executorch specific.
Entry points for delegation are between step 2 and 3. Thus delegation APIs consume edge dialect IR.
Need for the export API change.
Quantization workflow is built on top of exir.capture which is built on top of torch.export API. In order to support QAT, such exported models need to work with eager mode autograd. Current export, of step 1 above, emits ATen IR with core ATen ops. This is not autograd safe, meaning it is not safe to run such an exported model in eager mode (e.g. in python), and, expect the autograd engine to work. Thus training APIs, such as calculating loss on the output and calling
backward
on the loss, are not guaranteed to work with this IR.It is important that quantization APIs, for QAT as well as PTQ, work on the same IR, because a) it provides better UX to the users and b) it provides a single IR that backend specific quantizers (read more here) can target.
For this reason we aligned on two stage export, that is rooted in the idea of progressive lowering. The two stages are:
Output of stage 1 is autograd safe and thus models exported at 1 can be trained via eager mode autograd engine.
New export API.
We are rolling out changes related to new export API in three stages.
Stage 1 (landed):
As shown in the figure below, exir.capture is broken down into:
capture_pre_autograd_graph
exir.capture
Example of exporting model without quantization:
Example of exporting model with quantization:
You can see these changes here and here for how quantization APIs fit in.
Stage 2 (coming soon):
We will deprecate exir.capture in favor of directly using torch.export. More updates on this will be posted soon.
Stage 3 (timeline is to be determined):
The two APIs listed in stage 1 will be renamed to:
torch.export
to_core_aten
torch.export will export graph with ATen IR, and full ATen opset, that is autograd safe, while to_core_aten will transform output of torch.export into core ATen IR that is NOT autograd safe.
Example of exporting model without quantization:
Example of exporting model with quantization:
Timeline for this is to be determined, but this will NOT happen before PyTorch conference on 10/16/2023.
Why this change?
There are a couple of reasons:
This change aligns well with the long term state where capture_pre_autograd_graph is replaced with torch.export to obtain autograd safe aten IR, and the current use of exir.capture (or torch.export when replaced) will be replaced with to_core_aten to obtain ATen IR with core ATen opset.
In the long term, export for quantization wont be separate. Quantization will be an optional step, like delegation, in the export journey. Thus aligning with that in the short terms helps because:
Why the change now?
To minimize the migration pain later and have better alignment with the long term changes.
The text was updated successfully, but these errors were encountered: