-
Notifications
You must be signed in to change notification settings - Fork 536
SDK + Inspector output time format is inconsistent with delegates #4504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
from what I understand the recorded timings in the xnnpack events are ticks, and the above can be interpreted as nanoseconds. But I think @Olivia-liu @GregoryComer can speak more on this, and any plans on unifying the timing units. |
You can write a custom time converter like this def convert_xnnpack_delegate_time(
event_name: Union[str, int], input_time: Union[int, float]
) -> Union[int, float]:
return input_time / (1000 * 1000) And pass the convert to the Inspector like this inspector = Inspector(
etdump_path=args.etdump_path,
etrecord=args.etrecord_path,
debug_buffer_path=args.debug_buffer_path,
source_time_scale=TimeScale(args.source_time_scale),
target_time_scale=TimeScale(args.target_time_scale),
delegate_time_scale_converter=convert_xnnpack_delegate_time,
) This should be work. |
Thanks for the prompt response. From the structure of the profiling output I generally see that a block of delegate calls is followed by a "DELEGATE_CALL" event in the profiling. Maybe I'm mistaken about this representing the time spent in the delegate from Executorch's side. If I add up the delegate profiling times for the inference and interpret them as nanoseconds, the preceding block of XNNPack operators' overall time is quite different to the time given by DELEGATE_CALL (20.25ms vs 28.1ms). This ~40% missing inference time from the overall method::execute time was one of the reasons I thought that I couldn't interpret the results as nanoseconds. Full table below:
|
Hey @Aeranthes thanks again for your question! Always glad to see someone is trying out the tool :). I think there's some overhead of making the delegate call, so that's why the total time in greater than the sums of the operators. @cccclai do you have any knowledge on this? |
|
It should be correct to interpret those values of the delegated operators to be nanoseconds for xnnpack. We're working on making the time units consistent for delegates in the Inspector output. The 20.25ms vs 28.1ms gap is indeed weird. There might be some bugs unrelated to the time scale that's causing it. I don't think you did anything wrong since I can reproduce this on my side as well. |
Summary: Because the unit of delegate ops are different from non-delegates, this has caused confusion for users: [post](https://fb.workplace.com/groups/pytorch.edge.users/permalink/1562687731268181/), [Issue pytorch#4504](pytorch#4504). This diff gives a quick solution for the problem to unblock Beta. The rationale is that since for non-delegated ops, we have the default time scale: ``` source_time_scale: TimeScale = TimeScale.NS, target_time_scale: TimeScale = TimeScale.MS, ``` And the conversion rate is 10^6 (because 1 MS = 10^6 NS), so it makes sense to set a default converter function for delegated ops to be `input_time/(1000 * 1000)` as well. Similar converter functions have been used by [coreml](https://www.internalfb.com/code/fbsource/[76be64aab17c]/fbcode/executorch/examples/apple/coreml/scripts/inspector_utils.py?lines=259-262) and [vulkan](https://fburl.com/code/5cu6fx2b) already. The proper solution to write the convertion in etdump is tracked by T198369344 and T198369419 Differential Revision: D62160650
Summary: Pull Request resolved: pytorch#5076 By default, the non-delegated events have a time scale converter, but delegated events do not, and this has caused confusion for users: [post](https://fb.workplace.com/groups/pytorch.edge.users/permalink/1562687731268181/), [Issue pytorch#4504](pytorch#4504). This diff gives a quick solution for the problem to unblock ET Beta. The rationale is that since for non-delegated ops, we have the default time scale: ``` source_time_scale: TimeScale = TimeScale.NS, target_time_scale: TimeScale = TimeScale.MS, ``` And the conversion rate is 10^6 (because 1 MS = 10^6 NS), so it makes sense to set a default converter function for delegated ops to be `input_time/(1000 * 1000)` as well. Similar converter functions have been used by [coreml](https://www.internalfb.com/code/fbsource/[76be64aab17c]/fbcode/executorch/examples/apple/coreml/scripts/inspector_utils.py?lines=259-262) and [vulkan](https://fburl.com/code/5cu6fx2b) already. The proper solution to write the convertion in etdump is tracked by T198369344 and T198369419 Differential Revision: D62160650
Summary: Pull Request resolved: pytorch#5076 By default, the non-delegated events have a time scale converter, but delegated events do not, and this has caused confusion for users: [post](https://fb.workplace.com/groups/pytorch.edge.users/permalink/1562687731268181/), [Issue pytorch#4504](pytorch#4504). This diff gives a quick solution for the problem to unblock ET Beta. When no `delegate_time_scale_converter` is provided by the user, just use the same converting ratio as the non-delegated events. The proper solution to write the convertion in etdump is tracked by T198369344 and T198369419 Differential Revision: D62160650
Summary: Pull Request resolved: pytorch#5076 By default, the non-delegated events have a time scale converter, but delegated events do not, and this has caused confusion for users: [post](https://fb.workplace.com/groups/pytorch.edge.users/permalink/1562687731268181/), [Issue pytorch#4504](pytorch#4504). This diff gives a quick solution for the problem to unblock ET Beta. When no `delegate_time_scale_converter` is provided by the user, just use the same converting ratio as the non-delegated events. The proper solution to write the convertion in etdump is tracked by T198369344 and T198369419 Reviewed By: dbort Differential Revision: D62160650
Summary: Pull Request resolved: pytorch#5076 By default, the non-delegated events have a time scale converter, but delegated events do not, and this has caused confusion for users: [post](https://fb.workplace.com/groups/pytorch.edge.users/permalink/1562687731268181/), [Issue pytorch#4504](pytorch#4504). This diff gives a quick solution for the problem to unblock ET Beta. When no `delegate_time_scale_converter` is provided by the user, just use the same converting ratio as the non-delegated events. The proper solution to write the convertion in etdump is tracked by T198369344 and T198369419 Reviewed By: dbort Differential Revision: D62160650
Summary: Pull Request resolved: pytorch#5076 By default, the non-delegated events have a time scale converter, but delegated events do not, and this has caused confusion for users: [post](https://fb.workplace.com/groups/pytorch.edge.users/permalink/1562687731268181/), [Issue pytorch#4504](pytorch#4504). This diff gives a quick solution for the problem to unblock ET Beta. When no `delegate_time_scale_converter` is provided by the user, just use the same converting ratio as the non-delegated events. The proper solution to write the convertion in etdump is tracked by T198369344 and T198369419 Reviewed By: dbort Differential Revision: D62160650
@Olivia-liu is this resolved? |
This should solve it #5076 |
Hi again - I'm think this is probably setup-agnostic, but as quick rundown, I'm running on an Android device with SDK enabled. I've got a binary application that can generate an etdump file while registering the XNNPack delegate, then I'm running a version of mobilenet_v2 that has been partitioned for the same delegate.
When I print out the resulting event blocks to find the operator timings, I'm seeing the following pattern:
The subject of the issue is specifically that the time format seems inconsistent - the Executorch calls and column headers are recorded in milliseconds, while from what I gather from XNNProfiler.cpp, the delegate calls are recorded in "PAL ticks" / system ticks?
Is this a bug, and/or is there a way to get the output in a more comparable time unit? Thanks for any assistance.
cc @Olivia-liu @Jack-Khuu
The text was updated successfully, but these errors were encountered: