-
Notifications
You must be signed in to change notification settings - Fork 7.1k
replace new_like with wrap_like #6718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @pmeier! I only have a couple of nit comments but nothing major.
@vfdev-5 my understanding is that it's still worth proceeding with some of the ideas from #6681 to reduce the __torch_function__
calls. Could you confirm that this PR doesn't affect the approach you are favouring to solve this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, let's move on !
Thanks @pmeier
Conflicts: torchvision/prototype/transforms/_auto_augment.py torchvision/prototype/transforms/_color.py torchvision/prototype/transforms/functional/_augment.py
As discussed offline, there are multiple ways we can approach the interface design. Nothing is set in stone here. We'll move with what I have proposed with the strong possibility of refactoring later. The performance gain is too high to hold this up with bike shedding. |
Summary: * replace new_like with wrap_like * fix videos * revert casting in favor of ignoring mypy Reviewed By: NicolasHug Differential Revision: D40427465 fbshipit-source-id: 04b854225fe6a886cbe468b1277a0b73ca273885
Throughout this comment I'm using
Image
as proxy for all of our features for simplicityThis is my take on reducing the overhead transforms v2 has. Currently, we are using the idiom
vision/torchvision/prototype/features/_image.py
Line 136 in 7eb5d7f
everywhere to wrap a plain tensor into the image feature. Doing so results in multiple
__torch_function__
calls as detailed in #6681. Similar to the constructor, theImage.new_like
method accepts arbitrarydata: Any
as input and thus has to go through the constructor every time.However, we never call it without a tensor input. Plus, whenever we pass
dtype
toImage.new_like
, it is not to change thedtype
of the tensor to be wrapped, but rather to retain it:vision/torchvision/prototype/features/_bounding_box.py
Line 158 in 7eb5d7f
Taking this one step further, this also means that the
new_like
name is somewhat misleading. Yes, one gets a newfeatures.Image
object, but unlike thetorch.*_like
methods, we don't get a new storage unless thedtype
ordevice
is changed.This PR proposed to fix the above by refactoring
Image.new_like
toImage.wrap_like
. Opposed tonew_like
,wrap_like
only takes a tensor to be wrapped as well the metadata for the specific type, i.e.color_space
forfeatures.Image
. This prevents the need to go through the constructor and results in no__torch_function__
calls at all:We can estimate the impact of this change on one classification training:
Although the overhead is quite low with roughly 20 µs per sample or 1 µs per call, the enormous amount of calls during a full training blows this up to a significant increase. However, running the same benchmark on
main
yieldsTo put it in words, this PR achieves roughly a 10x reduction of the overhead. And the only thing we lose is the ability to pass arbitrary data to the wrapping function or change the
dtype
anddevice
in the process. Both of which we don't do and I currently don't see a use case for it either.Note that this doesn't affect the ability to pass arbitrary data to the constructor. This is still supported. Plus, in contrast to the proposed
wrap_like
function, the constructor may also process the metadata, like guessing thecolor_space
if none is passed tofeatures.Image
, whilewrap_like
only takes the correct type or takes the value from the reference.