-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Transforms without dispatcher #5421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💊 CI failures summary and remediationsAs of commit 0943de0 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
After some offline discussion with @vfdev-5, it became clear that always having a primitive transform will significantly increase the number of transforms that we provide. Thus, although this PR removes the high-level kernels and thus effectively turns 3 API layers into 2, the surface reduction is not as significant as originally thought. There will still be some reduction, since some transforms are already primitives, e.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I like this approach a lot. I've added a few comments, let's sync offline for next steps.
…eier/vision into transforms-without-dispatcher
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's looking very good. Few more comments:
…eier/vision into transforms-without-dispatcher
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nits and notes:
|
||
raise RuntimeError | ||
def gaussian_blur_image_pil(img: PIL.Image, kernel_size: List[int], sigma: Optional[List[float]] = None) -> PIL.Image: | ||
return to_pil_image(gaussian_blur_image_tensor(to_tensor(img), kernel_size=kernel_size, sigma=sigma)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vfdev-5 to_tensor
is heavily discouraged to be used in favour of pil_to_tensor
. We've made this change few months back. Could you send a separate PR that modifies the gaussian_blur
from the main area and this gaussian_blur_image_pil
kernel from prototype to use the new method?
|
||
|
||
def _grayscale_to_rgb_tensor(grayscale: torch.Tensor) -> torch.Tensor: | ||
return grayscale.expand(3, 1, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking TODO: This kernel assumes the tensor is a single image and not a batch. We shouldn't make this assumption. We should fix on a separate PR.
expand: bool = False, | ||
fill: Optional[List[float]] = None, | ||
center: Optional[List[float]] = None, | ||
) -> torch.Tensor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vfdev-5 FYI some transforms have been copypasted here to make refactoring easier. So any changes on main area need to be made also here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a major improvement over the previous proposal.
We should merge. I'll follow up with a more thorough review on the code-base that we have on the prototype transforms.
Summary: * add prototype transforms that don't need dispatchers * cleanup * remove legacy_transform decorator * remove legacy classes * remove explicit param passing * streamline extra_repr * remove obsolete ._supports() method * cleanup * remove Query * cleanup * fix tests * kernels -> functional * move image size and num channels extraction to functional * extend legacy function to extract image size and num channels * implement dispatching for auto augment * fix auto augment dispatch * revert some naming changes * remove ability to pass params to autoaugment * fix legacy image size extraction * align prototype.transforms.functional with transforms.functional * cleanup * fix image size and channels extraction * fix affine and rotate * revert image size to (width, height) * Minor corrections Reviewed By: datumbox Differential Revision: D34579512 fbshipit-source-id: 2044269d771b3488010b62ff611cf4f75ef75ed4 Co-authored-by: Vasilis Vryniotis <[email protected]>
This is the alternative proposal to #5418. The main difference is that this PR eliminates the high-level kernels aka dispatchers introduced in #5407 by merging it into the transforms. That means in contrast to #5418, in this PR transforms are responsible for the dispatch.
Pros
get_params
to return all parameters that are needed, whereas in this PR we can again only generate the dynamic ones there and simply access the constant ones.Cons
By removing the magic of the dispatchers, we will need to write more boilerplate code ourselves. Note that this does not necessarily translates to more LoC. Compare
to
Logging API calls is easier with dispatchers since the call can be placed inside the decorator. Since this layer of the API is removed here, but we still want to log the calls to the kernels, we need to perform the logging there. try api call logging with decorator #5424 investigates if this is possible through a decorator, which would weaken this con.
Writing custom composite transforms is harder. With the dispatchers we can do something like
Without dispatchers we could either call the low level kernels directly
or rely on the transforms objects
Out of the two options, I prefer the latter, since we have less duplicated code especially if the dispatch is not as straight forward as in the example.
(This only applies if we go with option 2 form the point above) If we use other transforms to write composite ops, we need to have a primitive for every transforms that does not have any random functionality. For example, we would need to have a
transforms.Affine
as well as atranforms.RandomAffine
. From an implementation perspective this is fairly easy to achieve withRandomAffine
subclassingAffine
and overwriting the_get_params()
method with random sampling. This also ties directly into add prototype transforms that use the prototype dispatchers #5418 (comment)