Skip to content

Update low-level functional transforms with value_range #5502

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vfdev-5 opened this issue Mar 1, 2022 · 11 comments
Closed

Update low-level functional transforms with value_range #5502

vfdev-5 opened this issue Mar 1, 2022 · 11 comments

Comments

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Mar 1, 2022

Following #5500 (comment) we may want to update low-level functional transforms with value_range argument to avoid implicit hard-coded max range definition:

  • 255 for uint8
  • 1.0 for float

Today this is done for

We can introduce new argument value_range and use it explicitly for these ops.
In general we can think of value_range as a tuple (min, max) which would cover majority of imagery where channels ranges are similar. There could be however other type of images (e.g. think of non-RGB color spaces or particular imagery) where value ranges could vary per channel, thus we may need to represent value_range as a list of 2-tuples: [(min_1, max_1), (min_2, max_2), ...]

cc @bjuncek @datumbox

@datumbox
Copy link
Contributor

datumbox commented Mar 1, 2022

I agree with the idea of passing value_range on our existing kernels. When None we can estimate the bound using our existing approach.

Concerning using single values vs single MinMax vs Multi MinMax, I think all boils down to what is our target. Do we see TorchVision offering native support for Transforms for more exotic non-RGB/Grayscale spaces? Perhaps that could be useful for medical or astronomical applications but we should be careful not to overcomplicate the API with features we won't use. Are transforms that use _blend even likely to be used in such domains?

I would like to get the advise of you and @pmeier on this one for how we proceed.

@NicolasHug
Copy link
Member

Apart from making the implicit ranges more explicit, what are other problems that having value_range would address?

I agree that the way we currently handle ranges isn't nice (read: not documented enough). But from my non-informed POV, it looks like adding value_range would add a lot of extra complexity for something that might be solved with better docs / user awareness?

@datumbox
Copy link
Contributor

datumbox commented Mar 1, 2022

@NicolasHug Thanks for the feedback, I think we have a description/context gap on this issue. Instead of jumping directly to the problem, we should provide info on what we try to solve and why.

The current API uses ToTensor and PILToTensor to not only convert the image to Tensor but also scale it to [0,1]. This is somehow problematic because the ToTensor does too many things. As a result of this implicit scaling, the low-level kernels try to guess the max pixel value of the image from its type.

Though we will maintain this guessing for BC, providing a range_value to the low-level kernels can free us from using ToTensor. In fact we can just maintain the image in its original space whatever it is and not have to normalize it until the very end when it's thrown to the model. This will make the pictures easier to visualize and combine. It will also remove the need for calling methods like ToTensor, ConvertImageDtype etc.

@pmeier
Copy link
Collaborator

pmeier commented Mar 2, 2022

Summarizing an internal discussion about this: apart from Keras, all other frameworks also make assumptions on the value range of images. From all the approaches we found, we think our approach that is also used by Albumentations is the most reasonable. We use [0, 1 if dtype.is_floating_point else torch.iinfo(dtype).max]. Thus, although we missed something we probably don't need the explicit value range.

We could provide a utility function that returns the assumed value range for a given dtype. This could also be used for convert_image_dtype and _blend from above.

@datumbox
Copy link
Contributor

datumbox commented Mar 2, 2022

If you check our current code you will see a TODO for using iinfo once it becomes JIT-scriptable. Instead of that, we got some methods that try to find the max value. I think we should check if iinfo is scriptable and if it's not perhaps simplify the code.

@vfdev-5 I wonder if you can help us with the above?

@vfdev-5
Copy link
Collaborator Author

vfdev-5 commented Mar 2, 2022

no, neither torch.iinfo nor torch.finfo is scriptable, pytorch/pytorch#41492

@datumbox
Copy link
Contributor

datumbox commented Mar 4, 2022

@vfdev-5 Thanks for checking. Is it possible to simplify the _max_value() ? Is there are reason why we don't have these pre-estimated and stored in mapping for all the supported types?

@vfdev-5
Copy link
Collaborator Author

vfdev-5 commented Mar 8, 2022

I think the reason is that torch script does not support global. We could do however something like:

def convert_image_dtype(image: torch.Tensor, dtype: torch.dtype = torch.float) -> torch.Tensor:

    # ...
    _max_values: Dict[torch.dtype, int] = {
        torch.uint8: 255,
        torch.int8: 127,
        torch.int16: int(2 ** 15),
        torch.int32: int(2 ** 31),
        torch.int64: int(2 ** 63),
        ...
    }

    input_max = _max_values[image.dtype]

@datumbox do you think it worth a change ?

EDIT: updated 256 -> 255

@datumbox
Copy link
Contributor

datumbox commented Mar 8, 2022

I think that makes sense. There is typo on the first record for torch.uint8, it should be 255 right?

@pmeier
Copy link
Collaborator

pmeier commented Mar 8, 2022

For the integer types, there is a -1 missing:

>>> assert torch.iinfo(torch.int16).max == 2**15 - 1

vfdev-5 added a commit to vfdev-5/vision that referenced this issue Mar 8, 2022
vfdev-5 added a commit that referenced this issue Mar 9, 2022
* Removed _max_value method and added a dictionary

Related to #5502

* Addressed failing tests and restored _max_value method

* Added xfailing test to switch quicker

* Switch to if/else impl
@datumbox
Copy link
Contributor

I think we should close this for now.

facebook-github-bot pushed a commit that referenced this issue Mar 15, 2022
Summary:
* Removed _max_value method and added a dictionary

Related to #5502

* Addressed failing tests and restored _max_value method

* Added xfailing test to switch quicker

* Switch to if/else impl

Reviewed By: vmoens

Differential Revision: D34878986

fbshipit-source-id: 2e8268eda1bff6f5375fc1b1c946a68af539689b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants