You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Similar to channels argument for tf.io.decode_image or PIL.Image.open(..).convert('format') allow specifying the output channel 'format' and have a sensible conversion performed for you when decoding an image if set.
Motivation
Prevent the need to manually check and do color conversion from RGB -> grayscale or grayscale -> RGB when datasets often have mixes of RGB and grayscale images. The output color format of image loading in data pipeline needs to be consistent for many typical use cases.
The lower level libraries (libjpeg, libpng) etc have facilities for specifying the output color space and will do the conversion for you if you set it up (ie out_color_space for libjpeg). This is what tensorflow decode_image does and is the most efficient.
Pitch
Add a channels= arg to read_image that matches Tensorflow semantics.
channels=0 - leave as original, a grayscale image decodes to 1 channel out, rgb to 3
channels=1 - grayscale out
channels=3 - RGB out
channels=4 - RGBA out (PNG or formats that support alpha, not valid for jpeg)
Alternatives
Additional context
The text was updated successfully, but these errors were encountered:
I think this is a nice addition to have, and we will be looking into adding support for this feature in torchvision.
We are already planning a few improvements to PNG support as @datumbox mentioned, so we could go one step further and also allow for a finer-grained control on the output format for read_image.
🚀 Feature
Similar to
channels
argument fortf.io.decode_image
orPIL.Image.open(..).convert('format')
allow specifying the output channel 'format' and have a sensible conversion performed for you when decoding an image if set.Motivation
Prevent the need to manually check and do color conversion from RGB -> grayscale or grayscale -> RGB when datasets often have mixes of RGB and grayscale images. The output color format of image loading in data pipeline needs to be consistent for many typical use cases.
The lower level libraries (libjpeg, libpng) etc have facilities for specifying the output color space and will do the conversion for you if you set it up (ie
out_color_space
for libjpeg). This is what tensorflow decode_image does and is the most efficient.Pitch
Add a
channels=
arg to read_image that matches Tensorflow semantics.Alternatives
Additional context
The text was updated successfully, but these errors were encountered: