Skip to content

Allow ToImage to handle image paths #8261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mantasu opened this issue Feb 7, 2024 · 3 comments · Fixed by #8262
Closed

Allow ToImage to handle image paths #8261

mantasu opened this issue Feb 7, 2024 · 3 comments · Fixed by #8262

Comments

@mantasu
Copy link
Contributor

mantasu commented Feb 7, 2024

🚀 The feature

Was just wondering if it was possible to add support for handling image paths when using ToImage. Here's the desired feature:

import numpy as np
from PIL import Image
from torchvision.transforms.v2 import ToImage

img_np = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
Image.fromarray(img_np).save("test_img.jpg")

print(ToImage()("test_img.jpg").shape) # torch.Size([3, 100, 100])

Motivation, pitch

This simply allows us to avoid manually loading images, for example in custom Datasets:

class MyDataset(torch.utils.data.Dataset):
    def __init__(self):
        self.transform = v2.Compose([v2.ToImage(), v2.Normalize([0.5]*3, [0.3]*3)])
    def __getitem__(self, idx):
        return self.transform(f"path/to/img-{idx}.jpg")

Also, would be incredibly handful when using a standalone to_image utility.

Alternatives

No response

Additional context

No response

@NicolasHug
Copy link
Member

Hi @mantasu , thanks for the feature request. Traditionally, the decoding utilities are kept separate from the transforms, as those tend to have fairly different parametrization.

Is read_image what you're looking for?

@mantasu
Copy link
Contributor Author

mantasu commented Feb 7, 2024

Yeah, I was wondering if this could be added to to_image with an additional final check for str. I created a PR with an example functionality. But feel free to close it if it is better to keep those separate!

@NicolasHug
Copy link
Member

But feel free to close it if it is better to keep those separate!

Thanks for understanding @mantasu - yes, let's keep those separate. Our converstion transforms (e.g. ToImage, ToTensor, ToPILImage, etc.) have been the source of a lot of confusion in the past, e.g. ToTensor() would silently scale the values of the input and convert a uint8 PIL image to float... We've tried to clean that a little, but backward-compatibility engagement make that hard. I'm afraid that adding another surface to those APIs (i.e. the ability to pass paths) would add a new layer of complexity to this already-messy space, so I prefer avoiding this. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants