-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[feature request] Allow passing in image reading/loading function to old-style datasets constructors #4991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@vadimkantorov Thanks for the proposal. It's unlikely that we will add many new features on the old-style datasets but this is certainly good feedback for those in prototype. As far as I understand the new API has a mechanism ( @pmeier Is supporting TorchVision's io within your current plans? |
Yes, the new datasets will use the There are still some design choices to be made what a decoder is and what it should return. For example, how do we handle the case if more than one type of file need to be decoded? Furthermore, how do we handle the case of multiple return values for example the audio and video tensors after decoding a video. |
The proposed change is simple, low risk and is immediately useful to hundreds of people working with existing dataset APIs, it will take some time until the new ones are adopted by the community. |
Unfortunately, I don't think the change is simple. The functionality you propose is implemented in the new-style datasets, but we already ran into troubles and I see more to come in the future. I've opened #5075 for a discussion of how we want to handle decoding in the future. While we will find a solution for them for the new-style datasets I'm not eager to also maintain the same functionality in the old-style ones. From your request I get that you only want to disable it and not use custom decoding. If don't care about the actual data and only use it for testing, can't you simply patch it out? from unittest import mock
from torchvision import datasets
dataset = datasets.VOCDetection(...)
print(dataset[0][0])
with mock.patch("torchvision.datasets.voc.Image.open"):
print(dataset[0][0])
With the patch, iterating over the complete dataset takes ~1.5 seconds on my machine. |
Yes, I have ways around it of course as mentioned above, but it's cumbersome. What kind of troubles could it be if one just accepts a lambda that reads an image by path? or at least makes it semi-standard private methods that are elementary to patch? |
As detailed in #5075:
|
About (1), in context of old-style dataset, one can start with accepting Aobut (2): old-style dataset may have a suitable default loading function if None is passed or if |
Our datasets are map-style datasets. Thus, they are fully indexed as soon as their are instantiated. What kind of information do you want to extract that is not available at that stage?
My point is that this solution will not work for all datasets as detailed in the other issue. I don't want one-off solutions for a problem that already has a workaround will be solved in a more general way in the future. I'm ok with a solution like you proposed in #4975, because it doesn't increase the API surface. Still, this new "feature" won't be available immediately. The next release is planned for early March. This is also the time we plan on moving the new-style datasets out of a prototype state. |
The vast majority of code and packages and projects in development use old-style datasets. It's your call of course, but adding such small features, even if not "solving the data loading" is useful and helpful for a lot of people. |
What exactly is the use case? As stated above, the datasets are already indexed after they are instantiated. What kind of data do you want to extract? |
E.g. I want to access only instance annotations, without spending time on loading and decoding the image, e.g. I need an indexable dataset instance that's fast. Currently I hack around, but it would be much better if the datasets were more amenable to this scenario (e.g. accepter image loader as argument in constructor or used a private |
If you regularly do this, wouldn't it a lot better to simply store the annotations in an indexable way and simply compute everything you need based on that? That would also save you instantiating the dataset and decoding the annotations: import pickle
from unittest import mock
from torchvision import datasets
dataset = datasets.VOCDetection(...)
ann_file = "voc_annotations.pt"
with mock.patch("torchvision.datasets.voc.Image.open"):
with open(ann_file, "wb") as file:
pickle.dump(dict(enumerate([ann["annotation"] for _, ann in dataset])), file)
with open(ann_file, "rb") as file:
anns = pickle.load(file)
anns[3141]
True, and this is why the new-style datasets will solve this.
If you are ok with patching a private method at runtime, I fail to see why my workaround from #4991 (comment) is too "cumbersome". |
Introducing a whole new mock concept just to disable image loading is overkill IMO instead of providing a super-simple extension point (method override/patch). It also requires knowing of this mock framework to do this simple task. Of course, it's a matter of personal preferences. If that's the recommended way, it would be best to have it in the docs, as the usecase is frequent when writing evaluation code relying on existing annotation reading / transforms code in the dataset. Does mock apply if we just create an object in the
All existing code and projects / examples / tutorials / university lab exercies use old datasets and will continue to do so for a long time. New projects will probably use benefit from the new functionality, but a lot of users will be continuing to use old datasets for quite some time.
Caching around always comes at cost of keeping track of whether the cached file was created with the right hyper-params (e.g. of transforms that may change) and can't save lambdas because of pickle issues. The coco dataset already has such a |
The mock stays in place until you exit the context manager. Afterwards you'll have the normal behavior again. You can also start and stop it without a context manager: patcher = mock.patch(...)
patcher.start()
....
patcher.stop()
Current plan is to remove the old datasets as soon as the new ones are stable plus one deprecation release. Thus, if users want to use the latest releases, they will have to move eventually.
If you want to have such a method, I'm not going to oppose it if is a 1-to-1 copy of the current behavior. I don't think there will be any changes to the old datasets that would break this in the future, but since it is a private method, I'm not going to guarantee it. Given that this feature is already requested in #4975, I'm going to close this issue. |
🚀 The feature
This makes more sense now that torchvision is compiled with its own image reading functions (
read_image
), so easier way to test pipelines without PIL would be niceIn addition, this would allow easy stubbing out image loading when wanted and solving #4975
Motivation, pitch
N/A
Alternatives
No response
Additional context
No response
cc @pmeier
The text was updated successfully, but these errors were encountered: