Skip to content

train_tensors in BaseDataset is mis-interpreted #356

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nabenabe0928 opened this issue Dec 6, 2021 · 1 comment
Closed

train_tensors in BaseDataset is mis-interpreted #356

nabenabe0928 opened this issue Dec 6, 2021 · 1 comment
Labels
fix-later Issues that have been close for a review later

Comments

@nabenabe0928
Copy link
Collaborator

nabenabe0928 commented Dec 6, 2021

The continuation of issue#352.

if I understand the code correctly, we are assuming that train_tensors[0] is always feature tensors and train_tensors[1] is always labels.
However, Dataset in torch assumes the following structure:

In [1]: import torchvision.datasets

In [2]: train_tensors = torchvision.datasets.CIFAR10('cifar10/')

In [3]: train_tensors[0]
Out[3]: (<PIL.Image.Image image mode=RGB size=32x32 at 0x7F8B6DAD3370>, 6)

In [4]: train_tensors[1]
Out[4]: (<PIL.Image.Image image mode=RGB size=32x32 at 0x7F8B6DAD33A0>, 9)

It indicates that the __getitem__ of Dataset (actually it applies to other Dataset child classes as well) returns:

train_tensors[i] := the i-th instance in the given dataset

It does not matter unless we use torch stuff, but those stuffs will cause many issues when we try to merge image tasks

@ravinkohli
Copy link
Contributor

ravinkohli commented Dec 10, 2021

Hey, actually we do follow the structure of a Dataset defined in torch see here. Specifically for image dataset, for FilePathDataset also it follows the structure here. I don't think we will have issues when we try to merge image tasks. Moreover, if we were to have issues, we would have had them also for tabular tasks as we use a torch DataLoader and it works fine. However, we may have issues with other parts of the ImageDataset code but we can look at it when we start working with images.

@ravinkohli ravinkohli added the fix-later Issues that have been close for a review later label Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix-later Issues that have been close for a review later
Projects
None yet
Development

No branches or pull requests

2 participants