-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Add Flowers102 dataset #5177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Add Flowers102 dataset #5177
Changes from 8 commits
Commits
Show all changes
28 commits
Select commit
Hold shift + click to select a range
53fd509
Add Flowers102 datasets
zhiqwang 79d0596
Fix initialization of images and labels
zhiqwang 3b7ce90
Fix _check_exists in Flowers102
zhiqwang 38df988
Add Flowers102 to datasets and docs
zhiqwang ba8889b
Add Flowers102TestCase to unittest
zhiqwang bf3e8d5
Fixing Python type statically
zhiqwang 8cf1bfd
Shuffle the fake labels
zhiqwang f3949ab
Merge branch 'main' into datasets/flowers-102
zhiqwang 4792f9e
Update test/test_datasets.py
zhiqwang fb3ae0d
Apply the suggestions by pmeier
zhiqwang d4b00a3
Use check_integrity to check file existence
zhiqwang b55568f
Save the labels to base_folder
zhiqwang 52b6bb8
Merged with upstream
zhiqwang 7fb9876
Minor fixes
zhiqwang 87cc4f1
Using a loop makes this more concise without reducing readability
zhiqwang d84399e
Using a loop makes this more concise without reducing readability
zhiqwang 6adabad
Remove self.labels and self.label_to_index attributes
zhiqwang 8618415
minor simplification
pmeier 2bb1ee6
Check the exitence of image folder
zhiqwang 9fef169
Revert the check
zhiqwang d8a343a
Check the existence of image folder
zhiqwang d3d0698
valid -> val
NicolasHug 7fa9c67
keep some stuff private
NicolasHug ce957c6
minor doc arrangements
NicolasHug a5b701e
remove default FEATURE_TYPES
NicolasHug 4b21a2f
Simplify the datasets existence
zhiqwang 53ad2c9
check if the image folder exists
zhiqwang 0791dfc
isdir -> is_dir
NicolasHug File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
from pathlib import Path | ||
from typing import Any, Tuple, Callable, Optional | ||
|
||
import numpy as np | ||
import PIL.Image | ||
|
||
from .utils import verify_str_arg, download_and_extract_archive, download_url | ||
from .vision import VisionDataset | ||
|
||
|
||
class Flowers102(VisionDataset): | ||
"""`Oxford 102 Flower <https://www.robots.ox.ac.uk/~vgg/data/flowers/102/>`_ Dataset. | ||
|
||
.. warning:: | ||
|
||
This class needs `scipy <https://docs.scipy.org/doc/>`_ to load target files from `.mat` format. | ||
|
||
Oxford 102 Flower is an image classification dataset consisting of 102 flower categories. The | ||
flowers chosen to be flower commonly occurring in the United Kingdom. Each class consists of | ||
between 40 and 258 images. | ||
|
||
The images have large scale, pose and light variations. In addition, there are categories that | ||
have large variations within the category and several very similar categories. | ||
|
||
Args: | ||
root (string): Root directory of the dataset. | ||
split (string, optional): The dataset split, supports ``"train"`` (default), ``"val"``, or ``"test"``. | ||
transform (callable, optional): A function/transform that takes in an PIL image and returns a | ||
transformed version. E.g, ``transforms.RandomCrop``. | ||
target_transform (callable, optional): A function/transform that takes in the target and transforms it. | ||
""" | ||
|
||
def __init__( | ||
self, | ||
root: str, | ||
split: str = "train", | ||
download: bool = True, | ||
zhiqwang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
transform: Optional[Callable] = None, | ||
target_transform: Optional[Callable] = None, | ||
) -> None: | ||
super().__init__(root, transform=transform, target_transform=target_transform) | ||
self._split = verify_str_arg(split, "split", ("train", "valid", "test")) | ||
self._base_folder = Path(self.root) / "flowers-102" | ||
self._meta_folder = self._base_folder / "labels" | ||
pmeier marked this conversation as resolved.
Show resolved
Hide resolved
|
||
self._images_folder = self._base_folder / "jpg" | ||
|
||
if download: | ||
self._download() | ||
|
||
if not self._check_exists(): | ||
raise RuntimeError("Dataset not found. You can use download=True to download it") | ||
|
||
self._labels = [] | ||
self._image_files = [] | ||
zhiqwang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
from scipy.io import loadmat | ||
|
||
# Read the label ids | ||
label_mat = loadmat(self._meta_folder / "imagelabels.mat") | ||
labels = label_mat["labels"][0] | ||
|
||
self.classes = np.unique(labels).tolist() | ||
self.class_to_idx = dict(zip(self.classes, range(len(self.classes)))) | ||
pmeier marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
# Read the image ids | ||
set_ids = loadmat(self._meta_folder / "setid.mat") | ||
splits_map = {"train": "trnid", "valid": "valid", "test": "tstid"} | ||
|
||
image_ids = set_ids[splits_map[self._split]][0] | ||
|
||
for image_id in image_ids: | ||
self._labels.append(self.class_to_idx[labels[image_id - 1]]) | ||
self._image_files.append(self._images_folder / f"image_{image_id:05d}.jpg") | ||
|
||
def __len__(self) -> int: | ||
return len(self._image_files) | ||
|
||
def __getitem__(self, idx) -> Tuple[Any, Any]: | ||
image_file, label = self._image_files[idx], self._labels[idx] | ||
image = PIL.Image.open(image_file).convert("RGB") | ||
|
||
if self.transform: | ||
image = self.transform(image) | ||
|
||
if self.target_transform: | ||
label = self.target_transform(label) | ||
|
||
return image, label | ||
|
||
def extra_repr(self) -> str: | ||
return f"split={self._split}" | ||
|
||
def _check_exists(self) -> bool: | ||
return all(folder.exists() and folder.is_dir() for folder in (self._meta_folder, self._images_folder)) | ||
|
||
def _download(self) -> None: | ||
if self._check_exists(): | ||
return | ||
|
||
download_and_extract_archive( | ||
"https://www.robots.ox.ac.uk/~vgg/data/flowers/102/102flowers.tgz", | ||
download_root=str(self._base_folder), | ||
md5="52808999861908f626f3c1f4e79d11fa", | ||
) | ||
|
||
download_url( | ||
"https://www.robots.ox.ac.uk/~vgg/data/flowers/102/setid.mat", | ||
str(self._meta_folder), | ||
md5="a5357ecc9cb78c4bef273ce3793fc85c", | ||
) | ||
|
||
download_url( | ||
"https://www.robots.ox.ac.uk/~vgg/data/flowers/102/imagelabels.mat", | ||
str(self._meta_folder), | ||
md5="e0620be6f572b9609742df49c70aed4d", | ||
) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.