-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Read COCO dataset from ZIP file #950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
51 commits
Select commit
Hold shift + click to select a range
e7f6f66
Read COCO dataset images from its zipfile when it is there
koenvandesande 5aebeae
Also do it for CocoCaptions
koenvandesande 7803588
Move code into utils.py, remove the magic constants and import them i…
koenvandesande 3e5e03d
Add test for zip lookup class
koenvandesande 4e39618
Fix for Python versions < 3.6
koenvandesande 9a59666
Generalize to CelebA, move part of shared logic into VisionDataset
koenvandesande 0bc30f2
Fix import
koenvandesande a8b483a
flake8 fixes
koenvandesande 26d51d0
Simplify implementation of ZipLookup by not keeping file descriptor open
koenvandesande 29d7df8
Remove unused import
koenvandesande 46ceaf0
Support reading images from ZIP for Omniglot dataset
koenvandesande 3f10d56
Add common get_path_or_fp function
koenvandesande 534d35d
Forgot one spot
koenvandesande 26227de
Remove syntax unsupported by Python 2, replace argument with code tha…
koenvandesande 547b618
Delete _C.cp37-win_amd64.pyd
koenvandesande 559a5cf
Fixes and extra unit tests
741e3bb
Fixes and extra unit tests
koenvandesande f037fe9
Merge branch 'read_zipped_data' of github.com:koenvandesande/vision i…
koenvandesande c0d4dbf
Fix
koenvandesande 481d45c
Fix
koenvandesande 32b2311
Need to rewrite Omniglot ZIP-file because it uses compression
koenvandesande 255a6f9
Fix flake8
koenvandesande 8adf9af
Omniglot depends on pandas, and that is tested now in test_datasets
koenvandesande 7753710
Fix
koenvandesande bfa7510
Add extra check
koenvandesande afd2d04
Refactor
koenvandesande 2b7a044
Add test
koenvandesande f4f1905
Flake8
koenvandesande 05e8401
tqdm too old in Travis?
koenvandesande 5280d5a
For the 'smart' person who uses a symlink to their data, and then mod…
koenvandesande 28b74b9
Fix mistake
koenvandesande f0f0f5d
Merge branch 'master' into read_zipped_data
koenvandesande bbfd16a
Flake8
koenvandesande 39ea27a
Fix extension
koenvandesande 2ac26ce
Add ZippedImageFolder class which reads a zipped version of the data …
koenvandesande 629c851
Merge branch 'master' into read_zipped_data
koenvandesande 9396c35
Fix flake8
koenvandesande 48894bf
Update test_zippedfolder.py
koenvandesande 0fa8035
Fix test
koenvandesande c05281a
Fix omniglot
koenvandesande ef3ba78
0.4.0 packaging (#1212)
ezyang 66bc6f9
Don't build nightlies on 0.4.0 branch.
ezyang a1ed206
Refactor version suffix so conda packages don't get suffixes. (#1218)…
ezyang 7a8b133
Merge remote-tracking branch 'upstream/v0.4.0' into read_zipped_data
koenvandesande b8c2c5d
Merge remote-tracking branch 'upstream/master' into read_zipped_data
koenvandesande 8df35fa
Merge branch 'master' into read_zipped_data
koenvandesande d68ce83
Merge branch 'master' into read_zipped_data
koenvandesande 17de30d
Merge branch 'master' into read_zipped_data
koenvandesande 393cfd6
Update config.yml
koenvandesande f28b324
Remove EOL
koenvandesande 6247d96
Merge branch 'master' into read_zipped_data
koenvandesande File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
import unittest | ||
|
||
import tempfile | ||
import os | ||
import shutil | ||
import zipfile | ||
from common_utils import get_tmp_dir | ||
|
||
from torchvision.datasets import ZippedImageFolder | ||
from torch._utils_internal import get_file_path_2 | ||
|
||
|
||
class Tester(unittest.TestCase): | ||
FAKEDATA_DIR = get_file_path_2(os.path.dirname(os.path.abspath(__file__)), 'assets', 'fakedata') | ||
|
||
def test_zipped_image_folder(self): | ||
temp_dir = tempfile.mkdtemp() | ||
temp_filename = os.path.join(temp_dir, "dataset.zip") | ||
try: | ||
with get_tmp_dir(src=os.path.join(Tester.FAKEDATA_DIR, 'imagefolder')) as root: | ||
classes = sorted(['a', 'b']) | ||
class_a_image_files = [os.path.join(root, 'a', file) | ||
for file in ('a1.png', 'a2.png', 'a3.png')] | ||
class_b_image_files = [os.path.join(root, 'b', file) | ||
for file in ('b1.png', 'b2.png', 'b3.png', 'b4.png')] | ||
|
||
zf = zipfile.ZipFile(temp_filename, "w", zipfile.ZIP_STORED, allowZip64=True) | ||
for dirname, subdirs, files in os.walk(root): | ||
for filename in files: | ||
zf.write(os.path.join(dirname, filename), | ||
os.path.relpath(os.path.join(dirname, filename), root)) | ||
zf.close() | ||
|
||
dataset = ZippedImageFolder(root=temp_filename) | ||
for cls in classes: | ||
self.assertEqual(cls, dataset.classes[dataset.class_to_idx[cls]]) | ||
class_a_idx = dataset.class_to_idx['a'] | ||
class_b_idx = dataset.class_to_idx['b'] | ||
imgs_a = [(img_path.replace(root + os.path.sep, '').replace(os.path.sep, "/"), class_a_idx) | ||
for img_path in class_a_image_files] | ||
imgs_b = [(img_path.replace(root + os.path.sep, '').replace(os.path.sep, "/"), class_b_idx) | ||
for img_path in class_b_image_files] | ||
imgs = sorted(imgs_a + imgs_b) | ||
self.assertEqual(imgs, dataset.imgs) | ||
finally: | ||
shutil.rmtree(temp_dir) | ||
|
||
|
||
if __name__ == '__main__': | ||
unittest.main() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.