-
-
Notifications
You must be signed in to change notification settings - Fork 329
Zarr DirectoryStore: getsize is wrong #253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks Thomas, currently on leave but will follow up next week.
…On Wed, 4 Apr 2018, 15:35 Thomas Kriechbaumer, ***@***.***> wrote:
https://github.com/zarr-developers/zarr/blob/3c8e9291e96e090e20caf6950da69a3cc180652f/zarr/storage.py#L864-L871
This code does not account for subfolders created by this example:
store = zarr.TempStore(dir='/tmpfs')for name, values in data.items():
zarr.array(values, store=store, path=name, compressor=compressor, filters=filter)
total_size = store.getsize()
total_size is 26, (24+2 for .zattrs and .zgroup), which is wrong.
Subfolders are not accounted for.
My TempStore has the following structure:
.zattrs
.group
name1/
1
2
3
name2/
1
2
3
The file sizes of name1 and name2 are unaccounted for.
Tested with Zarr 2.2.0.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#253>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAq8QlwB9pNbyPcTsylV21CTnmH_XefSks5tlNo8gaJpZM4TG6oy>
.
|
Just to say the getsize() method was originally added to provide information about stored size of a single array, and so recursing into subdirectories was not needed. However I think it would be better in general if it did report recursive size. I don't have time to work on this right now but a PR would be welcome. Also if this was changed to recursive size in DirectoryStore then might want to also review implementation in NestedDirectoryStore and ZipStore classes. |
So I guess |
Yes |
It would be pretty simple to change this to If we want to get fancier, we use |
PR ( #362 ) should fix this. Just used |
When looking into this further, it seems that |
@alimanfoo, do you have any thoughts on the comment above? Basically the current behavior seems intentional (as it is tested) so am trying to understand how we should proceed here. |
Sorry for slow follow up. FWIW I think it would be reasonable to change the behaviour and make it recursive. Should not affect results for reporting stored size of an array in a DirectoryStore, as there shouldn't be any sub-directories there anyway. (E.g., I often call |
Sorry I missed your comment, @alimanfoo. Thanks for the clarification. Happy to break the current tests as well. Just wanted to make sure there wasn't something I was missing. Will try to update PR ( #362 ) when I have a chance. |
Is the only way to get the current estimated size by parsing the returned value of |
Dear all, Are there any updates on this functionality (getting size of subfolders and arrays recursively)? Thank you. |
https://github.com/zarr-developers/zarr/blob/3c8e9291e96e090e20caf6950da69a3cc180652f/zarr/storage.py#L864-L871
This code does not account for subfolders created by this example:
total_size
is26
, (24+2 for.zattrs
and.zgroup
), which is wrong. Subfolders are not accounted for.My TempStore has the following structure:
The file sizes of name1 and name2 are unaccounted for.
Tested with Zarr 2.2.0.
The text was updated successfully, but these errors were encountered: