-
-
Notifications
You must be signed in to change notification settings - Fork 329
Add visitor pattern methods #122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add visitor pattern methods #122
Conversation
Only has doctests ATM. Could use some unit tests as well. That said, please let me know what you think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Only very minor suggestions.
zarr/hierarchy.py
Outdated
|
||
Examples | ||
-------- | ||
>>> from __future__ import print_function, unicode_literals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is needed in the example. For Zarr docstrings are standardised to PY3(6).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I figured it couldn't hurt.
zarr/hierarchy.py
Outdated
>>> g2 = g1.create_group('foo') | ||
>>> g3 = g1.create_group('bar') | ||
>>> d1 = g1.create_dataset('baz', shape=100, chunks=10, fill_value=0) | ||
>>> d2 = g1.create_dataset('quux', shape=200, chunks=20, fill_value=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use 'foo/quux' as the dataset name or something like that, to show what happens when you have a hierarchy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do that. Was thinking about doing something similar. Though the tests explore this as well.
May also drop datasets (or at least disable compression) as the compression levels seem to vary, which would make the doctest annoying to keep.
zarr/hierarchy.py
Outdated
>>> g2 = g1.create_group('foo') | ||
>>> g3 = g1.create_group('bar') | ||
>>> d1 = g1.create_dataset('baz', shape=100, chunks=10, fill_value=0) | ||
>>> d2 = g1.create_dataset('quux', shape=200, chunks=20, fill_value=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again suggest to have an example showing what happens with a hierarchy, e.g., 'foo/quux'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zarr/hierarchy.py
Outdated
@@ -414,6 +416,78 @@ def arrays(self): | |||
chunk_store=self._chunk_store, | |||
synchronizer=self._synchronizer) | |||
|
|||
def visit(self, func): | |||
"""Run callable on each object name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest to replace callable
with `func`
. Also maybe worth adding a note about how the return value is handled, e.g., from h5py source: "Returning None continues iteration; returning anything else aborts iteration and returns that value."
zarr/hierarchy.py
Outdated
return value | ||
|
||
def visititems(self, func): | ||
"""Run callable on each object name and object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for visit()
, suggest to replace callable
with `func`
and adding a note about how the return value is handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zarr/hierarchy.py
Outdated
|
||
Examples | ||
-------- | ||
>>> from __future__ import print_function, unicode_literals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed I don't think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
zarr/hierarchy.py
Outdated
def _visit(obj, func=func): | ||
keys = sorted(getattr(obj, "keys", lambda : [])()) | ||
for each_key in keys: | ||
for value in _visit(obj[each_key], func=func): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think there's a call to func(each_key)
missing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was playing with stuff and pushed a bad commit. Unfortunately have been AFK until now. Should be fixed.
zarr/hierarchy.py
Outdated
|
||
def _visit(obj, func=func): | ||
keys = sorted(getattr(obj, "keys", lambda : [])()) | ||
for each_key in keys: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing call to func(each_key, obj[each_key])
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
Ok, addressed comments and fixed bugs. This is ok on my end. |
Something weird happening with CI, maybe related to install of appdirs? Try upgrading to |
Yeah, this cropped up with the release of 1.4.1. Have raised upstream to see if they have any idea. xref: ActiveState/appdirs#89 |
Put together PR ( https://github.com/alimanfoo/zarr/pull/124 ) to try and fix the CI issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of small things, otherwise looks good.
zarr/hierarchy.py
Outdated
|
||
Examples | ||
-------- | ||
>>> from __future__ import print_function, unicode_literals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove this, just for consistency with other docstrings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, done.
zarr/hierarchy.py
Outdated
|
||
Examples | ||
-------- | ||
>>> from __future__ import print_function, unicode_literals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also done.
zarr/hierarchy.py
Outdated
|
||
""" | ||
|
||
def _visit(obj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now a duplicate of _visit above. Suggest to factor out into a single private function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, was actually thinking of refactoring it differently though. I think we could have a visitvalues
function that both visit
and visititems
use. Have a WIP branch for this, but it isn't quite done yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have pushed a refactoring using visitvalues
, which seems to work nicely.
Also made changes so that all of these will show up in the docs. Built the docs locally to make sure. |
Thanks for the changes, looks good. So I'm just reading the h5py docs, and apparently the name passed into the visiting function "will be the name of the object relative to the current group". As currently coded, this PR will pass the object path, which is the name relative to the root group. I guess if we are aiming for h5py compatibility where possible then we need to follow the same behaviour? Unless you have a reason to do it differently? |
Just to make sure we are on the same page, by current group, I believe they mean the group we called That's a good point. Thanks for raising it. This was definitely an oversight on my part. Not intentional. Do you have any suggestions for how we should do this? If not, I'm thinking we could just strip out that first |
No problem, yes that's how I read the docs, name is supposed to be relative to the group we called |
Provides `visit` and `visititem` that behave identically to their h5py counterparts. This should make it easier to traverse the full hierarchy of a Zarr group. Also should make it easier for users with h5py code to try out using Zarr with fewer changes.
This method seems to be at the core of `visit` and `visititems`. So it makes sense to refactor it out. Though it also provides usable functionality of its own. So it makes sense to expose it as part of the API too.
Alright, I have now updated the |
Great, thank you! |
Thanks for the good review. |
Fixes https://github.com/alimanfoo/zarr/issues/92
Provides
visit
andvisititem
that behave identically to their h5py counterparts. This should make it easier to traverse the full hierarchy of a Zarr group. Also should make it easier for users with h5py code to try out using Zarr with fewer changes.