-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Dataset.reduce methods #137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
else: | ||
dims = set(dimension) | ||
|
||
variables = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this OrderedDict()
instead of an unordered dictionary, just so the result will be less surprising (that is, with variables in the same order as the original).
Very nice start! Please also add a |
|
||
self.assertDatasetEqual(data.min(dimension=['dim1']), | ||
data.min(dimension='dim1')) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test for dimension=[]
:
self.assertDatasetEqual(data.mean(dimension=[]), data)
actual = data.min(dimension=reduct).dimensions | ||
self.assertItemsEqual(actual, expected) | ||
|
||
data.__delitem__('time') # removes unused time dim/var that is dropped |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This suggests another reason why it might make more sense to loop over variables (and handle coordinates explicitly) instead of only looping over noncoordinates: it's kind of weird to lose a dimension that wasn't summed over.
dims = set(dimension) | ||
|
||
if any([True for dim in dims if dim not in self.coordinates]): | ||
bad_dims = [dim for dim in dims if dim not in self.coordinates] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could simply make this:
bad_dims = [dim for dim in dims if dim not in self.coordinates]
if bad_dims:
raise ValueError
This getting pretty close but I would like more comprehensive tests to be confident that it is working properly:
|
@shoyer - re. the two tests you requested.
|
'{0}'.format(bad_dims)) | ||
|
||
variables = OrderedDict() | ||
for name, da in iteritems(self.variables): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make this slightly less dissonant, let's rename da
to something like var
which doesn't suggest this is DataArray variable.
`f(x, axis=axis, **kwargs)` to return the result of reducing an | ||
np.ndarray over an integer valued axis. | ||
dimension : str or sequence of str, optional Dimension(s) over which | ||
to apply `func`. If `dimension`(default=None) `func` is applied |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"If dimension
(default=None)" doesn't quite make sense to me. How about replacing that just with "By default"?
OK, this looks good to me now! Since the history is a little messy (given the number of revisions) could you please squash this into a single commit when I can merge? |
Alright, the rebase/squash is done. |
Thanks! |
A first attempt at implementing Dataset reduction methods.
#131