-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Modular encodings (rebased) #245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
consistent. Amongst other things this includes: - EncodedDataStores which can wrap other stores and allow for modular encoding/decoding. - Trivial indices ds['x'] = ('x', np.arange(10)) are no longer stored on disk and are only created when accessed. - AbstractDataStore API change. Shouldn't effect external users. - missing_value attributes now function like _FillValue All current tests are passing (though it could use more new ones). Post rebase notes (shoyer, Oct 2, 2014): Most tests are passing, though a couple are broken: - test_roundtrip_mask_and_scale (because this change needs a fix to not break the current API) - test_roundtrip_strings_with_fill_value on TestCFEncodedDataStore (I don't entirely understand why, let's come back to it later)
closing this for now until the discussion in #175 is resolved. |
a model where encoding/decoding happens when a dataset is stored to/ loaded from a DataStore. Conventions can now be enforced at the DataStore level by overwritting the Datastore.store() and Datastore.load() methods, or as an optional arg to Dataset.load_store, Dataset.dump_to_store. Includes miscelanous cleanup.
'supported on Python 2.6') | ||
try: | ||
store = backends.ScipyDataStore(gzip.open(nc), *args, **kwargs) | ||
except TypeError, e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the more modern syntax is except TypeError as e
Is there a reason you switched to using accessors (e.g., |
re: using accessors such as get_variables. To be fair, this pull request didn't really switch the behavior, it was already very similar but with different names (store_variables, open_store_variable) that were now changed to get_variables etc. I chose to continue with getters and setters because it makes it fairly clear what needs to be implemented to make a new DataStore, and allows for some post processing such as _decode_variable_name and encoding/decoding. Its not entirely clear to me how that would fit into a properties based approach so sounds like a good follow up pull request to me. |
K, fine by me. On Wed, Oct 8, 2014 at 6:37 PM, akleeman [email protected] wrote:
|
self._encoder_args = args | ||
self._encoder_kwdargs = kwdargs | ||
|
||
def store(self, variables, attributes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move this method to a mixin class (something like AlwaysWriteCFEncoded
) and use multiple inheritance to add it to NetCDF4DataStore and ScipyDataStore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing that would get a little awkward since the encoding is now embedded within a store, so arguments to the encoder are passed into the DataStore constructor. As a result the mixin class would need to implement a constructor which stored arguments which are intended for the encoder, and the DataStore that extends the mixin would need to distinguish between arguments intended for the store/mixin. Doing this in any sort of automated way leads to the nasty bit of logic (encoding_decorator) that I just removed.
One alteration which might make that all less nasty is to have any encoding/decoding arguments be directly passed into DataStore store/load. Though, in the interest of not further bloating this PR I'd like to save that future change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I do like the idea of passing encoding/decoding arguments directly into store/load.
On second thought, I do agree with you -- this is not enough redundant code to worry about factoring out.
if variables is None: | ||
self._variables = OrderedDict() | ||
else: | ||
self._variables = variables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could also put these on one line each:
self._variables = OrderedDict() if variables is None else variables
self._attributes = OrderedDict() if attributes is None else attributes
should be good to go once you get all the tests passing. |
b33b747
to
ad7df2b
Compare
@akleeman I went ahead and pushed fixes for the failing tests. Feel free to merge if when you're ready. |
Merged, so I can do some more encodings work on top of this. |
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 3.1.1 to 3.1.4. - [Release notes](https://github.com/codecov/codecov-action/releases) - [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md) - [Commits](codecov/codecov-action@v3.1.1...v3.1.4) --- updated-dependencies: - dependency-name: codecov/codecov-action dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Tom Nicholas <[email protected]>
This change is rebased on master and should let us pick up from #175. CC @akleeman
Restructured Backends to make CF convention application more consistent.
Amongst other things this includes:
for modular encoding/decoding.
stored on disk and are only created when accessed.
All current tests are passing (though it could use more new ones).
Post rebase notes (shoyer, Oct 2, 2014):
Most tests are passing, though a couple are broken:
break the current API)
(I don't entirely understand why, let's come back to it later)