Skip to content

Remove the encoding attribute from xray.DataArray? #628

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
shoyer opened this issue Oct 19, 2015 · 4 comments
Closed

Remove the encoding attribute from xray.DataArray? #628

shoyer opened this issue Oct 19, 2015 · 4 comments

Comments

@shoyer
Copy link
Member

shoyer commented Oct 19, 2015

As described in the dev version of our documentation on encoding, we now support a keyword argument for controlling how netCDF files are written to disk with to_netcdf:
http://xray.readthedocs.org/en/latest/io.html#reading-encoded-data

We still retain the feature that there is an "encoding" dictionary that sticks around with xray Variable objects, which stores how they were compressed/encoded on disk. This can be occasionally handy. It means, for example, that we always write out netCDF files with the same time units as the files we read from disk.

It might make sense to eliminate this feature for the sake of significantly simplifying xray's internal data model. For cases where it really matters, users can now use the encoding keyword argument to to_netcdf instead. This would leave three attributes on the Variable object: dims, _data and attrs.

Thoughts?

@jhamman
Copy link
Member

jhamman commented Oct 19, 2015

@shoyer - It would be nice to simplify the Variable object, however, until NumPy can support custom datatypes, I don't think this is very practical. How would you propose we round trip the coordinates and time variables? Reading a time variable with a noleap calendar and writing it out as a standard calendar seems like a bad idea.

In the climate modeling world, non-standard calendars are ubiquitous and I know of many people that have come to xray because we (partially) support those calendars. Leaving the encoding attribute/dictionary in Variable objects seems necessary to continue supporting the netCDF variable/attribute relationships.

@shoyer
Copy link
Member Author

shoyer commented Oct 19, 2015

I agree that reading a noleap calendar and writing it as standard seems like a very bad idea. What about simply leaving the calendar attribute intact when reading from netCDF files?

I'm less sure about what a good solution looks like for preserving variable specific coordinates. I know you just added that fix, but what are the actual use cases for that information? If coordinates are do not apply to all variables in your dataset, then perhaps an xray.Dataset is not the right model for your problem.

@jhamman
Copy link
Member

jhamman commented Oct 21, 2015

Certainly, if we get rid of the encoding attribute, we need to keep the time attributes intact. That is a fine solution in my view.

I'm not tied to my fix for the coordinates attribute. The balance we need to strike is to not go too far in supporting everything in the CF conventions while not working against supported features in the conventions.

@stale
Copy link

stale bot commented Jan 30, 2019

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity
If this issue remains relevant, please comment here; otherwise it will be marked as closed automatically

@stale stale bot added the stale label Jan 30, 2019
@stale stale bot closed this as completed Mar 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants