Skip to content

Expose more information in DataArray.__repr__ #128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 29, 2014
Merged

Conversation

shoyer
Copy link
Member

@shoyer shoyer commented May 14, 2014

This PR changes the DataArray representation so that it displays more of the information associated with a data array:

  • "Coordinates" are indicated by their name and the repr of the
    corresponding pandas.Index object (to indicate how they are used as
    indices).
  • "Linked" dataset variables are also listed.
    • These are other variables in the dataset associated with a DataArray
      which are also indexed along with the DataArray.
    • They accessible from the dataset attribute or by indexing the data
      array with a string.
    • Perhaps their most convenient aspect is that they enable groupby
      operations by
      name
      for
      DataArray objets.
    • This is an admitedly somewhat confusing (though convenient) notion that I
      am considering [removing](https://github.com/xray-
      pydata/xray/issues/117), but we if we don't remove them we should
      certainly expose their existence more clearly, given the potential
      benefits in expressiveness and costs in performance.

Questions to resolve:

  • Is "Linked dataset variables" the best name for these?
  • Perhaps it would be useful to show more information about these linked
    variables, such as their dimensions and/or shape?

Examples of the new repr are on nbviewer:
http://nbviewer.ipython.org/gist/shoyer/94936e5b71613683d95a

This PR changes the `DataArray` representation so that it displays more of the
information associated with a data array:

- "Coordinates" are indicated by their name and the `repr` of the
  corresponding pandas.Index object (to indicate how they are used as
  indices).
- "Linked" dataset variables are also listed.
   * These are other variables in the dataset associated with a DataArray
     which are also indexed along with the DataArray.
   * They accessible from the `dataset` attribute or by indexing the data
     array with a string.
   * Perhaps their most convenient aspect is that they enable [`groupby`
     operations by
     name](http://xray.readthedocs.org/en/latest/tutorial.html#apply) for
     DataArray objets.
   * This is an admitedly somewhat confusing (though convenient) notion that I
     am considering [removing](https://github.com/xray-
     pydata/xray/issues/117), but we if we don't remove them we should
     certainly expose their existence more clearly, given the potential
     benefits in expressiveness and costs in performance.

Questions to resolve:

- Is "Linked dataset variables" the best name for these?
- Perhaps it would be useful to show more information about these linked
  variables, such as their dimensions and/or shape?

Examples of the new repr are on nbviewer:
http://nbviewer.ipython.org/gist/shoyer/94936e5b71613683d95a
@shoyer shoyer added this to the 0.2 milestone May 20, 2014
shoyer added a commit that referenced this pull request May 29, 2014
Expose more information in DataArray.__repr__
@shoyer shoyer merged commit e64a202 into master May 29, 2014
@shoyer shoyer deleted the data-array-repr branch May 29, 2014 04:19
keewis pushed a commit to keewis/xarray that referenced this pull request Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant