Skip to content

Remove .T as shortcut for transpose() #2509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 26, 2018

Conversation

dcherian
Copy link
Contributor

  • Closes Remove the Dataset.T property? #1232
  • Fully documented, including whats-new.rst for all changes and api.rst for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)

@pep8speaks
Copy link

Hello @dcherian! Thanks for submitting the PR.

@jhamman
Copy link
Member

jhamman commented Oct 25, 2018

For posterity, what did we decide to do with the DataArray.T property?

@dcherian
Copy link
Contributor Author

For posterity, what did we decide to do with the DataArray.T property?

I'm not sure but I've fixed the whats-new entry to say that only Dataset.T has been removed currently.

d = super(Dataset, self).__dir__()
d.remove('T')
return d
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop this method entirely in favor of the super class method.

@shoyer
Copy link
Member

shoyer commented Oct 25, 2018

For posterity, what did we decide to do with the DataArray.T property?

I think this is arguably still useful/expected since DataArray is modeled on a NumPy array. But I could probably be convinced otherwise...

@dcherian
Copy link
Contributor Author

I think this is arguably still useful/expected since DataArray is modeled on a NumPy array.

I agree

@jhamman
Copy link
Member

jhamman commented Oct 25, 2018

Feeling like playing devils advocate here. I would expect we'll run into similar confusion with the DataArray transpose property.

>>> da = xr.DataArray([1, 2, 3, 4], dims=('time', ), coords={'time': pd.date_range('2018-10-24', periods=4)})
>>> da.time

<xarray.DataArray 'time' (time: 4)>
array(['2018-10-24T00:00:00.000000000', '2018-10-25T00:00:00.000000000',
       '2018-10-26T00:00:00.000000000', '2018-10-27T00:00:00.000000000'],
      dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 2018-10-24 2018-10-25 2018-10-26 2018-10-27

>>> da = xr.DataArray([1, 2, 3, 4], dims=('T', ), coords={'T': pd.date_range('2018-10-24', periods=4)})
>>> da.T

<xarray.DataArray (T: 4)>
array([1, 2, 3, 4])
Coordinates:
  * T        (T) datetime64[ns] 2018-10-24 2018-10-25 2018-10-26 2018-10-27

@dcherian
Copy link
Contributor Author

@jhamman one hiccup is that we haven't been issuing a DeprecationWarning for DataArray.T

I'm OK with removing it though.

@max-sixty
Copy link
Collaborator

I'm +0.2 for merging this, but at least issuing a DeprecationWarning before removing from DataArray.T (with no strong view on whether we do remove it)

@jhamman
Copy link
Member

jhamman commented Oct 25, 2018

Sure, that's a good point. I just wanted to illustrate that these are the same problems and that only removing the property from the Dataset may lead to continued confusion. @shoyer thoughts?

@shoyer
Copy link
Member

shoyer commented Oct 25, 2018

I suppose the case for removing DataArray.T is that you don't really need a shortcut for transposing arrays in xarray, since most functions don't rely on dimension order. And of course the real motivation: some weather/climate scientists like to use a variable named T for time :).

The counter-argument would be that T is found on most other numpy array-like types, including pandas.DataFrame. I've definitely used .T as a convenient shortcut myself in some cases.

I guess that puts me at -0 for deprecating DataArray.T. But we can merge this already, and leave that for a separate discussion.

@max-sixty
Copy link
Collaborator

I just wanted to illustrate that these are the same problems and that only removing the property from the Dataset may lead to continued confusion.

I think that's a good point. There are a few methods on DataArrays but not Datasets, but not many:

In [4]: {x for x in set(dir(xr.DataArray)) - set(dir(xr.Dataset)) if not x.startswith("_")}
Out[4]:
{'data',
 'dot',
 'dt',
 'dtype',
 'from_cdms2',
 'from_iris',
 'from_series',
 'get_axis_num',
 'item',
 'name',
 'ndim',
 'plot',
 'searchsorted',
 'shape',
 'size',
 'to_cdms2',
 'to_dataset',
 'to_index',
 'to_iris',
 'to_masked_array',
 'to_pandas',
 'to_series',
 'variable'}

Regardless, sounds like @shoyer thinks we should merge (and I'm probably still +0.1)

@@ -33,6 +33,8 @@ v0.11.0 (unreleased)
Breaking changes
~~~~~~~~~~~~~~~~

- ``Dataset.T`` has been removed as a shortcut for :py:meth:`transpose()`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally we don't put () in the docs - is this our convention or will the docs not link?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also in the line below!
I really don't think it matters if it's only cosmetic, though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

@max-sixty max-sixty merged commit 5940100 into pydata:master Oct 26, 2018
@jhamman
Copy link
Member

jhamman commented Oct 26, 2018

Thanks @dcherian!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants