Skip to content

DOC: Fixed timedelta docstring and examples #23259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Nov 20, 2018

Conversation

ryankarlos
Copy link
Contributor

@ryankarlos ryankarlos commented Oct 20, 2018

@pep8speaks
Copy link

pep8speaks commented Oct 20, 2018

Hello @ryankarlos! Thanks for updating the PR.

Comment last updated on October 27, 2018 at 13:50 Hours UTC

@ryankarlos
Copy link
Contributor Author

I've added in a few examples to illustrate 'M' and 'Y' outputs along with generating new columns in a dataframe using timedelta. Also corrected some linting issues and added a better summary. Any feedback would be welcome.

@ryankarlos
Copy link
Contributor Author

Travis failing in feather --- ive seen other PRs having the same problem - not sure why it is ?

@@ -64,6 +69,31 @@ def to_timedelta(arg, unit='ns', box=True, errors='raise'):
TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'],
dtype='timedelta64[ns]', freq=None)

For `M` and `Y` units, `1M = 30D` and 1Y = 365D:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there's an issue to deprecate 'M' and 'Y' for Timedelta, so I don't think it's a good idea to advertise these argument that are slated for deprecation.

#16344

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mroeschke sorry wasn't aware about the issue - ill remove it. Im thinking giving example for all the other units would be overkill - maybe a couple more ?
Maybe a reference to DateOffset in the description or examples would help ?

Copy link
Member

@mroeschke mroeschke Oct 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're up for it, you could address #16344 first and then add a note here saying "Day frequencies and higher are only supported.

Otherwise, I think this docstring has a good number of already examples, but demonstrating box would be a nice to have.

Copy link
Contributor Author

@ryankarlos ryankarlos Oct 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just sent an initial PR for #16344 , still have work to do as its my first time fixing deprecation issues before.
What is the protocol for displaying deprecation warnings within docstrings - does it go under summary section or within a separate Notes section ? I guess it would make more sense to have a line under 'units' stating M an Y are not supported and then reference to pd.DateOffset or similar for M and Y ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can mention the deprecated parameters in the Notes section

Convert argument to timedelta
Convert argument to timedelta.

Timedeltas are differences in times, expressed in difference units
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you mention they are absolute differences in time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@ryankarlos ryankarlos changed the title DOC: Fixed timedelta docstring to include month and year units and examples DOC: Fixed timedelta docstring examples Oct 21, 2018
@ryankarlos ryankarlos changed the title DOC: Fixed timedelta docstring examples DOC: Fixed timedelta docstring and examples Oct 21, 2018
>>> pd.to_timedelta(np.arange(5), box=False)
array([0, 1, 2, 3, 4], dtype='timedelta64[ns]')

Add new column of dates from existing dates in a `DataFrame`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove this example as well; it demonstrates assignment more than to_timedelta

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the box example or the one below it ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one below it demonstrating DataFrame column assignment

@codecov
Copy link

codecov bot commented Oct 25, 2018

Codecov Report

Merging #23259 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #23259   +/-   ##
=======================================
  Coverage   92.28%   92.28%           
=======================================
  Files         161      161           
  Lines       51451    51451           
=======================================
  Hits        47483    47483           
  Misses       3968     3968
Flag Coverage Δ
#multiple 90.68% <ø> (ø) ⬆️
#single 42.31% <ø> (+0.03%) ⬆️
Impacted Files Coverage Δ
pandas/core/tools/timedeltas.py 98.03% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update deb7b4d...a67649a. Read the comment docs.

@mroeschke
Copy link
Member

cc @datapythonista if you could have a look.

@ryankarlos
Copy link
Contributor Author

ryankarlos commented Oct 25, 2018

@mroeschke I haven't added the notes section for deprecated units yet - i was working concurrently on #23264 so wasn't sure if i should add that in before finishing that

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work on this. Added some comments about the formatting.

- If True returns a Timedelta/TimedeltaIndex of the results
- if False returns a np.timedelta64 or ndarray of values of dtype
timedelta64[ns]
arg : String, timedelta, list, tuple, 1-d array, or Series
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

str instead of String

timedelta64[ns]
arg : String, timedelta, list, tuple, 1-d array, or Series
The argument which needs to be converted to timedelta.
unit : Integer or float, default ns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int instead of Integer. But doesn't make sense, seemsl like it should be a string.

Quotes around 'ns'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh Sorry yes - my bad - thanks for spotting this.

The argument which needs to be converted to timedelta.
unit : Integer or float, default ns
Denotes the unit (D,h,m,s,ms,us,ns) of the arg.
box : Boolean, default True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool instead of Boolean

Denotes the unit (D,h,m,s,ms,us,ns) of the arg.
box : Boolean, default True
If True returns a Timedelta/TimedeltaIndex of the results.
if False returns a np.timedelta64 or ndarray of values of dtype
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a list, it'll render better. Ignore the validation errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, thats why i removed them below to get rid of the validation errors - will include them.

- If 'ignore', then invalid parsing will return the input
If 'raise', then invalid parsing will raise an exception.
If 'coerce', then invalid parsing will be set as NaT.
If 'ignore', then invalid parsing will return the input.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the list.


>>> pd.to_timedelta(np.arange(5), box=False)
array([0, 1, 2, 3, 4], dtype='timedelta64[ns]')

See also
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capital A


>>> pd.to_timedelta(np.arange(5), box=False)
array([0, 1, 2, 3, 4], dtype='timedelta64[ns]')

See also
--------
pandas.DataFrame.astype : Cast argument to a specified dtype.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the pandas. prefix

@ryankarlos
Copy link
Contributor Author

ryankarlos commented Oct 26, 2018

@datapythonista Thanks for the review. Added changes. Also rendered the docstring below:

screen shot 2018-10-27 at 00 40 25

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good, added a few more comments.

Convert argument to timedelta.

Timedeltas are absolute differences in times, expressed in difference
units e.g. days, hours, minutes, seconds. This method converts an argument
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd look better to have the e.g. in brackets.

arg : str, timedelta, list, tuple, 1-d array, or Series
The argument which needs to be converted to timedelta.
unit : str, default 'ns'
Denotes the unit (D,h,m,s,ms,us,ns) of the arg.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add spaces after the commas.

- If 'ignore', then invalid parsing will return the input
- If 'raise', then invalid parsing will raise an exception.
- If 'coerce', then invalid parsing will be set as NaT.
- If 'ignore', then invalid parsing will return the input.

Returns
-------
ret : timedelta64/arrays of timedelta64 if parsing succeeded
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can get rid of the ret, and leave here just the types timedelta64 or numpy.array of timedelta64. We should be able to automatically parse the line with the types, so no description should go there. A description on what is being returned, and clarification on when every type is being returned should go in the next line (indented).

unit : str, default 'ns'
Denotes the unit (D,h,m,s,ms,us,ns) of the arg.
box : bool, default True
- If True returns a Timedelta/TimedeltaIndex of the results.
- if False returns a np.timedelta64 or ndarray of values of dtype
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capital I

@ryankarlos
Copy link
Contributor Author

ryankarlos commented Oct 27, 2018

@datapythonista Thanks, added an update. Im working concurrently on a related PR #23264 -- not sure if i should add the notes section on deprecated units here or in that PR ?

from a recognized timedelta format / value into a Timedelta type.

Parameters
----------
arg : str, timedelta, list, tuple, 1-d array, or Series
arg : str, timedelta, list, tuple, 1-d array, or Series
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the missing spaces were down, in the (D,h,m,s...), which should be (D, h, m, s,...).

Just leave a space after the commas here.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just spotted couple of last things. If the examples are now working, looks good to me after these last fixes.

Convert argument to timedelta.

Timedeltas are absolute differences in times, expressed in difference
units e.g. (days, hours, minutes, seconds). This method converts an argument
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move the e.g. inside the parenthesis I think that makes more sense.

- If True returns a Timedelta/TimedeltaIndex of the results
- if False returns a np.timedelta64 or ndarray of values of dtype
timedelta64[ns]
arg : str, timedelta, list, tuple, 1-d array, or Series
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd replace list, tuple, 1-d array by list-like. I'd also remove the comma vefore the or.

@ryankarlos
Copy link
Contributor Author

ryankarlos commented Oct 27, 2018

@datapythonista done :)

screen shot 2018-10-27 at 15 05 57

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@sinhrks sinhrks mentioned this pull request Nov 1, 2018
4 tasks
@ryankarlos
Copy link
Contributor Author

@datapythonista There is a now a conflict with the most recent merge #23439 - with more additions in the units parameters. Should I revert to the version we have in this - or do you prefer to include all those units ('Y', 'M' shouldn't be in there for eg ?). Let me know and I can fix.

@datapythonista
Copy link
Member

Unlucky this was not merged earlier.

You need to fix the conflict in a way that you get the lastest changes (from the conflicting PR in this case), and you make any fixes to that.

This PR is a diff of what you have and master, if you take your changes, you'll be deleting the work in that PR here.

I'm not sure of how the long set of units in that PR renders. If it doesn't render correctly, you can fix it here (instead of having the units in the type, just say str and then you show the units as a list in the description).

@ryankarlos
Copy link
Contributor Author

@datapythonista ok thanks - ill fetch the latest changes from master - so is it ok for me to list the units as the short format (D, h, m, s, ms, us, ns) in the description if the long format looks odd on rendering. Is expanding on all the different unit formats necessary ?

@datapythonista
Copy link
Member

Yes, if they were added that's for a reason. Avoid changing the content unless you really know what you're doing. But make the format look as good as possible. And in this case, I'm not sure how sphinx handles multilines in the types, we had problems with that in the past, so may be the current format is displayed broken.

@ryankarlos
Copy link
Contributor Author

@datapythonista Ive fixed the merge conflict to include changes from upstream. The units were not rendering properly when including in the type (the format was getting broken after the first line) so I shifted it to the description as you had suggested.

screen shot 2018-11-13 at 00 10 16

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in reviewing.

Looks good, just couple of minor things, and I think it's ready to be merged.

@jreback jreback added this to the 0.24.0 milestone Nov 20, 2018
@jreback jreback added the Timedelta Timedelta data type label Nov 20, 2018
@jreback
Copy link
Contributor

jreback commented Nov 20, 2018

@datapythonista over to you. merge when satisfied.

@datapythonista datapythonista merged commit 83c7e28 into pandas-dev:master Nov 20, 2018
@datapythonista
Copy link
Member

Thanks for fixing this @ryankarlos

thoo added a commit to thoo/pandas that referenced this pull request Nov 20, 2018
…fixed

* upstream/master:
  DOC: Removing rpy2 dependencies, and converting examples using it to regular code blocks (pandas-dev#23737)
  BUG: Fix dtype=str converts NaN to 'n' (pandas-dev#22564)
  DOC: update pandas.core.resample.Resampler.nearest docstring (pandas-dev#20381)
  REF/TST: Add more pytest idiom to parsers tests (pandas-dev#23810)
  Added support for Fraction and Number (PEP 3141) to pandas.api.types.is_scalar (pandas-dev#22952)
  DOC: Updating to_timedelta docstring (pandas-dev#23259)
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Datetime units not listed in pd.to_timedelta docstring
6 participants