Skip to content

BUG: combination of out of bound date and nan with errors='ignore' gives nonsense data #26493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jorisvandenbossche opened this issue May 22, 2019 · 5 comments · Fixed by #50242
Labels
Bug Datetime Datetime data dtype

Comments

@jorisvandenbossche
Copy link
Member

In [46]: pd.__version__        
Out[46]: '0.25.0.dev0+593.g307265e28'

In [47]: pd.to_datetime(['15010101', '20150101', np.nan], format="%Y%m%d", errors='ignore')                                                                      
Out[47]: DatetimeIndex(['2085-07-20 23:34:33.709551616', '2015-01-01 00:00:00', 'NaT'], dtype='datetime64[ns]', freq=None)
@jorisvandenbossche jorisvandenbossche added Bug Datetime Datetime data dtype labels May 22, 2019
@jorisvandenbossche jorisvandenbossche added this to the Contributions Welcome milestone May 22, 2019
@makbigc
Copy link
Contributor

makbigc commented May 25, 2019

result[mask] = masked_result.astype('M8[ns]')

The astype conversion turns all elements no matter the element is out of bound.

@jorisvandenbossche
Copy link
Member Author

I can't say for sure (without looking more in detail) if that is indeed the reason (I would think that the array_to_datetime in the calc function a just above should still raise for out of bounds datetimes). But welcome to check further! Eg does it work correctly if you leave out the astype?

@another-green
Copy link
Contributor

What is the expected behavior of to_datetime in this case? If there is time out of bound, should it just ignore all the elements in the array and return an Index? For example:

Index(['15010101', '20150101', nan], dtype='object')

Or we want to convert the date like element, such as

Index([1501-01-01 00:00:00, 2015-01-01 00:00:00, 'NaT'], dtype='object')

@jorisvandenbossche
Copy link
Member Author

That's not fully clear. See also #14487

@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@MarcoGorelli
Copy link
Member

this'll be addressed by #50242

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants