-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Datetime parsing (PDEP-4): allow mixture of ISO formatted strings #50939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
044948f
f4e1392
9f06d80
d7f6056
8952a0e
6e6d579
3d65dbf
b247bbd
2f66f87
eb36d8c
262be89
e01b6ee
4a61e6a
607c77d
531e0e8
5582882
57b922c
313003e
2ede506
3b61e5b
acd44ae
ba6393f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -445,7 +445,8 @@ def _convert_listlike_datetimes( | |
if format is None: | ||
format = _guess_datetime_format_for_array(arg, dayfirst=dayfirst) | ||
|
||
if format is not None: | ||
# `format` could be inferred, or user didn't ask for mixed-format parsing. | ||
if format is not None and format != "mixed": | ||
return _array_strptime_with_fallback(arg, name, utc, format, exact, errors) | ||
|
||
result, tz_parsed = objects_to_datetime64ns( | ||
|
@@ -687,7 +688,7 @@ def to_datetime( | |
yearfirst: bool = False, | ||
utc: bool = False, | ||
format: str | None = None, | ||
mroeschke marked this conversation as resolved.
Show resolved
Hide resolved
|
||
exact: bool = True, | ||
exact: bool | lib.NoDefault = lib.no_default, | ||
unit: str | None = None, | ||
infer_datetime_format: lib.NoDefault | bool = lib.no_default, | ||
origin: str = "unix", | ||
|
@@ -717,9 +718,7 @@ def to_datetime( | |
.. warning:: | ||
|
||
``dayfirst=True`` is not strict, but will prefer to parse | ||
with day first. If a delimited date string cannot be parsed in | ||
accordance with the given `dayfirst` option, e.g. | ||
``to_datetime(['31-12-2021'])``, then a warning will be shown. | ||
with day first. | ||
|
||
yearfirst : bool, default False | ||
Specify a date parse order if `arg` is str or is list-like. | ||
|
@@ -759,13 +758,20 @@ def to_datetime( | |
<https://docs.python.org/3/library/datetime.html | ||
#strftime-and-strptime-behavior>`_ for more information on choices, though | ||
note that :const:`"%f"` will parse all the way up to nanoseconds. | ||
You can also pass: | ||
|
||
- "ISO8601", to parse any `ISO8601 <https://en.wikipedia.org/wiki/ISO_8601>`_ | ||
time string (not necessarily in exactly the same format); | ||
- "mixed", to infer the format for each element individually. This is risky, | ||
and you should probably use it along with `dayfirst`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's true that in that case that is best, but the typicaly caveat of that this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. true, but it would solve the typical example at least In [2]: pd.to_datetime(['12-01-2000 00:00:00', '13-01-2000 00:00:00'], format='mixed', dayfirst=True)
Out[2]: DatetimeIndex(['2000-01-12', '2000-01-13'], dtype='datetime64[ns]', freq=None) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I was just wondering if it would be useful to still call it out here explicitly (but it's quite explicit in the |
||
exact : bool, default True | ||
Control how `format` is used: | ||
|
||
- If :const:`True`, require an exact `format` match. | ||
- If :const:`False`, allow the `format` to match anywhere in the target | ||
string. | ||
|
||
Cannot be used alongside ``format='ISO8601'`` or ``format='mixed'``. | ||
unit : str, default 'ns' | ||
The unit of the arg (D,s,ms,us,ns) denote the unit, which is an | ||
integer or float number. This will be based off the origin. | ||
|
@@ -997,6 +1003,8 @@ def to_datetime( | |
DatetimeIndex(['2018-10-26 12:00:00+00:00', '2020-01-01 18:00:00+00:00'], | ||
dtype='datetime64[ns, UTC]', freq=None) | ||
""" | ||
if exact is not lib.no_default and format in {"mixed", "ISO8601"}: | ||
raise ValueError("Cannot use 'exact' when 'format' is 'mixed' or 'ISO8601'") | ||
if infer_datetime_format is not lib.no_default: | ||
warnings.warn( | ||
"The argument 'infer_datetime_format' is deprecated and will " | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any particular reason this is changed from failed? doesn't really matter to me, just curious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it just simplifies the logic