-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
datetime.strptime(dt.strftime("%c"), "%c"))
fails when year is <1000.
#124529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The year for datetime.datetime must be and is allowed to be anything in range |
Considering these results: >>> datetime(999, 1, 1).strftime("%c")
'Tue Jan 1 00:00:00 999'
>>> datetime.strptime("Tue Jan 1 00:00:00 999", "%c") # as from strftime() above => the error described above
[snip]
ValueError: time data 'Tue Jan 1 00:00:00 999' does not match format '%c'
>>> datetime.strptime("Tue Jan 1 00:00:00 999", "%c") # adding 0 before 999 to have 4-digit width year => success
datetime.datetime(999, 1, 1, 0, 0) ...and the following fragment of the docs (https://docs.python.org/3/library/datetime.html#technical-detail):
Another person, however, could argue that:
What do you think? [EDIT] The quoted note refers to the |
PS It seems that for $ ./python
Python 3.14.0a0 (heads/main:a4d1fdfb15, Sep 26 2024, 22:47:21) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> t_tuple = time.strptime("Tue Jan 1 00:00:00 0999", '%c')
>>> t_tuple
time.struct_time(tm_year=999, tm_mon=1, tm_mday=1, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=1, tm_yday=1, tm_isdst=-1)
>>> time.strftime('%c', t_tuple)
'Tue Jan 1 00:00:00 999'
>>> time.strptime(_, '%c')
Traceback (most recent call last):
File "<python-input-4>", line 1, in <module>
time.strptime(_, '%c')
~~~~~~~~~~~~~^^^^^^^^^
File "/home/zuo/cpython/Lib/_strptime.py", line 567, in _strptime_time
tt = _strptime(data_string, format)[0]
~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
File "/home/zuo/cpython/Lib/_strptime.py", line 352, in _strptime
raise ValueError("time data %r does not match format %r" %
(data_string, format))
ValueError: time data 'Tue Jan 1 00:00:00 999' does not match format '%c' |
HypothesisIt seems that the source of the problem is that (at least typically – for the
...whereas...
ObservationI checked that: (1) When formatting that example year 999, the results are:
Conclusion: (2) When parsing that example year Possible fixIn the (Another theoretically possible variant: just make the |
I'd happy to implement the fix – if you decide that this should be fixed. |
No issue on my Macbook laptop
|
Could you please check what string is returned on you system from the following call? >>> datetime(999, 1, 1).strftime("%c") Thanx :) PS My guess is that, for your locale, a |
@zuo I just tried it just now
|
Thank you! Yeah, that leading zero your platform/locale provides makes Anyway, now it's quite clear for me what the fix should be. |
Proof of concept: diff --git a/Lib/_strptime.py b/Lib/_strptime.py
index a3f8bb544d..6a2527b75c 100644
--- a/Lib/_strptime.py
+++ b/Lib/_strptime.py
@@ -213,8 +213,10 @@ def __init__(self, locale_time=None):
'Z'),
'%': '%'})
base.__setitem__('W', base.__getitem__('U').replace('U', 'W'))
- base.__setitem__('c', self.pattern(self.locale_time.LC_date_time))
- base.__setitem__('x', self.pattern(self.locale_time.LC_date))
+ base.__setitem__(
+ 'c', self.__pattern_with_lax_year(self.locale_time.LC_date_time))
+ base.__setitem__(
+ 'x', self.__pattern_with_lax_year(self.locale_time.LC_date))
base.__setitem__('X', self.pattern(self.locale_time.LC_time))
def __seqToRE(self, to_convert, directive):
@@ -236,6 +238,21 @@ def __seqToRE(self, to_convert, directive):
regex = '(?P<%s>%s' % (directive, regex)
return '%s)' % regex
+ def __pattern_with_lax_year(self, format):
+ """Like pattern(), but making %y and %Y accept also fewer digits.
+
+ Necessary to ensure that strptime() is able to parse strftime()'s
+ output when the %c or %x format code is used -- considering that
+ for some locales/platforms (e.g., 'C.UTF-8' on Linux), formatting
+ with either %c or %x may cause year numbers, if a number is small,
+ to have fewer digits than usual (e.g., '999' instead of `0999', or
+ '9' instead of '0009' or '09').
+ """
+ pattern = self.pattern(format)
+ pattern = pattern.replace(self['y'], r"(?P<y>\d{1,2})")
+ pattern = pattern.replace(self['Y'], r"(?P<Y>\d{1,4})")
+ return pattern
+
def pattern(self, format):
"""Return regex pattern for the format string.
[EDIT] After applying the above patch, the error does not occur anymore: >>> import time
>>> t_tuple = time.strptime("Tue Jan 1 00:00:00 0999", '%c')
>>> t_tuple
time.struct_time(tm_year=999, tm_mon=1, tm_mday=1, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=1, tm_yday=1, tm_isdst=-1)
>>> time.strftime('%c', t_tuple)
'Tue Jan 1 00:00:00 999'
>>> time.strptime(_, '%c')
time.struct_time(tm_year=999, tm_mon=1, tm_mday=1, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=1, tm_yday=1, tm_isdst=-1)
>>>
>>> from datetime import datetime
>>> datetime(999, 1, 1).strftime('%c')
'Tue Jan 1 00:00:00 999'
>>> datetime.strptime(_, '%c')
datetime.datetime(999, 1, 1, 0, 0) |
@pganssle @terryjreedy @Mariatta @serhiy-storchaka OK, I'd like to propose the fix, implemented in the linked PR #124778 – considering that:
|
|
I do not think that #124778 is a right solution. We should fix %Y, %G and maybe %y. This will automatically fix %c and %x. And I consider this a bugfix which should be backported. |
But wouldn't making My proposal in #124778 is similar, just much more conservative (as |
Let me also emphasize that (And there are no such statements in the docs when it comes to |
Bug report
Bug description:
Discovered this when adding some hypothesis tests for
strptime
/strftime
. I doubt this is a real problem anyone is going to have in the real world, but maybe.I do not know if this is locale-specific or OS specific.
CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux
Linked PRs
_strptime
to make%c
/%x
accept a year with fewer digits #124778The text was updated successfully, but these errors were encountered: