Skip to content

gh-124529: Fix _strptime to make %c/%x accept a year with fewer digits #124778

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

zuo
Copy link
Contributor

@zuo zuo commented Sep 30, 2024

_strptime.TimeRE has been modified, regarding the %c and %x format
codes, to make datetime.datetime.strptime(), datetime.date.strptime()
and time.strptime() accept inputs that contain year numbers consisting
of fewer digits than usual, i.e., not zero-padded to the usual width
of 2 or 4 (which is appropriate for the format code and current locale).

Thanks to that, now the desired behavior described in gh-124529 should be
consistently observed; certain strftime/strptime round trips (involving
%c/%x and small year numbers) no longer raise ValueError for some
platforms/locales (in particular, this was the case on Linux, for various
locales, including C/C.UTF-8).

@zuo zuo force-pushed the strptime-c-and-y-fix branch from 33d5109 to d3ae17b Compare September 30, 2024 02:51
@zuo zuo changed the title gh-124529: Fix _strptime to make %c/%y accept a year with fewer digits gh-124529: Fix _strptime to make %c/%x accept a year with fewer digits Sep 30, 2024
@zuo zuo force-pushed the strptime-c-and-y-fix branch from d3ae17b to 91b6bca Compare September 30, 2024 02:52
@zuo zuo force-pushed the strptime-c-and-y-fix branch from 91b6bca to 1d916a0 Compare September 30, 2024 02:54
@@ -0,0 +1,9 @@
Fix :meth:`datetime.datetime.strptime`, :meth:`datetime.date.strptime` as
well as :func:`time.strptime` (by modifying :class:`_strptime.TimeRE`) to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you shouldn't write classes or methods with underscores to NEWS, because they're not public.
And, Maybe you can try to shorten the description of NEWS, it's too long.

Copy link
Contributor Author

@zuo zuo Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, shortened the text and removed the reference to a non-public API. :)

@zuo zuo requested a review from rruuaanng September 30, 2024 19:01
@zuo zuo force-pushed the strptime-c-and-y-fix branch from bce4b6d to 56c84dc Compare September 30, 2024 20:44
``dt.year`` less than 1000) no longer raise ``ValueError`` for some
locales/platforms (this was the case, e.g., on Linux -- for various
locales, including ``C``/``C.UTF-8``).
Fix :meth:`datetime.datetime.strptime`, :meth:`datetime.date.strptime`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should write this

Fixed ``%c``/``%x`` in :meth:`datetime.datetime.strptime`, :meth:`datetime.date.strptime` and :func:`time.strptime` to accept years with  fewer digits than the usual 4 or 2 (not zero-padded).

This will be more concise and intuitive.

Copy link
Contributor Author

@zuo zuo Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've shortened it; yet I'd prefer to keep the information that the fix prevents the error for certain cases of a strftime/strptime round trip (this is the key element of the gh-124529 issue's description).

@zuo
Copy link
Contributor Author

zuo commented Oct 1, 2024

PS Note that the changes concern only the %c/%x-parsing-related behavior, and nothing changes when it comes to %y, %Y and %G (and any non-standard year-representation-related format codes).

@zuo
Copy link
Contributor Author

zuo commented Oct 1, 2024

PPS I added (obviously) the necessary tests (directly related to the changes), but also some indirectly-related ones (especially confirming the unchanged behavior for %y/%Y/%G).

PPPS A suitable mention in the docs has also been added.

@zuo zuo requested a review from rruuaanng October 1, 2024 13:23
@zuo
Copy link
Contributor Author

zuo commented Oct 1, 2024

I hope it's ready. [EDIT: not yet]

@zuo zuo force-pushed the strptime-c-and-y-fix branch from c16b5e3 to 059d4c5 Compare October 1, 2024 20:33
@zuo
Copy link
Contributor Author

zuo commented Oct 1, 2024

OK, I hope now it's ready. :)

and :func:`time.strptime` to make ``%c`` and ``%x`` accept year numbers
with fewer digits than the usual 4 or 2 (not zero-padded). This prevents
:exc:`ValueError` in certain cases of ``strftime/strptime`` round trips.
Patch by Jan Kaliszewski.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you should remove the author, because this PR will not be deleted after merging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't adding such information a common and documented practice?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the user-oriented update record, users do not care who implemented it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most users don't care, indeed; but for authors the presence of such information is a symbolic reward for the contribution. I find this custom nice.

And anyway, as I wrote above, we are talking about a common practice. I am not the right person to ask about changing established practices regarding Python development. :)

('y', ('9', '100', 'ni')),
('Y', ('7', '42', '999', '10000', 'SPAM')),
):
fmt = "%" + directive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can use f-string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both ways are valid; the one I chose is more consistent with the existing code in this module.

@zuo
Copy link
Contributor Author

zuo commented Oct 4, 2024

TODO: if this PR's approach is accepted at all, it would be good to adjust it to the changes from PR #124946 (after it is merged).

@zuo zuo requested a review from rruuaanng October 11, 2024 05:10
self.fail(str(exc))

try:
output_year = _strptime._strptime(input_string, fmt)[0][0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is't possible for _strptime._strptime to return a one-dim list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It always returns a 3-tuple, the 0th item of which is a 9-tuple, the 0th item of which is a int representing the year number.

fmt = "%" + directive

try:
(input_string,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, Wouldn't this be weird?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? IMHO, OK.

@donBarbos
Copy link
Contributor

looks like good idea and important fix :)
@zuo are you ready to continue?

@zuo zuo requested a review from rruuaanng March 3, 2025 02:46
@zuo
Copy link
Contributor Author

zuo commented Mar 3, 2025

looks like good idea and important fix :) @zuo are you ready to continue?

Yes, I am :)

Yet I'm not sure what is the next step...

@@ -2648,6 +2648,17 @@ Notes:
example, "month/day/year" versus "day/month/year"), and the output may
contain non-ASCII characters.

.. versionchanged:: 3.14
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as far as i know we use next construction

Suggested change
.. versionchanged:: 3.14
.. versionchanged:: next

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and I think you could mark many comments and suggestions as resolved, otherwise they can get in the way of reading and seem ignored :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and I think you could mark many comments and suggestions as resolved, otherwise they can get in the way of reading and seem ignored :)

Shouldn't it be done by the reviewer who asked a question?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can mark as solved.
at least I can't see this button as the author of suggestion,
so I think only you can close them, especially if there are no plans to continue this topic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants