Skip to content

gh-53203: Fix strptime() for %c and %x formats on many locales #124946

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

serhiy-storchaka
Copy link
Member

@serhiy-storchaka serhiy-storchaka commented Oct 3, 2024

Fixed locales:
Arabic, Bislama, Breton, Bodo, Kashubian, Chuvash, Estonian, French, Irish,
Ge'ez, Gurajati, Manx Gaelic, Hebrew, Hindi, Chhattisgarhi, Haitian Kreyol,
Japanese, Kannada, Korean, Marathi, Malay, Norwegian, Nynorsk, Punjabi,
Rajasthani, Tok Pisin, Yoruba, Yue Chinese, Yau/Nungon and Chinese.

In some locales, the default month used in __calc_date_time has the same name in full and abbreviated form. In others it conflicts with the day of the week name or other parts of the representation. So the code failed to correctly distinguish formats %b and %B.

In some locales (for example French and Hebrew), the default month
used in __calc_date_time has the same name in full and abbreviated
form. So the code failed to correctly distinguish formats %b and %B.

Co-authored-by: Eli Bendersky <[email protected]>
@serhiy-storchaka serhiy-storchaka force-pushed the strptime-short-month-names branch from 2c81238 to 1305285 Compare October 3, 2024 18:11
@serhiy-storchaka serhiy-storchaka changed the title gh-53203: Fix strptime(..,'%c') on locales with short month names gh-53203: Fix strptime() for %c and %x formats on many locales Oct 9, 2024
@serhiy-storchaka serhiy-storchaka added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 10, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @serhiy-storchaka for commit ef8c18e 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 10, 2024
@serhiy-storchaka
Copy link
Member Author

!buildbot RHEL8

@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @serhiy-storchaka for commit fdfbef6 🤖

The command will test the builders whose names match following regular expression: RHEL8

The builders matched are:

  • aarch64 RHEL8 PR
  • s390x RHEL8 LTO + PGO PR
  • s390x RHEL8 PR
  • AMD64 RHEL8 LTO + PGO PR
  • AMD64 RHEL8 PR
  • aarch64 RHEL8 LTO + PGO PR
  • PPC64LE RHEL8 Refleaks PR
  • s390x RHEL8 Refleaks PR
  • AMD64 RHEL8 FIPS Only Blake2 Builtin Hash PR
  • PPC64LE RHEL8 PR
  • AMD64 RHEL8 LTO PR
  • aarch64 RHEL8 Refleaks PR
  • aarch64 RHEL8 LTO PR
  • PPC64LE RHEL8 LTO + PGO PR
  • AMD64 RHEL8 FIPS No Builtin Hashes PR
  • s390x RHEL8 LTO PR
  • AMD64 RHEL8 Refleaks PR
  • PPC64LE RHEL8 LTO PR

@serhiy-storchaka serhiy-storchaka enabled auto-merge (squash) October 12, 2024 17:40
@serhiy-storchaka serhiy-storchaka merged commit c05f9dd into python:main Oct 12, 2024
34 checks passed
@miss-islington-app
Copy link

Thanks @serhiy-storchaka for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 12, 2024
…ythonGH-124946)

In some locales (like French or Hebrew) the full or abbreviated names of
the default month and weekday used in __calc_date_time can be part of
other name or constant part of the %c format. The month name can also
match %m with constant suffix (like in Japanese). So the code failed to
correctly distinguish formats %a, %A, %b, %B and %m.

Cycle all month and all days of the week to find the variable part
and distinguish %a from %A and %b from %B or %m.

Fixed locales for the following languges:
Arabic, Bislama, Breton, Bodo, Kashubian, Chuvash, Estonian, French, Irish,
Ge'ez, Gurajati, Manx Gaelic, Hebrew, Hindi, Chhattisgarhi, Haitian Kreyol,
Japanese, Kannada, Korean, Marathi, Malay, Norwegian, Nynorsk, Punjabi,
Rajasthani, Tok Pisin, Yoruba, Yue Chinese, Yau/Nungon and Chinese.

(cherry picked from commit c05f9dd)

Co-authored-by: Serhiy Storchaka <[email protected]>
Co-authored-by: Eli Bendersky <[email protected]>
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 12, 2024
…ythonGH-124946)

In some locales (like French or Hebrew) the full or abbreviated names of
the default month and weekday used in __calc_date_time can be part of
other name or constant part of the %c format. The month name can also
match %m with constant suffix (like in Japanese). So the code failed to
correctly distinguish formats %a, %A, %b, %B and %m.

Cycle all month and all days of the week to find the variable part
and distinguish %a from %A and %b from %B or %m.

Fixed locales for the following languges:
Arabic, Bislama, Breton, Bodo, Kashubian, Chuvash, Estonian, French, Irish,
Ge'ez, Gurajati, Manx Gaelic, Hebrew, Hindi, Chhattisgarhi, Haitian Kreyol,
Japanese, Kannada, Korean, Marathi, Malay, Norwegian, Nynorsk, Punjabi,
Rajasthani, Tok Pisin, Yoruba, Yue Chinese, Yau/Nungon and Chinese.

(cherry picked from commit c05f9dd)

Co-authored-by: Serhiy Storchaka <[email protected]>
Co-authored-by: Eli Bendersky <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Oct 12, 2024

GH-125369 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Oct 12, 2024
@bedevere-app
Copy link

bedevere-app bot commented Oct 12, 2024

GH-125370 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 only security fixes label Oct 12, 2024
serhiy-storchaka added a commit that referenced this pull request Oct 12, 2024
…GH-124946) (GH-125370)

In some locales (like French or Hebrew) the full or abbreviated names of
the default month and weekday used in __calc_date_time can be part of
other name or constant part of the %c format. The month name can also
match %m with constant suffix (like in Japanese). So the code failed to
correctly distinguish formats %a, %A, %b, %B and %m.

Cycle all month and all days of the week to find the variable part
and distinguish %a from %A and %b from %B or %m.

Fixed locales for the following languges:
Arabic, Bislama, Breton, Bodo, Kashubian, Chuvash, Estonian, French, Irish,
Ge'ez, Gurajati, Manx Gaelic, Hebrew, Hindi, Chhattisgarhi, Haitian Kreyol,
Japanese, Kannada, Korean, Marathi, Malay, Norwegian, Nynorsk, Punjabi,
Rajasthani, Tok Pisin, Yoruba, Yue Chinese, Yau/Nungon and Chinese.

(cherry picked from commit c05f9dd)

Co-authored-by: Serhiy Storchaka <[email protected]>
Co-authored-by: Eli Bendersky <[email protected]>
serhiy-storchaka added a commit that referenced this pull request Oct 12, 2024
…GH-124946) (GH-125369)

In some locales (like French or Hebrew) the full or abbreviated names of
the default month and weekday used in __calc_date_time can be part of
other name or constant part of the %c format. The month name can also
match %m with constant suffix (like in Japanese). So the code failed to
correctly distinguish formats %a, %A, %b, %B and %m.

Cycle all month and all days of the week to find the variable part
and distinguish %a from %A and %b from %B or %m.

Fixed locales for the following languges:
Arabic, Bislama, Breton, Bodo, Kashubian, Chuvash, Estonian, French, Irish,
Ge'ez, Gurajati, Manx Gaelic, Hebrew, Hindi, Chhattisgarhi, Haitian Kreyol,
Japanese, Kannada, Korean, Marathi, Malay, Norwegian, Nynorsk, Punjabi,
Rajasthani, Tok Pisin, Yoruba, Yue Chinese, Yau/Nungon and Chinese.

(cherry picked from commit c05f9dd)

Co-authored-by: Serhiy Storchaka <[email protected]>
Co-authored-by: Eli Bendersky <[email protected]>
@serhiy-storchaka serhiy-storchaka deleted the strptime-short-month-names branch October 13, 2024 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants