-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Refactor codecs error handlers to use _PyUnicodeError_GetParams
and extract complex logic into separate functions
#129173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
interpreter-core
(Objects, Python, Grammar, and Parser dirs)
topic-unicode
type-feature
A feature request or enhancement
Comments
This was referenced Jan 22, 2025
The merge plan is as follows:
|
picnixz
added a commit
that referenced
this issue
Jan 24, 2025
…129174) We also cleanup `PyCodec_StrictErrors` and the error message rendered when an object of incorrect type is passed to codec error handlers.
encukou
pushed a commit
that referenced
this issue
Feb 8, 2025
This was referenced Feb 9, 2025
_PyUnicodeError_GetParams
_PyUnicodeError_GetParams
and extract logic into separate functions
_PyUnicodeError_GetParams
and extract logic into separate functions_PyUnicodeError_GetParams
and extract complex logic into separate functions
encukou
pushed a commit
that referenced
this issue
Feb 14, 2025
encukou
pushed a commit
that referenced
this issue
Feb 20, 2025
picnixz
added a commit
that referenced
this issue
Feb 25, 2025
…129893) The logic of `PyCodec_ReplaceErrors` is now split into separate functions, each of which handling a specific exception type.
picnixz
added a commit
that referenced
this issue
Mar 3, 2025
Writing the decimal representation of a Unicode codepoint only requires to know the number of digits. --------- Co-authored-by: Petr Viktorin <encukou@gmail.com>
picnixz
added a commit
that referenced
this issue
Mar 3, 2025
…nctions (#129895) The logic of `PyCodec_BackslashReplaceErrors` is now split into separate functions, each of which handling a specific exception type.
We're done with this task! next one is to address #129894 (comment). |
seehwan
pushed a commit
to seehwan/cpython
that referenced
this issue
Apr 16, 2025
…ions (python#129893) The logic of `PyCodec_ReplaceErrors` is now split into separate functions, each of which handling a specific exception type.
seehwan
pushed a commit
to seehwan/cpython
that referenced
this issue
Apr 16, 2025
…thon#129894) Writing the decimal representation of a Unicode codepoint only requires to know the number of digits. --------- Co-authored-by: Petr Viktorin <encukou@gmail.com>
seehwan
pushed a commit
to seehwan/cpython
that referenced
this issue
Apr 16, 2025
…ate functions (python#129895) The logic of `PyCodec_BackslashReplaceErrors` is now split into separate functions, each of which handling a specific exception type.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
interpreter-core
(Objects, Python, Grammar, and Parser dirs)
topic-unicode
type-feature
A feature request or enhancement
Feature or enhancement
Proposal:
I want to refactor the different codecs handlers in
Python/codecs.c
to use_PyUnicodeError_GetParams
. Some codecs handlers will be refactored as part of #126004 but some others are not subject to issues (namely, theignore
,namereplace
,surrogateescape
, andsurrogatepass
handlers do not suffer from crashes, or at least I wasn't able to make them crash easily).In addition, I also plan to split the handlers into functions instead of 2 or 3 big blocks of code handling a specific exception. For that reason, I will introduce the following helper macros:
For handlers that need to be fixed, I will first fix them in-place (no refactorization). Afterwards, I will refactor them and extract the relevant part of the code into functions. That way, the diff will be easier to follow (
I've observed that it's much harder to read the diff where I did both so I will revert that part in the existing PRs; EDIT: actually there is no PR doing both fixes and split...).I'm creating this issue to track the progression of the refactorization if no issue occurs.
cc @vstinner @encukou
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Linked PRs
_PyUnicodeError_GetParams
inPyCodec_IgnoreErrors
#129174_PyUnicodeError_GetParams
inPyCodec_NameReplaceErrors
#129135_PyUnicodeError_GetParams
inPyCodec_SurrogatePassErrors
#129134_PyUnicodeError_GetParams
inPyCodec_SurrogateEscapeErrors
#129175PyCodec_ReplaceErrors
into separate functions #129893PyCodec_XMLCharRefReplaceErrors
logic #129894PyCodec_BackslashReplaceErrors
into separate functions #129895The text was updated successfully, but these errors were encountered: