Skip to content

gh-89653: PEP 670: Convert unicodeobject.h macros to functions #91696

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

gh-89653: PEP 670: Convert unicodeobject.h macros to functions #91696

wants to merge 1 commit into from

Conversation

vstinner
Copy link
Member

Convert unicodeobject.h macros to static inline functions:

  • Reorder functions to declare functions before their first usage.
  • PyUnicode_READ_CHAR() and PyUnicode_MAX_CHAR_VALUE() now only call
    PyUnicode_KIND() once.
  • Simplify PyUnicode_GET_SIZE().
  • PyUnicode_READ_CHAR() now uses PyUnicode_1BYTE_DATA(),
    PyUnicode_2BYTE_DATA() and PyUnicode_4BYTE_DATA().
  • Remove redundant PyUnicode_Check() assertions.

Static inline functions are wrapped into macros which casts pointer
types (PyObject*, void*) to prevent introducing new compiler warnings
when passing const pointers (ex: PyUnicode_WRITE).

PyUnicode_KIND() return type is "unsigned int" rather than "enum
PyUnicode_Kind" to prevent introducing new compiler warnings.

Convert unicodeobject.h macros to static inline functions:

* Reorder functions to declare functions before their first usage.
* PyUnicode_READ_CHAR() and PyUnicode_MAX_CHAR_VALUE() now only call
  PyUnicode_KIND() once.
* Simplify PyUnicode_GET_SIZE().
* PyUnicode_READ_CHAR() now uses PyUnicode_1BYTE_DATA(),
  PyUnicode_2BYTE_DATA() and PyUnicode_4BYTE_DATA().
* Remove redundant PyUnicode_Check() assertions.

Static inline functions are wrapped into macros which casts pointer
types (PyObject*, void*) to prevent introducing new compiler warnings
when passing const pointers (ex: PyUnicode_WRITE).

PyUnicode_KIND() return type is "unsigned int" rather than "enum
PyUnicode_Kind" to prevent introducing new compiler warnings.
@vstinner
Copy link
Member Author

Changes since my #31221 POC:

  • Add macros to prevent emitting new compiler warnings: so it's not needed to modify C code using these functions to fix const vs non-const => PEP 670 requires that
  • PyUnicode_KIND() return type is now unsigned int, not enum Uncode_Kind
  • The following functions return type is now unsigned int instead of int: PyUnicode_IS_COMPACT(), PyUnicode_CHECK_INTERNED(), PyUnicode_IS_READY(), PyUnicode_IS_ASCII(). Previously, the macros just exposed structure members of type unsigned int.
  • Don't deprecate PyUnicode_GET_SIZE(), PyUnicode_GET_DATA_SIZE(), PyUnicode_AS_UNICODE(), PyUnicode_AS_DATA()
  • Don't add new assertions to ease review.
  • Don't convert _PyUnicodeWriter_Prepare() and _PyUnicodeWriter_PrepareKind() macros to functions to keep this PR as small as possible.

I will re-do some on these changes (like deprecate functions) as separated PRs once this PR is merged.

@vstinner
Copy link
Member Author

Ah, and I also updated comments :-)

@erlend-aasland @gpshead: Would yo mind to review this PR?

@vstinner
Copy link
Member Author

@erlend-aasland: Tell me if you prefer that I split this PR into smaller PRs.

@vstinner
Copy link
Member Author

The following functions return type is now unsigned int instead of int: PyUnicode_IS_COMPACT(), PyUnicode_CHECK_INTERNED(), PyUnicode_IS_READY(), PyUnicode_IS_ASCII(). Previously, the macros just exposed structure members of type unsigned int.

Oh. It seems like it's wrong. The following code emits a new compiler warning: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare].

    int skind = PyUnicode_KIND(self);
    ...
    int rkind = skind;
    ...
    assert(PyUnicode_KIND(u) == rkind);

@erlend-aasland
Copy link
Contributor

@erlend-aasland: Tell me if you prefer that I split this PR into smaller PRs.

I would prefer that, yes :)

@vstinner
Copy link
Member Author

I would prefer that, yes :)

Here is a shorter PR: #91705

@vstinner
Copy link
Member Author

I splitted this PR into smaller PRs.

@vstinner vstinner closed this Apr 21, 2022
@vstinner vstinner deleted the unicode_static_inline2 branch April 21, 2022 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants