python · methane · May 12, 2022 · May 8, 2022 · May 9, 2022 · May 9, 2022
diff --git a/Doc/c-api/arg.rst b/Doc/c-api/arg.rst
@@ -136,48 +136,6 @@ which disallows mutable objects such as :class:`bytearray`.
    attempting any conversion.  Raises :exc:`TypeError` if the object is not
    a :class:`bytearray` object. The C variable may also be declared as :c:type:`PyObject*`.
 
-``u`` (:class:`str`) [const Py_UNICODE \*]
-   Convert a Python Unicode object to a C pointer to a NUL-terminated buffer of
-   Unicode characters.  You must pass the address of a :c:type:`Py_UNICODE`
-   pointer variable, which will be filled with the pointer to an existing
-   Unicode buffer.  Please note that the width of a :c:type:`Py_UNICODE`
-   character depends on compilation options (it is either 16 or 32 bits).
-   The Python string must not contain embedded null code points; if it does,
-   a :exc:`ValueError` exception is raised.
-
-   .. versionchanged:: 3.5
-      Previously, :exc:`TypeError` was raised when embedded null code points
-      were encountered in the Python string.
-
-   .. deprecated-removed:: 3.3 3.12
-      Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
-      :c:func:`PyUnicode_AsWideCharString`.
-
-``u#`` (:class:`str`) [const Py_UNICODE \*, :c:type:`Py_ssize_t`]
-   This variant on ``u`` stores into two C variables, the first one a pointer to a
-   Unicode data buffer, the second one its length.  This variant allows
-   null code points.
-
-   .. deprecated-removed:: 3.3 3.12
-      Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
-      :c:func:`PyUnicode_AsWideCharString`.
-
-``Z`` (:class:`str` or ``None``) [const Py_UNICODE \*]
-   Like ``u``, but the Python object may also be ``None``, in which case the
-   :c:type:`Py_UNICODE` pointer is set to ``NULL``.
-
-   .. deprecated-removed:: 3.3 3.12
-      Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
-      :c:func:`PyUnicode_AsWideCharString`.
-
-``Z#`` (:class:`str` or ``None``) [const Py_UNICODE \*, :c:type:`Py_ssize_t`]
-   Like ``u#``, but the Python object may also be ``None``, in which case the
-   :c:type:`Py_UNICODE` pointer is set to ``NULL``.
-
-   .. deprecated-removed:: 3.3 3.12
-      Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
-      :c:func:`PyUnicode_AsWideCharString`.
-
 ``U`` (:class:`str`) [PyObject \*]
    Requires that the Python object is a Unicode object, without attempting
    any conversion.  Raises :exc:`TypeError` if the object is not a Unicode
@@ -247,6 +205,11 @@ which disallows mutable objects such as :class:`bytearray`.
    them. Instead, the implementation assumes that the byte string object uses the
    encoding passed in as parameter.
 
+.. versionchanged:: 3.12
+   ``u``, ``u#``, ``Z``, and ``Z#`` are removed because they used legacy ``Py_UNICODE*``
+   representation.
+
+
 Numbers
 -------
 

diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
@@ -17,26 +17,12 @@ of Unicode characters while staying memory efficient.  There are special cases
 for strings where all code points are below 128, 256, or 65536; otherwise, code
 points must be below 1114112 (which is the full Unicode range).
 
-:c:type:`Py_UNICODE*` and UTF-8 representations are created on demand and cached
-in the Unicode object.  The :c:type:`Py_UNICODE*` representation is deprecated
-and inefficient.
-
-Due to the transition between the old APIs and the new APIs, Unicode objects
-can internally be in two states depending on how they were created:
-
-* "canonical" Unicode objects are all objects created by a non-deprecated
-  Unicode API.  They use the most efficient representation allowed by the
-  implementation.
-
-* "legacy" Unicode objects have been created through one of the deprecated
-  APIs (typically :c:func:`PyUnicode_FromUnicode`) and only bear the
-  :c:type:`Py_UNICODE*` representation; you will have to call
-  :c:func:`PyUnicode_READY` on them before calling any other API.
+UTF-8 representation is created on demand and cached in the Unicode object.
 
 .. note::
-   The "legacy" Unicode object will be removed in Python 3.12 with deprecated
-   APIs. All Unicode objects will be "canonical" since then. See :pep:`623`
-   for more information.
+   The :c:type:`Py_UNICODE` representation has been removed since Python 3.12
+   with deprecated APIs.
+   See :pep:`623` for more information.
 
 
 Unicode Type
@@ -101,18 +87,12 @@ access to internal read-only data of Unicode objects:
 
 .. c:function:: int PyUnicode_READY(PyObject *o)
 
-   Ensure the string object *o* is in the "canonical" representation.  This is
-   required before using any of the access macros described below.
-
-   .. XXX expand on when it is not required
-
-   Returns ``0`` on success and ``-1`` with an exception set on failure, which in
-   particular happens if memory allocation fails.
+   Returns ``0``. This API is kept only for backward compatibility.
 
    .. versionadded:: 3.3
 
-   .. deprecated-removed:: 3.10 3.12
-      This API will be removed with :c:func:`PyUnicode_FromUnicode`.
+   .. deprecated:: 3.10
+      This API do nothing since Python 3.12. Please remove code using this function.
 
 
 .. c:function:: Py_ssize_t PyUnicode_GET_LENGTH(PyObject *o)
@@ -130,23 +110,21 @@ access to internal read-only data of Unicode objects:
    Return a pointer to the canonical representation cast to UCS1, UCS2 or UCS4
    integer types for direct character access.  No checks are performed if the
    canonical representation has the correct character size; use
-   :c:func:`PyUnicode_KIND` to select the right function.  Make sure
-   :c:func:`PyUnicode_READY` has been called before accessing this.
+   :c:func:`PyUnicode_KIND` to select the right function.
 
    .. versionadded:: 3.3
 
 
-.. c:macro:: PyUnicode_WCHAR_KIND
-             PyUnicode_1BYTE_KIND
+.. c:macro:: PyUnicode_1BYTE_KIND
              PyUnicode_2BYTE_KIND
              PyUnicode_4BYTE_KIND
 
    Return values of the :c:func:`PyUnicode_KIND` macro.
 
    .. versionadded:: 3.3
 
-   .. deprecated-removed:: 3.10 3.12
-      ``PyUnicode_WCHAR_KIND`` is deprecated.
+   .. versionchanged:: 3.12
+      ``PyUnicode_WCHAR_KIND`` has been removed.
 
 
 .. c:function:: int PyUnicode_KIND(PyObject *o)
@@ -155,8 +133,6 @@ access to internal read-only data of Unicode objects:
    bytes per character this Unicode object uses to store its data.  *o* has to
    be a Unicode object in the "canonical" representation (not checked).
 
-   .. XXX document "0" return value?
-
    .. versionadded:: 3.3
 
 
@@ -208,49 +184,6 @@ access to internal read-only data of Unicode objects:
    .. versionadded:: 3.3
 
 
-.. c:function:: Py_ssize_t PyUnicode_GET_SIZE(PyObject *o)
-
-   Return the size of the deprecated :c:type:`Py_UNICODE` representation, in
-   code units (this includes surrogate pairs as 2 units).  *o* has to be a
-   Unicode object (not checked).
-
-   .. deprecated-removed:: 3.3 3.12
-      Part of the old-style Unicode API, please migrate to using
-      :c:func:`PyUnicode_GET_LENGTH`.
-
-
-.. c:function:: Py_ssize_t PyUnicode_GET_DATA_SIZE(PyObject *o)
-
-   Return the size of the deprecated :c:type:`Py_UNICODE` representation in
-   bytes.  *o* has to be a Unicode object (not checked).
-
-   .. deprecated-removed:: 3.3 3.12
-      Part of the old-style Unicode API, please migrate to using
-      :c:func:`PyUnicode_GET_LENGTH`.
-
-
-.. c:function:: Py_UNICODE* PyUnicode_AS_UNICODE(PyObject *o)
-                const char* PyUnicode_AS_DATA(PyObject *o)
-
-   Return a pointer to a :c:type:`Py_UNICODE` representation of the object.  The
-   returned buffer is always terminated with an extra null code point.  It
-   may also contain embedded null code points, which would cause the string
-   to be truncated when used in most C functions.  The ``AS_DATA`` form
-   casts the pointer to :c:type:`const char *`.  The *o* argument has to be
-   a Unicode object (not checked).
-
-   .. versionchanged:: 3.3
-      This function is now inefficient -- because in many cases the
-      :c:type:`Py_UNICODE` representation does not exist and needs to be created
-      -- and can fail (return ``NULL`` with an exception set).  Try to port the
-      code to use the new :c:func:`PyUnicode_nBYTE_DATA` macros or use
-      :c:func:`PyUnicode_WRITE` or :c:func:`PyUnicode_READ`.
-
-   .. deprecated-removed:: 3.3 3.12
-      Part of the old-style Unicode API, please migrate to using the
-      :c:func:`PyUnicode_nBYTE_DATA` family of macros.
-
-
 .. c:function:: int PyUnicode_IsIdentifier(PyObject *o)
 
    Return ``1`` if the string is a valid identifier according to the language
@@ -436,12 +369,17 @@ APIs:
 
    Create a Unicode object from the char buffer *u*.  The bytes will be
    interpreted as being UTF-8 encoded.  The buffer is copied into the new
-   object. If the buffer is not ``NULL``, the return value might be a shared
-   object, i.e. modification of the data is not allowed.
+   object.
+   The return value might be a shared object, i.e. modification of the data is
+   not allowed.
 
-   If *u* is ``NULL``, this function behaves like :c:func:`PyUnicode_FromUnicode`
-   with the buffer set to ``NULL``.  This usage is deprecated in favor of
-   :c:func:`PyUnicode_New`, and will be removed in Python 3.12.
+   This function raises :exc:`SystemError` when:
+
+   * *size* < 0,
+   * *u* is ``NULL`` and *size* > 0
+
+   .. versionchanged:: 3.12
+      *u* == ``NULL`` with *size* > 0 is not allowed anymore.
 
 
 .. c:function:: PyObject *PyUnicode_FromString(const char *u)
@@ -680,79 +618,6 @@ APIs:
    .. versionadded:: 3.3
 
 
-Deprecated Py_UNICODE APIs
-""""""""""""""""""""""""""
-
-.. deprecated-removed:: 3.3 3.12
-
-These API functions are deprecated with the implementation of :pep:`393`.
-Extension modules can continue using them, as they will not be removed in Python
-3.x, but need to be aware that their use can now cause performance and memory hits.
-
-
-.. c:function:: PyObject* PyUnicode_FromUnicode(const Py_UNICODE *u, Py_ssize_t size)
-
-   Create a Unicode object from the Py_UNICODE buffer *u* of the given size. *u*
-   may be ``NULL`` which causes the contents to be undefined. It is the user's
-   responsibility to fill in the needed data.  The buffer is copied into the new
-   object.
-
-   If the buffer is not ``NULL``, the return value might be a shared object.
-   Therefore, modification of the resulting Unicode object is only allowed when
-   *u* is ``NULL``.
-
-   If the buffer is ``NULL``, :c:func:`PyUnicode_READY` must be called once the
-   string content has been filled before using any of the access macros such as
-   :c:func:`PyUnicode_KIND`.
-
-   .. deprecated-removed:: 3.3 3.12
-      Part of the old-style Unicode API, please migrate to using
-      :c:func:`PyUnicode_FromKindAndData`, :c:func:`PyUnicode_FromWideChar`, or
-      :c:func:`PyUnicode_New`.
-
-
-.. c:function:: Py_UNICODE* PyUnicode_AsUnicode(PyObject *unicode)
-
-   Return a read-only pointer to the Unicode object's internal
-   :c:type:`Py_UNICODE` buffer, or ``NULL`` on error. This will create the
-   :c:type:`Py_UNICODE*` representation of the object if it is not yet
-   available. The buffer is always terminated with an extra null code point.
-   Note that the resulting :c:type:`Py_UNICODE` string may also contain
-   embedded null code points, which would cause the string to be truncated when
-   used in most C functions.
-
-   .. deprecated-removed:: 3.3 3.12
-      Part of the old-style Unicode API, please migrate to using
-      :c:func:`PyUnicode_AsUCS4`, :c:func:`PyUnicode_AsWideChar`,
-      :c:func:`PyUnicode_ReadChar` or similar new APIs.
-
-
-.. c:function:: Py_UNICODE* PyUnicode_AsUnicodeAndSize(PyObject *unicode, Py_ssize_t *size)
-
-   Like :c:func:`PyUnicode_AsUnicode`, but also saves the :c:func:`Py_UNICODE`
-   array length (excluding the extra null terminator) in *size*.
-   Note that the resulting :c:type:`Py_UNICODE*` string
-   may contain embedded null code points, which would cause the string to be
-   truncated when used in most C functions.
-
-   .. versionadded:: 3.3
-
-   .. deprecated-removed:: 3.3 3.12
-      Part of the old-style Unicode API, please migrate to using
-      :c:func:`PyUnicode_AsUCS4`, :c:func:`PyUnicode_AsWideChar`,
-      :c:func:`PyUnicode_ReadChar` or similar new APIs.
-
-
-.. c:function:: Py_ssize_t PyUnicode_GetSize(PyObject *unicode)
-
-   Return the size of the deprecated :c:type:`Py_UNICODE` representation, in
-   code units (this includes surrogate pairs as 2 units).
-
-   .. deprecated-removed:: 3.3 3.12
-      Part of the old-style Unicode API, please migrate to using
-      :c:func:`PyUnicode_GET_LENGTH`.
-
-
 .. c:function:: PyObject* PyUnicode_FromObject(PyObject *obj)
 
    Copy an instance of a Unicode subtype to a new true Unicode object if

diff --git a/Doc/data/stable_abi.dat b/Doc/data/stable_abi.dat
@@ -848,15 +848,15 @@ on the right is the text you'd replace it with.
 ``'s#'``    ``str(zeroes=True)``
 ``'s*'``    ``Py_buffer(accept={buffer, str})``
 ``'U'``     ``unicode``
-``'u'``     ``Py_UNICODE``
-``'u#'``    ``Py_UNICODE(zeroes=True)``
+``'u'``     ``wchar_t``
+``'u#'``    ``wchar_t(zeroes=True)``
 ``'w*'``    ``Py_buffer(accept={rwbuffer})``
 ``'Y'``     ``PyByteArrayObject``
 ``'y'``     ``str(accept={bytes})``
 ``'y#'``    ``str(accept={robuffer}, zeroes=True)``
 ``'y*'``    ``Py_buffer``
-``'Z'``     ``Py_UNICODE(accept={str, NoneType})``
-``'Z#'``    ``Py_UNICODE(accept={str, NoneType}, zeroes=True)``
+``'Z'``     ``wchar_t(accept={str, NoneType})``
+``'Z#'``    ``wchar_t(accept={str, NoneType}, zeroes=True)``
 ``'z'``     ``str(accept={str, NoneType})``
 ``'z#'``    ``str(accept={str, NoneType}, zeroes=True)``
 ``'z*'``    ``Py_buffer(accept={buffer, str, NoneType})``

diff --git a/Doc/whatsnew/3.12.rst b/Doc/whatsnew/3.12.rst
@@ -66,6 +66,9 @@ Summary -- Release highlights
 
 .. PEP-sized items next.
 
+Important deprecations, removals or restrictions:
+
+* :pep:`623`, Remove wstr from Unicode
 
 
 New Features
@@ -91,7 +94,9 @@ Improved Modules
 Optimizations
 =============
 
-
+* Removed ``wstr`` and ``wstr_length`` members from Unicode objects.
+  It reduces object size by 8 or 16 bytes on 64bit platform. (:pep:`623`)
+  (Contributed by Inada Naoki in :gh:`92536`.)
 
 
 Deprecated
@@ -140,6 +145,13 @@ New Features
 Porting to Python 3.12
 ----------------------
 
+* Legacy Unicode APIs based on ``Py_UNICODE*`` representation has been removed.
+  Please migrate to APIs based on UTF-8 or ``wchar_t*``.
+
+* Argument parsing functions like :c:func:`PyArg_ParseTuple` doesn't support
+  ``Py_UNICODE*`` based format (e.g. ``u``, ``Z``) anymore. Please migrate
+  to other formats for Unicode like ``s``, ``z``, ``es``, and ``U``.
+
 Deprecated
 ----------
 
@@ -150,3 +162,15 @@ Removed
   API. The ``token.h`` header file was only designed to be used by Python
   internals.
   (Contributed by Victor Stinner in :gh:`92651`.)
+
+* Leagcy Unicode APIs has been removed. See :pep:`623` for detail.
+
+   * :c:macro:`PyUnicode_WCHAR_KIND`
+   * :c:func:`PyUnicode_AS_UNICODE`
+   * :c:func:`PyUnicode_AsUnicode`
+   * :c:func:`PyUnicode_AsUnicodeAndSize`
+   * :c:func:`PyUnicode_AS_DATA`
+   * :c:func:`PyUnicode_FromUnicode`
+   * :c:func:`PyUnicode_GET_SIZE`
+   * :c:func:`PyUnicode_GetSize`
+   * :c:func:`PyUnicode_GET_DATA_SIZE`