Skip to content

_socket extension C capsule cannot be visited by the GC and so _socket.socket type may stay alive longer than expected #108240

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks done
vstinner opened this issue Aug 21, 2023 · 19 comments
Labels
type-bug An unexpected behavior, bug, or error

Comments

@vstinner
Copy link
Member

vstinner commented Aug 21, 2023

Bug report

Checklist

  • I am confident this is a bug in CPython, not a bug in a third-party project
  • I have searched the CPython issue tracker,
    and am confident this bug has not been reported before

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

Python 3.13.0a0 (heads/main:d63972e289, Aug 21 2023, 22:29:24) [GCC 13.2.1 20230728 (Red Hat 13.2.1-1)]

A clear and concise description of the bug:

The following script will never delete _socket.socket type:

import _socket

# uncomment to workaround the bug
#_socket.CAPI = None

_socket = None

import sys
del sys.modules['_socket']

import gc
gc.collect()

print("exit")

Example on a Python debug build:

$ ./python -X showrefcount script.py 
exit
[408 refs, 260 blocks]

The GC cannot visit the strong reference to the type stored in _socket.CAPI capsule (C API). In Python, creating a heap type creates a reference cycle. For example, a type MRO contains the type. Methods also contain a strong reference to their type.

$ ./python
Python 3.13.0a0 (heads/main:d63972e289, Aug 21 2023, 22:29:24) [GCC 13.2.1 20230728 (Red Hat 13.2.1-1)] on linux
>>> import _socket
>>> t=_socket.socket
>>> t.__mro__[0] is t
True

See also:

Linked PRs

@vstinner vstinner added the type-bug An unexpected behavior, bug, or error label Aug 21, 2023
vstinner added a commit to vstinner/cpython that referenced this issue Aug 21, 2023
_socket.CAPI capsule contains a strong reference to _socket.socket
type. The garbage collector cannot visit this reference and so cannot
create the reference cycle involving _socket.socket (a heap type
creates a reference cycle to inside, via MRO and methods). At Python,
_PyImport_ClearModules() sets _socket attributes to None which works
around the issue.

If the module is cleared from sys.modules manually,
_PyImport_ClearModules() cannot set _socket.CAPI to None and so the
issue cannot be worked around.

Change _socket.CAPI to use a borrowed reference instead of a strong
reference to allow clearing _socket.socket in this case.

Co-authored-by: Kirill Podoprigora <[email protected]>
@vstinner
Copy link
Member Author

I wrote PR #108241 to fix the issue for the _socket extension: avoid strong references in the capsule object.

@vstinner
Copy link
Member Author

_curses extension is affected by the same bug: _curses._C_API capsule stores a strong reference to _curses.window heap type.

Extensions not affected by the issue, using borrowed references:

  • _datetime.datetime_CAPI stores borrowed references to _datetime heap types
  • pyexpat.expat_CAPI does not store types
  • unicodedata._ucnhash_CAPI does not store types
  • _testcapi capsules don't store types

@vstinner
Copy link
Member Author

By the way, commit 46366ca may have introduced a leak: see PR #107577.

@vstinner
Copy link
Member Author

I closed issue #107789 as a duplicate of this issue.

@vstinner
Copy link
Member Author

vstinner commented Aug 21, 2023

Oh, @Eclips4 found that the _decimal extension is also affected: #107577 (comment)

@Eclips4
Copy link
Member

Eclips4 commented Aug 21, 2023

Oh, @Eclips4 found that the _decimal extension is also affected: #107577 (comment)

Probably related: #107994

@vstinner
Copy link
Member Author

Probably related: #107994

Oh right, the _decimal extension has no capsule, so the refleak is unrelated to this issue.

@vstinner
Copy link
Member Author

cc @erlend-aasland @corona10 @pablogsal: yet another complex GC issue involving extension multi-phase initialization and the garbage collector.

@vstinner
Copy link
Member Author

The _curses extension was not ported to multi-phase init API (PEP 489) yet. It still uses static types.

@Eclips4
Copy link
Member

Eclips4 commented Aug 21, 2023

The _curses extension was not ported to multi-phase init API (PEP 489) yet. It still uses static types.

You mean leak in the _curses is a known issue?

import importlib
import sys

def foo():
    name = "_curses"
    importlib.import_module(name)
    del sys.modules[name]


for _ in range(4):
    foo()

./python.exe -X showrefcount example.py
[1006 refs, 491 blocks]

@vstinner
Copy link
Member Author

You mean leak in the _curses is a known issue?

Yes. It should be ported to multi-phase init and its static types should be converted to heap types.

@CharlieZhao95
Copy link
Contributor

Yes. It should be ported to multi-phase init and its static types should be converted to heap types.

FYI, it seems that work on isolating the _curses extension module is suspended. #101714 (comment)

@Eclips4
Copy link
Member

Eclips4 commented Aug 22, 2023

FYI, it seems that work on isolating the _curses extension module is suspended. #101714 (comment)

So, should we re-open @chgnrdv PR about isolating _curses? Seems work almost done, but isolating module only for avoid reference leak looks kinda weird.

@erlend-aasland
Copy link
Contributor

Using curses in a subinterpreter makes as little sense as using any other terminal-dependent extension module (like for example the readline module). Those extension modules only make sense if used from the main interpreter. Is there a use case where you want two extension modules trying to use the terminal from two different interpreters?

@vstinner
Copy link
Member Author

The issue with _curses is that it leaks references, and so it makes debugging real leaks more complicated at Python exit:

$ ./python -X showrefcount -c 'pass'
[0 refs, 0 blocks]
$ ./python -X showrefcount -c 'import _curses'
[90 refs, 87 blocks]

I fixed some stdlib extensions by adding _PyTypes_FiniTypes() function which is called at Python exit. The problem here is that the _curses extension is a shared library, so Python cannot track its types. Maybe an "at exit" callback can be used for such cases (but not Py_AtExit() which calls the callback after the interpreter is deleted). Converting _curses to multi-phase init would give more freedom on how resources are released.

@erlend-aasland
Copy link
Contributor

The problem with curses and readline is that they rely on process wide globals, which mean module state just won't do it. Try converting readline to multi-phase init, and you'll see what I mean :) Tkinter has similar issues.

@vstinner
Copy link
Member Author

The _ast extension caused me lot of troubles since there are APIs to access its state without importing the extension. Its state is now stored in the interpreter :-)

@erlend-aasland
Copy link
Contributor

Yes, storing the state in the interpreter is an option, but I don't think we want to do that for tkinter, readline, and curses.

vstinner added a commit to vstinner/cpython that referenced this issue Aug 22, 2023
The _socket extension uses PyCapsule_SetTraverse() to visit and clear
the socket type in the garbage collector. So the _socket.socket type
can be cleared in some corner cases when it wasn't possible before.
vstinner added a commit to vstinner/cpython that referenced this issue Aug 22, 2023
The _socket extension uses PyCapsule_SetTraverse() to visit and clear
the socket type in the garbage collector. So the _socket.socket type
can be cleared in some corner cases when it wasn't possible before.
vstinner added a commit to vstinner/cpython that referenced this issue Aug 22, 2023
The _socket extension uses PyCapsule_SetTraverse() to visit and clear
the socket type in the garbage collector. So the _socket.socket type
can be cleared in some corner cases when it wasn't possible before.
vstinner added a commit to vstinner/cpython that referenced this issue Aug 23, 2023
The _socket extension uses PyCapsule_SetTraverse() to visit and clear
the socket type in the garbage collector. So the _socket.socket type
can be cleared in some corner cases when it wasn't possible before.
vstinner added a commit to vstinner/cpython that referenced this issue Aug 23, 2023
The _socket extension uses _PyCapsule_SetTraverse() to visit and clear
the socket type in the garbage collector. So the _socket.socket type
can be cleared in some corner cases when it wasn't possible before.
vstinner added a commit to vstinner/cpython that referenced this issue Aug 23, 2023
The _socket extension uses _PyCapsule_SetTraverse() to visit and clear
the socket type in the garbage collector. So the _socket.socket type
can be cleared in some corner cases when it wasn't possible before.
vstinner added a commit that referenced this issue Aug 23, 2023
The _socket extension uses _PyCapsule_SetTraverse() to visit and clear
the socket type in the garbage collector. So the _socket.socket type
can be cleared in some corner cases when it wasn't possible before.
@vstinner
Copy link
Member Author

Fixed by commit 513c89d

Thanks for your bug report @Eclips4.

vstinner added a commit to vstinner/cpython that referenced this issue Aug 24, 2023
vstinner added a commit to vstinner/cpython that referenced this issue Aug 24, 2023
vstinner added a commit to vstinner/cpython that referenced this issue Aug 29, 2023
Move _PyCapsule_SetTraverse() definition to a new internal header
file: pycore_capsule.h.
vstinner added a commit that referenced this issue Aug 29, 2023
Move _PyCapsule_SetTraverse() definition to a new internal header
file: pycore_capsule.h.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants