Skip to content

gh-130202: Fix bug in _PyObject_ResurrectEnd in free threaded build #130281

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 25, 2025

Conversation

colesbury
Copy link
Contributor

@colesbury colesbury commented Feb 18, 2025

This fixes a fairly subtle bug involving finalizers and resurrection in debug free threaded builds: if _PyObject_ResurrectEnd returns 1 (i.e., the object was resurrected by a finalizer), it's not safe to access the object because it might still be deallocated. For example:

  • The finalizer may have exposed the object to another thread. That thread may hold the last reference and concurrently deallocate it any time after _PyObject_ResurrectEnd() returns 1.
  • _PyObject_ResurrectEnd() may call _Py_brc_queue_object(), which may internally deallocate the object immediately if the owning thread is dead.

Therefore, it's important not to access the object after it's resurrected. We only violate this in two cases, and only in debug builds:

  • We assert that the object is tracked appropriately. This is now moved up betewen the finalizer and the _PyObject_ResurrectEnd() call.

  • The --with-trace-refs builds may need to remember the object if it's resurrected. This is now handled by _PyObject_ResurrectStart() and _PyObject_ResurrectEnd().

Note that --with-trace-refs is currently disabled in --disable-gil builds because the refchain hash table isn't thread-safe, but this refactoring avoids an additional thread-safety issue.

@colesbury colesbury added skip news 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section topic-free-threading labels Feb 18, 2025
@colesbury colesbury added 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section and removed 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section labels Feb 18, 2025
@colesbury

This comment was marked as resolved.

@bedevere-bot

This comment was marked as resolved.

… build

This fixes a fairly subtle bug involving finalizers and resurrection in
debug free threaded builds: if `_PyObject_ResurrectEnd` returns `1`
(i.e., the object was resurrected by a finalizer), it's not safe to
access the object because it might still be deallocated. For example:

 * The finalizer may have exposed the object to another thread. That
   thread may hold the last reference and concurrently deallocate it any
   time after `_PyObject_ResurrectEnd()` returns `1`.
 * `_PyObject_ResurrectEnd()` may call `_Py_brc_queue_object()`, which
   may internally deallocate the object immediately if the owning thread
   is dead.

Therefore, it's important not to access the object after it's
resurrected. We only violate this in two cases, and only in debug
builds:

 * We assert that the object is tracked appropriately. This is now moved
   up betewen the finalizer and the `_PyObject_ResurrectEnd()` call.

 * The `--with-trace-refs` builds may need to remember the object if
   it's resurrected. This is now handled by `_PyObject_ResurrectStart()`
   and `_PyObject_ResurrectEnd()`.

Note that `--with-trace-refs` is currently disabled in `--disable-gil`
builds because the refchain hash table isn't thread-safe, but this
refactoring avoids an additional thread-safety issue.
@colesbury colesbury removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Feb 18, 2025
@colesbury colesbury marked this pull request as ready for review February 18, 2025 21:42
@colesbury colesbury added 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section and removed 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section labels Feb 18, 2025
Copy link
Contributor

@mpage mpage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@colesbury colesbury merged commit f963239 into python:main Feb 25, 2025
49 checks passed
@colesbury colesbury deleted the gh-130202-resurrect branch February 25, 2025 17:03
@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot iOS ARM64 Simulator 3.x has failed when building commit f963239.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/1380/builds/2687) and take a look at the build logs.
  4. Check if the failure is related to this commit (f963239) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/1380/builds/2687

Failed tests:

  • test.test_concurrent_futures.test_interpreter_pool

Failed subtests:

  • test_map_exception - test.test_concurrent_futures.test_interpreter_pool.InterpreterPoolExecutorTest.test_map_exception

Summary of the results of the build (if available):

==

Click to see traceback logs
Traceback (most recent call last):
  File "/Users/buildbot/Library/Developer/XCTestDevices/03BBFD20-9BB7-463D-8836-E126B03F9D57/data/Containers/Bundle/Application/2A7C7908-357F-4202-9B95-5FD766753D27/iOSTestbed.app/python/lib/python3.14/test/test_concurrent_futures/executor.py", line 54, in test_map_exception
    self.assertEqual(i.__next__(), (0, 1))
                     ~~~~~~~~~~^^
  File "/Users/buildbot/Library/Developer/XCTestDevices/03BBFD20-9BB7-463D-8836-E126B03F9D57/data/Containers/Bundle/Application/2A7C7908-357F-4202-9B95-5FD766753D27/iOSTestbed.app/python/lib/python3.14/concurrent/futures/_base.py", line 611, in result_iterator
    yield _result_or_cancel(fs.pop())
          ~~~~~~~~~~~~~~~~~^^^^^^^^^^
  File "/Users/buildbot/Library/Developer/XCTestDevices/03BBFD20-9BB7-463D-8836-E126B03F9D57/data/Containers/Bundle/Application/2A7C7908-357F-4202-9B95-5FD766753D27/iOSTestbed.app/python/lib/python3.14/concurrent/futures/_base.py", line 309, in _result_or_cancel
    return fut.result(timeout)
           ~~~~~~~~~~^^^^^^^^^
  File "/Users/buildbot/Library/Developer/XCTestDevices/03BBFD20-9BB7-463D-8836-E126B03F9D57/data/Containers/Bundle/Application/2A7C7908-357F-4202-9B95-5FD766753D27/iOSTestbed.app/python/lib/python3.14/concurrent/futures/_base.py", line 448, in result
    return self.__get_result()
           ~~~~~~~~~~~~~~~~~^^
  File "/Users/buildbot/Library/Developer/XCTestDevices/03BBFD20-9BB7-463D-8836-E126B03F9D57/data/Containers/Bundle/Application/2A7C7908-357F-4202-9B95-5FD766753D27/iOSTestbed.app/python/lib/python3.14/concurrent/futures/_base.py", line 393, in __get_result
    raise self._exception
concurrent.futures.interpreter.BrokenInterpreterPool: A thread initializer failed, the thread pool is not usable anymore


Traceback (most recent call last):
  File "/Users/buildbot/Library/Developer/XCTestDevices/03BBFD20-9BB7-463D-8836-E126B03F9D57/data/Containers/Bundle/Application/2A7C7908-357F-4202-9B95-5FD766753D27/iOSTestbed.app/python/lib/python3.14/concurrent/futures/thread.py", line 99, in _worker
    ctx.initialize()
    ~~~~~~~~~~~~~~^^
  File "/Users/buildbot/Library/Developer/XCTestDevices/03BBFD20-9BB7-463D-8836-E126B03F9D57/data/Containers/Bundle/Application/2A7C7908-357F-4202-9B95-5FD766753D27/iOSTestbed.app/python/lib/python3.14/concurrent/futures/interpreter.py", line 132, in initialize
    self.interpid = _interpreters.create(reqrefs=True)
                    ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
interpreters.InterpreterError: interpreter creation failed
ERROR

@freakboy3742
Copy link
Contributor

Not sure what's going on with the failure here - there was a failure trying to initialise a sub interpreter. The subsequent CI run passed without error.

seehwan pushed a commit to seehwan/cpython that referenced this pull request Apr 16, 2025
… build (pythongh-130281)

This fixes a fairly subtle bug involving finalizers and resurrection in
debug free threaded builds: if `_PyObject_ResurrectEnd` returns `1`
(i.e., the object was resurrected by a finalizer), it's not safe to
access the object because it might still be deallocated. For example:

 * The finalizer may have exposed the object to another thread. That
   thread may hold the last reference and concurrently deallocate it any
   time after `_PyObject_ResurrectEnd()` returns `1`.
 * `_PyObject_ResurrectEnd()` may call `_Py_brc_queue_object()`, which
   may internally deallocate the object immediately if the owning thread
   is dead.

Therefore, it's important not to access the object after it's
resurrected. We only violate this in two cases, and only in debug
builds:

 * We assert that the object is tracked appropriately. This is now moved
   up betewen the finalizer and the `_PyObject_ResurrectEnd()` call.

 * The `--with-trace-refs` builds may need to remember the object if
   it's resurrected. This is now handled by `_PyObject_ResurrectStart()`
   and `_PyObject_ResurrectEnd()`.

Note that `--with-trace-refs` is currently disabled in `--disable-gil`
builds because the refchain hash table isn't thread-safe, but this
refactoring avoids an additional thread-safety issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants