gh-123271: Make builtin zip method safe under free-threading#123272
gh-123271: Make builtin zip method safe under free-threading#123272colesbury merged 15 commits intopython:mainfrom
Conversation
| @@ -0,0 +1 @@ | |||
| Make :meth:`zip` safe under free-threading. | |||
There was a problem hiding this comment.
| Make :meth:`zip` safe under free-threading. | |
| Make :func:`zip` thread-safe without the :term:`GIL`. |
There was a problem hiding this comment.
Even with this PR the zip function is not thread-safe: when iterating over zip(range(100), range(100)) one can obtain a tuple (2,3). What does PR (hopefully) does is modifying the code so that the interpreter does not crash. For this reason I would like to be careful with the term "thread-safe". I choose not to make zip fully thread safe because of the performance impact, see the discussion in #120496
There was a problem hiding this comment.
Thanks for clarifying. The word safe is a bit confusing, I guess a lot of other people would read it as thread-safe. I think we need to clarify the reason why zip isn't safe in free-threaded build.
Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
Python/bltinmodule.c
Outdated
| #ifdef Py_GIL_DISABLED | ||
| int reuse_tuple = 0; | ||
| #else | ||
| int reuse_tuple = Py_REFCNT(result) == 1; | ||
| #endif | ||
| if (reuse_tuple) { |
There was a problem hiding this comment.
I think we want to enable re-use under certain conditions. We should probably put the logic in some internal function, possibly in pycore_object.h. I think the condition for the free-threaded build is:
- ob_tid matches
_Py_ThreadId() - ob_ref_local is 1
- ob_ref_shared is 0
The logic is that no other thread calling zip_next can re-use lz->result because of the thread id condition (condition 1). And no thread outside of zip_next can incref that object because the combined conditions ensure that lz->result is the only reference.
There was a problem hiding this comment.
@colesbury Thanks for the hint! I implementeded the suggestion
- I placed the logic in a single method
_Py_Reuse_Immutable_Objectso we can use the trick code for several other cases (e.g.itertools.pairwise,enumerate). The location and name of the new method can be changed though - I re-ordered the three conditions because I suspect that
ob_ref_localis most informative and fastest to check - I am now checking
ob_ref_sharedto be zero, but perhaps we should check onPy_ARITHMETIC_RIGHT_SHIFT(Py_ssize_t, ob_ref_shared , _Py_REF_SHARED_SHIFT)?
Misc/NEWS.d/next/Core_and_Builtins/2024-08-23-21-20-34.gh-issue-123271.xeVViR.rst
Outdated
Show resolved
Hide resolved
…e-123271.xeVViR.rst Co-authored-by: Sam Gross <colesbury@gmail.com>
colesbury
left a comment
There was a problem hiding this comment.
Thanks @eendebakpt. The check that ob_ref_shared is entirely zero is correct as you wrote it. We don't want non-zero flags in this case as that could allow another thread to concurrently incref the object in some circumstances.
| _ = [t.start() for t in worker_threads] | ||
| _ = [t.join() for t in worker_threads] |
There was a problem hiding this comment.
The preferred style is to use for loops instead of using list comprehensions for side effects.
| number_of_threads = 8 | ||
| number_of_iterations = 40 | ||
| n = 40_000 |
There was a problem hiding this comment.
We run lots of unit tests in CI, so they individually need to be very fast. For the free-threaded tests, we should aim for <0.1 seconds on two cores (i.e., when run with taskset -c 0-1 on Linux)
There was a problem hiding this comment.
On my system (windows) the test is 0.07 seconds for the free-threading build. I will reduce the number of threads and iterations a bit so it is even faster.
Co-authored-by: Sam Gross <colesbury@gmail.com>
|
Thanks for the fix @eendebakpt! There's maybe a dozen or so similar patterns in CPython and I searched for the regex |
|
@colesbury Thanks for the guidance here! I was indeed planning on addressing some of the similar patterns. I'll open an new issue for this. |
In this PR we make
zipsafe under free-threading.