-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-123271: Make builtin zip method safe under free-threading #123272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -0,0 +1 @@ | |||
Make :meth:`zip` safe under free-threading. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make :meth:`zip` safe under free-threading. | |
Make :func:`zip` thread-safe without the :term:`GIL`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with this PR the zip
function is not thread-safe: when iterating over zip(range(100), range(100))
one can obtain a tuple (2,3)
. What does PR (hopefully) does is modifying the code so that the interpreter does not crash. For this reason I would like to be careful with the term "thread-safe". I choose not to make zip
fully thread safe because of the performance impact, see the discussion in #120496
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying. The word safe
is a bit confusing, I guess a lot of other people would read it as thread-safe
. I think we need to clarify the reason why zip
isn't safe in free-threaded
build.
Co-authored-by: Kirill Podoprigora <[email protected]>
Python/bltinmodule.c
Outdated
#ifdef Py_GIL_DISABLED | ||
int reuse_tuple = 0; | ||
#else | ||
int reuse_tuple = Py_REFCNT(result) == 1; | ||
#endif | ||
if (reuse_tuple) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to enable re-use under certain conditions. We should probably put the logic in some internal function, possibly in pycore_object.h
. I think the condition for the free-threaded build is:
- ob_tid matches
_Py_ThreadId()
- ob_ref_local is 1
- ob_ref_shared is 0
The logic is that no other thread calling zip_next
can re-use lz->result
because of the thread id condition (condition 1). And no thread outside of zip_next
can incref that object because the combined conditions ensure that lz->result
is the only reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@colesbury Thanks for the hint! I implementeded the suggestion
- I placed the logic in a single method
_Py_Reuse_Immutable_Object
so we can use the trick code for several other cases (e.g.itertools.pairwise
,enumerate
). The location and name of the new method can be changed though - I re-ordered the three conditions because I suspect that
ob_ref_local
is most informative and fastest to check - I am now checking
ob_ref_shared
to be zero, but perhaps we should check onPy_ARITHMETIC_RIGHT_SHIFT(Py_ssize_t, ob_ref_shared , _Py_REF_SHARED_SHIFT)
?
Misc/NEWS.d/next/Core_and_Builtins/2024-08-23-21-20-34.gh-issue-123271.xeVViR.rst
Outdated
Show resolved
Hide resolved
…e-123271.xeVViR.rst Co-authored-by: Sam Gross <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @eendebakpt. The check that ob_ref_shared
is entirely zero is correct as you wrote it. We don't want non-zero flags in this case as that could allow another thread to concurrently incref the object in some circumstances.
_ = [t.start() for t in worker_threads] | ||
_ = [t.join() for t in worker_threads] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The preferred style is to use for
loops instead of using list comprehensions for side effects.
number_of_threads = 8 | ||
number_of_iterations = 40 | ||
n = 40_000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We run lots of unit tests in CI, so they individually need to be very fast. For the free-threaded tests, we should aim for <0.1 seconds on two cores (i.e., when run with taskset -c 0-1
on Linux)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my system (windows) the test is 0.07 seconds for the free-threading build. I will reduce the number of threads and iterations a bit so it is even faster.
Co-authored-by: Sam Gross <[email protected]>
Thanks for the fix @eendebakpt! There's maybe a dozen or so similar patterns in CPython and I searched for the regex |
@colesbury Thanks for the guidance here! I was indeed planning on addressing some of the similar patterns. I'll open an new issue for this. |
In this PR we make
zip
safe under free-threading.