-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-120608: Make reversed thread-safe #120609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Misc/NEWS.d/next/Core and Builtins/2024-06-16-22-42-47.gh-issue-120608._-YtWX.rst
Outdated
Show resolved
Hide resolved
@@ -98,6 +98,8 @@ extern "C" { | |||
# define Py_END_CRITICAL_SECTION() \ | |||
_PyCriticalSection_End(&_cs); \ | |||
} | |||
# define Py_EXIT_CRITICAL_SECTION() \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is better to define a local API instead of a global one? For example END_TYPE_LOCK()
does that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just extract the critical section into a helper function and then wrap the helper with the existing macros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this PR, I'd just extract most of reversed_next
into a reversed_next_impl
and wrap that with the Py_{BEGIN,END}_CRITICAL_SECTION
macros. See also #120318.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW we ended up not doing that in #120318. There's this comment though that explains the approach and gives an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback everyone! I added the Py_EXIT_CRITICAL_SECTION
in this PR (and some others) to deal with return and goto statements. I am not in favor of adding a local API, since it will leak details of the locking implementation to modules. Changing the Py_EXIT_CRITICAL_SECTION
to a more private _Py_EXIT_CRITICAL_SECTION
(or something like that) would be fine with me.
In this case we can indeed refactor to create a reversed_next_impl
and put a global lock around it. I am not too worried about performance here, and if it turns out performance is an issue we can change later.
Unless more suggestions come it, I will refactor to reversed_next_impl
in the coming days.
…e-120608._-YtWX.rst Co-authored-by: Nikita Sobolev <[email protected]>
Do you have a benchmark for the single thread case? |
@@ -431,16 +432,20 @@ reversed_next(reversedobject *ro) | |||
PyObject *item; | |||
Py_ssize_t index = ro->index; | |||
|
|||
Py_BEGIN_CRITICAL_SECTION(ro); | |||
if (index >= 0) { | |||
item = PySequence_GetItem(ro->seq, index); | |||
if (item != NULL) { | |||
ro->index--; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't declare the critical section for the list_next
Lines 3890 to 3911 in dacc5ac
listiter_next(PyObject *self) | |
{ | |
_PyListIterObject *it = (_PyListIterObject *)self; | |
Py_ssize_t index = FT_ATOMIC_LOAD_SSIZE_RELAXED(it->it_index); | |
if (index < 0) { | |
return NULL; | |
} | |
PyObject *item = list_get_item_ref(it->it_seq, index); | |
if (item == NULL) { | |
// out-of-bounds | |
FT_ATOMIC_STORE_SSIZE_RELAXED(it->it_index, -1); | |
#ifndef Py_GIL_DISABLED | |
PyListObject *seq = it->it_seq; | |
it->it_seq = NULL; | |
Py_DECREF(seq); | |
#endif | |
return NULL; | |
} | |
FT_ATOMIC_STORE_SSIZE_RELAXED(it->it_index, index + 1); | |
return item; | |
} |
Why do we have to declare the critical section here?
And what about just handling as the out-of-bound,
and if the ro->index needs to be decremented, why not just use atomic operation in here?
cc @colesbury
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check index >= 0
and decrement ro->index--
need to happen atomically. We could use the atomic _Py_atomic_compare_exchange
for this, but it would increase code complexity. If performance is a concern, I can update the PR with this approach. The PySequence_GetItem
is not inside the critical section, but I assumed it to be (or to be made later) thread safe
@corona10 Here is a benchmark for the PR using a critical section. Based on comments on the issue #120496 I believe that using atomics might be a better option, I will look into that a bit later. Benchmark script:
Result
The performance for |
The
reversed
builtin is not thread-safe. Inreversed_next
the objectindex
can be read and decremented by multiple threads leading to incorrect results. The part that needs to happen atomically is the check(index >= 0)
and the decrementro->index--;
.A reproducing example will be included in the tests later.
_Py_atomic_compare_exchange
on thero->index
(exchanging withro->index--
). This would make the code more complex and I am not sure lock contention is a performance issue here.Py_EXIT_CRITICAL_SECTION
is the same as proposed in gh-120496: Make enum_iter thread safe #120591