-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-110206: Fix multiprocessing test_notify_all #130933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The test could deadlock trying join on the worker processes due to a combination of behaviors: * The use of `assertReachesEventually` did not ensure that workers actually woken.release() because the SyncManager's Semaphore does not implement get_value. * This mean that the test could finish and the variable "sleeping" would got out of scope and be collected. This unregisters the proxy leading to failures in the worker or possibly the manager. * The subsequent call to `p.join()` during cleanUp therefore never finished. This takes two approaches to fix this: 1) Use woken.acquire() to ensure that the workers actually finish calling woken.release() 2) Wait until the workers finish during the test, while cond, sleeping, and woken are still valid.
🤖 New build scheduled with the buildbot fleet by @colesbury for commit c06d0f1 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F130933%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
🤖 New build scheduled with the buildbot fleet by @gpshead for commit 19c049d 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F130933%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
nogil refleak buildbot failures will be fixed by #130901 |
Thanks @colesbury for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12, 3.13. |
The test could deadlock trying join on the worker processes due to a combination of behaviors: * The use of `assertReachesEventually` did not ensure that workers actually woken.release() because the SyncManager's Semaphore does not implement get_value. * This mean that the test could finish and the variable "sleeping" would got out of scope and be collected. This unregisters the proxy leading to failures in the worker or possibly the manager. * The subsequent call to `p.join()` during cleanUp therefore never finished. This takes two approaches to fix this: 1) Use woken.acquire() to ensure that the workers actually finish calling woken.release() 2) At the end of the test, wait until the workers are finished, while `cond`, `sleeping`, and `woken` are still valid. (cherry picked from commit c476410) Co-authored-by: Sam Gross <[email protected]>
GH-130950 is a backport of this pull request to the 3.13 branch. |
The test could deadlock trying join on the worker processes due to a combination of behaviors: * The use of `assertReachesEventually` did not ensure that workers actually woken.release() because the SyncManager's Semaphore does not implement get_value. * This mean that the test could finish and the variable "sleeping" would got out of scope and be collected. This unregisters the proxy leading to failures in the worker or possibly the manager. * The subsequent call to `p.join()` during cleanUp therefore never finished. This takes two approaches to fix this: 1) Use woken.acquire() to ensure that the workers actually finish calling woken.release() 2) At the end of the test, wait until the workers are finished, while `cond`, `sleeping`, and `woken` are still valid. (cherry picked from commit c476410) Co-authored-by: Sam Gross <[email protected]>
GH-130951 is a backport of this pull request to the 3.12 branch. |
The test could deadlock trying join on the worker processes. Apply the same technique as pythongh-130933. Join the process before the test ends in `test_notify` as well.
…30951) The test could deadlock trying join on the worker processes due to a combination of behaviors: * The use of `assertReachesEventually` did not ensure that workers actually woken.release() because the SyncManager's Semaphore does not implement get_value. * This mean that the test could finish and the variable "sleeping" would got out of scope and be collected. This unregisters the proxy leading to failures in the worker or possibly the manager. * The subsequent call to `p.join()` during cleanUp therefore never finished. This takes two approaches to fix this: 1) Use woken.acquire() to ensure that the workers actually finish calling woken.release() 2) At the end of the test, wait until the workers are finished, while `cond`, `sleeping`, and `woken` are still valid. (cherry picked from commit c476410) Co-authored-by: Sam Gross <[email protected]>
…30950) The test could deadlock trying join on the worker processes due to a combination of behaviors: * The use of `assertReachesEventually` did not ensure that workers actually woken.release() because the SyncManager's Semaphore does not implement get_value. * This mean that the test could finish and the variable "sleeping" would got out of scope and be collected. This unregisters the proxy leading to failures in the worker or possibly the manager. * The subsequent call to `p.join()` during cleanUp therefore never finished. This takes two approaches to fix this: 1) Use woken.acquire() to ensure that the workers actually finish calling woken.release() 2) At the end of the test, wait until the workers are finished, while `cond`, `sleeping`, and `woken` are still valid. (cherry picked from commit c476410) Co-authored-by: Sam Gross <[email protected]>
|
The test could deadlock trying join on the worker processes. Apply the same technique as pythongh-130933. Join the process before the test ends in `test_notify` as well. (cherry picked from commit edd1eca) Co-authored-by: Sam Gross <[email protected]>
The test could deadlock trying join on the worker processes. Apply the same technique as pythongh-130933. Join the process before the test ends in `test_notify` as well. (cherry picked from commit edd1eca) Co-authored-by: Sam Gross <[email protected]>
) The test could deadlock trying join on the worker processes. Apply the same technique as gh-130933. Join the process before the test ends in `test_notify` as well. (cherry picked from commit edd1eca) Co-authored-by: Sam Gross <[email protected]>
) The test could deadlock trying join on the worker processes. Apply the same technique as gh-130933. Join the process before the test ends in `test_notify` as well. (cherry picked from commit edd1eca) Co-authored-by: Sam Gross <[email protected]>
The test could deadlock trying join on the worker processes due to a combination of behaviors: * The use of `assertReachesEventually` did not ensure that workers actually woken.release() because the SyncManager's Semaphore does not implement get_value. * This mean that the test could finish and the variable "sleeping" would got out of scope and be collected. This unregisters the proxy leading to failures in the worker or possibly the manager. * The subsequent call to `p.join()` during cleanUp therefore never finished. This takes two approaches to fix this: 1) Use woken.acquire() to ensure that the workers actually finish calling woken.release() 2) At the end of the test, wait until the workers are finished, while `cond`, `sleeping`, and `woken` are still valid.
The test could deadlock trying join on the worker processes. Apply the same technique as pythongh-130933. Join the process before the test ends in `test_notify` as well.
The test could deadlock trying join on the worker processes due to a combination of behaviors:
The use of
assertReachesEventually
did not ensure that workers actually calledwoken.release()
because the SyncManager's Semaphore does not implement get_value.This mean that the test could finish and the variable "sleeping" would got out of scope and be collected. This unregisters the proxy leading to failures in the worker or possibly the manager.
The subsequent call to
p.join()
during cleanUp therefore never finished.This takes two approaches to fix this:
Use woken.acquire() to ensure that the workers actually finish calling
woken.release()
.Wait until the workers finish during the test, while cond, sleeping,
and woken are still valid.