-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-109593: Fix reentrancy issue in multiprocessing resource_tracker #109629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -51,15 +51,31 @@ | |||||||||
}) | ||||||||||
|
||||||||||
|
||||||||||
class ReentrantCallError(RuntimeError): | ||||||||||
pass | ||||||||||
|
||||||||||
|
||||||||||
class ResourceTracker(object): | ||||||||||
|
||||||||||
def __init__(self): | ||||||||||
self._lock = threading.Lock() | ||||||||||
self._lock = threading.RLock() | ||||||||||
self._fd = None | ||||||||||
self._pid = None | ||||||||||
|
||||||||||
def _reentrant_call_error(self): | ||||||||||
# gh-109629: this happens if an explicit call to the ResourceTracker | ||||||||||
# gets interrupted by a garbage collection, invoking a finalizer (*) | ||||||||||
# that itself calls back into ResourceTracker. | ||||||||||
# (*) for example the SemLock finalizer | ||||||||||
raise ReentrantCallError( | ||||||||||
"Reentrant call into the multiprocessing resource tracker") | ||||||||||
|
||||||||||
def _stop(self): | ||||||||||
with self._lock: | ||||||||||
# This should not happen (_stop() isn't called by a finalizer) | ||||||||||
# but we check for it anyway. | ||||||||||
if self._lock._recursion_count() > 1: | ||||||||||
return self._reentrant_call_error() | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it be possible to create the exception in the function and use raise here to make it more explicit that this line raises an exception?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, it is. |
||||||||||
if self._fd is None: | ||||||||||
# not running | ||||||||||
return | ||||||||||
|
@@ -81,6 +97,9 @@ def ensure_running(self): | |||||||||
This can be run from any process. Usually a child process will use | ||||||||||
the resource created by its parent.''' | ||||||||||
with self._lock: | ||||||||||
if self._lock._recursion_count() > 1: | ||||||||||
# The code below is certainly not reentrant-safe, so bail out | ||||||||||
return self._reentrant_call_error() | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would prefer:
Suggested change
or:
Suggested change
Since the |
||||||||||
if self._fd is not None: | ||||||||||
# resource tracker was launched before, is it still running? | ||||||||||
if self._check_alive(): | ||||||||||
|
@@ -159,7 +178,17 @@ def unregister(self, name, rtype): | |||||||||
self._send('UNREGISTER', name, rtype) | ||||||||||
|
||||||||||
def _send(self, cmd, name, rtype): | ||||||||||
self.ensure_running() | ||||||||||
try: | ||||||||||
self.ensure_running() | ||||||||||
except ReentrantCallError: | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't get this code path. So it's possible that self._fd is None, since something is calling ensure_running(), no? But the code below doesn't check if self._fd is None or not. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is true, but |
||||||||||
# The code below might or might not work, depending on whether | ||||||||||
# the resource tracker was already running and still alive. | ||||||||||
# Better warn the user. | ||||||||||
# (XXX is warnings.warn itself reentrant-safe? :-) | ||||||||||
warnings.warn( | ||||||||||
f"ResourceTracker called reentrantly for resource cleanup, " | ||||||||||
f"which is unsupported. " | ||||||||||
f"The {rtype} object {name!r} might leak.") | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @vstinner This might raise a warning from time to time during the test suite, will it fail the buildbots? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hum. If you run tests with -Werror, it will raises an UserWarning exception, and can make the tests fail, no? |
||||||||||
msg = '{0}:{1}:{2}\n'.format(cmd, name, rtype).encode('ascii') | ||||||||||
if len(msg) > 512: | ||||||||||
# posix guarantees that writes to a pipe of less than PIPE_BUF | ||||||||||
|
@@ -176,6 +205,7 @@ def _send(self, cmd, name, rtype): | |||||||||
unregister = _resource_tracker.unregister | ||||||||||
getfd = _resource_tracker.getfd | ||||||||||
|
||||||||||
|
||||||||||
def main(fd): | ||||||||||
'''Run resource tracker.''' | ||||||||||
# protect the process from ^C and "killall python" etc | ||||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Avoid deadlocking on a reentrant call to the multiprocessing resource tracker. Such a reentrant call, though unlikely, can happen if a GC pass invokes the finalizer for a multiprocessing object such as SemLock. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you prefer to raise an exception (I'm fine with it), do we still need a reentrant lock?
Would it be possible to use a regular lock, try to acquire it, raise if it's a reentrant lock, or use the lock and release it afterwards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you "raise if it's a reentrant lock" with a regular lock? If it fails locking, you don't know if it's because of a reentrant call, or simply another thread.