-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
bpo-30891: Fix importlib _find_and_load() race condition #2646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
message = ('import of {} halted; ' | ||
'None in sys.modules'.format(name)) | ||
raise ModuleNotFoundError(message, name=name) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My patch changes how the global import lock is handled in _find_and_load(). Before my change, it was held for the whole function (to simplify). With my change, it is now acquired/released twice when we take the _lock_unlock_module() path. IMHO it isn't an issue, I prefer finer grain lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with your improvements to the lock handling, this still looks a bit race-prone to me, since we have the classic "query before use" pattern of:
if name not in sys.modules:
...
module = sys.modules[name]
That is, just because the module was there when we checked if name not in sys.modules
doesn't mean it's still going to be there when we run module = sys.modules[name]
.
Previously, holding _imp.acquire_lock()
for the whole function would at least protect this from other _find_and_load()
calls, but as far as I can see it's never been protected from other threads doing del sys.modules[name]
without holding the relevant module lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you are right: I proposed PR #2665.
I confirm with my Windows VM that "./python -m test -R 3:100 -m test_concurrency test_import" doesn't fail anymore with this change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except one outdated sentence.
Lib/importlib/_bootstrap.py
Outdated
def _lock_unlock_module(name): | ||
"""Release the global import lock, and acquires then release the | ||
module lock for a given module name. | ||
"""Acquires then release the module lock for a given module name. | ||
This is used to ensure a module is completely initialized, in the | ||
event it is being imported by another thread. | ||
|
||
Should only be called with the import lock taken.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, I misunderstood your comment. It should now be fixed.
Lib/importlib/_bootstrap.py
Outdated
with _ModuleLockManager(name): | ||
try: | ||
module = sys.modules[name] | ||
except KeyError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KeyError is raised in common case (_find_and_load is called from accelerated C code when name not in sys.modules). Catching exceptions in Python code is slow. If you want to avoid KeyError in the following code, it would be better to use sys.modules.get(name, sentinel)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok ok, I reverted this unrelated change.
* Rewrite importlib _get_module_lock(): it is now responsible to hold the imp lock directly. * _find_and_load() now holds the module lock to check if name is in sys.modules to prevent a race condition
* Rewrite importlib _get_module_lock(): it is now responsible to hold the imp lock directly. * _find_and_load() now holds the module lock to check if name is in sys.modules to prevent a race condition (cherry picked from commit 4f9a446)
the imp lock directly.
sys.modules to prevent a race condition