Skip to content

bpo-30891: Fix importlib _find_and_load() race condition #2646

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 10, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 30 additions & 28 deletions Lib/importlib/_bootstrap.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ def _new_module(name):
# Module-level locking ########################################################

# A dict mapping module names to weakrefs of _ModuleLock instances
# Dictionary protected by the global import lock
_module_locks = {}
# A dict mapping thread ids to _ModuleLock instances
_blocking_on = {}
Expand Down Expand Up @@ -144,10 +145,7 @@ def __init__(self, name):
self._lock = None

def __enter__(self):
try:
self._lock = _get_module_lock(self._name)
finally:
_imp.release_lock()
self._lock = _get_module_lock(self._name)
self._lock.acquire()

def __exit__(self, *args, **kwargs):
Expand All @@ -159,31 +157,37 @@ def __exit__(self, *args, **kwargs):
def _get_module_lock(name):
"""Get or create the module lock for a given module name.

Should only be called with the import lock taken."""
lock = None
Acquire/release internally the global import lock to protect
_module_locks."""

_imp.acquire_lock()
try:
lock = _module_locks[name]()
except KeyError:
pass
if lock is None:
if _thread is None:
lock = _DummyModuleLock(name)
else:
lock = _ModuleLock(name)
def cb(_):
del _module_locks[name]
_module_locks[name] = _weakref.ref(lock, cb)
try:
lock = _module_locks[name]()
except KeyError:
lock = None

if lock is None:
if _thread is None:
lock = _DummyModuleLock(name)
else:
lock = _ModuleLock(name)
def cb(_):
del _module_locks[name]
_module_locks[name] = _weakref.ref(lock, cb)
finally:
_imp.release_lock()

return lock


def _lock_unlock_module(name):
"""Release the global import lock, and acquires then release the
module lock for a given module name.
"""Acquires then releases the module lock for a given module name.

This is used to ensure a module is completely initialized, in the
event it is being imported by another thread.

Should only be called with the import lock taken."""
"""
lock = _get_module_lock(name)
_imp.release_lock()
try:
lock.acquire()
except _DeadlockError:
Expand Down Expand Up @@ -587,7 +591,6 @@ def _module_repr_from_spec(spec):
def _exec(spec, module):
"""Execute the spec's specified module in an existing module's namespace."""
name = spec.name
_imp.acquire_lock()
with _ModuleLockManager(name):
if sys.modules.get(name) is not module:
msg = 'module {!r} not in sys.modules'.format(name)
Expand Down Expand Up @@ -670,7 +673,6 @@ def _load(spec):
clobbered.

"""
_imp.acquire_lock()
with _ModuleLockManager(spec.name):
return _load_unlocked(spec)

Expand Down Expand Up @@ -957,16 +959,16 @@ def _find_and_load_unlocked(name, import_):

def _find_and_load(name, import_):
"""Find and load the module."""
_imp.acquire_lock()
if name not in sys.modules:
with _ModuleLockManager(name):
with _ModuleLockManager(name):
if name not in sys.modules:
return _find_and_load_unlocked(name, import_)

module = sys.modules[name]
if module is None:
_imp.release_lock()
message = ('import of {} halted; '
'None in sys.modules'.format(name))
raise ModuleNotFoundError(message, name=name)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My patch changes how the global import lock is handled in _find_and_load(). Before my change, it was held for the whole function (to simplify). With my change, it is now acquired/released twice when we take the _lock_unlock_module() path. IMHO it isn't an issue, I prefer finer grain lock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with your improvements to the lock handling, this still looks a bit race-prone to me, since we have the classic "query before use" pattern of:

if name not in sys.modules:
    ...
module = sys.modules[name]

That is, just because the module was there when we checked if name not in sys.modules doesn't mean it's still going to be there when we run module = sys.modules[name].

Previously, holding _imp.acquire_lock() for the whole function would at least protect this from other _find_and_load() calls, but as far as I can see it's never been protected from other threads doing del sys.modules[name] without holding the relevant module lock.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you are right: I proposed PR #2665.

_lock_unlock_module(name)
return module

Expand Down
4 changes: 0 additions & 4 deletions Python/import.c
Original file line number Diff line number Diff line change
Expand Up @@ -1559,10 +1559,6 @@ PyImport_ImportModuleLevelObject(PyObject *name, PyObject *globals,
if (initializing == -1)
PyErr_Clear();
if (initializing > 0) {
#ifdef WITH_THREAD
_PyImport_AcquireLock();
#endif
/* _bootstrap._lock_unlock_module() releases the import lock */
value = _PyObject_CallMethodIdObjArgs(interp->importlib,
&PyId__lock_unlock_module, abs_name,
NULL);
Expand Down
Loading