C API: Design an API to disallow creating more than one extension instance per Python process #111088

vstinner · 2023-10-19T23:10:09Z

Somewhere in PyModuleDef, we should add a way to disallow globally the creation of more than one extension instance.

Or maybe the restriction should be that an extension cannot be loaded in two different interpreters. Honestly, I'm not sure.

@ericsnowcurrently no longer wants to treat the main interpreter differently: see issue gh-109857.

Previously, I proposed to directly expose PyInterpreterState_IsMain() to the limited C API, but the idea got rejected.

The syslog module behaves differently if it's used in the main interpreter:

static inline int
is_main_interpreter(void)
{
    return (PyInterpreterState_Get() == PyInterpreterState_Main());
}

...
    if (!is_main_interpreter()) {
        PyErr_SetString(PyExc_RuntimeError, "subinterpreter can't use syslog.openlog()");
        return NULL;
    }

The syslog module stores a Python object in static PyObject *S_ident_o = NULL;. It's a way to keep a string passed to C openlog() function alive until closelog() is called. The problem is to decide which extension instance and/or which Python interpreter is responsible for that memory: the memory must be kept alive until closelog() is called. But what if each Python interpreter call openlog()? Calling openlog() more than once overrides the previous call. But if closelog() is called, if multiple interpreters called openlog(), you might think that syslog is still usable, whereas closelog() has been called.

See issue gh-99127 and PR gh-99128 which added is_main_interpreter() function to syslog.

Cython already implements a similar, but different check: it raises "this module can only be loaded into one interpreter per process" error message.

Cython uses a different logic to handle PEP 489 "Multi-phase extension module initialization". It uses a static PY_INT64_T main_interpreter_id = -1; variable which is initialized at the first call of __Pyx_check_single_interpreter().

//#if CYTHON_PEP489_MULTI_PHASE_INIT
static CYTHON_SMALL_CODE int __Pyx_check_single_interpreter(void) {
    #if PY_VERSION_HEX >= 0x030700A1
    static PY_INT64_T main_interpreter_id = -1;
    PY_INT64_T current_id = PyInterpreterState_GetID(PyThreadState_Get()->interp);
    if (main_interpreter_id == -1) {
        main_interpreter_id = current_id;
        return (unlikely(current_id == -1)) ? -1 : 0;
    } else if (unlikely(main_interpreter_id != current_id))

    #else
    static PyInterpreterState *main_interpreter = NULL;
    PyInterpreterState *current_interpreter = PyThreadState_Get()->interp;
    if (!main_interpreter) {
        main_interpreter = current_interpreter;
    } else if (unlikely(main_interpreter != current_interpreter))
    #endif

    {
        PyErr_SetString(
            PyExc_ImportError,
            "Interpreter change detected - this module can only be loaded into one interpreter per process.");
        return -1;
    }
    return 0;
}

The text was updated successfully, but these errors were encountered:

vstinner · 2023-10-19T23:15:54Z

Calling some C functions have a process-wide effect (ex: not limited to the current thread), and some also expect that the caller keeps the memory alive.

In the past, os.putenv() was implemented with C putenv() which expects the caller to keep the memory alive. Python had to store the string in an internal list (posix_putenv_garbage) until Python exit. Python started to crash when the os module was converted to multi-phase init API (PEP 489).

See:

issue Implement os.putenv() with setenv() if available #83587
PR bpo-39406: Implement os.putenv() with setenv() if available #18128: Implement os.putenv() with setenv() if available
PR bpo-39406: os.putenv() avoids putenv_dict on Windows #18126: os.putenv() avoids putenv_dict on Windows

encukou · 2023-10-23T09:53:52Z

The mechanism is currently documented as a recipe in the howto: https://docs.python.org/3/howto/isolating-extensions.html#opt-out-limiting-to-one-module-object-per-process

Or maybe the restriction should be that an extension cannot be loaded in two different interpreters. Honestly, I'm not sure.

IMO, this depends on the extension.
The recipe is relatively easy to extend in this way -- reset the flag when freeing the module. (That's the power of recipes -- they spell out what's happening, and you can adjust them.)

The downside of this recipe is that it will need locking under nogil. (But it seems to me that it's not the only thing to need that, and so nogil should add a general mechanism for process-global locking.)

ericsnowcurrently · 2023-10-23T15:32:23Z

nogil should add a general mechanism for process-global locking

FWIW, this could be helpful for some extensions even before nogil. See the previous DPO discussion (and another related DPO discussion).

vstinner added the topic-C-API label Oct 19, 2023

vstinner mentioned this issue Oct 19, 2023

[C API] Convert a few stdlib extensions to the limited C API (PEP 384) #85283

Closed

vstinner closed this as completed Dec 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

C API: Design an API to disallow creating more than one extension instance per Python process #111088

C API: Design an API to disallow creating more than one extension instance per Python process #111088

vstinner commented Oct 19, 2023

vstinner commented Oct 19, 2023

Uh oh!

encukou commented Oct 23, 2023

Uh oh!

ericsnowcurrently commented Oct 23, 2023

Uh oh!

Uh oh!

C API: Design an API to disallow creating more than one extension instance per Python process #111088

C API: Design an API to disallow creating more than one extension instance per Python process #111088

Comments

vstinner commented Oct 19, 2023

vstinner commented Oct 19, 2023

Uh oh!

encukou commented Oct 23, 2023

Uh oh!

ericsnowcurrently commented Oct 23, 2023

Uh oh!