Skip to content

gh-80406: Finalise subinterpreters in Py_FinalizeEx() #17575

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
23af5f5
Add test suggested by ncoghlan
LewisGaul Nov 21, 2019
433663c
Finalise sub-interpreters in Py_FinalizeEx()
LewisGaul Dec 11, 2019
48e1cfc
Improve test name
LewisGaul Dec 13, 2019
0400634
Switch back to main threadstate in test_audit_subinterpreter before c…
LewisGaul Dec 13, 2019
b79649c
📜🤖 Added by blurb_it.
blurb-it[bot] Dec 14, 2019
8b1e7d9
Markups including: switch from 'finalizing' flag to 'allow_new', add …
LewisGaul Jan 21, 2020
fd6073a
Merge branch 'finalise-subinterps' of github.com:LewisGaul/cpython in…
LewisGaul Jan 21, 2020
4bbd58f
Merge branch 'master' into finalise-subinterps
LewisGaul Oct 20, 2020
1095e66
Use '_' for unused variable in test_embed.py
LewisGaul Oct 20, 2020
675285d
Fix struct position of 'allow_new' flag
LewisGaul Oct 22, 2020
8e21788
Add handling for unsupported case of calling Py_Finalize() from a sub…
LewisGaul Oct 22, 2020
606c068
Emit resource warning when calling Py_Finalize() with unfinalized sub…
LewisGaul Oct 22, 2020
e0789b0
Update Py_FinalizeEx() docs
LewisGaul Oct 22, 2020
dda99ce
Update test for resource warning when implicitly finalizing subinterp…
LewisGaul Oct 23, 2020
847e8d2
Tidy up test_finalize_subinterps() testcase
LewisGaul Oct 23, 2020
a2fb0fc
Add testcase for calling Py_Finalize() from a subinterpreter
LewisGaul Oct 23, 2020
d234528
Tweak subinterpreters still running ResourceWarning handling
LewisGaul Nov 23, 2020
46a8619
Make calling PyFinalizeEx() from a subinterpreter a Py_FatalError
LewisGaul Nov 23, 2020
c89c0e5
Acquire interpreters mutex before setting allow_new=0 in PyFinalizeEx()
LewisGaul Nov 23, 2020
c285f52
Merge remote-tracking branch 'upstream/master' into finalise-subinterps
LewisGaul Nov 23, 2020
95cbfd4
Add back in the 'interp' variable to PyFinalizeEx() to fix the build
LewisGaul Nov 23, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Include/internal/pycore_pystate.h
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,7 @@ typedef struct pyruntimestate {
If that becomes a problem later then we can adjust, e.g. by
using a Python int. */
int64_t next_id;
int finalizing;
} interpreters;
// XXX Remove this field once we have a tp_* slot.
struct _xidregistry {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
:func:`Py_FinalizeEx()` now implicitly cleans up subinterpreters, as the C API documentation suggests.
46 changes: 46 additions & 0 deletions Programs/_testembed.c
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,47 @@ static int test_repeated_init_and_subinterpreters(void)
return 0;
}

/* bpo-36225: Implicitly tear down subinterpreters with Py_Finalize() */
static int test_finalize_subinterps(void)
{
PyThreadState *mainstate;
PyGILState_STATE gilstate;
int i, j;

for (i=0; i<15; i++) {
printf("--- Pass %d ---\n", i);
_testembed_Py_Initialize();
mainstate = PyThreadState_Get();

PyEval_ReleaseThread(mainstate);

gilstate = PyGILState_Ensure();
print_subinterp();
PyThreadState_Swap(NULL);

for (j=0; j<2; j++) {
Py_NewInterpreter();
print_subinterp();
}

PyThreadState_Swap(mainstate);
print_subinterp();

for (j=0; j<2; j++) {
Py_NewInterpreter();
print_subinterp();
}

PyThreadState_Swap(mainstate);
print_subinterp();
PyGILState_Release(gilstate);

PyEval_RestoreThread(mainstate);
Py_Finalize();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This certainly helps verify that finalization still works. There should probably also be something verifying that the subinterpreters were properly cleaned up at the beginning of finalization. (...perhaps with some artifact generated when each sub-interp is finalized.)

Also, what about the case where:

  • the subinterpreter has multiple threads still running?
  • what about daemon threads? (yeah, it's mean of me to ask 😉)
  • the subinterpreter's tstate_head is still running?
  • someone calls Py_NewInterpreter() while interpreters are being cleaned up?
  • someone calls Py_NewInterpreter() while finalization is otherwise still running?
  • someone calls Py_NewInterpreter() after finalization is finished?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps registering an "atexit" handler in each subinterpreter that prints something, and then confirming in the Python test case code that all the subinterpreter exit messages appear before the main interpreter's exit message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all sounds worth checking in a test, but I'm unclear how to implement it. Any advice would be appreciated, specifically:

  • Should all of the test logic be in _testembed.c, or should that just be performing the C API calls with most of the test logic being in test_embed.py?
    • I'm only aware of how to register an atexit handler from Python code.
    • How should the interaction between test_embed.py and _testembed.c work?

With a better understanding of the above I may be able to have a go at covering the above points in the tests, although it'll likely take quite a lot of thought given I'm pretty new to the C API! Any further guidance very welcome, and I can hopefully get this finished off without such a delay this time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Should all of the test logic be in _testembed.c, or should that just be performing the C API calls with most of the test logic being in test_embed.py?

Put as much logic as you can in the Python code. _testembed.c should mostly be only what can't be done from Python (with exceptions where practicality dictates more).

  • I'm only aware of how to register an atexit handler from Python code.

You can call Python code from C if needed. Import the atexit module, get the appropriate function, and call it, all using the C-API. We do the same thing in various places, like Python/import.c. For me (not a C expert) searching the code base has always been the easiest way to see how to do something. 😄

(That assumes there isn't a C-API for atexit handlers.)

  • How should the interaction between test_embed.py and _testembed.c work?

I'll need more context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a good reason to pair up on a video call. Then we could walk through this stuff a bit more efficiently. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least some of the embedding tests already use PyRun_SimpleString() to run Python code inside the created interpreter, and that's also what I had in mind for the suggested atexit test case above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there's a lot of considerations and things to check here, also fleshed out by Victor's message at bpo-36225#msg371571. Does everything here need addressing in this one PR, or can some of these points be split into separate issues to follow this fix? This feels like rather a lot to tackle all in one go - which of the cases you listed would you suggest starting with @ericsnowcurrently (perhaps the simplest to test!)?

}
return 0;
}

/*****************************************************
* Test forcing a particular IO encoding
*****************************************************/
Expand Down Expand Up @@ -1181,10 +1222,14 @@ static int test_audit_subinterpreter(void)
PySys_AddAuditHook(_audit_subinterpreter_hook, NULL);
_testembed_Py_Initialize();

PyThreadState *mainstate = PyThreadState_Get();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to double-check with @zooba on his intention here. It's pretty important to make sure that the auditing functionality works as expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @zooba, as a consequence of my changes here, test_audit_subinterpreter() in _testembed.c started failing.

The change I'm making is to make Py_Finalize() implicitly clean up subinterpreters.

In the test, multiple subinterpreters are created, and then Py_Finalize() is called from the last-created subinterpreter. It seems there's currently an issue with calling Py_Finalize() from a subinterpreter (see bpo-37776), which caused this test to fail when getting Py_Finalize() to clean up subinterpreters.

The test passes if Py_Finalize() is instead called from the main interpreter tstate - which is the change I've made to the test. Just wanting to check whether that's taking anything away from what's intentionally being checked by this test?

Copy link
Contributor Author

@LewisGaul LewisGaul Oct 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #19063 from @vstinner which was not merged, but also proposed to change the logic of this testcase. It seems like this testcase is doing something that is not currently working, and according to bpo-38865#msg357331 may not be supported in general?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also confirmed that this test still fails on my branch without this change.


Py_NewInterpreter();
Py_NewInterpreter();
Py_NewInterpreter();

// Currently unable to call Py_Finalize from subinterpreter thread, see bpo-37776.
PyThreadState_Swap(mainstate);
Py_Finalize();

switch (_audit_subinterpreter_interpreter_count) {
Expand Down Expand Up @@ -1603,6 +1648,7 @@ struct TestCase
static struct TestCase TestCases[] = {
{"test_forced_io_encoding", test_forced_io_encoding},
{"test_repeated_init_and_subinterpreters", test_repeated_init_and_subinterpreters},
{"test_finalize_subinterps", test_finalize_subinterps},
{"test_pre_initialization_api", test_pre_initialization_api},
{"test_pre_initialization_sys_options", test_pre_initialization_sys_options},
{"test_bpo20891", test_bpo20891},
Expand Down
18 changes: 18 additions & 0 deletions Python/pylifecycle.c
Original file line number Diff line number Diff line change
Expand Up @@ -1257,6 +1257,20 @@ Py_FinalizeEx(void)
PyThreadState *tstate = _PyRuntimeState_GetThreadState(runtime);
PyInterpreterState *interp = tstate->interp;

// Finalize sub-interpreters.
runtime->interpreters.finalizing = 1;
PyInterpreterState *subinterp = PyInterpreterState_Head();
PyInterpreterState *next_interp;
while (subinterp != NULL) {
next_interp = PyInterpreterState_Next(subinterp);
if (subinterp != PyInterpreterState_Main()) {
PyThreadState_Swap(subinterp->tstate_head);
Py_EndInterpreter(subinterp->tstate_head);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails if the interp.tstate_head is still running (has a frame). We may want to consider a more graceful approach to dealing with subinterpreters that are still doing work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yes, do you have any further thoughts on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to give it more thought (and take the time to queue up details into my mental cache 😄).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've recently hit this problem with lingering subinterpreter with existing frames.

I solved it by adding Py_CLEAR before the Py_EndInterpreter call (which is similar to how PyThreadState_Clear handles existing frames) and now it seems to work and end gracefully. While that might not be the best solution (and I don't see in subinterpreters much to see possible issues with that), I guess it's not worse than Fatal error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericsnowcurrently any new thoughts here? We might be better off having a call to chat through everything :)

}
subinterp = next_interp;
}
PyThreadState_Swap(tstate);

// Wrap up existing "threading"-module-created, non-daemon threads.
wait_for_thread_shutdown(tstate);

Expand Down Expand Up @@ -1454,6 +1468,10 @@ new_interpreter(PyThreadState **tstate_p)
return _PyStatus_ERR("Py_Initialize must be called first");
}

if (runtime->interpreters.finalizing) {
return _PyStatus_ERR("Interpreters are being finalized");
}

/* Issue #10915, #15751: The GIL API doesn't work with multiple
interpreters: disable PyGILState_Check(). */
_PyGILState_check_enabled = 0;
Expand Down