Skip to content

[BUG] gil_scoped_acquire fails with cpython 3.9 debug build #2681

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
albanD opened this issue Nov 19, 2020 · 10 comments
Closed

[BUG] gil_scoped_acquire fails with cpython 3.9 debug build #2681

albanD opened this issue Nov 19, 2020 · 10 comments
Labels
Milestone

Comments

@albanD
Copy link
Contributor

albanD commented Nov 19, 2020

Issue description

Trying to use gil_scoped_acquire with a cpython 3.9 debug build fails with the following stack trace:

[New Thread 0x7ffff6910700 (LWP 28320)]
python: Python/ceval.c:135: is_tstate_valid: Assertion `!_PyMem_IsPtrFreed(tstate->interp)' failed.

Thread 2 "python" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff6910700 (LWP 28320)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff7c6b859 in __GI_abort () at abort.c:79
#2  0x00007ffff7c6b729 in __assert_fail_base (fmt=0x7ffff7e01588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0x555555864f00 "!_PyMem_IsPtrFreed(tstate->interp)", file=0x5555558648e3 "Python/ceval.c", line=135, 
    function=<optimized out>) at assert.c:92
#3  0x00007ffff7c7cf36 in __GI___assert_fail (assertion=assertion@entry=0x555555864f00 "!_PyMem_IsPtrFreed(tstate->interp)", 
    file=file@entry=0x5555558648e3 "Python/ceval.c", line=line@entry=135, 
    function=function@entry=0x555555866760 <__PRETTY_FUNCTION__.16391> "is_tstate_valid") at assert.c:101
#4  0x0000555555679fdc in is_tstate_valid (tstate=tstate@entry=0x7ffff0000b70) at ./Include/internal/pycore_pymem.h:49
#5  0x000055555567eaa1 in take_gil (tstate=tstate@entry=0x7ffff0000b70) at Python/ceval_gil.h:227
#6  0x000055555567f053 in PyEval_AcquireThread (tstate=0x7ffff0000b70) at Python/ceval.c:385
#7  0x00007ffff6b31087 in pybind11::gil_scoped_acquire::gil_scoped_acquire (this=0x7ffff690fe90)
    at /tmp/pip-req-build-c1357uvy/.eggs/pybind11-2.6.1-py3.9.egg/pybind11/include/pybind11/pybind11.h:2097
#8  0x00007ffff6b274f0 in thread_worker () at main.cpp:7
#9  0x00007ffff6b27abf in std::__invoke_impl<void, void (*)()> (__f=<optimized out>) at /usr/include/c++/8/bits/invoke.h:89
#10 std::__invoke<void (*)()> (__fn=<optimized out>) at /usr/include/c++/8/bits/invoke.h:95
#11 std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul> (this=<optimized out>) at /usr/include/c++/8/thread:244
#12 std::thread::_Invoker<std::tuple<void (*)()> >::operator() (this=<optimized out>) at /usr/include/c++/8/thread:253
#13 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> > >::_M_run (this=<optimized out>)
    at /usr/include/c++/8/thread:196
#14 0x00007ffff6a02d84 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#15 0x00007ffff7f9b609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#16 0x00007ffff7d68103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Checking the code, this part looks especially suspicious:

/* Work around an annoying assertion in PyThreadState_Swap */
#if defined(Py_DEBUG)
PyInterpreterState *interp = tstate->interp;
tstate->interp = nullptr;
#endif
PyEval_AcquireThread(tstate);
#if defined(Py_DEBUG)
tstate->interp = interp;
#endif

Reproducible example code

This is the python side

import pybind_exit_test as m
import time

m.new_thread()
time.sleep(2)

And the cpp lib

#include <pybind11/pybind11.h>
#include <thread>

namespace py = pybind11;

void thread_worker() {
    py::gil_scoped_acquire gil;
}

void new_thread() {
    std::thread t(&thread_worker);
    t.detach();
}

PYBIND11_MODULE(pybind_exit_test, m) {
    m.def("new_thread", &new_thread);
}
@YannickJadoul
Copy link
Collaborator

Which version of pybind11 are you using? This seems familiar, somehow.

@albanD
Copy link
Contributor Author

albanD commented Nov 19, 2020

I ran the tests from https://github.com/pybind/pybind11/tree/59a2ac2745d8a57ac94c6accced73620d59fb844
But the linked code that looks suspicious is from current master.

@YannickJadoul
Copy link
Collaborator

Thanks!
It does look suspicious, yes :-/

It also seems very familiar, somehow, but I can't find another existing issue right now :-(

@YannickJadoul YannickJadoul added this to the v2.6.2 milestone Nov 19, 2020
@albanD
Copy link
Contributor Author

albanD commented Nov 19, 2020

From checking the code,
I think that the addition of e2b884c made this trick unnecessary and the two #if defined(Py_DEBUG) blocks can just be removed?

@YannickJadoul
Copy link
Collaborator

Would you mind trying that out, @albanD?

@henryiii, can/should we build with Python debug versions for some CI runs? (not poking you to do that; just wondering if we should/if it's easy)

@albanD
Copy link
Contributor Author

albanD commented Nov 20, 2020

It runs fine for now but there are other issues with debug build :/ I'll open an issue about it shortly and once I can actually get all the tests to pass without crashing I try and submit a patch.

@YannickJadoul
Copy link
Collaborator

Thanks, that's very much appreciated! :-)
Also, if already some debug crashes get fixed, that's already progress anyway.

@henryiii
Copy link
Collaborator

Is this closable with the merged PR mentioned above?

@bstaletic
Copy link
Collaborator

If I remember correctly, yes, this can be closed.

@henryiii
Copy link
Collaborator

Good enough. If someone else remembers differently, they can reopen. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants