Skip to content

Crash when Python lambdas (with captured state) are called from multiple C++ threads (fix inside) #1587

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
uentity opened this issue Nov 2, 2018 · 1 comment · Fixed by #1595

Comments

@uentity
Copy link
Contributor

uentity commented Nov 2, 2018

Intro

Current pybind11 functional type_caster support invoking callbacks passed from Python in async way, i.e. from multiple C++ threads. It implements that by holding GIL while functor is being executed according to following essential code from type_caster::load() that initializes value member:

...
		value = [func](Args... args) -> Return {
			gil_scoped_acquire acq;
			object retval(func(std::forward<Args>(args)...));
			/* Visual studio 2015 parser issue: need parentheses around this expression */
			return (retval.template cast<Return>());
		};

Notice the sentence gil_scoped_acquire acq; that captures and releases GIL in RAII fashion.

Problem

Problem with code above is that destruction of value (and captured Python functor func) happens after GIL has been released. Then, if functor func is for example Python lambda that captures some variables, these variables are being freed (reference counter decremented) when GIL is no longer held.

Notice that all this process of functor invoke and destruction can execute in some worker C++ thread and that leads to UB (immediate terminate in my experience).

Problem isn't arising If func is pure function or a stateless lambda.

Solution

I've made a workaround to this issue by replacing the code above with the following:

...
		// dynamically allocated lambda that actually invokes passed functor
		auto f = new auto([func](Args... args) -> Return {
			object retval(func(std::forward<Args>(args)...));
			/* Visual studio 2015 parser issue: need parentheses around this expression */
			return (retval.template cast<Return>());
		});
		if(!f) return false;

		// ensure GIL is released AFTER functor destructor is called
		value = [f](Args... args) -> Return {
			gil_scoped_acquire acq;
			(*f)(std::forward<Args>(args)...);
			delete f;
		};

Basically what it does -- it keeps captured GIL until functor is finished and completely destructed. This approach completely cures the problem.
However the downside is that it dynamically allocates the inner lambda (that actually invokes func). With C++17 lambda will be constructed in-place without copy/move involved. But still this proposal may be sub-optimal.

So, I'm calling for core devs here for looking into this issue because it leads to rather severe limitations of Python callbacks usage.

@uentity uentity changed the title Crash when Python lambdas (with captured state) are called from miltiple C++ threads (fix inside) Crash when Python lambdas (with captured state) are called from multiple C++ threads (fix inside) Nov 2, 2018
uentity added a commit to uentity/pybind11 that referenced this issue Nov 6, 2018
…1587)

Ensure GIL is released after functor destructor finished (not only
during functor execution as in previous implementation).
uentity added a commit to uentity/pybind11 that referenced this issue Nov 6, 2018
…1587)

Ensure GIL is released after functor destructor finished (not only
during functor execution as in previous implementation).
@uentity
Copy link
Contributor Author

uentity commented Nov 6, 2018

I came with a much nicer solution in PR #1595. Please, take a look.

uentity added a commit to uentity/pybind11 that referenced this issue Nov 9, 2018
uentity added a commit to uentity/pybind11 that referenced this issue Nov 9, 2018
uentity added a commit to uentity/pybind11 that referenced this issue Nov 13, 2018
uentity added a commit to uentity/pybind11 that referenced this issue Dec 2, 2018
uentity added a commit to uentity/pybind11 that referenced this issue Dec 10, 2018
uentity added a commit to uentity/pybind11 that referenced this issue Mar 29, 2019
uentity added a commit to uentity/pybind11 that referenced this issue Jun 6, 2019
uentity added a commit to uentity/pybind11 that referenced this issue Jun 6, 2019
wjakob pushed a commit that referenced this issue Jun 11, 2019
…1595)

* Fix async Python functors invoking from multiple C++ threads (#1587)

Ensure GIL is held during functor destruction.

* Add async Python callbacks test that runs in separate Python thread
wjakob pushed a commit that referenced this issue Jun 11, 2019
…1595)

* Fix async Python functors invoking from multiple C++ threads (#1587)

Ensure GIL is held during functor destruction.

* Add async Python callbacks test that runs in separate Python thread
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant