-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Allow binding factory functions as constructors #805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
(Credit to @YannickJadoul and @bmerry for the idea). |
cf0dcce
to
462db1b
Compare
Also worth noting: this isn't affected by #804, because the |
48665d6
to
2eb05d6
Compare
This strikes me a fairly intrusive change to support a relatively narrow set of situations. (In particular, the As an alternative, could we have something like: m.def("__init__", py::factory_adapter(&Example::create)); where the |
Yeah, that should be doable. I'll take another shot at this in the next day or two. |
And that MSVC job failure is more bizarre than ever. |
I don't want to come across whiny, but could I bring up that gist I've mentioned in the gitter chat, again? I'm sure it could be improved a lot with better knowledge of pybind11, but what it more or less does is:
|
2eb05d6
to
16567c3
Compare
Redesigned this. The redesign is far less intrusive, and avoids some unnecessary Python object creation in several cases. I still want to add some tests to verify all the various edge cases (e.g. for things like returning an alias pointer or value, or a subclass pointer). |
docs/advanced/classes.rst
Outdated
// Bind an existing pointer-returning factory function: | ||
.def(py::init_factory(&Example::create)) | ||
// Similar, but returns the pointer wrapped in a holder: | ||
.def("__init__", []() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
init_factory
missing.
Nifty -- +1 for covering all the different kinds of things that could be returned. Since the |
As far as organization, moving it to another header is a good idea. It can't quite be all moved, though: the |
Yep, that is to be expected. With a forward declaration, it should be possible though, no?. |
I assume you meant making it an optional (i.e. requires an extra include before using) header. |
fcef625
to
dfaec31
Compare
I don't know about all the subtleties of handling e.g. I've got two possibly useful things:
But for the rest, just "wow", again! |
Re: The SFINAE was necessary because the |
Aha, my bad, sorry But thanks again for the quick and elegant solution! |
513bfaf
to
650b65b
Compare
I've added various tests and fixes related to value and alias initialization, and moved the tests out into a new |
d8d66ea
to
19d0654
Compare
3ba8721
to
b52a0dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I've said before, I support the inclusion of the factory by default and the changes to the docs to prefer py::init(...)
over placement new. The lazy allocation added to the latest revision is also great and removes any runtime overhead concerns of the factory. It also gives users more freedom regarding various new
operators (avoiding placement new
), which is a nice bonus.
However, the latest commit, which implements py::init<...>()
in terms of the factory, results in a pretty large +5% increase in binary size and about +2% for compile time (in both cases, not counting the new tests).
This could be addressed either by reverting the py::init<...>()
changes or by further slimming down the factory implementation. Preferably the latter. A comment below points out the possibility of replacing a couple of templates with a single non-template function. For me, that reduced the binary size gap to +2% (from +5%). More code could similarly be detemplatized to bring the binary size back on par with the original py::init<...>()
.
include/pybind11/factory_support.h
Outdated
#if defined(PYBIND11_CPP14) | ||
cl.def("__init__", [cl_type, func = std::move(class_factory)] | ||
#else | ||
CFuncType func(std::move(class_factory)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const auto &func = class_factory;
should be enough for the capture (just copy, instead of move + copy). And similar below for class_func
and alias_func
.
include/pybind11/factory_support.h
Outdated
NAMESPACE_END(factory) | ||
|
||
// Simple, load-only type caster that does a type check and value_and_holder construction. | ||
template <typename T> class type_caster<factory::v_h_proxy<T>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can see, these are two extra template instantiations per class (proxy + caster specialization) which could be replaced with a non-template function like the one below (in that case the type of the first argument of the __init__
lambda changes from v_h_proxy<Cpp<Class>> &
to object
).
PYBIND11_NOINLINE inline value_and_holder load_v_h(handle src, type_info *tinfo) {
if (!src || !tinfo)
throw type_error();
auto *inst = reinterpret_cast<instance *>(src.ptr());
auto result = inst->get_value_and_holder(tinfo, false);
if (!result.inst)
throw type_error();
return result;
}
include/pybind11/factory_support.h
Outdated
no_nullptr(ptr); | ||
if (Py_TYPE(v_h.inst) != cl_type) { | ||
// Inherited from on the Python side: an alias instance is required | ||
if (!is_alias<Class>(ptr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code path covered by the tests? Commenting out everything below except for throw type_error(...)
doesn't seem to produce any test errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should have been, but may have been dropped accidentally somewhere along the line. I'll investigate.
include/pybind11/factory_support.h
Outdated
static void construct(value_and_holder &v_h, Cpp<Class> &&result, PyTypeObject *) { | ||
static_assert(std::is_move_constructible<Cpp<Class>>::value, | ||
"pybind11::init() return-by-value factory function requires a movable class"); | ||
new (preallocate(v_h)) Cpp<Class>(std::move(result)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to do deallocate(v_h) = new Cpp<Class>(std::move(result));
? Here, and in other places where preallocate
. It would reduce the reliance on placement new.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
preallocate
doesn't do anything if the value is already preallocated, so if you mix old-style placement-new __init__
s with factory inits you get fewer allocation/deallocations. If placement-new constructors are rare/deprecated, it seems a reasonable approach though.
(And on a side note, non-factory inits have a pretty big issue with the constructor being invoked multiple time if being used under multiple inheritance, so discouraging them is probably a good thing).
@@ -203,6 +199,8 @@ class cpp_function : public function { | |||
a.descr = strdup(a.value.attr("__repr__")().cast<std::string>().c_str()); | |||
} | |||
|
|||
rec->is_constructor = !strcmp(rec->name, "__init__") || !strcmp(rec->name, "__setstate__"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this need to be moved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was part of an earlier commit that was since dropped to fix the constructor signature when the first argument was a handle
, but that got replaced with v_h_proxy<Cpp<Class>>
. I think you want me to go back to that, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constructor signature rewriting got reintroduced with removal of the proxy class, so I kept this moved.
include/pybind11/pybind11.h
Outdated
} | ||
|
||
template <typename, typename, typename, typename, typename...> friend struct detail::factory::init; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't need to be friend
.
include/pybind11/factory_support.h
Outdated
|
||
// Makes sure the `value` for the given value_and_holder is not allocated; if it is, deallocate it. | ||
// Returns the (null) value pointer reference. | ||
inline void *&deallocate(value_and_holder &v_h) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under what circumstances is the value already allocated? If __init__
is called multiple times or something else?
BTW In this PR, a good chunk of the doc comments for these internal functions are just restating what the code is doing, but not really why that needs to be done, which is the more important part. E.g. in this case, I find that if (v_h) v_h.type->dealloc(v_h);
is a more concise statement of what is happening compared to the comment, but neither really say why that's needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possible case is when you have overloaded constructors when you have mixed placement new and init factory constructors, such as:
cl.def("__init__", [](Class &self, double i) { new (&self) Class(i); });
cl.def("__init__", py::init([](int j) { return new Class(j); });
when called from Python as Class(3)
to go to the second overload. Because the lazy allocation happens during argument loading, the &self
will get preallocated even though the overload fails, and the other init factories need to deal with it.
I suppose we could deprecated placement new constructors and get rid of this eventually (i.e. in an eventual 3.0), in which case we wouldn't need to worry about this case.
Re: comments, that's a fair criticism; I'll edit them to be more "why" than "what".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes sense with the mixed constructors. Deprecating the placement new constructors sounds good to me.
I've been a bit swamped with other things for the last few days, but I'll try to get this PR updated tomorrow. |
b52a0dc
to
972a4de
Compare
The last commit just pushed should address the outstanding comments. There is still some growth in .so size (~19KB) from the If all looks good, there's one last cleanup I think we should do which is to rename |
a3c7608
to
64075f9
Compare
I've renamed |
(The change looks bigger than it actually is; most of the commit diff is just from moving the static functions in the factory class out, and moving the |
d6aa3ff
to
2629ad0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me overall. Moving all the helper functions out of the factory class is very nice. I left some comments below, but nothing major. The file size is significantly lower than the previous iteration of this PR. I think the current minor increase is well worth it for the multiple inheritance fix and it also reduces the reliance on placement-new and eliminates the allocation/construction split in most cases, which is a nice advantage.
Oh, there seems to be a bit of a leak in the test output capture: https://travis-ci.org/pybind/pybind11/jobs/264412442#L1305
(I've been experimenting a bit and I think there's a way to shave off a bit more size from the binary, but I need to test it more thoroughly and I don't want to hold up this PR anymore. I think it's good as is and I'll see about this extra size saving separately.)
include/pybind11/pybind11.h
Outdated
@@ -244,6 +242,12 @@ class cpp_function : public function { | |||
.cast<std::string>() + "."; | |||
#endif | |||
signature += tinfo->type->tp_name; | |||
} else if (rec->is_constructor && detail::same_type(typeid(handle), *t) && rec->scope) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this match only self
or any argument of type handle
? Add arg_index == 0
?
include/pybind11/detail/init.h
Outdated
|
||
// Takes a Cpp pointer and returns true if it actually is a polymorphic Alias instance. | ||
template <typename Class, enable_if_t<Class::has_alias, int> = 0> | ||
static bool is_alias(Cpp<Class> *ptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftover from old implementation: this and many of the functions below are static
but they shouldn't be.
include/pybind11/detail/init.h
Outdated
return dynamic_cast<Alias<Class> *>(ptr) != nullptr; | ||
} | ||
// Failing fallback version of the above for a no-alias class (always returns false) | ||
template <typename Class> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused template parameters should be unnamed or commented out.
tests/test_factory_constructors.py
Outdated
|
||
@pytest.unsupported_on_py2 | ||
@pytest.mark.parametrize('which, bad', [(1, 1), (1, 2), (6, 1), (6, 2), (6, 3), (6, 4)]) | ||
def test_invalid_self(which, bad, capture): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused capture
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was to suppress the capture leak (which it does on my system, but apparently doesn't on one of the travis-ci builds).
I don't quite understand why that capture leak is happening at all, though. Could the mark.parametrized
be involved somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deparameterizing it worked around the issue.
tests/test_factory_constructors.py
Outdated
a = m.TestFactory2(tag.pointer, 1) | ||
m.TestFactory1.__init__(a, tag.pointer) | ||
elif bad == 2: | ||
m.TestFactory1.__init__(NotPybindDerived.__new__(NotPybindDerived), tag.pointer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why __new__
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking for something allocated but uninitialized. Will it hurt anything? (Edit: It also came straight out of one of the suggestions by @YannickJadoul earlier in the PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to just pass an ordinary non-pybind instance (essentially matching to previous case, except for the non-pybind aspect).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't hurt anything I just thought it looked like unusual syntax for the functionally equivalent NotPybindDerived()
. In general, for a native Python class A
, the constructor A() = A.__new__() + A.__init__()
. If A
doesn't have __init__()
then A() == A.__new__()
, which is the case for NotPybindDerived
in the test.
include/pybind11/detail/init.h
Outdated
}; | ||
|
||
// Implementation class for py::init(Func) and py::init(Func, AliasFunc) | ||
template <typename CFunc, typename CReturn, typename AFuncIn, typename AReturn, typename... Args> struct factory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CReturn
and AReturn
are not used anymore.
include/pybind11/detail/init.h
Outdated
// Fallback of the above for aliases without an `Alias(Cpp &&) constructor; does nothing and | ||
// returns false to signal the non-availability of the special constructor. | ||
template <typename Class, enable_if_t<!std::is_constructible<Alias<Class>, Cpp<Class> &&>::value, int> = 0> | ||
static constexpr bool construct_alias_from_cpp(value_and_holder &, Cpp<Class> &&) { return false; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The control flow built around this feels a bit more complicated than it should be. This return false
results in two different throw type_error
messages below, but they have the same cause. Consider switching to void
functions and place the throw
statement in the false
fallback. (I'm also using simple tag dispatch in the example below because it's a nice lightweight option, especially for binary overloads, and it also nicely maps to the future if constexpr
.)
template <typename Class>
using is_alias_constructible = std::is_constructible<Alias<Class>, Cpp<Class> &&>;
template <typename Class>
void construct_alias_from_cpp(std::true_type /*is_alias_constructible*/,
value_and_holder &v_h, Cpp<Class> &&base) {
deallocate(v_h) = new Alias<Class>(std::move(base));
}
template <typename Class>
[[noreturn]] void construct_alias_from_cpp(std::false_type /*is_alias_constructible*/,
value_and_holder &, Cpp<Class> &&) {
throw type_error("pybind11::init(): unable to convert returned instance to required "
"alias class: no `Alias<Class>(Class &&)` constructor available");
}
if (Class::has_alias && need_alias && !is_alias<Class>(ptr)) {
...
construct_alias_from_cpp<Class>(is_alias_constructible<Class>{}, v_h, std::move(*ptr));
} else {
deallocate(v_h) = ptr;
}
We currently allocate instance values when creating the instance itself (except when constructing the instance for a `cast()`), but there is no particular reason to do so: the instance itself and the internals (for a non-simple layout) are allocated via Python, with no reason to expect better locality from the invoked `operator new`. Moreover, it makes implementation of factory function constructors trickier and slightly less efficient: they don't use the pre-eallocate the memory, which means there is a pointless allocation and free. This commit makes the allocation lazy: instead of preallocating when creating the instance, the allocation happens when the instance is first loaded (if null at that time). In addition to making it more efficient to deal with cases that don't need preallocation, this also allows for a very slight performance increase by not needing to look up the instances types during allocation. (There is a lookup during the eventual load, of course, but that is happening already).
`function_signature_t` extracts the function type from a function, function pointer, or lambda. `is_lambda` (which is really `is_not_a_function_or_pointer_or_member_pointer`, but that name is a bit too long) checks whether the type is (in the approprate context) a lambda. `is_function_pointer` checks whether the type is a pointer to a function.
An alias can be used for two main purposes: to override virtual methods, and to add some extra data to a class needed for the pybind-wrapper. Both of these absolutely require that the wrapped class be polymorphic so that virtual dispatch and destruction, respectively, works.
8fd16e2
to
2dc9078
Compare
This allows you to use: cls.def(py::init(&factory_function)); where `factory_function` returns a pointer, holder, or value of the class type (or a derived type). Various compile-time checks (static_asserts) are performed to ensure the function is valid, and various run-time type checks where necessary. Some other details of this feature: - The `py::init` name doesn't conflict with the templated no-argument `py::init<...>()`, but keeps the naming consistent: the existing templated, no-argument one wraps constructors, the no-template, function-argument one wraps factory functions. - If returning a CppClass (whether by value or pointer) when an CppAlias is required (i.e. python-side inheritance and a declared alias), a dynamic_cast to the alias is attempted (for the pointer version); if it fails, or if returned by value, an Alias(Class &&) constructor is invoked. If this constructor doesn't exist, a runtime error occurs. - for holder returns when an alias is required, we try a dynamic_cast of the wrapped pointer to the alias to see if it is already an alias instance; if it isn't, we raise an error. - `py::init(class_factory, alias_factory)` is also available that takes two factories: the first is called when an alias is not needed, the second when it is. - Reimplement factory instance clearing. The previous implementation failed under python-side multiple inheritance: *each* inherited type's factory init would clear the instance instead of only setting its own type value. The new implementation here clears just the relevant value pointer. - dealloc is updated to explicitly set the leftover value pointer to nullptr and the `holder_constructed` flag to false so that it can be used to clear preallocated value without needing to rebuild the instance internals data. - Added various tests to test out new allocation/deallocation code. - With preallocation now done lazily, init factory holders can completely avoid the extra overhead of needing an extra allocation/deallocation. - Updated documentation to make factory constructors the default advanced constructor style. - If an `__init__` is called a second time, we have two choices: we can throw away the first instance, replacing it with the second; or we can ignore the second call. The latter is slightly easier, so do that.
This reimplements the py::init<...> implementations using the various functions added to support `py::init(...)`, and moves the implementing structs into `detail/init.h` from `pybind11.h`. It doesn't simply use a factory directly, as this is a very common case and implementation without an extra lambda call is a small but useful optimization. This, combined with the previous lazy initialization, also avoids needing placement new for `py::init<...>()` construction: such construction now occurs via an ordinary `new Type(...)`. A consequence of this is that it also fixes a potential bug when using multiple inheritance from Python: it was very easy to write classes that double-initialize an existing instance which had the potential to leak for non-pod classes. With the new implementation, an attempt to call `__init__` on an already-initialized object is now ignored. (This was already done in the previous commit for factory constructors). This change exposed a few warnings (fixed here) from deleting a pointer to a base class with virtual functions but without a virtual destructor. These look like legitimate warnings that we shouldn't suppress; this adds virtual destructors to the appropriate classes.
2dc9078
to
8a48285
Compare
Fixed up everything and squashed the commits down; I think this is ready to go (once the builds finish). |
Merged! |
auto *cl_type = get_type_info(typeid(Cpp<Class>)); | ||
cl.def("__init__", [cl_type](handle self_, Args... args) { | ||
auto v_h = load_v_h(self_, cl_type); | ||
if (v_h.instance_registered()) return; // Ignore duplicate __init__ calls (see above) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stumbled across this comment (duplicate init calls) while looking at init.h
and was a bit puzzled. I wasn't able to find what the "see above" comment refers to. Would you mind clarifying this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the code got shuffled around; the comment it's referring to is at lines 258/259 now. I'll fix it up on master
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(They used to be right next to each other).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 9f6a636.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, just got a conflict with that commit :) Those checks are gone now in #1014.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was pretty incredibly bad timing :)
It was only necessary because of a bug the pull request (pybind/pybind11#805).
Edit:
Redid this (a few times). The interface is now:
where
factory_function
is some pointer- or holder-generating factory function or lambda of the type thatcls
binds, such as:Internally, this still results in a (non-static)
__init__
method, but we handle the C++-static-factory <-> python method translation by calling the factory function, then stealing the resulting object internal pointer and holder.