Matrix conversion and variant issue #916

classner · 2017-06-21T22:27:44Z

Hi!

I came across the following issue working with variants of Eigen matrices (which is an amazing feature, by the way):

Issue description

When calling the function, parameterized with a variant over Eigen const matrix references with an Eigen matrix over a certain size that needs to be converted, I get a segmentation fault. This works flawlessly with smaller matrices. I do not suspect the variant implementation is the problem, because if I shift the conversion out of pybind11 and do it myself in a c++ function, it works flawlessly, also it works if the matrix doesn't have to be converted.

Reproducible example code

example.py:

#!/usr/bin/env python2
import example
import numpy as np

# Works.
example.test_v(np.ones((30, 10))[:, (7, 3, 9, 5)])

# Segmentation fault (conversion because of stride).
#example.test_v(np.ones((500000, 10))[:, (7, 3, 9, 5)])
# Segmentation fault (conversion because of dtype).
for i in range(71475):
    print i
    # Fails for me at i = 71474, i.e., 71475 is the first with too much data.
    example.test_v(np.ones((i+1,), dtype=np.uint8))

# Raises correctly.
# example.test_v_noconv(np.ones((500000, 10))[:, (7, 3, 9, 5)])

CMakeLists.txt:

cmake_minimum_required(VERSION 3.2)
project(example)

add_subdirectory(pybind11-master)
include_directories (eigen3-v3.3)
include_directories (variant-v1.1.4/include)

pybind11_add_module(example MODULE example.cpp)

add_custom_command (TARGET example POST_BUILD
  COMMAND "${CMAKE_COMMAND}" -E copy
     "$<TARGET_FILE:example>"
     "${PROJECT_SOURCE_DIR}/$<TARGET_FILE_NAME:example>"
  COMMENT "Copying to root directory")

example.cpp

#include <pybind11/pybind11.h>
#include <pybind11/eigen.h>
#include <pybind11/stl.h>
#include <Eigen/Dense>
#include <mapbox/variant.hpp>

namespace py = pybind11;
namespace mu = mapbox::util;

// Register the variant.
namespace pybind11 { namespace detail {

    template <typename... Ts>
    struct type_caster<mapbox::util::variant<Ts...>> :
    variant_caster<mapbox::util::variant<Ts...>> {};

    template <>
    struct visit_helper<mapbox::util::variant> {
      template <typename... Args>
      static auto call(Args &&...args)
        -> decltype(mapbox::util::apply_visitor(std::forward<Args>(args)...)) {
        return mapbox::util::apply_visitor(std::forward<Args>(args)...);
      }
    };

} }  // namespace detail  namespace pybind11

template <typename DT>
using Mat = Eigen::Matrix<DT,
                          Eigen::Dynamic,
                          Eigen::Dynamic,
                          Eigen::RowMajor>;

template <typename DT>
using MatCRef = Eigen::Ref<const Mat<DT>>;

struct Empty{};

PYBIND11_MODULE(example, m) {
  m.doc() = "Example module";

  m.def("test_v", [](const mu::variant<Empty,
                                       MatCRef<float>,
                                       MatCRef<double>> &val_v) {
          val_v.match([&](const auto &val) {
              std::cout << val(0, 0) << std::endl;
            },
            [](const Empty &) {});
        });

  m.def("test_v_noconv", [](const mu::variant<Empty,
                                              MatCRef<float>,
                                              MatCRef<double>> &val_v) {
          val_v.match([&](const auto &val) {
              std::cout << val(0, 0) << std::endl;
            },
            [](const Empty &) {});
        },
        py::arg().noconvert());

}

This assumes that pybind11-master, eigen3-v3.3 and variant-v1.1.4 are subfolders, variant can be downloaded from here, but probably std::variant can be used in the same way. If .noconvert() is used, the function works just as expected, which I use as a workaround.

Another probably only slightly related issue I came across in this setting is the keep_alive policy, which also doesn't work for these kind of converted arguments (I assume the original object that is passed to the variant is kept alive, but the converted object that the Eigen::Ref points to in the end is not). I'm not 100% sure on this one, though, and may open a dedicated issue once I have a proper test case.

The text was updated successfully, but these errors were encountered:

jagerman · 2017-06-22T03:32:32Z

The core issue here is that pybind's variant caster doesn't keep the subcaster alive: it creates subcasters on the fly until one succeeds, at which point it copies out the value into its own variant<X,Y,Z> but doesn't preserve the X, Y, or Z caster.

This breaks the eigen caster for types like Eigen::Ref<Eigen::MatrixXd>: the object that we load() into (i.e. the type caster value) needs to reference existing memory, which we do via an Eigen::Map that references the source numpy array data. If no conversion is needed, this is fine (for example, if you change the dtype=np.uint8 to dtype=np.float32, i.e. one of the variant types, in example.py): the Eigen::Ref directly references the numpy data.

When a conversion is needed, on the other hand—as in your test case, with either incompatible strides or an incompatible type (np.uint8)—the eigen caster holds the temporary, converted py::array as a caster member and points the Eigen::Ref into this temporary array data. The variant caster only keeps the value (the Eigen::Ref), but since it doesn't keep the eigen caster object itself around, by the time we extract the value from the variant caster it's just an Eigen::Ref that is pointing into invalid memory. You hit the segfault at 71475, but that's just a combination of luck and how numpy/python allocate and free memory: you could have hit it at 1 because the whole thing is invalid memory.

I pushed a fix to https://github.com/jagerman/pybind11/tree/variant-hold-subcaster that fixes it for variant (it's a pretty elegant fix, I think: reusing the variant template type itself to make a subcaster variant) but I haven't pushed it as a PR yet because the same issue is present in optional, map_caster, list_caster, etc. For optional storing a subcaster (or even an optional<subcaster>) would be fine, but I don't think it's a great approach for, e.g., list_caster: we'd have to store a parallel list of subcasters with one stored subcaster for every list conversion, even though the vast majority of the time we aren't likely to need the caster to be preserved. (Even the Eigen::Ref caster itself only needs the subcaster to be preserved when an intermediate copy is required).

A nicer alternative is to allow type casters to export a py::object dependency, and have variant/list/etc. casters hold the dependency when exported by a type caster. (If multiple dependences are needed, the py::object could easily hold a python list).

The caster overhaul proposed in #864 suggests a nicer way of doing this: maybe<T> could store an optional dependency in itself. A successful variant/optional/list/etc. subcaster load would then steal the dependency, if set; the stl container type casters would simply add any subcaster dependencies into their own maybe<T> dependency (which would presumably be a py::list). The Eigen::Ref caster would then simply set the dependency field to its internal temporary py::array—when it actually needs it (i.e. when a copy is needed, as in the original issue). In the vast majority of cases, no dependencies are needed, but for things like Eigen::Ref that can't avoid one, things don't break when going through subcasters.

Cc @dean0x7d, @wjakob for thoughts.

jagerman · 2017-06-22T05:18:39Z

#917 ought to fix this by keeping a dependency (or list of dependencies) as required when using stl.h subcasters. #864's maybe<T> will need to incorporate something similar, as I described above, but that PR should fix the issue in the meantime.

classner · 2017-06-22T09:38:29Z

This is great, thank you for the quick fix and the detailed discussion of the issue! The concept described in the caster overhaul sounds sensible and like a good solution on the longer run. How would keep_alives be handled in that case transparently? There are multiple scenarios possible:

No keep_alive: dependencies don't have to be stored, no problem.
keep_alive fully applied: the unconverted and the converted objects are being kept alive. Advantage: transparent for the library user, because the object that is provided as argument 'really' stays alive in any case, not only the converted one. Downside: maybe it's not necessary to keep both alive and it would be sufficient to keep the converted one, memory overhead.
keep_alive only applies to the possibly converted object. Advantage: memory efficient. Disadvantage: maybe the user expects the object to stick around, due to gc effects, destructor operations, etc. Broadly speaking, this may not be what the user 'semantically' expects when using keep_alive.

Maybe two versions of keep_alive would be appropriate, transporting this choice to the user. Or am I thinking along the wrong lines here?

jagerman · 2017-06-27T16:46:12Z

Closed #917 in favour or #924, which solves it in a nicer way (by tying the temporaries lifespan to the current in-progress pybind11 function call).

classner · 2017-06-28T17:51:08Z

I read through the comments of #924 and am not sure which one of the ways for handling keep_alives is used in the end. I assume if keep_alive is set, both, original and temporary objects are being kept alive? This is great because it ensures correctness for now. Are you considering to also introduce a keep_alive option that only either keeps the converted object alive, if required, otherwise the original one?

dean0x7d · 2017-06-28T18:07:18Z

Special effort is only taken to keep the converted object alive. The original stays alive on its own because it's owned by the caller. The original object comes into the function as a borrowed reference and we're not allowed to shorten its lifetime. On the other hand, Its lifetime could be extended if the converted object depends on it, but that usually happens automatically within Python so we don't need to worry about it (that is, we don't need a special life support system like for the converted object).

jagerman · 2017-06-28T18:23:50Z

The keep-alive in question is entirely internal to the pybind implementation. Essentially what we do is created an internal python list and, if a temporary array is needed, we store a reference to it in that list, which keeps it (and more importantly, its data) alive.

The list itself is cleared when the function call ends, at which point destruction of the temporary is allowed to occur. So essentially, when you call a bound function with a conforming type (i.e. matching dtype and compatible strides), the Eigen::Ref points directly into the input array data. If you call the bound function with a non-conforming type, it'll work similarly to calling a wrapper like this:

def f_wrap(m):
    x = np.array(m, dtype='double', order='F')
    f(x)
f_wrap(myarray)

where the temporary x lives for the duration of the f() call. (Of course we're not actually calling another function, and this works for an arbitrary number of arguments, etc., but the basic idea of keeping a reference alive for the duration of the call is the same).

classner · 2017-06-28T19:32:31Z

Ok, great! Thanks for the quick answers and the good solution!

This was referenced Jun 22, 2017

Revising the design of type_caster<T>::load() #864

Open

Allow type_casters to expose temporary dependencies #917

Closed

dean0x7d mentioned this issue Jun 26, 2017

Add a life support system for type_caster temporaries (+ binary size reduction) #924

Merged

classner closed this as completed Jun 28, 2017

rwgk mentioned this issue Feb 9, 2023

FWD pybind11 google/pybind11clif#916

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matrix conversion and variant issue #916

Matrix conversion and variant issue #916

classner commented Jun 21, 2017

jagerman commented Jun 22, 2017

jagerman commented Jun 22, 2017

classner commented Jun 22, 2017

jagerman commented Jun 27, 2017

classner commented Jun 28, 2017

dean0x7d commented Jun 28, 2017

jagerman commented Jun 28, 2017

classner commented Jun 28, 2017

Matrix conversion and variant issue #916

Matrix conversion and variant issue #916

Comments

classner commented Jun 21, 2017

Issue description

Reproducible example code

jagerman commented Jun 22, 2017

jagerman commented Jun 22, 2017

classner commented Jun 22, 2017

jagerman commented Jun 27, 2017

classner commented Jun 28, 2017

dean0x7d commented Jun 28, 2017

jagerman commented Jun 28, 2017

classner commented Jun 28, 2017