Skip to content

Matrix conversion and variant issue #916

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
classner opened this issue Jun 21, 2017 · 8 comments · Fixed by #924
Closed

Matrix conversion and variant issue #916

classner opened this issue Jun 21, 2017 · 8 comments · Fixed by #924

Comments

@classner
Copy link

Hi!

I came across the following issue working with variants of Eigen matrices (which is an amazing feature, by the way):

Issue description

When calling the function, parameterized with a variant over Eigen const matrix references with an Eigen matrix over a certain size that needs to be converted, I get a segmentation fault. This works flawlessly with smaller matrices. I do not suspect the variant implementation is the problem, because if I shift the conversion out of pybind11 and do it myself in a c++ function, it works flawlessly, also it works if the matrix doesn't have to be converted.

Reproducible example code

example.py:

#!/usr/bin/env python2
import example
import numpy as np

# Works.
example.test_v(np.ones((30, 10))[:, (7, 3, 9, 5)])

# Segmentation fault (conversion because of stride).
#example.test_v(np.ones((500000, 10))[:, (7, 3, 9, 5)])
# Segmentation fault (conversion because of dtype).
for i in range(71475):
    print i
    # Fails for me at i = 71474, i.e., 71475 is the first with too much data.
    example.test_v(np.ones((i+1,), dtype=np.uint8))

# Raises correctly.
# example.test_v_noconv(np.ones((500000, 10))[:, (7, 3, 9, 5)])

CMakeLists.txt:

cmake_minimum_required(VERSION 3.2)
project(example)

add_subdirectory(pybind11-master)
include_directories (eigen3-v3.3)
include_directories (variant-v1.1.4/include)

pybind11_add_module(example MODULE example.cpp)

add_custom_command (TARGET example POST_BUILD
  COMMAND "${CMAKE_COMMAND}" -E copy
     "$<TARGET_FILE:example>"
     "${PROJECT_SOURCE_DIR}/$<TARGET_FILE_NAME:example>"
  COMMENT "Copying to root directory")

example.cpp

#include <pybind11/pybind11.h>
#include <pybind11/eigen.h>
#include <pybind11/stl.h>
#include <Eigen/Dense>
#include <mapbox/variant.hpp>

namespace py = pybind11;
namespace mu = mapbox::util;

// Register the variant.
namespace pybind11 { namespace detail {

    template <typename... Ts>
    struct type_caster<mapbox::util::variant<Ts...>> :
    variant_caster<mapbox::util::variant<Ts...>> {};

    template <>
    struct visit_helper<mapbox::util::variant> {
      template <typename... Args>
      static auto call(Args &&...args)
        -> decltype(mapbox::util::apply_visitor(std::forward<Args>(args)...)) {
        return mapbox::util::apply_visitor(std::forward<Args>(args)...);
      }
    };

} }  // namespace detail  namespace pybind11

template <typename DT>
using Mat = Eigen::Matrix<DT,
                          Eigen::Dynamic,
                          Eigen::Dynamic,
                          Eigen::RowMajor>;

template <typename DT>
using MatCRef = Eigen::Ref<const Mat<DT>>;

struct Empty{};

PYBIND11_MODULE(example, m) {
  m.doc() = "Example module";

  m.def("test_v", [](const mu::variant<Empty,
                                       MatCRef<float>,
                                       MatCRef<double>> &val_v) {
          val_v.match([&](const auto &val) {
              std::cout << val(0, 0) << std::endl;
            },
            [](const Empty &) {});
        });

  m.def("test_v_noconv", [](const mu::variant<Empty,
                                              MatCRef<float>,
                                              MatCRef<double>> &val_v) {
          val_v.match([&](const auto &val) {
              std::cout << val(0, 0) << std::endl;
            },
            [](const Empty &) {});
        },
        py::arg().noconvert());

}

This assumes that pybind11-master, eigen3-v3.3 and variant-v1.1.4 are subfolders, variant can be downloaded from here, but probably std::variant can be used in the same way. If .noconvert() is used, the function works just as expected, which I use as a workaround.

Another probably only slightly related issue I came across in this setting is the keep_alive policy, which also doesn't work for these kind of converted arguments (I assume the original object that is passed to the variant is kept alive, but the converted object that the Eigen::Ref points to in the end is not). I'm not 100% sure on this one, though, and may open a dedicated issue once I have a proper test case.

@jagerman
Copy link
Member

The core issue here is that pybind's variant caster doesn't keep the subcaster alive: it creates subcasters on the fly until one succeeds, at which point it copies out the value into its own variant<X,Y,Z> but doesn't preserve the X, Y, or Z caster.

This breaks the eigen caster for types like Eigen::Ref<Eigen::MatrixXd>: the object that we load() into (i.e. the type caster value) needs to reference existing memory, which we do via an Eigen::Map that references the source numpy array data. If no conversion is needed, this is fine (for example, if you change the dtype=np.uint8 to dtype=np.float32, i.e. one of the variant types, in example.py): the Eigen::Ref directly references the numpy data.

When a conversion is needed, on the other hand—as in your test case, with either incompatible strides or an incompatible type (np.uint8)—the eigen caster holds the temporary, converted py::array as a caster member and points the Eigen::Ref into this temporary array data. The variant caster only keeps the value (the Eigen::Ref), but since it doesn't keep the eigen caster object itself around, by the time we extract the value from the variant caster it's just an Eigen::Ref that is pointing into invalid memory. You hit the segfault at 71475, but that's just a combination of luck and how numpy/python allocate and free memory: you could have hit it at 1 because the whole thing is invalid memory.

I pushed a fix to https://github.com/jagerman/pybind11/tree/variant-hold-subcaster that fixes it for variant (it's a pretty elegant fix, I think: reusing the variant template type itself to make a subcaster variant) but I haven't pushed it as a PR yet because the same issue is present in optional, map_caster, list_caster, etc. For optional storing a subcaster (or even an optional<subcaster>) would be fine, but I don't think it's a great approach for, e.g., list_caster: we'd have to store a parallel list of subcasters with one stored subcaster for every list conversion, even though the vast majority of the time we aren't likely to need the caster to be preserved. (Even the Eigen::Ref caster itself only needs the subcaster to be preserved when an intermediate copy is required).

A nicer alternative is to allow type casters to export a py::object dependency, and have variant/list/etc. casters hold the dependency when exported by a type caster. (If multiple dependences are needed, the py::object could easily hold a python list).

The caster overhaul proposed in #864 suggests a nicer way of doing this: maybe<T> could store an optional dependency in itself. A successful variant/optional/list/etc. subcaster load would then steal the dependency, if set; the stl container type casters would simply add any subcaster dependencies into their own maybe<T> dependency (which would presumably be a py::list). The Eigen::Ref caster would then simply set the dependency field to its internal temporary py::array—when it actually needs it (i.e. when a copy is needed, as in the original issue). In the vast majority of cases, no dependencies are needed, but for things like Eigen::Ref that can't avoid one, things don't break when going through subcasters.

Cc @dean0x7d, @wjakob for thoughts.

@jagerman
Copy link
Member

#917 ought to fix this by keeping a dependency (or list of dependencies) as required when using stl.h subcasters. #864's maybe<T> will need to incorporate something similar, as I described above, but that PR should fix the issue in the meantime.

@classner
Copy link
Author

This is great, thank you for the quick fix and the detailed discussion of the issue! The concept described in the caster overhaul sounds sensible and like a good solution on the longer run. How would keep_alives be handled in that case transparently? There are multiple scenarios possible:

  • No keep_alive: dependencies don't have to be stored, no problem.
  • keep_alive fully applied: the unconverted and the converted objects are being kept alive. Advantage: transparent for the library user, because the object that is provided as argument 'really' stays alive in any case, not only the converted one. Downside: maybe it's not necessary to keep both alive and it would be sufficient to keep the converted one, memory overhead.
  • keep_alive only applies to the possibly converted object. Advantage: memory efficient. Disadvantage: maybe the user expects the object to stick around, due to gc effects, destructor operations, etc. Broadly speaking, this may not be what the user 'semantically' expects when using keep_alive.

Maybe two versions of keep_alive would be appropriate, transporting this choice to the user. Or am I thinking along the wrong lines here?

@jagerman
Copy link
Member

Closed #917 in favour or #924, which solves it in a nicer way (by tying the temporaries lifespan to the current in-progress pybind11 function call).

@classner
Copy link
Author

I read through the comments of #924 and am not sure which one of the ways for handling keep_alives is used in the end. I assume if keep_alive is set, both, original and temporary objects are being kept alive? This is great because it ensures correctness for now. Are you considering to also introduce a keep_alive option that only either keeps the converted object alive, if required, otherwise the original one?

@dean0x7d
Copy link
Member

Special effort is only taken to keep the converted object alive. The original stays alive on its own because it's owned by the caller. The original object comes into the function as a borrowed reference and we're not allowed to shorten its lifetime. On the other hand, Its lifetime could be extended if the converted object depends on it, but that usually happens automatically within Python so we don't need to worry about it (that is, we don't need a special life support system like for the converted object).

@jagerman
Copy link
Member

The keep-alive in question is entirely internal to the pybind implementation. Essentially what we do is created an internal python list and, if a temporary array is needed, we store a reference to it in that list, which keeps it (and more importantly, its data) alive.

The list itself is cleared when the function call ends, at which point destruction of the temporary is allowed to occur. So essentially, when you call a bound function with a conforming type (i.e. matching dtype and compatible strides), the Eigen::Ref points directly into the input array data. If you call the bound function with a non-conforming type, it'll work similarly to calling a wrapper like this:

def f_wrap(m):
    x = np.array(m, dtype='double', order='F')
    f(x)
f_wrap(myarray)

where the temporary x lives for the duration of the f() call. (Of course we're not actually calling another function, and this works for an arbitrary number of arguments, etc., but the basic idea of keeping a reference alive for the duration of the call is the same).

@classner
Copy link
Author

Ok, great! Thanks for the quick answers and the good solution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants