Skip to content

[BUG]: Segmentation fault on Windows with abstract classss #3217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
TomWildenhain-Microsoft opened this issue Aug 25, 2021 · 9 comments
Open
3 tasks done
Labels
triage New bug, unverified

Comments

@TomWildenhain-Microsoft
Copy link

TomWildenhain-Microsoft commented Aug 25, 2021

Required prerequisites

Problem description

A have two abstract C++ class: Door and Knob. Door returns a Knob with Door::GetKnob. The knob can be pulled with Knob::Pull(). Door successfully calls Door::GetKnob and gets a non-null knob, but the code crashes when Knob::Pull is called.

Reproducible example code

C++

#include <iostream>
#include <pybind11/pybind11.h>

namespace py = pybind11;

int simple_add(int i, int j) {
    return i + j;
}

class Knob {
 public:
  virtual void Pull() = 0;
};

class Door {
 public:
  virtual std::string_view Name() = 0;
  virtual Knob& GetKnob() = 0;
};

void open_door(Door& door) {
  std::string_view name = door.Name();
  std::cout << name << std::endl;
  Knob& knob = door.GetKnob();
  std::cout << "Got the knob" << std::endl;
  knob.Pull();
  std::cout << "Pulled the knob" << std::endl;
}

class PyKnob : public Knob {
 public:
  using Knob::Knob;
  void Pull() {
    PYBIND11_OVERRIDE_PURE(void, Knob, Pull);
  }
};

class PyDoor : public Door {
 public:
  using Door::Door;
  std::string_view Name() {
    PYBIND11_OVERRIDE_PURE(std::string_view, Door, Name);
  }
  Knob& GetKnob() {
    PYBIND11_OVERRIDE_PURE(Knob&, Door, GetKnob);
  }
};

PYBIND11_MODULE(pybind_test, m) {
  m.def("simple_add", &simple_add);
  m.def("open_door", &open_door);
  py::class_<Knob, PyKnob>(m, "Knob")
    .def(py::init<>())
    .def("Pull", &Knob::Pull);
  py::class_<Door, PyDoor>(m, "Door")
    .def(py::init<>())
    .def("Name", &Door::Name)
    .def("GetKnob", &Door::GetKnob);
}

Python

import pybind_test

print(pybind_test.simple_add(3, 4))

class MyKnob(pybind_test.Knob):
    def __init__(self):
        pybind_test.Knob.__init__(self)

    def Pull(self):
        print("Hey!")

class MyDoor(pybind_test.Door):
    def __init__(self, name):
        pybind_test.Door.__init__(self)
        self.name = name

    def Name(self):
        return self.name
    
    def GetKnob(self):
        return MyKnob()

print("Opening")
pybind_test.open_door(MyDoor("I'm a door."))
print("Done")

Output

7
Opening
I'm a door.
Got the knob

Program crashes at this point. (while calling knob.Pull();)

@TomWildenhain-Microsoft TomWildenhain-Microsoft added the triage New bug, unverified label Aug 25, 2021
@TomWildenhain-Microsoft
Copy link
Author

The problem appears to be that Knob is getting deallocated before it is returned by

    def GetKnob(self):
        return MyKnob()

Setting a global
MYKNOB = MyKnob()
and returning it

    def GetKnob(self):
        return MYKNOB 

works.

@TomWildenhain-Microsoft
Copy link
Author

I tried the return value policies but it's a little tricky here, since the passing by reference means the C++ can't take ownership of the knob and python will always destroy the object if it is returned without referencing it elsewhere. The above hack actually works fairly well for my purposes. Feel free to close this if there is no better resolution.

btw the classes page of the documentation should maybe link to the return value policies page.

@henryiii
Copy link
Collaborator

I get:

RuntimeError: Tried to call pure virtual function "Knob::Pull"

@henryiii
Copy link
Collaborator

henryiii commented Aug 26, 2021

Also, this looks very problematic to me. You are returning a reference to an object you've created, but how does C++ know when to remove that object, and how does it know the object is actually larger than the reference you have (slicing issue)? If you can use smart pointers, I think this would be much simpler and easier. If not, I'd try to use move, not references. Expecting a user to remember to add a & for a return value is very error prone.

In general, no references to virtual classes in C++. They get sliced. Always use (smart) pointers.

@laramiel
Copy link
Contributor

laramiel commented Sep 2, 2021

Try allocating the knob in the MyDoor constructor:

class MyDoor(pybind_test.Door):
    def __init__(self, name):
        pybind_test.Door.__init__(self)
        self.name = name
        self.knob = MyKnob()

    def Name(self):
        return self.name
    
    def GetKnob(self):
        return self.knob

Then at least you're not returning a temporary.

@TomWildenhain-Microsoft
Copy link
Author

@laramiel yes that works but isn't the behavior I want. Pretend it's called "Knob factory" instead ;-). Also I tried shared_pointers and return value policies and couldn't get it to stop deallocating on return. The failure of shared_pointer really surprised me.

@henryiii
Copy link
Collaborator

henryiii commented Sep 3, 2021

I've already said - you can't return a reference to a temporary object with a virtual class. It doesn't work. The compiler slices off part of your object. It's UB.

Using shared pointers should be just fine. Do you have the shared pointer version to look at?

@TomWildenhain-Microsoft
Copy link
Author

Using shared pointers should be just fine. Do you have the shared pointer version to look at?

Thanks for your help. Replace all Knob& with std::shared_ptr<Knob> to get the shared pointer version.

@TomWildenhain-Microsoft
Copy link
Author

The cause of the issue is described here:
#673 (comment)

You were on the right track with shared_ptr. The reason you get the error about calling a virtual call() is that the Python object doesn't exist anymore at the time you're making the call. You still have the C++ instance, in the shared_ptr, but the Python instance (which was returned by A.create()) isn't around anymore because nothing references it once the call is over: pybind11 itself only holds on to it long enough to convert the value to a C++ type (std::shared_ptr in this case), which means that in your testing function, your t will be a shared pointer to a PyAnimal, but when pybind11 tries to look up the python instance associated with t to find the overridden call function there is no Python instance anymore.

It would be great if pybind could keep the python instance around when there is a shared pointer to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage New bug, unverified
Projects
None yet
Development

No branches or pull requests

3 participants