Skip to content

Check scope's __dict__ instead of using hasattr when registering classes and exceptions #2335

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 8, 2020

Conversation

YannickJadoul
Copy link
Collaborator

@YannickJadoul YannickJadoul commented Jul 27, 2020

This allows registering the same name in a derived class scope, and fixes #1624

Disadvantage:

  • The check is slower.
  • Wouldn't work with custom __getattribute__/__setattribute__ (maybe now it already wouldn't, either?).

@@ -917,7 +917,7 @@ class generic_type : public object {
PYBIND11_OBJECT_DEFAULT(generic_type, object, PyType_Check)
protected:
void initialize(const type_record &rec) {
if (rec.scope && hasattr(rec.scope, rec.name))
if (rec.scope && hasattr(rec.scope, "__dict__") && rec.scope.attr("__dict__").contains(rec.name))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially, we could drop hasattr(rec.scope, "__dict__"), if we're happy throwing an exception when the user tries to use a scope that doesn't have a __dict__ attribute?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of throwing an exception in this situation worries me, a lot.

I'm surprised to see the strong runtime performance concerns from both Yannick and Boris. Is it correct that this code only runs when a pybind11 module is imported? Is that so slow that people complain already? What's the incremental extra runtime overhead? My first guess is it's no measurable, given all the other things happening at import time (files are opened, dlls loaded), but maybe I'm overlooking something?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised to see the strong runtime performance concerns from both Yannick and Boris.

It's not really a concern. It is only run once at import time, so it hardly matter.

What's the incremental extra runtime overhead?

I wasn't measuring import whatever, but rather this specific patch in isolation.

>>> timeit.timeit(stmt='B().__dict__.get("X")', setup='class A:\n  def X(self):pass\nclass B(A):pass')
0.17064448199380422
>>> timeit.timeit(stmt='hasattr(B(), "X")', setup='class A:\n  def X(self):pass\nclass B(A):pass')
0.12798299700079951

Again, this is executed exactly once. It really shouldn't be a concern. I just wanted to satisfy my own curiosity.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too worried about it, either! Just want to point it out.
The reasoning is/was, if hasattr(rec.scope, "__dict__") fails, most likely the later scope.attr(name) = *this; will also fail/throw an exception?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I misunderstood this code initially (github is helpfully hiding the next line...). Would this be logically equivalent?

if (rec.scope && hasattr(rec.scope, rec.name) && hasattr(rec.scope, "__dict__") && rec.scope.attr("__dict__").contains(rec.name))

Now suspecting: the existing hasattr returns true, but we want to raise an error only if we find that the attribute is defined in __dict__?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how to solve this, though. Checking hasattr/getattr is too wide (i.e., checks base classes), while __dict__ is too strict (would miss the crazy case where __getattr__/__setattr__, or __getattribute__/__setattribute__, would be overwritten).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we shouldn't fix it? If it's this hard to do it right, perhaps we shouldn't "fix" it and break things that currently work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting suggestion, yes. I must say the current situation feels wrong, though, where it matters in which order you register a class first in the base class or first in a derived class.

And the above shouldn't break too much. there's a thing where, if getattr and setattr are overwritten, things might already be broken anyway. In any case, the scope should be a class or a module, no?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, the scope should be a class or a module, no?

Do you want the good news or the bad news? The bad news?

>>> def f():print('f')
...
>>> def g():print('g')
...
>>> f.g = g
>>> del g
>>> f.g()
g
>>> f()
f

The good news is that using functions as scopes isn't allowed with pybind. I tried. I really really tried to make pybind accept a function object as a scope.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm indeed not sure if this is good or bad news :-/

@YannickJadoul YannickJadoul requested review from rwgk and bstaletic July 27, 2020 23:46
@bstaletic
Copy link
Collaborator

Wouldn't work with custom getattribute/setattribute (maybe now it already wouldn't, either?).

It used to work. PyObject_GetAttrString(o, "attr") calls

This is so horribly expensive. By the way, there's a detail I omitted. PyObject_HasAttrString(o, "str"). Because it just defers to PyObject_GetAttrString(o, "str") and if that throws just calls PyErr_Clear() to swallow it.

Oh, by the way, tp_getattr is deprecated.

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Yannick and Boris for the explanations re runtime performance! Based on that this fix looks great to me, my only suggestion is to harden the test a bit.

@henryiii
Copy link
Collaborator

henryiii commented Oct 2, 2020

Needs rebasing.

@YannickJadoul
Copy link
Collaborator Author

Rebased. I do still want to give this solution another look, btw, now that we're a few months later and potentially wiser.

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR was included in Google-global testing: no issues found.

@wjakob
Copy link
Member

wjakob commented Oct 5, 2020

This change looks fine to me -- runtime cost of pybind11 module initialization is usually not much of a concern (it's very fast.), so the extra check is negligible. One question though: you added some tests that ensure that the new kinds of scoped class declarations work, but are we ever checking that one cannot define the same class twice within the same module? IIRC no, in which case it would be good to add an extra test for that.

@YannickJadoul
Copy link
Collaborator Author

OK, if everyone's fine with not supporting weird metaclasses overriding __getattribute__ and __setattribute__ and bypassing __dict__, I'm fine.

On potential improvement (mentioned above) would be this:

Would if (rec.scope && hasattr(rec.scope, rec.name) && (!hasattr(rec.scope, "__dict__") || rec.scope.attr("__dict__").contains(rec.name))) even be better?

I.e.:

  • if we have no scope: weird, but OK, nevermind, we can't override anything

  • if hasattr is false: fine, not problem, we can setattr the new class

  • if hasattr is true, and we don't have a __dict__: aaaargh, something weird is going on in this type; and the getattr (and probably setattr as well) mechanism has been overwritten; so, custom case: go ahead, but you are probably specialized enough to find a way around it

  • if hasattr is true, and we do have a __dict__: check if the dict contains it, if not, we probably found it in a parent class's scope

But this is too complex?

I'll still quickly add a test! :-)

@YannickJadoul
Copy link
Collaborator Author

One question though: you added some tests that ensure that the new kinds of scoped class declarations work, but are we ever checking that one cannot define the same class twice within the same module? IIRC no, in which case it would be good to add an extra test for that.

I searched for the error message, couldn't find it in the tests, and added test_register_duplicate_class.

@YannickJadoul YannickJadoul force-pushed the fix-derived-class-scope branch from aa61989 to 8368c76 Compare October 6, 2020 14:28
@henryiii henryiii closed this Oct 6, 2020
@henryiii henryiii reopened this Oct 6, 2020
@YannickJadoul YannickJadoul force-pushed the fix-derived-class-scope branch 3 times, most recently from c2e4dba to 8368c76 Compare October 6, 2020 19:54
YannickJadoul added a commit to YannickJadoul/pybind11 that referenced this pull request Oct 8, 2020
@YannickJadoul YannickJadoul force-pushed the fix-derived-class-scope branch 3 times, most recently from 584046f to 7b4a47c Compare October 8, 2020 21:42
…cate_class and test_factory_constructors.py::test_init_factory_alias
@YannickJadoul YannickJadoul force-pushed the fix-derived-class-scope branch from 7b4a47c to 0f66572 Compare October 8, 2020 21:47
@YannickJadoul
Copy link
Collaborator Author

YannickJadoul commented Oct 8, 2020

OK, this is finally green, and I'm fed up with this. I'm merging this.

I still have another idea, but that'll be for a separate issue/next release :-)
Nvm, probably a bad idea, I now realize. I'm probably just unnecessarily worried about this setattr and getattr.

@YannickJadoul YannickJadoul merged commit 71aea49 into pybind:master Oct 8, 2020
@YannickJadoul
Copy link
Collaborator Author

Thanks everyone for all the reviews and help! :-)

@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Oct 8, 2020
@YannickJadoul YannickJadoul deleted the fix-derived-class-scope branch October 8, 2020 23:10
YannickJadoul added a commit to YannickJadoul/pybind11 that referenced this pull request Oct 9, 2020
YannickJadoul added a commit to YannickJadoul/pybind11 that referenced this pull request Oct 9, 2020
…re-enabling test_register_duplicate_class again
YannickJadoul added a commit to YannickJadoul/pybind11 that referenced this pull request Oct 9, 2020
YannickJadoul added a commit to YannickJadoul/pybind11 that referenced this pull request Oct 9, 2020
YannickJadoul added a commit to YannickJadoul/pybind11 that referenced this pull request Oct 15, 2020
YannickJadoul added a commit to YannickJadoul/pybind11 that referenced this pull request Oct 15, 2020
henryiii pushed a commit that referenced this pull request Oct 16, 2020
…duplicate_class (#2564)

* Demonstrate test_factory_constructors.py failure without functional changes from #2335

* Revert "Demonstrate test_factory_constructors.py failure without functional changes from #2335"

This reverts commit ca33a80.

* Fix test crash where registered Python type gets garbage collected

* Clean up some more internal structures when class objects go out of scope

* Reduce length of std::erase_if-in-C++20 comment

* Clean up code for cleaning up type internals

* Move cleaning up of type info in internals to tp_dealloc on pybind11_metaclass
@henryiii henryiii removed the needs changelog Possibly needs a changelog entry label Nov 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

how to overwrite parent class's enum with same name?
5 participants