Skip to content

Fix typing_extensions to support PEP 560 #549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 6, 2018

Conversation

ilevkivskyi
Copy link
Member

The main fix is rewriting Protocol. Although the fix is straightforward, there are some observations/details I wanted to discuss before moving Protocol to typing:

  • It looks like we still need a metaclass for Protocol even with PEP 560. The problem is a certain asymmetry in ABCs API, there is __subclasshook__, but there is no such thing as __instancehook__. Because of this we need a metaclass with a single method. __instancehook__ (or currently the metaclass) is needed to support structural isinstance() checks for concrete classes that set attributes on self so that fallback to issubclass() is not possible. Although protocols with non-method members are relatively rare, in view of planned support for static type checking in Bazel (and other modular build systems) protocols describing certain data will be needed. So we have two options here:
    • Continue to use metaclass, since explicitly subclassing protocols (and thus having metaclass conflicts) is rare. (-0)
    • Just add __instancehook__ exactly the same way there is __subclasshook__, then we can keep typing metaclass-free even with protocols. (+1)
  • Currently behaviour of .register() with protocols is not specified. As a result it is determined by the current implementation and has some (weird) corner cases. Maybe .register() should be actually prohibited for protocols (normal ABCs can still use it)? Protocols are supposed to be purely structural, allowing .register() breaks this logic and may introduce some type un-safety. So the options here are simple:
    • Specify the rules for how .register() should work for protocols and try to implement these rules. (-1)
    • Prohibit using .register() on protocols (still OK for normal ABCs). (+0)
  • Third question is about what to do with extending "built-in" protocols? Existing implementation actually prohibits class FancyIter(Iterable[str], Protocol): ... (because protocols can only inherit form other protocols, but technically collections.abc.Iterable is a nominal class). However allowing this is not easy because machinery in abc (and functools) indiscriminately calls issubclass() on all subclasses of Iterable, but it is prohibited for non-runtime protocols. Here we have three options:
    • Don't allow extending built-in protocols. I don't like this because it reminds me UserDict and UserList. (-1)
    • Allow this and just continue to use the "good old" sys._getframe() hack (which I wanted to remove). (-0)
    • Allow this and patch __bases__ in __init_subclass__ to remove built-in protocols from there. We will also need to copy the methods to allow using them by explicit subclasses. (+0)

For clarity I added my votes for all options. @gvanrossum what is your opinion on these points. It is not urgent to discuss them, (I didn't act on any of these points in the PR, it just mirrors the pre-PEP 560 implementation so can be merged before we agree on these points) but we should discuss these before moving protocols to typing.

@gvanrossum
Copy link
Member

gvanrossum commented Apr 2, 2018

This is a huge thing. Maybe you can find a new co-conspirator amongst Python core devs or mypy contributors? I also am not sure at all that I fully appreciate all the subtleties in the issues you bring up. I wish we could whiteboard in person -- maybe at or after PyCon?

in view of planned support for static type checking in Bazel (and other modular build systems) protocols describing certain data will be needed.

Can you clarify? Why is Bazel used as a forcing function? What specific examples do you have in mind? Runtime checking for Protocol conformance is pretty iffy even if just methods are involved (there's no signature checking, hence the @runtime* decorator) and I would be okay if Protocols just didn't support this if data attributes are present. Then again maybe data attributes are a major use case somehow?

Just add __instancehook__ exactly the same way there is __subclasshook__, then we can keep typing metaclass-free even with protocols. (+1)

I wonder what Mark Shannon would think of this (he's on record as not wanting to enable runtime checks at all).

Prohibit using .register() on protocols (still OK for normal ABCs). (+0)

Makes some sense to me, though it depends on how good the solution is for the previous issue. I could imagine that in some cases it's easier to use .register() (and then accept instances of the registered class as members of the protocol without any attribute checks) than to make the attribute check work. Also I'm curious about the corner cases you've found.

Third question is about what to do with extending "built-in" protocols?

I'm not following all the details of your description here -- I haven't thought about any of this code in a really long time. :-(

Sorry...

@ilevkivskyi
Copy link
Member Author

This is a huge thing.

Do you mean the PR, my comment, or the protocols in general? The comment is indeed long but it was inspired by thinking about where we can go with PEP 560 and protocols. As I mentioned the PR itself doesn't act yet on any of this points it just fixes the support for protocols in Python 3.7 broken by the changes in internal typing API.

I wish we could whiteboard in person -- maybe at or after PyCon?

Discussion of these issues is not urgent, it can wait until then, but I would appreciate if you review the PR, so that typing_extensions stops being broken on Python 3.7 (now it raises on import).

Can you clarify? Why is Bazel used as a forcing function?

We had a discussion on Slack with Jared Hance about what to do with circular imports necessary only for type checking and Bazel that doesn't like circular imports (even under if MYPY: ...). He reasonably proposed that all such cycles should be broken by introducing protocols for the function/method argument types where necessary. After thinking some time about this it seems a very good idea to me: modular (e.g. in the sense of having well defined Bazel targets) static type checking will require protocols. Then since many argument types for functions will have some data components, support for non-method members is needed. The question about why we need the runtime support is a bit more subtle (also see my answer below). The problem will happen if someone needs to transition from one API to another. Imagine this (simplified) scenario:

class LegacyData(Protocol):
    x: int
    y: int
    label: str

class NewData(Protocol):
    coord: Tuple[int, int]
    label: str

def process(items: Iterable[Union[LegacyData, NewData]]) -> None:
    for item in items:
        if <some check here>:
             # unpack and process legacy API
        else:
             # other logic for new API

IMO ideally we should provide a way to destructure a union of protocols (otherwise we would induce growing amount of rigidity in any protocol specified APIs) so the check above can be just isinstace(item, LegacyData). Of course one can use some hasattr() hacks instead, but then this function will be not statically checked.

I wonder what Mark Shannon would think of this (he's on record as not wanting to enable runtime checks at all).

On my record he is not participating in protocol discussion at all :-) But seriously this situation is special. Namely, mypy already has reasonable support of isinstance() for nominal types, and this is important for example for destructuring unions. We should have the same mechanism to distinguish structural types, and it already exists -- collections.abc.Iterable and friends already work with isinstance() this way. Arguably, mypy implementation for this should be improved, and we have an issue for this python/mypy#3827, whatever we do at runtime should be matched 1:1 statically, so that we are completely safe, for example:

@runtime_checkable
class ProtoA(Protocol):
    attr: int
class ProtoB(Protocol):
    attr: str

x: Union[ProtoA, ProtoB]
if isinstance(x, ProtoA):
    x.attr + 1  # the binder type is still a union here, so mypy will complain after the issue is fixed.

Maybe you can find a new co-conspirator amongst Python core devs or mypy contributors?

I hope @JukkaL is still interested in participating in this discussion. My ideal goal would be to have protocols story settled by the end of June, but I don't want to distract Jukka to much, there is still some work needed on fine-grained mode.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a sense of deja vu -- did I see an earlier version of this code in a different PR?

The list of supported versions in setup.py needs to be extended to include 3.7. (Not that anything checks for this. :-)

return attrs


def _callable_members_only(cls):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if this function followed the (very informal) naming convention for booleans to start with "has" or "is". Initially I mistook it for a filter over _get_protocol_attrs().

@@ -985,6 +997,188 @@ def __new__(cls, *args, **kwds):
Protocol.__doc__ = Protocol.__doc__.format(bases="Protocol, Generic[T]" if
OLD_GENERICS else "Protocol[T]")


if PEP_560:
# We need to rewrite these from scratch.
Copy link
Member

@gvanrossum gvanrossum Apr 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate in the comment why? Also, it seems weird that there are now two conditional definitions of _ProtocolMeta and Protocol, one pair under if HAVE_PROTOCOLS above and then this pair. Why not at least execute precisely one pair of definitions at runtime? E.g. the first if could be if HAVE_PROTOCOLS and not PEP_560.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(What I meant is that apparently both pairs are executed in 3.7?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this totally makes sense. There is no need to execute both versions in 3.7. I will re-organise the ifs.



class Protocol(metaclass=_ProtocolMeta):
# There is quite a few overlapping code with typing.Generic.
Copy link
Member

@gvanrossum gvanrossum Apr 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar nit: s/few/lot of/.

@gvanrossum
Copy link
Member

Whoops, sent the review too soon. Expect more to come.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly I don't have the concentration to review the new versions of _ProtocolMeta and Protocol carefully; I haven't deeply studied this code for a long time. :-( So here are just some superficial nits.

This is the end of today's review.

@@ -985,6 +997,188 @@ def __new__(cls, *args, **kwds):
Protocol.__doc__ = Protocol.__doc__.format(bases="Protocol, Generic[T]" if
OLD_GENERICS else "Protocol[T]")


if PEP_560:
# We need to rewrite these from scratch.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(What I meant is that apparently both pairs are executed in 3.7?)


class Protocol(metaclass=_ProtocolMeta):
# There is quite a few overlapping code with typing.Generic.
# Unfortunarely it is hard to avoid this while these live in two different modules.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: unfortuna/r/t/ely.

class Protocol(metaclass=_ProtocolMeta):
# There is quite a few overlapping code with typing.Generic.
# Unfortunarely it is hard to avoid this while these live in two different modules.
# The duplicated code will be removed when Protocol is moved to typing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Would that be in 3.8 or could it be e.g. in 3.7.1?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can move it already in 3.7.1, however this will require _ProtocolMeta, since we can add __instancehook__ only in 3.8. Although this is probably not a big deal, since I expect that explicit subclassing of protocols by concrete classes will be rare. One of the main goals of PEP 544 is to allow not to subclass them :-)

@typing_extensions.runtime act as simple-minded runtime protocol that checks
only the presence of given attributes, ignoring their type signatures.

Protocol classes can be generic, they are defined as::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention restrictions, e.g. you cannot use a generic protocol with issubclass? (Also doesn't that mean that @runtime with a generic protocol should just be an error?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Protocols are not different from normal generics in this aspect, for example:

@runtime_checkable
class GenP(Protocol[T]):
    def meth(self) -> T:
        ...
class C:
    ...
issubclass(C, GenP)  # This is OK, False or True depending on C
issubclass(C, GenP[int])  # This is an error both in mypy and at runtime.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still my suggestion stands -- @runtime (by whichever name) should complain when the decorated class is a generic protocol. Right? Since the only reason to add @runtime is to enable isinstance() checks and that is not allowed for generic protocols.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait. After reading the relevant passage in PEP 544 again I finally realize that @isinstance(x, GenP) is still allowed, it's only @isinstance(x, GenP[int]) that's disallowed. So forget everything I said here. (Except that as I said in a previous comment it's been a long time since I thought about this. :-( )

def __new__(cls, *args, **kwds):
if cls is Protocol:
raise TypeError("Type Protocol cannot be instantiated; "
"it can be used only as a base class")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: I'd reorder this as "it can only be used as a base class" (but what you wrote isn't a grammatical error, it just sounds a bit formal or dated).

# Generic can only be subscripted with unique type variables.
if not all(isinstance(p, TypeVar) for p in params):
raise TypeError(
"Parameters to Protocol[...] must all be type variables")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you at least add the index of the first one that's not a type variable to the message? (1-based, I'm guessing. :-)

self.assertIsSubclass(PG[int], PG)
self.assertIsSubclass(BadPG[int], P)
self.assertIsSubclass(BadPG[T], PG)
if not PEP_560:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a similar regression for non-protocol generics? If so, fine. If not, this is a bit unfortunate (though it may be unavoidable).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is not different from normal generics. This is already mentioned in the PEP 560 as one of the few user visible changes.

@@ -471,7 +474,10 @@ def test_counter_instantiation(self):
class C(typing_extensions.Counter[T]): ...
if TYPING_3_5_3:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of unfortunate that these variables are named in such a way that it's not clear that they imply "or all higher versions".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I would say it is too late to change this.

@@ -1091,7 +1102,9 @@ def test_generic_protocols_repr(self):
T = TypeVar('T')
S = TypeVar('S')
class P(Protocol[T, S]): pass
self.assertTrue(repr(P).endswith('P'))
# After PEP 560 unsubscripted generics have standard repr.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/have/have a/

self.assertEqual(typing_extensions._get_protocol_attrs(P), {'meth'})
self.assertEqual(typing_extensions._get_protocol_attrs(PR), {'x'})
self.assertEqual(frozenset(typing_extensions._get_protocol_attrs(PG)),
frozenset({'x', 'meth'}))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intentionally delete the test for PG[int]?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I think it is better to keep it under if not PEP_560:.

@ilevkivskyi
Copy link
Member Author

I have a sense of deja vu -- did I see an earlier version of this code in a different PR?

Thanks for review! Yes, this just mirrors the logic in pre-PEP 560 version. I tried to share some code, but it is not so easy. I will check if there are some more places where common logic can be factored out. I will reply to some comments now, but will push the new version later.

@ilevkivskyi
Copy link
Member Author

@gvanrossum I addressed the previous set or your comments. Could you please take a look again if this PR is ready?

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, LGTM!

@ilevkivskyi ilevkivskyi merged commit dbac8c2 into python:master Apr 6, 2018
@ilevkivskyi ilevkivskyi deleted the fix-extensions-560 branch April 6, 2018 11:59
@markshannon
Copy link
Member

markshannon commented Apr 7, 2018

I wonder what Mark Shannon would think of this (he's on record as not wanting to enable runtime checks at all).

I am still opposed to runtime checks, for the same reasons as before. If any type implements, then all must and they cannot be implemented correctly (for [], (), etc) or efficiently (for [1]*(10**7)).
Plus, it adds to the inevitable confusion about types vs. classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants