Skip to content

Equality of Type objects with NNBD types. #444

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lrhn opened this issue Jul 10, 2019 · 14 comments
Closed

Equality of Type objects with NNBD types. #444

lrhn opened this issue Jul 10, 2019 · 14 comments
Labels
nnbd NNBD related issues

Comments

@lrhn
Copy link
Member

lrhn commented Jul 10, 2019

The equality (==) of Type objects is currently undefined, but generally implemented as equal only if it is the same type. It distinguishes "equivalent" types (mutual subtypes) if they are not the same type, so Object == dynamic is false, and the Type objects for List<Object> and List<dynamic> are also distinct.

This is the least equivalence that is still useful. It does not attempt normalization.

With NNBD types, we get further ways to have different-but-equivalent types, like int? and int?? (etc.). Should that affect the behavior of Type.==?

The runtimeType of any object is either a non-nullable type or Null, so runtime types will never be a ? type or a * type, or even a FutureOr type.

The runtie type might contain one of those union types, though.
Given a C<int?> and a C<int??>, keeping the current behavior and retaining the ?? type will cause the runtime types to differ. Do we want that?

A similar problem can happen with * types. If I pass a C<int?> instance and a C<int*> instance, their runtime types will not compare equal. That's probably for the best, since we can't have equality here, and for C<int> and C<int*> too, and not break transitivity. So, making those types distinct is the right thing.

The safest approach is probably to keep the current behavior, and not try to be clever in any way.
It makes Type equality less useful in the presence of non-migrated libraries, but int* really is a different type than both int and int?.

@lrhn lrhn added the nnbd NNBD related issues label Jul 10, 2019
@eernstg
Copy link
Member

eernstg commented Jul 10, 2019

[Edit: Corrected F3 and updated result: Now all cases in the example print 'true'.]

We discussed that and specified that == on instances of Type that are obtained as the reified representation of a Dart type must evaluate to true if and only if the two types are mutual subtypes (it's line 14408 in the current spec, but github only loads about 13000 lines of the file; a search for 'When types ' in the 'Raw' presentation of dartLangSpec.tex will find it). The issue was dart-lang/sdk#34447.

Object and dynamic are probably considered different by at least some implementations because of the long history of giving dynamic special powers. There is a known bug around different syntactic forms of the same function type, but otherwise we do get results consistent with the "mutual subtype" rule:

Type typeOf<X>() => X;
typedef F2 = void Function(int);
typedef void F3(int i);
bool mustBeTrue4 = F2 == F3;

class C<X> {
  Type get typeArgument => X;
}

C<C<dynamic>> c = C<C<int>>();

main() {
  print(typeOf<List<int>>() == typeOf<List<int>>()); // 'true'
  print(typeOf<List<F2>>() == typeOf<List<F3>>()); // 'true'
  print(typeOf<Function(double)>() == typeOf<dynamic Function(double d)>()); // 'true'
  print(c.runtimeType == C<C<int>>().runtimeType); // 'true'
  print(c.runtimeType == typeOf<C<C<int>>>()); // 'true'
  print(c.typeArgument == typeOf<C<int>>()); // 'true'
}

So the starting point should be closer to "Type.== means mutual subtype" than "Type.== is random".

I'd hope that we could eliminate T?? and similar stacks like T*? or T?* in reified types, such that there is no need to decide on how to handle them: So eager normalization for that sounds good!

But then we have the remaining conflict: Should we let the subtyping relation dictate that List<int> == List<int*> == List<int?>?

This is of course not nice, because it breaks the transitivity of ==. However, the mixture of legacy and NNBD types is already not sound, and I think we should at least consider carefully what works better in practice: Break the transitivity of == for reified Types in order to maintain that Type.== is based on subtyping, or break the rule that Type.== is based on subtyping in order to maintain transitivity.

I tend to prefer the subtype rule: Both reified types and dynamic errors associated with type checks are run-time phenomena, and the latter are solely based on subtype requirements, so a strictly subtype based semantics will ensure a certain level of consistency.

@lrhn
Copy link
Member Author

lrhn commented Jul 12, 2019

The actual implementation is more like "equal if same type" than "equal if mutulal subtype". It just matters what the implementation in question considers to be the "same type".

Examples:

import "dart:async";
Type typeOf<X>() => X;
main() {
  print(typeOf<dynamic>() == typeOf<Object>());  // false
  print(typeOf<dynamic>() == typeOf<void>());  // false
  print(typeOf<FutureOr<Object>>() == typeOf<Object>());  // false
  print(typeOf<FutureOr<Future<Object>>>() == typeOf<Future<Object>>());  // false
  print(typeOf<FutureOr<Null>>() == typeOf<Future<Null>>());  // false
  print(typeOf<void Function<T>(T) Function()>() == typeOf<void Function<T>(T) Function()>());  // false VM, true dart2js
}

All of these are examples of types that are mutual subtypes, but are also "different" types.
The implementations report them as different, except the function type, which dart2js gets right, and the VM does not (no matter what we decide to do for equality, structural types with the same structure must be the "same" type in every regard).

@lrhn
Copy link
Member Author

lrhn commented Jul 12, 2019

So, what worries me here is that if we try to make implementations use "mutual subtype" as the rule (which they don't now), then we will have non-transitive behavior for NNBD legacy * types.
That might not be a problem in practice, and legacy types will disappear when everybody has moved to NNBD anyway. When that has happened, we should be back to our current state, just with more union types, so we need to figure out whether List<int?> and List<int??> is the "same type" or not.

@eernstg
Copy link
Member

eernstg commented Jul 12, 2019

Examples:

That's a very nice list of cases to test! ;-)

Given that the implementations do not agree completely, it is not an option to just say "this is what we have, let's put that into the specification". I think the model which is based on mutual subtyping is simple, consistent, and conceptually well motivated; so it would be nice if we can get that without too much breakage.

Is there a simple and consistent rule which would justify all those 'false' results as shown in the example?

whether List<int?> and List<int??> is the "same type"

I think we'd want to ensure some kind of equality and normalization such that reified types will never need to have an unbounded number of modifiers. Then we will never have List<int??> as a reified type. We still have to choose normalization rules that do not introduce unsoundness, but it will hopefully never be a run-time issue.

@rrousselGit
Copy link

Is there a simple and consistent rule which would justify all those 'false' results as shown in the example?

Flutter with Widget.canUpdate

Using such code, if we replace MyStatefulWidget<int> with MyStatefulWidget<int?>, that will make Widget.canUpdate return true but make State.didUpdateWidget throw because it still expects a MyStatefulWidget<int>

@eernstg
Copy link
Member

eernstg commented Jul 12, 2019

We cannot allow t1 == t2 to evaluate to true if t1 reifies MyStatefulWidget<int> and t2 reifies MyStatefulWidget<int?>. But there is no mutual subtype relationship there anyway, we only have one direction: MyStatefulWidget<int> <: MyStatefulWidget<int?>.

@leafpetersen
Copy link
Member

See #428 for some discussion around what to do with the * types before strong checking is turned on.

@rrousselGit There are essentially two situations we need to consider: the migration period, in which we allow non-migrated and migrated code to interact freely; and the post-migration period (strong checking turned on) in which we use the final NNBD semantics.

In the migration period, note that State.didUpdateWidget will not throw if it is passed a type which has the wrong nullability, so I think we'd be ok there even if canUpdate returned the wrong result.

In the post-migration period (strong checking turned on), State.didUpdateWidget would throw on an incorrectly nullable type, so we would definitely want canUpdate to return the correct answer (which it should).

I think my tentative thinking here is that:

  • mutual subtyping is the correct semantic behavior
  • in the pre-strong checking NNBD world, that means mutual subtyping ignoring nullability

@lrhn
Copy link
Member Author

lrhn commented Jul 15, 2019

Keeping the current behavior is also an option: Then int*, int? and int would all be un-equal at run-time. They are different types after all, which is what the type objects represent.

That would mean that StatefulWidget<int*> is not equal to StatefulWidget<int>, which may or may not be a problem for partially migrated code (but only if it actually relies on Type object equality).

(I'm more worried about int?? being different from int? because that's likely to be confusing and happening by accident).

@rakudrama
Copy link
Member

If the definition of == is mutual subtyping, this raises some issues.

Switch case expressions are compared with the case expression's == method. A mutual subtype test seems expensive in the context of a switch statement. For example, case dynamic: should match all top types. In specifying equality of Type objects, do you have in mind some implementation strategies that would make the case usage efficient? Is there a canonical projection from Type-->Type where structural equivalence of the projection is equality on the domain?

Would equal types in case expressions be some kind of error or warning?

typedef F1 = void Function();
typedef F2 = dynamic Function();
typedef F3 = FutureOr<dynamic> Function();
...
case F1: ...
case F2: ...
case F3: ...

Collections (Maps, Sets) require a coherent hashCode. The hashCode is computed independently of the mutual subtyping test, so is there a principle for the construction of a hasCode function that is both 'good' (dispersive) and easy to maintain? At the minimum we need an extensive test that the hashCode is coherent.

@eernstg
Copy link
Member

eernstg commented Sep 9, 2019

tl;dr Type == was specified, but not implemented; is it now a breaking change? rd;lt

Equal cases

@rakudrama wrote:

Would equal types in case expressions be some kind of error or warning?

I cannot find any location in the language specification where it is specified to be an error to have several switch case expressions that are equal. I also don't get any errors from the analyzer nor from the CFE on the following:

main() {
  var i = 42;
  switch (i) {
    case 1: break;
    case 1: break;
    default: break;
  }
}

So it would make sense to indicate that there's something fishy in this situation, but it would be a general switch thing, not just a type literal-in-switch-case thing. Also, it could be at any level (it could be hint or lint), because the semantics is well-defined, and no language invariants are violated.

hashCode

I agree that it may not be easy (nor cheap, in terms of run-time performance) to perform this computation for a complex type in a way that matches the implementation of ==. However, I think it would make sense to consider this to be part of the implementation of an operator == on instances of Type that are reified types. So if we want the "nice" operator ==, we'd also want a hashCode that works correctly together with that ==.

Can we project the issue away?

Is there a canonical projection from Type-->Type where structural
equivalence of the projection is equality on the domain?

Maybe we want structural equivalence of the original representation instead?, such that we could evaluate e1 == e2 which is a comparison of expressions of type Type is follows:

Evaluate e1 to an object o1, and let o1c be the canonicalized object, if any, representing the same type (where 'same type' is based on mutual subtyping). Same for e2, o2, o2c. Then e1 == e2 evaluates to true if identical(o1c, o2c) evaluates to true. Otherwise (if identical(o1c, o2c) yields false), we need to run the structural equality test on o1 and o2.

I think we could find such a projection, but it might not help much — it's just a different way to ask for the same computation.

Do we want the specified semantics?

I would consider it more important that a switch statement behaves in a meaningful way (e.g., that it doesn't decide that void Function(int) is not the same thing as void Function(int), just because the latter was obtained at run time), and less important that a switch on types may be slower than other switch statements.

@lrhn argued here that

We have always worked under the assumption that type objects are equal
if they represent the same type (otherwise they are completely useless)

and I agree that it would be less meaningful and less useful if we drop that property.

So the implementation effort matters, but otherwise I think the specified behavior is preferable over any alternatives.

Breaking change? Alternatives?

With that, the main issue here is that this was specified in commit dart-lang/sdk@4816b60 (Sep 2018), but there has been a communication glitch, and now this rule appears to come as a surprise. Whenever we have a property which is specified but not implemented after a while, we need to consider whether it should be revisited as a breaking change.

If so, we'd need to consider some alternatives:

  • We drop all guarantees: Operator == on Type instances that reify types is unspecified (so it can return whatever it wants), but identical could be used in some situations where only constants are involved. In practice, many things will just work anyway.

  • We specify that the guarantee that e1 == e2 where e1 and e2 reify types evaluates to true iff they reify mutual subtypes only holds for some types (say, generic classes and function types are not included, top types are not included). This would make it possible for developers to stick to the cases "where it works", and we could emit diagnostic messages for all other cases (except cases like X == Y where X and Y are type variables whose value may or may not have this guarantee).

  • We restrict the set of constant expressions denoting reified types. So maybe a type alias name is no longer constant, or it is only constant if it doesn't use the upcoming generalization (e.g., typedef C = D<int>;). This would require operator == and hashCode to be implemented to perform the mutual subtype test, but it would ensure that switch statements aren't slower when they compare types than they are otherwise.

@rakudrama is this a reasonable description of the situation as you see it?

The treatment of == for reified types in general could be folded into the breaking changes of NNBD, which would make the management of the breakage a part of the bigger NNBD porting effort.

@lrhn, @leafpetersen, @munificent, WDYT?

@lrhn
Copy link
Member Author

lrhn commented Sep 10, 2019

I'm all for making Type.operator== even more useless.

Changing the implemented behavior to the specified behavior is technically a breaking change—It's possible that there exists a program which will change behavior when it can no longer distinguish the runtime type of a List<Object> from a List<dynamic>, or if we change the value if <dynamic>[].runtimeType.toString(). They may deserve to fail, but they would be broken.
("Break them. Break them all!")

We originally decided that there is no List<dynamic> at run-time, it's all List<Object>, and the difference between the two static types is entirely a static property. We erase dynamicity at run-time, dynamic is a static-only flag on a static Object type. As such, making a distinction between dynamic and Object should not be an issue because there are no dynamic types at all.
We also erase void to Object. You can retain the flags so that you can print the original source type (I wouldn't, but you can), but it should not affect equality.
So, the implementation strategy to help doing equality would be erasure.

However, that would still not help us for situations like:

Type typeOf<T>() => T;
switch (typeOf<FutureOr<Object>>()) {
  case Object: 
    print("Did I get here?");
    break;
  default: throw "Nope";
}

Here the FutureOr<Object> and Object types are mutual subtypes, but it's not a matter of simple erasure of dynamism/voidness to see that. So, no simple strategy.

Also: print(typeOf<Future<Object>>() == typeOf<FutureOr<Future<Object>>>()); should print true. So should print(typeOf<FutureOr<Null>>() == typeOf<Future<Null>>()); (for now).
There is no easy way to recognize this short of special-casing FutureOr as a type constructor to degenerate in some cases, which would add complexity to the situation where we generate a new type at run-time.

So, what is the currently implemented behavior?
What implementations actually do differs (obviously!).

Both VM and dart2js consider dynamic not equal to Object. Types are only equal if they are the exact same type. The VM seems to canonicalize all types, so they are identical if they are equal, and dart2js does not canonicalize, so only the result of type literal expressions are identical (when they are equal).

For switch cases, we know that the case expressions are constant, so they are literal expressions, but the switch expression is not constant, so we can't use identical for the comparison. So, you'll have to do sequential equality checks unless you have some backchannel to the implementation that allows you to cheat. Since constant types are system types, and system types are never equal to non-system implementations of Type, then you can start by rejecting any non-system Type object from matching Type object case expression. Then you can use the internal representation, including hashCode if necessary. It still doesn't prevent complex equality in some cases.

Another option is to specify the current behavior. Type objects are equal if they represent the same type ... for some definition of same type. It's probably "obvious" what that definition is, but we say that FutureOr<int> introduces a new type which is a super-type of int and Future<int>. We do not say that writing it twice represents the same type, because it doesn't matter since the two types are equivalent. We probably want them to have structural equality. In any case, if we go for "same type", we do need to specify that. It's not trivial.

With NNBD in mind, we should worry about whether int?? == int?. Are they the same type, or are they different types that are mutual subtypes? Is FutureOr<Null> the same union type as Future<Null>?, or are they different because the we use a different syntactic type constructor?

So, the options I see are:

  1. Current specification: Types are equal if they are mutual subtypes. That may require checking both directions in order to figure it out, and it makes hashCode a pain to compute, going on intractable. Eager canonicalization is an option, but it's probably not memory-safe for programs generating new types at run-time.

  2. Current behavior: Types are equal if they are the same type. We need to formally specify what "same type" means, likely with structural equivalence for function and union types, recursive equality for "same" generic types, and nominative equality for everything else. The hashCode can be computed recursively.

  3. Even less promises: We promise nothing for equality except perhaps for constant types. The Type class is now completely useless if you don't have dart:mirrors. It will break every current use of someType == something.runtimeType..

I think the second option is the most realistic. It is still not trivial, but we pretty much already do it correctly, we just have to specify the same thing.

We can still erase dynamic and void at run-time if we want to. (I want to).

@eernstg
Copy link
Member

eernstg commented Sep 10, 2019

No matter what else we are doing, I assume that we agree to start by expanding type aliases, cf. dart-lang/sdk#32782.

Option 1

For option 1 we could use type normalization to make hashCode tractable. @leafpetersen and I discussed having normalization for types, and the point is that normalized types can be compared structurally (whereas non-normalized types can have a very different structure and still be equal in this case).

We'd want normalization to normalize all top types to Object? and all bottom types to Never (there are a few more cycles of two elements and we'd choose a canonical representative for each of them).

That kind of normalization goes one step further than the kind that we would want to use during static analysis. For instance, we would want static normalization to replace T?? by T? and Null? by Null, but we do not want to forget about the difference between dynamic and Object? everywhere.

This approach makes sense conceptually: Mutual subtypes are "the same type" according to any soundness consideration (because it would always be possible to switch from using one or the other type for an expression via an upcast).

Option 1.1

An option 1.1 could be obtained as a variant of option 1: It would again be based on type normalization, but in this case we would use exactly the same normalization at run time as the one that we would be using during static analysis. This normalization would preserve the distinction between void, dynamic, and Object?, possibly by mapping FutureOr<T> to T and removing ? until we have one of those "atomic" top types. Bottom types would be normalized to Never. Null types are more tricky (Never* comes to mind).

This option is a refinement of option 1 (whenever option 1 considers a set of types as the same, this option could further divide them into smaller sets), and that's also soundness preserving.

We need the distinction between atomic top types during static analysis, and it may be considered as "noise" to preserve a distinction which goes beyond that which is needed for soundness, but it will hardly be considered very strange to keep it at run time as well.

Option 2

Option 2 corresponds to having a structural equality test where equality only needs to consider different syntactic forms denoting the same entity. So void is void, but Object could be p.Object where p is a prefix, and user-defined types are the same if they resolve to the same declaration. Everything else is based on establishing equality of wholes by showing equality of corresponding parts. I don't even think we should consider union types as structural: Future<Null>? and FutureOr<Null> are spelled quite differently, so why would you expect them to test equal?

I agree that this one is quite practical. If we say that it's 'the current behavior' then it's non-breaking by definition, but I also think that we can specify this in a manner which is both reasonable and non-breaking in practice (and if there is an esoteric corner case where the current behavior amounts to a surprising exception then we should be able to handle the discrepancies as bugs).

The conceptual justification here would be that types that look different are just different (except for type aliases and prefixes).

Can we have it both ways?

If we settle on option 2 because that's the non-breaking path ahead, and it's relatively justifiable conceptually, do we then make it harder to use normalization during static analysis? Would that imply that we need to have a dynamic representation of X? which is int?? in this case?:

class C<X> { List<X?> xs = []; }
main() {
  C<int?> c = ...;
  print(c.xs.runtimeType); // `List<int??>`.
}

I think it could be a source of serious confusion if we make a distinction between such types as int? and int?? etc. at run time, but also that there is an inherent contradiction between using option 2 ("a type is just what it looks like") and normalizing such elements and the ? suffix ("the meaning of a type is its normalized form").

@leafpetersen
Copy link
Member

I have a proposal to resolve this here. The proposal is essentially as follows:

  1. Runtime type object equality is defined to equate types with the same normal form
  2. Runtime type object equality is defined to ignore the * modifier on types.

@leafpetersen
Copy link
Member

I've landed my proposal in the feature spec, closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nnbd NNBD related issues
Projects
None yet
Development

No branches or pull requests

5 participants