Skip to content

Meta issue for implementation of strong mode instantiate to bound for Dart 1.5 #27526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
leafpetersen opened this issue Oct 6, 2016 · 24 comments
Closed
3 tasks done
Assignees
Labels
area-language Dart language related items (some items might be better tracked at github.com/dart-lang/language). area-meta Cross-cutting, high-level issues (for tracking many other implementation issues, ...). P1 A high priority bug; for example, a single project is unusable or has many test failures
Milestone

Comments

@leafpetersen
Copy link
Member

leafpetersen commented Oct 6, 2016

Meta issue tracking the final implementation of the instantiate to bound behavior for Dart 1.5 (that is, how to treat uses of generic types/methods where no type arguments are specified).

A bit more context (but see the relevant discussion in #26265):
Dart tries to infer the type of generic arguments based on the given bounds. However, that can be complicated (and sometimes impossible). For example the following example should probably be an error:

class A<G extends B> {}
class B<H extends A> {}
A x = null;  // What should the generic type of A be?

Different approaches possible. One restriction (for example) would require to write the generic type in the bound:

class A<G extends B<must_write_something_here>> {}
class B<H extends A<must_write_something_here>> {}

Tracking bugs:

In a following release:

@leafpetersen leafpetersen added the area-language Dart language related items (some items might be better tracked at github.com/dart-lang/language). label Oct 6, 2016
@leafpetersen leafpetersen added this to the 1.50 milestone Oct 6, 2016
@leafpetersen leafpetersen self-assigned this Oct 6, 2016
@leafpetersen
Copy link
Member Author

Relevant discusssion here: #26265

@floitschG
Copy link
Contributor

Status update:
I updated the description with a bit more context.

@leafpetersen is assigned to propose rules. Once they are in, we will possibly reassign this bug to make the changes (and/or update the spec).

@leafpetersen
Copy link
Member Author

We have two proposals, which I'm going to summarize here and solicit feedback on.

@leafpetersen
Copy link
Member Author

leafpetersen commented Nov 11, 2016

Proposal 1 (preferred).

Terms: a named type F is a "raw type" if F is the name of a parameterized type, and no arguments are passed for its parameters. Example:

class A<T extends num> {}

// A here is a raw type
A x; 
// A<num> here is not a raw type
A<num> y; //

Given a parameterized type F with type parameters T0 extends B0, ..., Tk extends Bk, a use of F with no type arguments is treated as if it were called with a set of explicit type arguments D0, ..., Dk which are computed as described below.

The high level intuition is that to compute the default bounds for F, we first compute the default bounds of any raw types that appear in its bounds, erroring out if that involves a cycle. So for example, with class A<T extends B> {}; class B<T extends A> {}, neither A nor B have default bounds, since computing the default bound for A requires computing the default bound for B, which in turn requires computing the default bound for A. We then check to see if the dependencies induced by the free occurrences of the type variables in each other's bounds forms a cycle (including possibly a self-cycle): if so, we error out (e.g. class Foo<T extends Comparable<T>> has no default bound). Otherwise, there is a total order on the type variables such that we can substitute away all occurrences of each type variable in order, replacing them with a default bound which has no free type variables (e.g. Foo<T extends num, S extends Comparable<T>> has default bounds <num, Comparable<num>>.

More precisely:

  1. For every type name G which appears in the Bi, if G is a raw type, attempt to compute its default arguments. If computing its default arguments transitively requires computing the default arguments for F, it is an error. No default arguments for F can be computed, and the user will be required to explicitly provide type arguments for F.
  2. Start with D0, ..., Dk == B0, ..., Bk.
  • Choose i such that none of T0, ..., Tk occurs in Di, and replace all uses of Ti with Di in D0, ..., Dk.
  • Repeat this until either every i has been chosen once, or there is no valid choice of i
  • If every i has been chosen once, then the solution is D0, ..., Dk, otherwise there is a cycle, and we emit an error indicating that default bounds cannot be computed for this type, and hence that this type cannot be used as a raw type.

Examples:

// Default is A<int>.
class A<T extends int> {}

// Type completed to A<int>.
A x;

// Default is B<A<int>>.
class B<T extends A> {}

// Type completed to B<A<int>>.
B x;

// Default is C<int, A<int>>.
class C<T extends int, S extends A<T>> {}

// Type completed to C<int, A<int>>.
C x;

// No error here, but no default can be computed.
class D<T extends Comparable<T>> {}

// Error: no bounds can be computed for x
D x;

// Error: D has no default
class E<T extends D>

// Error: G has no default bounds, because of a cycle
class F<T extends G> {}
// Error: F has no default bounds, because of a cycle
class G<T extends F> {}

Notes:

I've presented the two phases as ordered, but I believe that they are independent and could be computed in either order, so long as we treat typedefs as strict in their arguments. I have chosen this ordering (which is different from the original proposal) because I believe that this might yield better error messages, but either ordering should be fine.

The major downside of this approach is that computing the default bounds for a type requires a recursive computation of default bounds for any raw types mentioned in its bounds, with cycle detection. This is fairly non-local, and somewhat heavyweight. We would like feedback on any implementation concerns.

@leafpetersen
Copy link
Member Author

leafpetersen commented Nov 11, 2016

Proposal 2 (less preferred)

  1. All types used in an extends clause must be explicitly instantiated, otherwise it is an error.
     class Foo<T extends List> {} // ERROR, List must be instantiated
  1. Outside of an extends clause, missing type arguments are implicitly filled in with defaults that are computed from the bounds in the following manner:
  • Given a list of type parameters <T0 extends B0, ..., Tn extends Bn>
  • We say that a bound B is ground if the free type variables of B do not include any of the Ti.
  • The default values <D0,..., Dn> for the Bi are computed as follows:
    • We start with <D0, ..., Dn> == <B0,..., Bn>
    • For every D0 which is ground, we replace T0 with D0 in all of the Di
    • We repeat the previous step until we reach a fixed point (guaranteed, each step either eliminates a type variable or is at a fixed point)
    • If all the Di are ground, those are the defaults
    • If any of the Di is non-ground, we issue an error.

Basically, this algorithm will require explicit arguments for F-bounded cases, but will do the right thing for non-recursive cases (where "the right thing" is using the bound as a default value).

Examples:

// Default is A<int>.
class A<T extends int> {}

// Type completed to A<int>.
A x;

// Error: bound is not explicitly instantiated.
class B<T extends A> {}

// Default is C<int, A<int>>.
class C<T extends int, S extends A<T>> {}

// Type completed to C<int, A<int>>.
C x;

// No error here, but no default can be computed.
class D<T extends Comparable<T>> {}

// Error: no bounds can be computed for x
D x;

// Error: D has no default
class E<T extends D>

// Error: G has no default bounds
class F<T extends G> {}
// Error: F has no default bounds
class G<T extends F> {}

Notes:

This proposal is less preferred because it makes more cases that seem reasonable an error.

This proposal has the advantage that the calculation of the default bounds of a type is entirely local to the type declaration.

@leafpetersen
Copy link
Member Author

@jmesserly @sigmundch @bwilkerson @stereotype441 @scheglov @floitschG @eernstg @lrhn @munificent

Feedback on usability and implementation concerns welcome.

Given that proposal 2 is just an extension of proposal 1, it might be worth implementing proposal 2 and testing it against a large corpus to see how many cases are rejected that could be potentially handled by proposal 2.

@stereotype441
Copy link
Member

I'd be fine with proposal 2 without any further investigation. If we want to consider proposal 1, then I would want us to first do a trial implementation inside the summary linker to make sure the effect on summaries isn't too onerous.

@bwilkerson
Copy link
Member

I do have a concern with the non-locality of the first proposal, both from an implementation standpoint (I expect you're aware of many of the implementation issues) and from a usability standpoint. It tends to be confusing to users if a change they're making in one library causes errors to appear in other libraries. While I haven't constructed an example to prove that this could occur, it looks like a likely problem.

It would definitely be interesting to know how often the problems we're trying to solve and the problems we're (potentially) introducing occur in practice. In particular, how often do the "cases that seem reasonable" mentioned in the second proposal occur? If they are very rare, then it would be worth the simplicity of the model to accept the limitations.

I'm also wondering whether we really want to produce errors at the declaration site. One of the examples is:

// No error here, but no default can be computed.
class D<T extends Comparable<T>> {}

// Error: D has no default
class E<T extends D>

It might be frustrating to users for this to be disallowed if all of the uses of E include explicit type parameters and hence aren't a problem in practice.

On a minor note, I think one of the examples is wrong:

// Default is C<int, B<int>>.
class C<T extends int, S extends B<T>> {}

Assuming that the definition of B is

class B<T extends A> {}

then, C<int, B<int>> isn't a valid type because int doesn't extend A. I assume that an attempt to build a default would fail if the default itself would end up being invalid.

@leafpetersen
Copy link
Member Author

It tends to be confusing to users if a change they're making in one library causes errors to appear in other libraries. While I haven't constructed an example to prove that this could occur, it looks like a likely problem.

This could certainly happen, but I would expect it to be fairly rare. It can only arise if you have types which refer to each cyclically in their bounds. This isn't unheard of, but I would expect it to only happen in fairly tightly coupled code.

In particular, how often do the "cases that seem reasonable" mentioned in the second proposal occur? If they are very rare, then it would be worth the simplicity of the model to accept the limitations.

Given that the 1st is an extension of the 2nd, perhaps it would be worth implementing it and testing it out. I think that there will be quite a bit of code that looks like the following:

class A<T extends List> {}

A x;

which will be an error. If we made the modest extension to proposal 2 to allow raw types in bounds for raw types with implicit bounds, then it's plausible that this might catch most of the remaining cases.

I'm also wondering whether we really want to produce errors at the declaration site. One of the examples is:

Unfortunately, this one pretty much has to be an error. It doesn't matter whether E is every used without bounds or not. We're simply unable to compute what its own bound means, and hence we cannot even type check the body of E.

On a minor note, I think one of the examples is wrong:

Fixed, thanks.

@leafpetersen
Copy link
Member Author

I did a little bit of code search spelunking, and actually didn't find many uses of the obvious culprits. I think it's worth either implementing proposal 2 and testing it out, or possibly just implementing the "make a raw type in a bound an error" part and testing it out.

@leafpetersen
Copy link
Member Author

status: ready for implementation. I will work with implementers to get proposal 2 implemented, run a pre-submit, and make go/no-go on the extension to proposal 1 based on data.

@MichaelRFairhurst
Copy link
Contributor

MichaelRFairhurst commented Nov 28, 2016

Not sure anyone has specifically brought up where the generic types are recursive to the class that's defined.

class C<T extends Comparable<T>> {} // anything comparable to itself. Unresolvable.
class C<T extends C<T>> {} // think java's Enum class. Resolvable!

In the latter case, it should resolve to Enum<Enum<Enum<Enum..... That sounds like an error, but its not! Think of the methods on the enum class: things like .nextValue().

abstract class Enum<E extends Enum<E>> {
  E nextValue();
}

Enum x = ...;
x = x.nextValue();

You can see how we know that x.nextValue() will be an Enum. And we can keep calling it forever and keeping an Enum. It's a valid infinite type.

At the very least this is a case to be caught at declaration time in the spec to make sure it doesn't crash in an infinite loop!

But maybe it is best represented by

class InfiniteRootType extends DartType {
   DartType loopType;
}
class InfiniteRecurseType extends DartType {
   InfiniteRootType recursePoint;
}

in this case, Enum<E extends Enum<E>> could be constructed:

DartType enumType = new InfiniteRootType();
DartType enumClassType = ....;
DartType enumRecurseType = new InfiniteRecurseType();
enumType.loopType = enumClassType;
enumClassType.parameters[0] = enumRecurseType;
enumRecurseType.recursePoint = enumType;

These classes could include state to guard against infinite loops to report internal errors, or know when they are complete, and algorithms could detect them to do special comparisons. I think there's a proof to be made about the typing relationship of of infinite types which is something along the lines of "infinite type a is a subtype of type b if b is an infinite type where any point in the tree traced forward until it repeats perfectly matches the root of a traced forward until it repeats itself." Probably not the hardest code to write, though there may be like...one or two algorithms that are real hell to describe with infinite types as a possibility.

Maybe easier to report an error when its declared :) But this is not outside the bounds of a typechecking system. Its really just a question of how complicated the algorithms that handle it become.

@munificent
Copy link
Member

where the generic types are recursive to the class that's defined.

Yes, Dart supports F-bounded quantification. Like you note, it comes into play in Comparable and a couple of other places (though surprisingly few for the complexity it adds to the type system).

We should be handling the potentially infinite types that come from that in the analyzer already (though it has certainly been a source of bugs over the years!). :)

@MichaelRFairhurst
Copy link
Contributor

MichaelRFairhurst commented Nov 30, 2016

I think F-Bounded (also, didn't know that's what they were called! Thanks!) types are supported and working, but F-Bounded types with undeclared parameters, and therefore instantiated to bounds, are not working as well as they could:

class Enum<E extends Enum<E>> {
  E nextValue() {
    return this;
  }
}

testfn() {
  Enum e;
  e = e.nextValue();
}

with dartanalyzer --strong gives me the warning:

[warning] Unsound implicit cast from Enum<dynamic> to Enum<Enum<dynamic>> (/home/mfairhurst/dart/angular-dart-analyzer/server_plugin/bin/test.dart, line 9, col 7)

Instead of Enum e being resolved to an infinite type, it looks like its resolved to Enum<Enum<dynamic>>, making the e.nextValue() typed to Enum<dynamic>, and then lack of covariance correctly steps in to complain about what's happening.

Its also worth noting that Enum<dynamic> and Enum<Enum<dynamic>> are not valid types! Declaring them yourself generates these types of errors:

[error] 'Enum' does not extend 'Enum<Enum>' (/home/mfairhurst/dart/angular-dart-analyzer/server_plugin/bin/test.dart, line 7, col 6)
[error] 'dynamic' does not extend 'Enum' (/home/mfairhurst/dart/angular-dart-analyzer/server_plugin/bin/test.dart, line 7, col 11)

@leafpetersen
Copy link
Member Author

Recursive types generally don't fit well into Dart's type system - there's not way to express them directly in the syntax (though they are in a certain sense there implicitly).

Both of the proposals that I sketched out in comments above deal with the Enum example by making uses of Enum as a raw type into an error (since we cannot compute a sensible and finite default bound for them).

@MichaelRFairhurst
Copy link
Contributor

That's probably the best, at least from a difficulty-to-implement point of view.

I don't yet see a problem with letting Enum be semantically equivalent to the syntactically impossible to express type Enum<Enum<Enum<Enum<..., but that does not mean problems with it don't exist! I know for instance that there is an effort right now to expose the bottom type, which previously existed semantically and not syntactically.

As someone who likes type theory (only well enough to not know the name for F-Bound quantification), I enjoy thinking about this problem, but there are bigger fish to fry.

@mit-mit mit-mit added the area-meta Cross-cutting, high-level issues (for tracking many other implementation issues, ...). label Dec 14, 2016
@mit-mit
Copy link
Member

mit-mit commented Dec 14, 2016

@leafpetersen #27526 (comment) suggests that this is specified, can we get the checklist updated? And is there a bug tracking the implementation in analyzer?

@munificent
Copy link
Member

And is there a bug tracking the implementation in analyzer?

There is now. :)

@mit-mit mit-mit modified the milestones: 1.22, 1.50 Dec 16, 2016
@mit-mit
Copy link
Member

mit-mit commented Dec 16, 2016

Moving to 1.22 for implementation per status update in email from @leafpetersen. Informal spec is included directly in this bug itself.

@kevmoo
Copy link
Member

kevmoo commented Jan 19, 2017

@leafpetersen @mit-mit @munificent Checking that this is critical for 1.22 w/ the spec work moved to 1.50?

@leafpetersen
Copy link
Member Author

My take (which I've shared with the analyzer folks) is that the syntax changes which affect both strong mode and spec mode are critical for 1.50 (I see this as the main point of 1.50 as it stands).

Strong mode only changes to bring strong mode into alignment with various language team decisions are a step down in priority: good to have in 1.50, but 1.50 still makes sense without them if they just can't happen in time.

This particular issue is second on my list of strong mode only changes in terms of priority, below toplevel inference. But if this is feasible to land in that time frame and toplevel inference is not, then feel free to prioritize this.

@mit-mit mit-mit added P1 A high priority bug; for example, a single project is unusable or has many test failures and removed P1 A high priority bug; for example, a single project is unusable or has many test failures labels Jan 19, 2017
@mit-mit
Copy link
Member

mit-mit commented Jan 24, 2017

Changelog underway: https://codereview.chromium.org/2648203003/

@leafpetersen
Copy link
Member Author

We (the language team) have decided on a small extension to the feature, but otherwise this is set to go as implemented.

#28580

@mit-mit
Copy link
Member

mit-mit commented Jan 31, 2017

Closing this; the follow-up issue #28580 will happen in 1.23.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-language Dart language related items (some items might be better tracked at github.com/dart-lang/language). area-meta Cross-cutting, high-level issues (for tracking many other implementation issues, ...). P1 A high priority bug; for example, a single project is unusable or has many test failures
Projects
None yet
Development

No branches or pull requests

8 participants