-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[Prototype] Better type inference for lambdas (e.g., as used in folds) #9076
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In `mergeIfSuper`, to simplify `tp1 | tp2`, we first check if `tp2` can be made a subtype of `tp1`. If so we could just return `tp1` but this isn't what we did before this commit: at that point we checked if `tp1` could be made a subtype of `tp2` and in that case prefered returning `tp2` to `tp1`. I haven't been able to find a reason for this (the comment says "keep existing type if possible" which I don't understand). On the other hand, I've found cases where this logic broke type inference because the second subtype check inferred extraneous constraints (see added testcase). So this commit simply removes this logic, it does the same for `mergeIfSub` which contains similar logic to simplify `tp1 & tp2`, though this one is less problematic since it always runs with frozen constraints.
This is just a workaround, awaiting further discussion in the linked issue.
This is the minimum extra check needed to avoid loops after the inference changes in this PR and is incomplete, see test case.
This is needed for type inference with a type variable selection (added in the next commit of this PR) to work properly, but it also affects master. Because dependencies can now span multiple TypeLambdas, `addToConstraint` had to be adjusted to properly propagate all constraints. i5735 doesn't compile anymore but I don't think it ever really made sense.
No version of Scala has ever been able to infer the following: val xs = List(1, 2, 3) xs.foldLeft(Nil)((acc, x) => x :: acc) To understand why, let's have a look at the signature of `List[A]#foldLeft`: def foldLeft[B](z: B)(op: (B, A) => B): B When typing the foldLeft call in the previous expression, the compiler starts by creating an unconstrained type variable ?B, the challenge is then to successfully type the expression and instantiate `?B := List[Int]`. Typing the first argument is easy: `Nil` is a valid argument if we add a constraint: ?B >: Nil.type Typing the second argument is where we get stuck normally: we need to choose a type for the binding `acc`, but `?B` is a type variable and not a fully-defined type, this is solved by instantiating `?B` to one of its bound, but no matter what bound we choose, the rest of the expression won't typecheck: - if we instantiate `?B := Nil.type`, then the body of the lambda `x :: acc` is not a subtype of the expected result type `?B`. - if we instantiate `?B := Any`, then the body of the lambda does not typecheck since there is no method `::` on `Any`. But... what if we just let `acc` have type `?B` without instantiating it first? This is not completely meaningless: `?B` behaves like an abstract type except that its bounds might be refined as we typecheck code, as long as narrowing holds (#), this should be safe. The remaining challenge then is to type the body of the lambda `x :: acc` which desugars to `acc.::(x)`, this won't typecheck as-is since `::` is not defined on the upper bound of `?B`, so we need to refine this upper bound somehow, the heuristic we use is: 1) Look for `::` in the lower bound of `?B >: Nil.type`, Nil does have such a member! 2) Find the class where this member is defined: it's `List` 3) If the class has type parameters, create one fresh type variable for each parameter slot, the resulting type is our new upper bound, so here we get `?B <: List[?X]` where `?X` is a fresh type variable. We can then proceed to type the body of the lambda: acc.::(x) This first creates a type variable `?B2 >: ?X`, because `::` has type: def :: [B >: A](elem: B): List[B] Because the result type of the lambda is `?B`, we get an additional constraint: List[?B2] <: ?B We know that `?B <: List[?X]` so this means that `?B2 <: ?X`, but we also know that `B2 >: ?X`, so we can instantiate `?B2 := ?X` and `?B := List[?X]`. Finally, because `x` has type Int we have `?B2 >: Int` which simplifies to: ?X >: Int Therefore, the result type of the foldLeft is `List[?X]` where `?X >: Int`, because `List` is covariant, we instantiate `?X := Int` to get the most precise result type `List[Int]`. Note that the the use of fresh type variables in 3) was crucial here: if we had instead used wildcards and added an upper bound `?B <: List[_]`, then we would have been able to type `acc.::(x)`, but the result would have type `List[Any]`, meaning the result of the foldLeft call would be `List[Any]` when we wanted `List[Int]`. \# Status All the compiler tests pass, including bootstrapping, but one of third of the community build breaks currently. Even if this PR never makes it in, it has been very useful for stress-testing our constraint solver and lead to several PRs I made over the past few days: of this PR that would be worth getting in by themselves. \# Open questions - Is this actually sound? - Are there other compelling examples where this useful, besides folds? - Is the performance impact of this stuff acceptable? - How do we deal with overloads? - How do we deal with overrides? - How does this interact with implicit conversions? - How does this interact with implicit search in general, we might find one implicit at a given point, but then as we add more constraints to the same type variable, the same implicit search could find a different result. How big of a problem is that? (#): narrowing in fact does not hold when `@uncheckedVariance` is used, which is why we special-case it in `typedSelect` in this commit.
17fb4f9
to
e15c580
Compare
@Blaisorblade I make some claims above that what I'm doing is sound as long as narrowing holds, do you think that makes sense? (And do we in fact expect narrowing to hold in Scala?) |
(EDIT) TLDR: I find this cool and let me encourage you! The concerns shouldn't affect soundness, but with lots of question marks.
Otherwise, narrowing holds without restrictions in both my model and Nada's thesis; only "good bounds" can be broken by narrowing. I expect it should hold in Scala (ignoring transitivity) because it corresponds to Liskov's substitution principle — but if it doesn't, you should get a soundness bug (at least in a calculus and probably in Scala) as follows: If some judgment
|
In fact, I found a case where Ycheck is slacking on the job: e15c580#diff-7c63b7bfffa9340fb48ac9b61fa56beeR601, but in principle yes. However, anything that is only caught by Ycheck is still a soundness bug since we don't run it by default.
Yes that's inevitable I think, I think the best we can do is tune our heuristics (like the one we use to decide which type variables to instantiate before doing an implicit search) to keep existing code behaving more or less the same.
Right, I think that's fine since the constraint solver itself will never narrow an upper bound constraint |
Agreed; I was only hoping that those bugs would be easier to catch.
But it could narrow |
Note that it matters whether you use scala> List(1, 2, 3).foldLeft(List.empty)((x, y) => y :: x)
val res0: List[Any] = List(3, 2, 1) I'm pleased to see that this PR improves this example too: scala> List(1, 2, 3).foldLeft(List.empty)((x, y) => y :: x)
val res0: List[Int] = List(3, 2, 1) |
There was no activity on this one for a long while, so let's close it. |
No version of Scala has ever been able to infer the following:
To understand why, let's have a look at the signature of
List[A]#foldLeft
:When typing the foldLeft call in the previous expression, the compiler
starts by creating an unconstrained type variable ?B, the challenge is then to
successfully type the expression and instantiate
?B := List[Int]
.Typing the first argument is easy:
Nil
is a valid argument if we add aconstraint:
Typing the second argument is where we get stuck normally: we need to choose a type
for the binding
acc
, but?B
is a type variable and not a fully-defined type,this is solved by instantiating
?B
to one of its bound, but no matter whatbound we choose, the rest of the expression won't typecheck:
?B := Nil.type
, then the body of the lambdax :: acc
isnot a subtype of the expected result type
?B
.?B := Any
, then the body of the lambda does nottypecheck since there is no method
::
onAny
.But... what if we just let
acc
have type?B
without instantiating it first?This is not completely meaningless:
?B
behaves like an abstract type exceptthat its bounds might be refined as we typecheck code, as long as narrowing
holds (#), this should be safe.
The remaining challenge then is to type the body of the lambda
x :: acc
whichdesugars to
acc.::(x)
, this won't typecheck as-is since::
is not defined onthe upper bound of
?B
, so we need to refine this upper bound somehow, theheuristic we use is:
::
in the lower bound of?B >: Nil.type
, Nil does have such a member!List
for each parameter slot, the resulting type is our new upper bound,
so here we get
?B <: List[?X]
where?X
is a fresh type variable.We can then proceed to type the body of the lambda:
acc.::(x)
This first creates a type variable
?B2 >: ?X
, because::
has type:Because the result type of the lambda is
?B
, we get an additional constraint:We know that
?B <: List[?X]
so this means that?B2 <: ?X
, but wealso know that
B2 >: ?X
, so we can instantiate?B2 := ?X
and?B := List[?X]
.Finally, because
x
has type Int we have?B2 >: Int
which simplifies to:Therefore, the result type of the foldLeft is
List[?X]
where?X >: Int
,because
List
is covariant, we instantiate?X := Int
to get the most preciseresult type
List[Int]
.Note that the the use of fresh type variables in 3) was crucial here: if we had
instead used wildcards and added an upper bound
?B <: List[_]
, then we wouldhave been able to type
acc.::(x)
, but the result would have typeList[Any]
,meaning the result of the foldLeft call would be
List[Any]
when we wantedList[Int]
.Status
All the compiler tests pass, including bootstrapping, but one of third of the
community build breaks currently.
Even if this PR never makes it in, it has been very useful for stress-testing
our constraint solver and lead to several PRs I made over the past few days:
#9012, #9019, #9030, #9053, and there's some more changes in the early commits
of this PR that would be worth getting in by themselves.
Open questions
implicit at a given point, but then as we add more constraints to the same
type variable, the same implicit search could find a different result. How big
of a problem is that?
(#): narrowing in fact does not hold when
@uncheckedVariance
is used, which iswhy we special-case it in
constrainSelectionQualifier