[Prototype] Better type inference for lambdas (e.g., as used in folds)

smarter · smarter · commit e15c5805a54a · 2020-05-29T17:14:21.000+02:00
No version of Scala has ever been able to infer the following:

  val xs = List(1, 2, 3)
  xs.foldLeft(Nil)((acc, x) =&gt; x :: acc)

To understand why, let's have a look at the signature of `List[A]#foldLeft`:

  def foldLeft[B](z: B)(op: (B, A) =&gt; B): B

When typing the foldLeft call in the previous expression, the compiler
starts by creating an unconstrained type variable ?B, the challenge is then to
successfully type the expression and instantiate `?B := List[Int]`.

Typing the first argument is easy: `Nil` is a valid argument if we add a
constraint:

  ?B &gt;: Nil.type

Typing the second argument is where we get stuck normally: we need to choose a type
for the binding `acc`, but `?B` is a type variable and not a fully-defined type,
this is solved by instantiating `?B` to one of its bound, but no matter what
bound we choose, the rest of the expression won't typecheck:
- if we instantiate `?B := Nil.type`, then the body of the lambda `x :: acc` is
  not a subtype of the expected result type `?B`.
- if we instantiate `?B := Any`, then the body of the lambda does not
  typecheck since there is no method `::` on `Any`.

But... what if we just let `acc` have type `?B` without instantiating it first?
This is not completely meaningless: `?B` behaves like an abstract type except
that its bounds might be refined as we typecheck code, as long as narrowing
holds (#), this should be safe.

The remaining challenge then is to type the body of the lambda `x :: acc` which
desugars to `acc.::(x)`, this won't typecheck as-is since `::` is not defined on
the upper bound of `?B`, so we need to refine this upper bound somehow, the
heuristic we use is:

1) Look for `::` in the lower bound of `?B &gt;: Nil.type`, Nil does have such a member!
2) Find the class where this member is defined: it's `List`
3) If the class has type parameters, create one fresh type variable
   for each parameter slot, the resulting type is our new upper bound,
   so here we get `?B &lt;: List[?X]` where `?X` is a fresh type variable.

We can then proceed to type the body of the lambda:

  acc.::(x)

This first creates a type variable `?B2 &gt;: ?X`, because `::` has type:

  def :: [B &gt;: A](elem: B): List[B]

Because the result type of the lambda is `?B`, we get an additional constraint:

  List[?B2] &lt;: ?B

We know that `?B &lt;: List[?X]` so this means that `?B2 &lt;: ?X`, but we
also know that `B2 &gt;: ?X`, so we can instantiate `?B2 := ?X` and `?B := List[?X]`.

Finally,  because `x` has type Int we have `?B2 &gt;: Int` which simplifies to:

  ?X &gt;: Int

Therefore, the result type of the foldLeft is `List[?X]` where `?X &gt;: Int`,
because `List` is covariant, we instantiate `?X := Int` to get the most precise
result type `List[Int]`.

Note that the the use of fresh type variables in 3) was crucial here: if we had
instead used wildcards and added an upper bound `?B &lt;: List[_]`, then we would
have been able to type `acc.::(x)`, but the result would have type `List[Any]`,
meaning the result of the foldLeft call would be `List[Any]` when we wanted
`List[Int]`.

\# Status

All the compiler tests pass, including bootstrapping, but one of third of the
community build breaks currently.

Even if this PR never makes it in, it has been very useful for stress-testing
our constraint solver and lead to several PRs I made over the past few days:
of this PR that would be worth getting in by themselves.

\# Open questions

- Is this actually sound?
- Are there other compelling examples where this useful, besides folds?
- Is the performance impact of this stuff acceptable?
- How do we deal with overloads?
- How do we deal with overrides?
- How does this interact with implicit conversions?
- How does this interact with implicit search in general, we might find one
  implicit at a given point, but then as we add more constraints to the same
  type variable, the same implicit search could find a different result. How big
  of a problem is that?

(#): narrowing in fact does not hold when `@uncheckedVariance` is used, which is
     why we special-case it in `typedSelect` in this commit.
diff --git a/compiler/src/dotty/tools/dotc/ast/Desugar.scala b/compiler/src/dotty/tools/dotc/ast/Desugar.scala
@@ -681,7 +681,12 @@ object desugar {
               if (restrictedAccess) mods.withPrivateWithin(constr1.mods.privateWithin)
               else mods
             }
-            val appParamss =
+            // FIXME: This now infers `List[List[DefTree]]`, the issue
+            // is that `withMods` is defined in `DefTree` so that becomes the
+            // upper bound of the type variable (see logic in `constrainSelectionQualifier`),
+            // but the result type of `withMods` is a type member which is
+            // refined in `ValDef`.
+            val appParamss: List[List[ValDef]] =
               derivedVparamss.nestedZipWithConserve(constrVparamss)((ap, cp) =>
                 ap.withMods(ap.mods | (cp.mods.flags & HasDefault)))
             val app = DefDef(nme.apply, derivedTparams, appParamss, applyResultTpt, widenedCreatorExpr)
diff --git a/compiler/src/dotty/tools/dotc/core/OrderingConstraint.scala b/compiler/src/dotty/tools/dotc/core/OrderingConstraint.scala
@@ -310,6 +310,9 @@ class OrderingConstraint(private val boundsMap: ParamBounds,
   private def ensureNonCyclic(param: TypeParamRef, inst: Type)(using Context): Type =
 
     def recur(tp: Type, fromBelow: Boolean): Type = tp match
+      case tp: NamedType =>
+        val underlying1 = recur(tp.underlying, fromBelow)
+        if underlying1 ne tp.underlying then underlying1 else tp
       case tp: AndOrType =>
         val r1 = recur(tp.tp1, fromBelow)
         val r2 = recur(tp.tp2, fromBelow)
@@ -613,6 +616,8 @@ class OrderingConstraint(private val boundsMap: ParamBounds,
   def occursAtToplevel(param: TypeParamRef, inst: Type)(implicit ctx: Context): Boolean =
 
     def occurs(tp: Type)(using Context): Boolean = tp match
+      case tp: NamedType =>
+        occurs(tp.underlying)
       case tp: AndOrType =>
         occurs(tp.tp1) || occurs(tp.tp2)
       case tp: TypeParamRef =>
diff --git a/compiler/src/dotty/tools/dotc/core/Types.scala b/compiler/src/dotty/tools/dotc/core/Types.scala
@@ -3927,7 +3927,7 @@ object Types {
         NoType
     }
 
-    def tyconTypeParams(implicit ctx: Context): List[ParamInfo] = {
+    def tyconTypeParams(implicit ctx: Context): List[TypeApplications.TypeParamInfo] = {
       val tparams = tycon.typeParams
       if (tparams.isEmpty) HKTypeLambda.any(args.length).typeParams else tparams
     }
diff --git a/compiler/src/dotty/tools/dotc/typer/Inferencing.scala b/compiler/src/dotty/tools/dotc/typer/Inferencing.scala
@@ -381,7 +381,7 @@ object Inferencing {
    *
    *  we want to instantiate U to x.type right away. No need to wait further.
    */
-  private def variances(tp: Type)(using Context): VarianceMap = {
+  def variances(tp: Type)(using Context): VarianceMap = {
     Stats.record("variances")
     val constraint = ctx.typerState.constraint
 
diff --git a/compiler/src/dotty/tools/dotc/typer/ProtoTypes.scala b/compiler/src/dotty/tools/dotc/typer/ProtoTypes.scala
@@ -135,7 +135,12 @@ object ProtoTypes {
      *  or as an upper bound of a prefix or underlying type.
      */
     private def hasUnknownMembers(tp: Type)(using Context): Boolean = tp match {
-      case tp: TypeVar => !tp.isInstantiated
+      case tp: TypeVar =>
+        // FIXME: This used to be `!tp.isInstantiated` but that prevents
+        // extension methods from being selected with the changes in this PR.
+        // This change doesn't break any testcase, can we construct a testcase
+        // where this matters?
+        false
       case tp: WildcardType => true
       case NoType => true
       case tp: TypeRef =>
@@ -152,20 +157,30 @@ object ProtoTypes {
       case _ => false
     }
 
-    override def isMatchedBy(tp1: Type, keepConstraint: Boolean)(using Context): Boolean =
-      name == nme.WILDCARD || hasUnknownMembers(tp1) ||
-      {
-        val mbr = if (privateOK) tp1.member(name) else tp1.nonPrivateMember(name)
+    override def isMatchedBy(tp1: Type, keepConstraint: Boolean)(using Context): Boolean = {
+      if name == nme.WILDCARD || hasUnknownMembers(tp1) then
+        return true
+
+      def go(pre: Type): Boolean = {
+        val mbr = if (privateOK) pre.member(name) else pre.nonPrivateMember(name)
         def qualifies(m: SingleDenotation) =
           memberProto.isRef(defn.UnitClass) ||
-          tp1.isValueType && compat.normalizedCompatible(NamedType(tp1, name, m), memberProto, keepConstraint)
+          pre.isValueType && compat.normalizedCompatible(NamedType(pre, name, m), memberProto, keepConstraint)
             // Note: can't use `m.info` here because if `m` is a method, `m.info`
             //       loses knowledge about `m`'s default arguments.
         mbr match { // hasAltWith inlined for performance
           case mbr: SingleDenotation => mbr.exists && qualifies(mbr)
           case _ => mbr hasAltWith qualifies
         }
       }
+      tp1.widenDealias.stripTypeVar match {
+        case tp: TypeParamRef =>
+          val bounds = ctx.typeComparer.bounds(tp)
+          go(bounds.hi) || go(bounds.lo)
+        case _ =>
+          go(tp1)
+      }
+    }
 
     def underlying(using Context): Type = WildcardType
 
diff --git a/compiler/src/dotty/tools/dotc/typer/Typer.scala b/compiler/src/dotty/tools/dotc/typer/Typer.scala
@@ -516,13 +516,220 @@ class Typer extends Namer
     tree
   }
 
+  /** Try to add constraints to type a selection where the qualifier is a type variable.
+   *
+   *  Currently, this should only happen with lambdas, for example when typechecking:
+   *
+   *    def foo[T <: List[Any]](x: T => T)
+   *    foo(x => x.head: Int)
+   *
+   *  In the past, `typedFunctionValue` would have instantiated the type
+   *  variable corresponding to the type parameter `T` to `List[Any]` before
+   *  typing the lambda, which would then fail because `x.head` has type `Any`.
+   *  But we now leave such type variables uninstantiated, which means we need
+   *  to figure out how to type a selection where the prefix is an
+   *  uninstantiated type variable, and in particular how to propagate
+   *  constraints from typing this selection back to that type variable.
+   *
+   *  @param qual           The type of the qualifier of the selection
+   *  @param name           The name of the member being selected
+   *  @param underlyingVar  The uninstantiated type variable underlying the type of the qualifier
+   *  @param pt             The expected type of the selection
+   */
+  private def constrainSelectionQualifier(
+    qual: Type, name: Name, underlyingVar: TypeVar, pt: Type)(using Context): Boolean = {
+
+    /** Return `tycon[?A, ?B, ...]` where `?A`, `?B`, ... are fresh type variables
+     *  conforming to the corresponding type parameter in `tparams`.
+     */
+    def appliedWithVars(tycon: Type, tparams: List[TypeApplications.TypeParamInfo]): Type = {
+      if (tparams.isEmpty)
+        tycon
+      else {
+        val tl = tycon.EtaExpand(tparams).asInstanceOf[HKTypeLambda]
+        val tvars = constrained(tl, untpd.EmptyTree, alwaysAddTypeVars = true)._2.map(_.tpe)
+        tycon.appliedTo(tvars)
+      }
+    }
+
+    /** Replace all applied types `tycon[T, S, ...]` by `tycon[?A, ?B, ...]`
+     *  where `?A`, `?B`, ... are fresh type variables.
+     */
+    def replaceArgsByVars = new TypeMap {
+      def apply(t: Type): Type = t match {
+        case tp: TypeLambda =>
+          tp
+        case tp @ AppliedType(tycon, args) =>
+          // Note that we don't constrain the fresh type variables
+          // such that the mapped type is a subtype of `tp`, we let
+          // the caller deal with that.
+          appliedWithVars(tycon, tp.tyconTypeParams)
+        case _ =>
+          mapOver(t)
+      }
+    }
+
+    /** Does `@uncheckedVariance` appears somewhere in the type of `d` ? */
+    def hasUncheckedVariance(d: SingleDenotation) = d.info.widen.existsPart {
+      case tp @ AnnotatedType(_, annot) =>
+        annot.symbol eq defn.UncheckedVarianceAnnot
+      case tp =>
+        false
+    }
+
+    /** The members of one of the bound of `underlyingVar` which the selection
+     *  could resolve to.
+     *
+     * @param isUpper  If true, look for candidates in the upper bound,
+     *                 otherwise look in the lower bound.
+     */
+    def candidatesInBound(isUpper: Boolean): List[SingleDenotation] = {
+      val bounds = ctx.typeComparer.bounds(underlyingVar.origin)
+      val bound = if (isUpper) bounds.hi else bounds.lo
+      val d = bound.member(name)
+      d.alternatives
+    }
+
+    /** Try to add additional constraints on `underlyingVar`
+     *  to allow a selection of `candidate` to typecheck.
+     *
+     *  @param  isUpper  Does `candidate` come from the upper bound
+     *                   of the qualifier type?
+     */
+    def constrainTo(candidate: SingleDenotation, isUpper: Boolean): Boolean = {
+      if (hasUncheckedVariance(candidate)) {
+        // If `@uncheckedVariance` appears in the type of the candidate, give up
+        // on delaying instantiation and just instantiate the type variable at
+        // that point. If we don't do that, the type variable might later
+        // be constrained in a way that prevents the selection from typechecking,
+        // because narrowing does not hold with unchecked variance.
+        // See tests/pos/fold-infer-uncheckedVariance.scala for an example
+        // which did not pass `-Ytest-pickler` before.
+        // FIXME: `-Ycheck:all` did not pick up on this issue in
+        // fold-infer-uncheckedVariance.scala because the ReTyper never retypes
+        // selections, I think it's important to get TreeChecker to actually
+        // verify this stuff, especially with everything going on in this PR.
+        underlyingVar.instantiate(fromBelow = !isUpper)
+        return true
+      }
+
+      val owner = candidate.symbol.maybeOwner
+      // TODO: Deal with methods in structural types?
+      if (!owner.exists || !owner.isClass)
+        return false
+
+      if (isUpper) {
+        // The candidate comes from the upper bound of the qualifier.
+        // In that case we replace type arguments in the upper bound
+        // by fresh type variables to make it more flexible.
+        //
+        // For example, we might have `qual: ?T` where `?T <: List[AnyVal]`.
+        // in which case `qual.head` will have type `qual.A` where `A`
+        // is an abstract type >: Nothing <: AnyVal.
+        // Therefore, when typechecking `qual.head: Int`, we get:
+        //
+        //   qual.A <:< Int
+        //   AnyVal <:< Int
+        //         false
+        //
+        // The problem is that subtype checks on `qual.A` do
+        // not allow us to constraint `?T` further.
+        //
+        // To fix this, we need a more precise upper-bound for `?T`:
+        // we can safely rewrite the constraint:
+        //
+        //   ?T <: List[AnyVal]
+        //
+        // as:
+        //
+        //   ?T <: List[?X]
+        //   ?X <: AnyVal
+        //
+        // Now, if we try to typecheck `qual.head: Int`, we get:
+        //
+        //   qual.A <:< Int
+        //       ?X <:< Int
+        //         true, with extra constraint `?X <: Int`
+        //
+        // And at a later point, `?T` will be instantiated to a
+        // subtype of `List[Int]` as expected.
+
+        val base = qual.baseType(owner)
+        // FIXME: this is wasteful: if we have multiple selections with the
+        // same qualifier, we'll create fresh type variables every time.
+        val newUpperBound = replaceArgsByVars(base)
+
+        if newUpperBound ne base then
+          underlyingVar <:< newUpperBound
+      } else {
+        // The candidate comes from the lower bound of the qualifier.
+        // In that case, we need to constrain the upper bound of the
+        // qualifier to be able to typecheck the selection at all,
+        // and like in the isUpper case, we want type variables in
+        // the arguments of that upper bound for flexibility.
+        //
+        // For example, if we have `qual: ?T` where `?T >: Nil`, then
+        // `qual.::` will fail as there is no member named `::`
+        // defined on `Any`, so we need to further constrain the upper
+        // bound. We know that `::` is defined on `List`, so we can add
+        // a constraint:
+        //
+        //   ?T <: List[?X]
+        //   ?X
+        //
+        // (in this example, the fresh type variable `?X` can stay
+        // unconstrained since `Nil <:< List[?X]` is true for all `?X`)
+
+        // FIXME: better handling of overrides: `candidate` might be an
+        // override of some member defined in a parent class, in which
+        // case we're overconstraining the upper bound.
+        val newUpperBound = appliedWithVars(owner.typeRef, owner.typeParams)
+        underlyingVar <:< newUpperBound
+      }
+
+      // FIXME: it would be nice if we could use the expected type
+      // to filter out some candidates, but it's hard to rule out
+      // anything since some implicit conversion might kick in
+      // during adaptation.
+      true
+    }
+
+    /** Try to add additional constraints on `underlyingVar`
+     *  to allow a selection based on the candidates found
+     *  in one of its own bound.
+     *
+     *  @param isUpper  If true, look for candidates in the upper bound,
+     *                  otherwise look in the lower bound.
+     */
+    def constrainInBound(isUpper: Boolean): Boolean = {
+      // FIXME: we just stop after finding a matching candidate, should we
+      // take the union of the constraints they add instead?
+      candidatesInBound(isUpper).exists(constrainTo(_, isUpper))
+    }
+
+    // FIXME: We currently only look at the lower bound if we don't find a
+    // matching member in the upper bound, but that could exclude
+    // the right candidate.
+    constrainInBound(isUpper = true) || constrainInBound(isUpper = false)
+  }
+
   def typedSelect(tree: untpd.Select, pt: Type, qual: Tree)(using Context): Tree = qual match {
     case qual @ IntegratedTypeArgs(app) =>
       pt.revealIgnored match {
         case _: PolyProto => qual // keep the IntegratedTypeArgs to strip at next typedTypeApply
         case _ => app
       }
     case qual =>
+      qual.tpe.widenDealias.stripTypeVar match {
+        case tp: TypeParamRef =>
+          ctx.typerState.constraint.typeVarOfParam(tp) match {
+            case tvar: TypeVar =>
+              constrainSelectionQualifier(qual.tpe, tree.name, tvar, pt)
+            case _ =>
+          }
+        case _ =>
+      }
+
       val select = assignType(cpy.Select(tree)(qual, tree.name), qual)
       val select1 = toNotNullTermRef(select, pt)
 
@@ -1096,6 +1303,21 @@ class Typer extends Namer
       case _ =>
     }
 
+    // The set of type variables in the prototype which appear only in covariant or
+    // contravariant positions. These should be instantiatable without
+    // preventing the body of the lambda from typechecking (...except in situations
+    // like `def foo[T, U <: T](x: T => U)`, where instantiating `T` to a specific
+    // type might overconstrain `U`).
+    //
+    // This doesn't exclude type variables which appear with a different
+    // variance at a later point in the same method call, or a subsequent chained
+    // call.
+    //
+    // TODO: try to replace this by an empty list and see how that affects
+    // inference and performance (we would end up creating a lot more type
+    // variables in `typedSelect`).
+    lazy val protoVariantVars = variances(pt).toList.filter(_._2 != 0).map(_._1)
+
     val (protoFormals, resultTpt) = decomposeProtoFunction(pt, params.length)
 
     /** The inferred parameter type for a parameter in a lambda that does
@@ -1118,7 +1340,7 @@ class Typer extends Namer
      *  If all attempts fail, issue a "missing parameter type" error.
      */
     def inferredParamType(param: untpd.ValDef, formal: Type): Type =
-      if isFullyDefined(formal, ForceDegree.failBottom) then return formal
+      if isFullyDefined(formal, ForceDegree.none) then return formal
       val target = calleeType.widen match
         case mtpe: MethodType =>
           val pos = paramIndex(param.name)
@@ -1128,9 +1350,17 @@ class Typer extends Namer
           else NoType
         case _ => NoType
       if target.exists then formal <:< target
-      if isFullyDefined(formal, ForceDegree.flipBottom) then formal
+      // if isFullyDefined(formal, ForceDegree.flipBottom) then formal
+      if isFullyDefined(formal, ForceDegree.none) then formal
       else if target.exists && isFullyDefined(target, ForceDegree.flipBottom) then target
-      else errorType(AnonymousFunctionMissingParamType(param, params, tree, formal), param.sourcePos)
+      else if !formal.isInstanceOf[WildcardType] then
+        instantiateSelected(formal, protoVariantVars)
+        // Intentionally leave uninstantiated type variables in the types of parameters,
+        // this works because `typedSelect` special cases the handling of qualifiers
+        // whose type is a type variable.
+        formal
+      else
+        errorType(AnonymousFunctionMissingParamType(param, params, tree, formal), param.sourcePos)
 
     def protoFormal(i: Int): Type =
       if (protoFormals.length == params.length) protoFormals(i)
diff --git a/tests/pos/fold-infer-2.scala b/tests/pos/fold-infer-2.scala
diff --git a/tests/pos/fold-infer-uncheckedVariance.scala b/tests/pos/fold-infer-uncheckedVariance.scala
diff --git a/tests/pos/fold-infer.scala b/tests/pos/fold-infer.scala

Original file line number	Diff line number	Diff line change
`@@ -3927,7 +3927,7 @@ object Types {`
`3927`	`3927`	`NoType`
`3928`	`3928`	`}`
`3929`	`3929`
`3930`		`- def tyconTypeParams(implicit ctx: Context): List[ParamInfo] = {`
	`3930`	`+ def tyconTypeParams(implicit ctx: Context): List[TypeApplications.TypeParamInfo] = {`
`3931`	`3931`	`val tparams = tycon.typeParams`
`3932`	`3932`	`if (tparams.isEmpty) HKTypeLambda.any(args.length).typeParams else tparams`
`3933`	`3933`	`}`
Original file line number	Diff line number	Diff line change
`@@ -381,7 +381,7 @@ object Inferencing {`
`381`	`381`	`*`
`382`	`382`	`* we want to instantiate U to x.type right away. No need to wait further.`
`383`	`383`	`*/`
`384`		`- private def variances(tp: Type)(using Context): VarianceMap = {`
	`384`	`+ def variances(tp: Type)(using Context): VarianceMap = {`
`385`	`385`	`Stats.record("variances")`
`386`	`386`	`val constraint = ctx.typerState.constraint`
`387`	`387`