Skip to content
This repository was archived by the owner on Dec 22, 2021. It is now read-only.

Use a universal abstraction for constrained collection types #39

Closed
wants to merge 3 commits into from

Conversation

szeiger
Copy link
Contributor

@szeiger szeiger commented Feb 27, 2017

This could be used for different types of constraints to implement
types such as Map, BitSet, SortedSet. Computing the result type
works similarly to CanBuild but the eligible implicit scopes and the
relationship of different implicits are much simpler. In particular,
there is never a From type and there are only two choices for the
To type: the desired constrained type (which does not need to be a
type constructor) and an unconstrained fallback type constructor.

The current implementation shows that inheritance works (with plenty of
@uncheckedVariance tricks) if the constraint is the same. We add an
Ordering constraint in SortedSet and extend that to TreeSet. If
the element type has an Ordering poly-transforms produce another
TreeSet (or SortedSet, depending on the original type), otherwise a
Set.

Unfortunately, it does not look like it will be possible to compose
multiple constraints, which would be needed for types like TreeMap
(combining the Tuple2 constraint of Map with the Ordering
constraint of Sorted).

This could be used for different types of constraints to implement
types such as `Map`, `BitSet`, `SortedSet`. Computing the result type
works similarly to `CanBuild` but the eligible implicit scopes and the
relationship of different implicits are much simpler. In particular,
there is never a `From` type and there are only two choices for the
`To` type: the desired constrained type (which does not need to be a
type constructor) and an unconstrained fallback type constructor.

The current implementation shows that inheritance works (with plenty of
`@uncheckedVariance` tricks) if the constraint is the same. We add an
`Ordering` constraint in `SortedSet` and extend that to `TreeSet`. If
the element type has an `Ordering` poly-transforms produce another
`TreeSet` (or `SortedSet`, depending on the original type), otherwise a
`Set`.

Unfortunately, it does not look like it will be possible to compose
multiple constraints, which would be needed for types like `TreeMap`
(combining the `Tuple2` constraint of `Map` with the `Ordering`
constraint of `Sorted`).
scalaVersion in ThisBuild := "2.12.1"
resolvers in ThisBuild += "scala-pr" at "https://scala-ci.typesafe.com/artifactory/scala-pr-validation-snapshots"

scalaVersion in ThisBuild := "2.12.2-ebe1180-SNAPSHOT" // "2.12.1"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scala build from scala/scala#5742

szeiger added 2 commits March 2, 2017 13:46
Adding an upper bound to element types allows `FromIterable` and
`IterableFactory` to be used for building constrained collection types.
In cases where the constraint is only a type proof (e.g. `BitSet`,
`String`, `Map`) they are on equal footing with unconstrained
collections.

As a proof of concept, I added a `BitSet` class. If you have an
`Iterable[Int]` you can now call `.to(BitSet)` with no overloading or
runtime overhead (like passing an additional implicit parameter).

Sort-of-problem: We need pretend unary type constructors for all
collection types, so instead of a `BitSet` you really get a
`BitSet with Iterable[T]`.

This still doesn’t help with the actual implementation of constrained
collection types (only with their companion objects).

Next step: Implicit lookup of companion objects and extending it to
constraints with a runtime component.
- Requires special syntax for `to` when building collections with
  a constraint value.

- No fallback to unconstrained parent type in poly-transforms
@szeiger
Copy link
Contributor Author

szeiger commented Mar 3, 2017

I'm done with my experiments here so I'll summarize what I learned in order to make an informed decision on how we want to integrate constrained collection types. Rather than just unconstrained and constrained, we can make further distinctions which may prove useful:

  • The simplest case: An unconstrained collection type with a unary type constructor C[E] that can be instantiated for any E.
  • A unary type constructor with a type constraint on the element type. Not all collection types are covariant but their factories are, so this can always be viewed as an upper bound C[E <: B]. We don't have any instance of this case at the moment.
  • A collection type with an upper bound on the element type but no natural unary type constructor. For the purpose of getting a type constructor inferred from a value (e.g. a factory object passed to Iterable.to) we can "fake" a unary type constructor like BitSet with Iterable[E] for any E <: Int, so we can treat it like the previous case. This can be used to model Map, BitSet and String.
  • A collection type with a unary type constructor that requires an implicit evidence (other than a generalized type constraint). Instances are SortedSet[E : Ordered], its subclass TreeSet[E : Ordered] and Array[E : ClassTag]. In the current collections library we also have BitSet <: SortedSet[Int] but at least for the purpose of creating collections, BitSet is different from SortedSet-like types.
  • A collection type that combines multiple inherited constraints. This affects SortedMap / TreeMap which combine the constraints of Map and Sorted.

In the first commit I implement a ConstrainedIterableFactory which duplicates IterableFactory but all methods take an additional Build typeclass which is similar to CanBuild. We don't need the full CanBuildFrom in this design because the required Build type is always specific to a collection type's companion object. There are always two possible results for the implicit lookup: If the implicit evidence (as determined by the companion object) is available, you get the most specific constrained type, otherwise the most specific unconstrained supertype. Callers can request only the preferred type, such that (s: BitSet).map(_.toString) would fall back to building a Set[String] but (i: Iterable[String]).to(BitSet) would fail to compile instead of building a Set[String].

One downside of this design is that constraints do not compose. My attempts to solve this problem inevitably led back to the full CanBuildFrom. But I think we can live with this limitation. The only problematic case is SortedMap and maps are such a basic abstraction that we should add direct support for them. We need a MapLike abstraction anyway, so we may as well overload the IterablePolyTransforms methods specifically for maps. The necessary constraints for Sorted can then be added on top via the generic abstraction.

Another downside is the runtime overhead, which should be the same as for CanBuildFrom: All implicit evidence parameters need to be wrapped in a Build value and the builder is constructed from an implicit parameter instead of the receiver of a poly-transform method which gives HotSpot fewer options for optimization.

The final state of this PR after commit 3 shows the alternative: Pass implicit evidence values directly to the overloaded methods in ConstrainedIterablePolyTransforms. This is just as generic as the Build-based version (because we can still abstract over the evidence type) but it avoids the runtime overhead and leads to simpler method signatures.

The big problem of this version is that there is no fallback. Assuming an implementation that doesn't rely on generalized type constraints,(s: BitSet).map(i => i: Any) compiles because overload resolution is driven by the types but (s: SortedSet[Int]).map(i => i: Any) fails to compile because there is no Ordering[Any] and it is already too late to fall back to the other overload that doesn't require the evidence. (Note that default values for implicit parameters do not help; we need different return types, not just different implementations)

This problem is independent of having a generic abstraction for constraints. All implementations of constrained collections that require an implicit evidence will run into this issue, so the biggest question is, are we OK with this limitation or do we need to go back to a Build typeclass approach? This will affect all Sorted collections (and only those, as far as I can tell; Array is immune because a ClassTag is available for every type).

For collections that only have type constraints the actual collection type needs to implement specific overloads for the type constraints. We cannot have generic supertraits because methods with a generic constraint would have the same erasure as the unconstrained overloads. We can provide such an abstraction for the collection factories though, which is what I did in the 2nd commit. This requires an additional upper bound type parameter in FromIterable and IterableFactory. With this in place we can write Iterable.to in a way that allows (i: Iterable[Int]).to(BitSet) to compile without any runtime overhead or special-casing, but I'm not sure the more complicated FromIterable really carries its weight here.

To summarize my recommendations and open questions:

  • We should not use any generic abstractions for Map.
  • Are we OK with operations on Sorted types failing to compile when the target type is not Sorted as well?
  • Is it worth adding an upper bound to FromIterable?

@Ichoran
Copy link
Contributor

Ichoran commented Mar 4, 2017

I am not only okay with operations on Sorted types failing to compile when the target type is wrong, I actually think it is an improvement over the current state of affairs where you can get all the way through a set of fluently chained methods and not realize that your assumption about being sorted was broken somewhere in the middle. I think we should view this as a feature, not a drawback.

However, being unable to map to your target at all is terribly inconvenient, so we should always provide a method that will upcast the type to something that has lost the constraint.

Regarding the FromIterable constraints, I tentatively think that it is worth it because special-casing element types seems like a natural use-case for extending collections. For instance, the XML library did exactly that.

I guess with mutable collections we deal as usual with element constraints by making collections with constrained element types invariant in the parameter with the constraint. For other constraints, since the in-place stuff will automatically not change the collection type and thus previous evidence applies (e.g. an Ordering), and the not-in-place stuff will follow the same logic as here, I think there's nothing extra.

So I vote for the final strategy as the best alternative. I would like to hear from @julienrf as well, however.

@odersky
Copy link
Contributor

odersky commented Mar 4, 2017

I agree that issuing type errors when constraints are lost is probably desirable. At least worthwhile checking out in depth.

However, being unable to map to your target at all is terribly inconvenient, so we should always provide a method that will upcast the type to something that has lost the constraint.

Isn't that just a widening? I.e.

val s: SortedSet[Int]
val f: Int => Any
s.map(f)  // fails, but
(s: Set[Int]).map(d) // succeeds

If it's just that, I am OK with it!

@Ichoran
Copy link
Contributor

Ichoran commented Mar 5, 2017

Yes, just a widening.

@odersky
Copy link
Contributor

odersky commented Mar 5, 2017

@Ichoran @szeiger That seems perfectly reasonable then. 👍

@julienrf
Copy link
Contributor

julienrf commented Mar 6, 2017

Are we OK with operations on Sorted types failing to compile when the target type is not Sorted as well?

Definitely.

Is it worth adding an upper bound to FromIterable?

I don’t think so.

Regarding the FromIterable constraints, I tentatively think that it is worth it because special-casing element types seems like a natural use-case for extending collections. For instance, the XML library did exactly that.

I’m not sure to want all the collection factories to extend this factory. I think it will be marginally useful (e.g. for BitSet and XmlNodes), I would prefer the List.apply factory to take an unconstrained A type parameter rather than a useless A <: Any one.

@szeiger
Copy link
Contributor Author

szeiger commented Mar 6, 2017

I would prefer the List.apply factory to take an unconstrained A type parameter rather than a useless A <: Any one.

What's the difference? An unconstrained A is automatically A <: Any >: Nothing.

@julienrf
Copy link
Contributor

julienrf commented Mar 6, 2017

The difference is only visible in tools (autocompletion, scaladoc). But if you all think this is not a problem I’m happy to go with this solution.

@szeiger
Copy link
Contributor Author

szeiger commented Mar 6, 2017

It looks to me like the tooling should be improved for this case rather than the API. Both IntelliJ and scaladoc show fromIterable without a type bound and the other methods with an explicit <: Any. The only difference between these cases is whether the type bound was inherited or defined locally. If you override empty as

  override def empty[E <: Any]: List[E] = ???

both IntelliJ and scaladoc show it without an explicit type bound, even though one was provided in the definition. I can't think of any reason why an inherited method should be treated differently.

@julienrf
Copy link
Contributor

julienrf commented Mar 6, 2017

Still, if you “go to definition” you will see the actual source code that does have this useless upper bound. But, as I previously said, this is just a minor inconvenience in my opinion.

@szeiger
Copy link
Contributor Author

szeiger commented Mar 8, 2017

New implementation of these ideas in #45

@szeiger szeiger closed this Mar 8, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants