Require an Order instance for NonEmptyList's groupBy function#1964
Conversation
2516f36 to
f5397be
Compare
|
Hmmm, |
|
We could use Hash but we don’t yet have containers that use Hash. |
|
@johnynek I suggested that we could use the hashcode from |
|
we have to return a This seems a bit complex to me, but it could work. So something like: class MapFromHashBuckets[K, V](h: Hash[K], m: Map[Int, List[(K, NonEmptyList[V])]) extends Map[K, V] {
def get(k: K): Option[NonEmptyList[V]] = m.get(h.hash(k)).flatMap {
case Nil => None
case candidates => candidates.collectFirst { case (k0, vs) if h.eqv(k, k0) => vs}
}
}then we just need to implement the other 3 methods of Map. I guess it might not be so bad if we have tests. Probably better to use |
|
Update: Please disregard. the two maps below aren't consistent as @igstan pointed out.
def groupBy[B](f: A => B)(implicit B: Hash[B]): Map[B, NonEmptyList[A]] = {
val m = mutable.LongMap.empty[(B, mutable.Builder[A, List[A]])]
for { elem <- toList } {
val b = f(elem)
m.getOrElseUpdate(B.hash(b), (b, List.newBuilder[A]))._2 += elem
}
val b = immutable.Map.newBuilder[B, NonEmptyList[A]]
for { (k, bs) <- m.values } {
val head :: tail = bs.result // we only create non empty list inside of the map `m`
b += (k, NonEmptyList(head, tail)))
}
b.result
}
|
|
I was going to propose the exact same thing that @kailuowang did in the last snippet. However, I'm thinking now that whatever constraint we're going to put on |
|
@igstan ah you are right. I guess we have to return a wrapper or a new implementation of Map if we really want to do this. |
|
@kailuowang I'm actually thinking of overriding |
|
Sounds like we are moving into creating our own version of |
|
why don't we do the |
|
The |
|
Right. I can change to an immutable |
|
BTW, I've tried the overriding version and can't do it because |
|
Yeah, I was assuming we would use TreeMap
On Wed, Oct 11, 2017 at 08:55 Ionuț G. Stan ***@***.***> wrote:
BTW, I've tried the overriding version and can't do it because
immutable.HashMap is sealed.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1964 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAEJdpbD4KHtCQIRAV0zz-R6lKwEBsZBks5srQ8BgaJpZM4P1x49>
.
--
P. Oscar Boykin, Ph.D. | http://twitter.com/posco | http://pobox.com/~boykin
|
|
I'm going to force-push the changes as soon as I finish running the tests locally. |
f5397be to
685e949
Compare
|
Force-pushed. |
|
Not sure why the build is failing, but seems to be a spurious one. |
|
There doesn't seem to be a |
|
Oh, silly me. I didn't see that error message when I looked earlier. Suddenly this "low-hanging fruit" bug fix seems to be very high :) |
|
Ok, so here's a summary of the issues so far. If we want to go ahead with If we want to go ahead with Not sure how to reconcile these. I'm out of ideas right now. |
|
Well, we could of course drop the |
|
That brings back my point of such map already exists in dog |
|
Then maybe |
Codecov Report
@@ Coverage Diff @@
## master #1964 +/- ##
==========================================
+ Coverage 96.07% 96.08% +<.01%
==========================================
Files 273 273
Lines 4539 4541 +2
Branches 122 119 -3
==========================================
+ Hits 4361 4363 +2
Misses 178 178
Continue to review full report at Codecov.
|
|
I've changed to use just an immutable |
| b.result | ||
| } | ||
|
|
||
| m.mapValues(v => NonEmptyList.fromListUnsafe(v.result)) |
There was a problem hiding this comment.
IIRC, mapValues is by-name and full of surprises and not something you would ever want to use.
There was a problem hiding this comment.
The alternative would be to construct another TreeMap. Calling map instead of mapValues wouldn't work because we'd have to call toMap on it at the end, and that uses the universal equality and hashing we're trying to get rid off.
What do you think?
There was a problem hiding this comment.
BTW, I never had problems with mapValues, which is why I used it.
There was a problem hiding this comment.
Another surprise of .mapValues is that the result is not Serializable.
There was a problem hiding this comment.
Why don't you use Functor[Map].map(m)(v => NonEmptyList.fromListUnsafe(v.result))? :)
There was a problem hiding this comment.
Erm, of course meant:
Functor[Map[B, ?]].map(m)(v => NonEmptyList.fromListUnsafe(v.result))There was a problem hiding this comment.
Ok, I think was wrong when I said we can't use map instead of mapValues. The result of TreeMap.map is a TreeMap as well, which means the ordering is preserved. I'll change to that.
I don't think we need to redundant Functor[Map[B, ?]] summoning here.
| b.result | ||
| } | ||
|
|
||
| m.map { case (k, v) => (k, NonEmptyList.fromListUnsafe(v.result)) } |
There was a problem hiding this comment.
can we ascribe a type to this to make sure we are not converting to using hashCode and equals at the end?
m.map { case (k, v) => (k, NonEmptyList.fromListUnsafe(v.result)) } : TreeMap[B, NonEmptyList[A]]There was a problem hiding this comment.
Sure. One further thought. Shall we also change the return type to SortedMap? That would explain why we require an Order instance for B.
| */ | ||
| def groupBy[B](f: A => B): Map[B, NonEmptyList[A]] = { | ||
| val m = mutable.Map.empty[B, mutable.Builder[A, List[A]]] | ||
| def groupBy[B](f: A => B)(implicit B: Order[B]): Map[B, NonEmptyList[A]] = { |
There was a problem hiding this comment.
Can we add to the scaladoc a brief description of why Order is required?
|
actually, yes, I think that's a good idea. It makes it clear the `Order` is
being used and it allows callers to leverage that fact, which is nice.
Incidentally, `SortedMap` is a pretty nice type for other reasons. It
doesn't have some of the issues `Map` and `Set` have (see #1831)
…On Thu, Oct 12, 2017 at 8:46 AM, Ionuț G. Stan ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In core/src/main/scala/cats/data/NonEmptyList.scala
<#1964 (comment)>:
>
+ m.map { case (k, v) => (k, NonEmptyList.fromListUnsafe(v.result)) }
Sure. One further thought. Shall we also change the return type to
SortedMap? That would explain why we require an Order instance for B.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1964 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAEJdipmWVpWqDKlR_XbYKQKvv6r4fxfks5srl5rgaJpZM4P1x49>
.
--
P. Oscar Boykin, Ph.D. | http://twitter.com/posco | http://pobox.com/~boykin
|
| */ | ||
| def toNel: Option[NonEmptyList[A]] = NonEmptyList.fromList(la) | ||
| def groupByNel[B](f: A => B): Map[B, NonEmptyList[A]] = | ||
| def groupByNel[B : Order](f: A => B): Map[B, NonEmptyList[A]] = |
There was a problem hiding this comment.
shouldn't the type here also be SortedMap?
There was a problem hiding this comment.
Yes, it should. Working on it.
There was a problem hiding this comment.
Well... current status: yak-shaving inside NonEmptyTraverse doctests, where SortedMap needs an Apply instance.
There was a problem hiding this comment.
I've pushed something that addresses this. I've fixed the NonEmptyTraverse doctests by upcasting to a vanilla Map for which there's an Apply instance. I'm not completely satisfied with this approach, but it's the best I've got right now.
There was a problem hiding this comment.
I think making an Apply and Traversable for SortedMap is probably out of scope, so I think that's okay.
This addresses: #1959
I didn't see a way to require just an
Eqinstance forBwhile preserving the performance characteristics. Amutable.Mapuses more than just universal equality, it also uses the built-in JVM hashing. So, what I did was to switch to amutable.TreeMapwhich requires ascala.math.Orderinginstance, and we get that by constrainingBto have acats.Orderinstance.