Skip to content

IntSet: reverse bitmap for faster comparison? #674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jwaldmann opened this issue Jul 28, 2019 · 5 comments
Closed

IntSet: reverse bitmap for faster comparison? #674

jwaldmann opened this issue Jul 28, 2019 · 5 comments

Comments

@jwaldmann
Copy link
Contributor

This is just an idea to improve instance Ord IntSet (related to #470 ). It's quite a pervasive change, and it'd help only in a special case - that may occur frequently, though.

When all elements of the IntSet are small, the tree is in fact Tip prefix bitmap. For just that special case, instance Eq IntSet is just two comparisons of machine words,
but instance Ord IntSet (in suggested #670) needs more ops (more than 10, see relateTipTip, relateBM).

It would be much easier if compare (Tip p bm1) (Tip p bm2) = compare bm1 bm2
but since the comparison must have fromAscList semantics, we need
= compare (reverse bm1) (reverse bm2) (the implementation does not actually use revNat)

Instead of doing the reversal here, we could define bitmapOf x not as 2^x but 2^(wordSize - 1 - x)

In the general case (comparing Tips that sit below Bin) we need the Relation result (that encodes 5 possible results) so there's no hope of doing this in one op.

I think that the underlying reason for all this is that some ops on machine words are uniform (direction does not matter, as in .&.), some are symmetric (two directions, but identical cost, e.g., shift-left, shift-right), but some are asymmetric (one direction, the other one is missing: lexicographic comparison, carry propagation in arithmetical operations).

Now everything regarding bitmaps (not prefixes!) in IntSet is uniform or symmetric - except for this comparison?

@treeowl
Copy link
Contributor

treeowl commented Jul 28, 2019

I remember those options being discussed in the original paper; you may want to take a look and see to what extent the times have changed. Another thought, for IntSet but not IntMap: what if we store the mask index (instead of the mask) in the low bits of the prefix? That lets us drop a word from each Bin node. Some extra operations may be required some places, so I don't know how performance will compare, ultimately.

@jwaldmann
Copy link
Contributor Author

"options being discussed in the original paper" - you mean Okasaki and Gill 1998 https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.37.5452 ?

This is for Maps, not Sets, so they don't have bitmaps in the leaves.

They discuss little endian vs. big endian patricia trees - and make this remark regarding asymmetric operations: "there does not appear to be a clever bit-twiddling solution to calculate the highest one bit in a number, as there was for the lowest one bit".

This point is moot now since containers uses clz/ctz primops?

I am not seeing any reference to bitwise tricks in Morrison 1968 http://www.mathcs.emory.edu/~cheung/papers/XML/PatriciaTrie-JACM1968.pdf (if you meant that with "original paper")

Before we go any further with this idea (of reversing bitmaps for faster instance Ord IntSet) I will look at what happens with type IntSet = Word in my NFA->DFA benchmark (where I know that all the sets have small elements).

jwaldmann added a commit to jwaldmann/containers that referenced this issue Aug 2, 2019
@jwaldmann
Copy link
Contributor Author

type IntSet = Word

see https://github.com/jwaldmann/containers/blob/intset%3Dword/containers-tests/benchmarks/IntSet.hs#L53

this gives results like

benchmarked instanceOrd:dense/16/@IntSet
time                 336.4 ms   (333.0 ms .. 341.3 ms)

benchmarked instanceOrd:dense/16/@Word
time                 274.1 ms   (272.8 ms .. 274.9 ms)

I conclude that a more efficient implementation of instance Ord IntSet
can at best save another 20 percent runtime w.r.t. the current proposal
(for the case that the sets are really Tips).

I am not really certain about these data: I sprinkled the code with Inline and Specialize pragmas. Some of them do change the runtimes. I have no precise idea why.

As I currently do not have an actual use case (I deal with automata sometimes, but I don't need NFA->DFA right now), I will not push this any further.

@treeowl
Copy link
Contributor

treeowl commented Aug 2, 2019 via email

@meooow25
Copy link
Contributor

Now everything regarding bitmaps (not prefixes!) in IntSet is uniform or symmetric - except for this comparison?

The other asymmetric operation I'm aware of is iteration: it is faster to iterate low to high bit than vice versa. This translates to foldr being a little faster than foldl and foldl' being a little faster than foldr' (see #1079 for some benchmarks). Reversing the bitmap would flip this difference.

Since people tend to prefer foldr and foldl' (carried over from lists), I think the cost outweighs the benefit that we would get on comparing Tips.

As I currently do not have an actual use case (I deal with automata sometimes, but I don't need NFA->DFA right now), I will not push this any further.

Thanks for proposing this anyway, it is an interesting idea to consider. Please reopen if you have more to discuss on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants