-
Notifications
You must be signed in to change notification settings - Fork 182
Speed up folds on Sequences #510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
(oh, and I noticed a typo I had added in the sequence benchmark file) |
Just a couple more things to check:
1. How are the foldl and foldr default definitions using your new foldMap?
2. How are the foldl' and foldr' default definitions using your new foldr
and foldl?
I don't imagine we'll be able to use the defaults, but it's worth a shot to
keep the already-enormous source code size down.
…On Tue, Jan 23, 2018 at 3:10 PM, Donnacha Oisín Kidney < ***@***.***> wrote:
(oh, and I noticed a typo I had added in the sequence benchmark file)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#510 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABzi_ducqfSjOOFE0GK7lOGEYNjuZQtiks5tNjzSgaJpZM4RqPjy>
.
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much better. I really like the fact that the Seq
and Elem
business stays out of the FingerTree
code. There are a few other places where I ended up mashing those together because I couldn't find a better way. If you think we can unmash them, it would be really awesome to split off an entire Data.Sequence.Internal.FingerTree
module. I suspect doing so may reduce the amount of recompilation we have to do to run the test suite.
Data/Sequence/Internal.hs
Outdated
(.#) f _ = coerce f | ||
#else | ||
(.#) :: (b -> c) -> (a -> b) -> a -> c | ||
(.#) f g = \x -> f (g x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be written (.#) = (.)
. Should we add a hidden Coercions
module in Utils
for this and (#.)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that'd be a good idea. Another candidate for it would be liftLeftFold
(or something, I'm not sure of the convention) which could be instead of lift_elem
Any sense of why the old versions of |
I'm not sure your
I want things like this to be efficient, if possible: f x0 y0 z0 as= foldr go stop as x0 y0 z0 where
go a r !x !y !z = ...
stop !x !y !z = ...
g x0 as = foldr go stop as x0 where
go a r !x = let !(a',x') = p a x in a' : r x'
stop !x = [] |
The old |
Oh, sorry, my memory was out of date. We used to use the defaults for |
FYI: two mashed places that come to mind are |
I'll look for a good candidate for multiple strict accumulators for
|
GHC 7.8 needs a different implementation of |
Actually, we don't need to be as fancy as |
So I've taken some of the "FB" forms of functions from Data.List: foldrTake :: Int -> Seq.Seq Int -> [Int]
foldrTake n xs = foldr (\x xs m -> case m of 1 -> [x]; _ -> x : xs (m-1)) (const []) xs n
foldrScanl :: Seq.Seq Int -> [Int]
foldrScanl bs = 0 : foldr (\b g -> oneShot (\x -> let !b' = x + b in b' : g b')) (const []) bs 0 And here are the results:
|
All right. I'll give this one more look later and merge. Great work.
…On Jan 23, 2018 4:45 PM, "Donnacha Oisín Kidney" ***@***.***> wrote:
So I've taken some of the "FB" forms of functions from Data.List:
foldrTake :: Int -> Seq.Seq Int -> [Int]
foldrTake n xs = foldr (\x xs m -> case m of 1 -> [x]; _ -> x : xs (m-1)) (const []) xs n
foldrScanl :: Seq.Seq Int -> [Int]
foldrScanl bs = 0 : foldr (\b g -> oneShot (\x -> let !b' = x + b in b' : g b')) (const []) bs 0
And here are the results:
benchmarking 500000/foldrScanl/new
time 13.35 ms (12.04 ms .. 14.71 ms)
0.963 R² (0.933 R² .. 0.987 R²)
mean 14.09 ms (13.62 ms .. 14.76 ms)
std dev 1.413 ms (1.070 ms .. 1.841 ms)
variance introduced by outliers: 48% (moderately inflated)
benchmarking 500000/foldrScanl/old
time 55.00 ms (52.27 ms .. 58.46 ms)
0.993 R² (0.985 R² .. 0.997 R²)
mean 47.31 ms (44.97 ms .. 49.78 ms)
std dev 4.565 ms (3.825 ms .. 5.301 ms)
variance introduced by outliers: 37% (moderately inflated)
benchmarking 500000/foldrTake/new
time 6.626 ms (6.199 ms .. 7.081 ms)
0.952 R² (0.911 R² .. 0.980 R²)
mean 6.061 ms (5.839 ms .. 6.432 ms)
std dev 871.2 μs (629.0 μs .. 1.357 ms)
variance introduced by outliers: 73% (severely inflated)
benchmarking 500000/foldrTake/old
time 23.09 ms (21.65 ms .. 25.00 ms)
0.981 R² (0.960 R² .. 0.997 R²)
mean 22.55 ms (21.52 ms .. 23.37 ms)
std dev 1.999 ms (1.438 ms .. 2.884 ms)
variance introduced by outliers: 35% (moderately inflated)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#510 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABzi_ZeIZI-v7VkwrhOlZ8GDEcBxRsYeks5tNlMNgaJpZM4RqPjy>
.
|
Just a couple notes on this pull request:
|
Let's leave the strictness alone for right now. That needs a separate PR,
and probably also discussion on the libraries list. And likely a major
version bump. Blech.
…On Jan 24, 2018 1:55 PM, "Donnacha Oisín Kidney" ***@***.***> wrote:
Just a couple notes on this pull request:
-
This includes the changes to the strictness of foldr' and foldl' in
the initial accumulator. Removing those changes isn't a problem (it doesn't
affect performance, as far as I can tell), and it might be better to put
those in their own pull request (and add some tests on them also).
-
From some initial benchmarks, it looks like other functions were also
suffering from the lack of inlining and specialisation that was affecting
the folds. In particular, traverse: a simplistic implementation (using
the new foldr):
traverse f = foldr (liftA2 (<|) . f) (pure empty)
looks like it's slightly faster than the current one.
-
I would like to split out the finger tree stuff from the rest of it
also, somehow (I think it would be cool to be able to use it like
Data.FingerTree), although I'm not at all familiar with the Applicative
and splitMap code yet.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#510 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABzi_S6jIUvaVm269s7iHiBX0TXVVl7Qks5tN3ycgaJpZM4RqPjy>
.
|
Sounds good! I'll change it to the old strictness, and add tests for it |
Sweet. |
As referenced in #504.
After noticing that
foldMapWithIndex
was significantly faster thanfoldMap
, I rewrote theFoldable
methods to mimic the style offoldMapWithIndex
. Writing the first level of recursion out on the finger tree manually, and specialising the folds on nodes and digits manually, yields a significant speedup. From my testing, there's a ~3.8x speedup forfoldMap
, a ~1.8x speedup forfoldl'
andfoldr'
, and an ~8x speedup forfoldl
andfoldr
.These are the functions benchmarked:
And these are the results, when run on random sequences of length 500000:
For reference, the
foldMapWithIndex
function could be used to sum as well:And these are its times (unchanged between old/new):