Skip to content

A new List.map that is both stack-safe and fast #131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
paulyoung opened this issue Sep 22, 2017 · 1 comment
Closed

A new List.map that is both stack-safe and fast #131

paulyoung opened this issue Sep 22, 2017 · 1 comment

Comments

@paulyoung
Copy link

I came across this and thought it was worth sharing: https://discuss.ocaml.org/t/a-new-list-map-that-is-both-stack-safe-and-fast/865

matthewleon added a commit to matthewleon/purescript-lists that referenced this issue Nov 21, 2017
Addresses purescript#131

The relevant chunk sizes (5 for the initial list segment), (3 for the
tail-recursive remainder) were arrived at through benchmarked
experimentation, mapping a simple (_ + 1) through lists of various
sizes.

Relevant figures:
list of 1000 elems:   142.61 μs -> 36.97 μs
list of 2000 elems:   275.17 μs -> 55.33 μs
list of 10000 elems:  912.73 μs -> 208.39 μs
list of 100000 elems: 34.56 ms  -> 1.24 ms

The ~30x speed increase for long lists is probably explained by the lack
of GC thrashing with this approach.
matthewleon added a commit to matthewleon/purescript-lists that referenced this issue Nov 21, 2017
Addresses purescript#131

The relevant chunk sizes (5 for the initial list segment), (3 for the
tail-recursive remainder) were arrived at through benchmarked
experimentation, mapping a simple (_ + 1) through lists of various
sizes.

Relevant figures:
list of 1000 elems:   142.61 μs -> 36.97 μs
list of 2000 elems:   275.17 μs -> 55.33 μs
list of 10000 elems:  912.73 μs -> 208.39 μs
list of 100000 elems: 34.56 ms  -> 1.24 ms

The ~30x speed increase for long lists is probably explained by the lack
of GC thrashing with this approach.
matthewleon added a commit to matthewleon/purescript-lists that referenced this issue Nov 21, 2017
Addresses purescript#131

The relevant chunk sizes (5 for the initial list segment), (3 for the
tail-recursive remainder) were arrived at through benchmarked
experimentation, mapping a simple (_ + 1) through lists of various
sizes.

Relevant figures:
list of 1000 elems:   142.61 μs -> 36.97 μs
list of 2000 elems:   275.17 μs -> 55.33 μs
list of 10000 elems:  912.73 μs -> 208.39 μs
list of 100000 elems: 34.56 ms  -> 1.24 ms

The ~30x speed increase for long lists is probably explained by the lack
of GC thrashing with this approach.

Benchmarked on 2017 Macbook Pro, 2.3 GHz Intel Core i5, 8 GB RAM.
macOS Sierra 10.12.6
node v8.9.1
matthewleon added a commit to matthewleon/purescript-lists that referenced this issue Nov 21, 2017
Addresses purescript#131

The relevant chunk sizes (5 for the initial list segment), (3 for the
tail-recursive remainder) were arrived at through benchmarked
experimentation, mapping a simple (_ + 1) through lists of various
sizes.

Relevant figures:
list of 1000 elems:   142.61 μs -> 36.97 μs
list of 2000 elems:   275.17 μs -> 55.33 μs
list of 10000 elems:  912.73 μs -> 208.39 μs
list of 100000 elems: 34.56 ms  -> 1.24 ms

The ~30x speed increase for long lists is probably explained by the lack
of GC thrashing with this approach.

Benchmarked on 2017 Macbook Pro, 2.3 GHz Intel Core i5, 8 GB RAM.
macOS Sierra 10.12.6
node v8.9.1
matthewleon added a commit to matthewleon/purescript-lists that referenced this issue Nov 23, 2017
Addresses purescript#131

The relevant chunk sizes (5 for the initial list segment), (3 for the
tail-recursive remainder) were arrived at through benchmarked
experimentation, mapping a simple (_ + 1) through lists of various
sizes.

Relevant figures:
list of 1000 elems:   142.61 μs -> 36.97 μs
list of 2000 elems:   275.17 μs -> 55.33 μs
list of 10000 elems:  912.73 μs -> 208.39 μs
list of 100000 elems: 34.56 ms  -> 1.24 ms

The ~30x speed increase for long lists is probably explained by the lack
of GC thrashing with this approach.

Benchmarked on 2017 Macbook Pro, 2.3 GHz Intel Core i5, 8 GB RAM.
macOS Sierra 10.12.6
node v8.9.1
matthewleon added a commit to matthewleon/purescript-lists that referenced this issue Dec 4, 2017
Addresses purescript#131

The relevant chunk sizes (5 for the initial list segment), (3 for the
tail-recursive remainder) were arrived at through benchmarked
experimentation, mapping a simple (_ + 1) through lists of various
sizes.

Relevant figures:
list of 1000 elems:   142.61 μs -> 36.97 μs
list of 2000 elems:   275.17 μs -> 55.33 μs
list of 10000 elems:  912.73 μs -> 208.39 μs
list of 100000 elems: 34.56 ms  -> 1.24 ms

The ~30x speed increase for long lists is probably explained by the lack
of GC thrashing with this approach.

Benchmarked on 2017 Macbook Pro, 2.3 GHz Intel Core i5, 8 GB RAM.
macOS Sierra 10.12.6
node v8.9.1
hdgarrood added a commit that referenced this issue Mar 10, 2019
* List Functor: mix unrolled and reverse map

Addresses #131

The relevant chunk sizes (5 for the initial list segment), (3 for the
tail-recursive remainder) were arrived at through benchmarked
experimentation, mapping a simple (_ + 1) through lists of various
sizes.

Relevant figures:
list of 1000 elems:   142.61 μs -> 36.97 μs
list of 2000 elems:   275.17 μs -> 55.33 μs
list of 10000 elems:  912.73 μs -> 208.39 μs
list of 100000 elems: 34.56 ms  -> 1.24 ms

The ~30x speed increase for long lists is probably explained by the lack
of GC thrashing with this approach.

Benchmarked on 2017 Macbook Pro, 2.3 GHz Intel Core i5, 8 GB RAM.
macOS Sierra 10.12.6
node v8.9.1

* initial benchmarks for List.map

2017 MacBook Pro 2.3 GHz Intel Core i5, 8 GB 2133 MHz LPDDR3
Node v8.9.1

List
====
map
---
map: empty list
mean   = 1.31 μs
stddev = 11.87 μs
min    = 799.00 ns
max    = 375.82 μs
map: singleton list
mean   = 2.40 μs
stddev = 11.03 μs
min    = 1.03 μs
max    = 342.18 μs
map: list (1000 elems)
mean   = 143.41 μs
stddev = 225.12 μs
min    = 97.16 μs
max    = 2.03 ms
map: list (2000 elems)
mean   = 274.16 μs
stddev = 295.84 μs
min    = 199.66 μs
max    = 2.06 ms
map: list (5000 elems)
mean   = 531.84 μs
stddev = 512.61 μs
min    = 229.45 μs
max    = 2.95 ms
map: list (10000 elems)
mean   = 895.24 μs
stddev = 777.87 μs
min    = 464.59 μs
max    = 2.94 ms
map: list (100000 elems)
mean   = 33.45 ms
stddev = 7.65 ms
min    = 22.07 ms
max    = 63.47 ms

* style tweak

* test stack-safety of strict map

* lower unrolled map iteration limit

this lower the probability of stack-size troubles

* restore un-exported functions from Data.List.Types

* add failing map test

* fix a logic error in List.map chunkedRevMap

* make map stack safe(r) again

begin with reverse unrolled map

* Update for 0.12 id -> identity

* Update benchmark code for 0.12

* Remove outdated comment

* 🤦‍♂️
@hdgarrood
Copy link
Contributor

Fixed by #135 / #157.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants