-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: preallocate some overflow buckets when map size hint provided #19931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Started on this. Getting a good estimator is proving to be non-trivial; it depends on both the absolute size and how close the resulting map is to loadFactor. |
This simplifies the code, as well as providing a single place to modify to change the allocation of new overflow buckets. Updates #19931 Updates #19992 Change-Id: I77070619f5c8fe449bbc35278278bca5eda780f2 Reviewed-on: https://go-review.googlesource.com/40975 Run-TryBot: Josh Bleecher Snyder <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Keith Randall <[email protected]>
Any change to how we allocate overflow buckets will require some extra hmap storage, but we don't want hmap to grow, particular as small maps usually don't need overflow buckets. This CL converts the existing hmap overflow field, which is usually used for pointer-free maps, into a generic extra field. This extra field can be used to hold data that is optional. If it is valuable enough to do have special handling of overflow buckets, which are medium-sized, it is valuable enough to pay an extra alloc and two extra words for. Adding fields to extra would entail adding overhead to pointer-free maps; any mapextra fields added would need to be weighed against that. This CL is just rearrangement, though. Updates #19931 Updates #19992 Change-Id: If8537a206905b9d4dc6cd9d886184ece671b3f80 Reviewed-on: https://go-review.googlesource.com/40976 Run-TryBot: Josh Bleecher Snyder <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Keith Randall <[email protected]>
bmap already has a overflow (getter) method. Add a setoverflow (setter) method, for readability. Updates #19931 Updates #19992 Change-Id: I00b3d94037c0d75508a7ebd51085c5c3857fb764 Reviewed-on: https://go-review.googlesource.com/40977 Run-TryBot: Josh Bleecher Snyder <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Keith Randall <[email protected]>
Updates #19931 Updates #19992 Change-Id: Ib2d4e6b9b89a49caa443310d896dce8d6db06050 Reviewed-on: https://go-review.googlesource.com/40978 Run-TryBot: Josh Bleecher Snyder <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Keith Randall <[email protected]>
When allocating a non-small array of buckets for a map, also preallocate some overflow buckets. The estimate of the number of overflow buckets is based on a simulation of putting mid=(low+high)/2 elements into a map, where low is the minimum number of elements needed to reach this value of b (according to overLoadFactor), and high is the maximum number of elements possible to put in this value of b (according to overLoadFactor). This estimate is surprisingly reliable and accurate. The number of overflow buckets needed is quadratic, for a fixed value of b. Using this mid estimate means that we will overallocate a few too many overflow buckets when the actual number of elements is near low, and underallocate significantly too few overflow buckets when the actual number of elements is near high. The mechanism introduced in this CL can be re-used for other overflow bucket optimizations. For example, given an initial size hint, we could estimate quite precisely the number of overflow buckets. This is #19931. We could also change from "non-nil means end-of-list" to "pointer-to-hmap.buckets means end-of-list", and then create a linked list of reusable overflow buckets when they are freed by map growth. That is #19992. We could also use a similar mechanism to do bulk allocation of overflow buckets. All these uses can co-exist with only the one additional pointer in mapextra, given a little care. name old time/op new time/op delta MapPopulate/1-8 60.1ns ± 2% 60.3ns ± 2% ~ (p=0.278 n=19+20) MapPopulate/10-8 577ns ± 1% 578ns ± 1% ~ (p=0.140 n=20+20) MapPopulate/100-8 8.06µs ± 1% 8.19µs ± 1% +1.67% (p=0.000 n=20+20) MapPopulate/1000-8 104µs ± 1% 104µs ± 1% ~ (p=0.317 n=20+20) MapPopulate/10000-8 891µs ± 1% 888µs ± 1% ~ (p=0.101 n=19+20) MapPopulate/100000-8 8.61ms ± 1% 8.58ms ± 0% -0.34% (p=0.009 n=20+17) name old alloc/op new alloc/op delta MapPopulate/1-8 0.00B 0.00B ~ (all equal) MapPopulate/10-8 179B ± 0% 179B ± 0% ~ (all equal) MapPopulate/100-8 3.33kB ± 0% 3.38kB ± 0% +1.48% (p=0.000 n=20+16) MapPopulate/1000-8 55.5kB ± 0% 53.4kB ± 0% -3.84% (p=0.000 n=19+20) MapPopulate/10000-8 432kB ± 0% 428kB ± 0% -1.06% (p=0.000 n=19+20) MapPopulate/100000-8 3.65MB ± 0% 3.62MB ± 0% -0.70% (p=0.000 n=20+20) name old allocs/op new allocs/op delta MapPopulate/1-8 0.00 0.00 ~ (all equal) MapPopulate/10-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) MapPopulate/100-8 18.0 ± 0% 17.0 ± 0% -5.56% (p=0.000 n=20+20) MapPopulate/1000-8 96.0 ± 0% 72.6 ± 1% -24.38% (p=0.000 n=20+20) MapPopulate/10000-8 625 ± 0% 319 ± 0% -48.86% (p=0.000 n=20+20) MapPopulate/100000-8 6.23k ± 0% 4.00k ± 0% -35.79% (p=0.000 n=20+20) Change-Id: I01f41cb1374bdb99ccedbc00d04fb9ae43daa204 Reviewed-on: https://go-review.googlesource.com/40979 Run-TryBot: Josh Bleecher Snyder <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Keith Randall <[email protected]>
This is partly done in CL 40979, insofar as the hint guides the number of buckets, and the number of buckets now guides the number of preallocated overflow buckets. It could be refined further, using the mechanisms of CL 40979, by estimating more precisely the number of overflow buckets needed for the hint, rather than taking the midpoint of the sizes for that its bucket range. |
CL https://golang.org/cl/40976 mentions this issue. |
CL https://golang.org/cl/40975 mentions this issue. |
CL https://golang.org/cl/40977 mentions this issue. |
CL https://golang.org/cl/40979 mentions this issue. |
CL https://golang.org/cl/40978 mentions this issue. |
Josh, anything else to do here? |
We could improve the estimation (along the lines of #19931 (comment)), but it's a low enough priority that I think we can call this done. |
When populating a large map, there will ~always be overflow. Currently, the map size hint is used to allocate an initial set of buckets, but not to allocate any overflow buckets. As a result, even if you provide a perfect hint to the runtime, populating a large map allocates a lot of times. A quick simulation suggests that adding 100,000 ints to a perfectly-hinted map allocates about 2700 overflow buckets.
Work require to implement this:
cc @randall77
The text was updated successfully, but these errors were encountered: