Skip to content

Commit e8ebf3a

Browse files
craig[bot]nvanbenschoten
craig[bot]
andcommitted
Merge #32165
32165: storage/cmdq: create new specialized augmented interval btree r=nvanbenschoten a=nvanbenschoten This is a component of the larger change in #31997. The first few commits here modify the existing interval btree implementation, allowing us to properly benchmark against it. The second to last commit forks https://github.com/petermattis/pebble/blob/master/internal/btree/btree.go, specializes it to the command queue, and rips out any references to pebble. There are a number of changes we'll need to make to it: 1. Add synchronized node and leafNode freelists 2. Add Clear method to release owned nodes into freelists 3. Introduce immutability and a copy-on-write policy The next commit modifies the btree type added in the previous commit and turns it into an augmented interval tree. The tree represents intervals and permits an interval search operation following the approach laid out in CLRS, Chapter 14. The B-Tree stores cmds in order based on their start key and each B-Tree node maintains the upper-bound end key of all cmds in its subtree. This is close to what `util/interval.btree` does, although the new version doesn't maintain the lower-bound start key of all cmds in each node. The new interval btree is significantly faster than both the old interval btree and the old interval llrb tree because it minimizes key comparisons while scanning for overlaps. This includes avoiding all key comparisons for cmds with start keys that are greater than the search range's start key. See the comment on `overlapScan` for an explanation of how this is possible. The new interval btree is also faster because it has been specialized for the `storage/cmdq` package. This allows it to avoid interfaces and dynamic dispatch throughout its operations, which showed up prominently on profiles of the other two implementations. A third benefit of the rewrite is that it inherits the optimizations made in pebble's btree. This includes inlining the btree items and child pointers in nodes instead of using slices. ### Benchmarks: _The new interval btree:_ ``` Insert/count=16-4 76.1ns ± 4% Insert/count=128-4 156ns ± 4% Insert/count=1024-4 259ns ± 8% Insert/count=8192-4 386ns ± 1% Insert/count=65536-4 735ns ± 5% Delete/count=16-4 129ns ±16% Delete/count=128-4 189ns ±12% Delete/count=1024-4 338ns ± 7% Delete/count=8192-4 547ns ± 4% Delete/count=65536-4 1.22µs ±12% DeleteInsert/count=16-4 168ns ± 2% DeleteInsert/count=128-4 375ns ± 8% DeleteInsert/count=1024-4 562ns ± 1% DeleteInsert/count=8192-4 786ns ± 3% DeleteInsert/count=65536-4 2.31µs ±26% IterSeekGE/count=16-4 87.2ns ± 3% IterSeekGE/count=128-4 141ns ± 3% IterSeekGE/count=1024-4 227ns ± 4% IterSeekGE/count=8192-4 379ns ± 2% IterSeekGE/count=65536-4 882ns ± 1% IterSeekLT/count=16-4 89.5ns ± 3% IterSeekLT/count=128-4 145ns ± 1% IterSeekLT/count=1024-4 226ns ± 6% IterSeekLT/count=8192-4 379ns ± 1% IterSeekLT/count=65536-4 891ns ± 1% IterFirstOverlap/count=16-4 184ns ± 1% IterFirstOverlap/count=128-4 260ns ± 3% IterFirstOverlap/count=1024-4 685ns ± 7% IterFirstOverlap/count=8192-4 1.23µs ± 2% IterFirstOverlap/count=65536-4 2.14µs ± 1% IterNext-4 3.82ns ± 2% IterPrev-4 14.8ns ± 2% IterNextOverlap-4 8.57ns ± 2% IterOverlapScan-4 25.8µs ± 3% ``` _Compared to old llrb interval tree (currently in use):_ ``` Insert/count=16-4 323ns ± 7% 76ns ± 4% -76.43% (p=0.008 n=5+5) Insert/count=128-4 539ns ± 2% 156ns ± 4% -71.05% (p=0.008 n=5+5) Insert/count=1024-4 797ns ± 1% 259ns ± 8% -67.52% (p=0.008 n=5+5) Insert/count=8192-4 1.30µs ± 5% 0.39µs ± 1% -70.38% (p=0.008 n=5+5) Insert/count=65536-4 2.69µs ±11% 0.74µs ± 5% -72.65% (p=0.008 n=5+5) Delete/count=16-4 438ns ± 7% 129ns ±16% -70.44% (p=0.008 n=5+5) Delete/count=128-4 785ns ± 6% 189ns ±12% -75.89% (p=0.008 n=5+5) Delete/count=1024-4 1.38µs ± 2% 0.34µs ± 7% -75.44% (p=0.008 n=5+5) Delete/count=8192-4 2.36µs ± 2% 0.55µs ± 4% -76.82% (p=0.008 n=5+5) Delete/count=65536-4 4.73µs ±13% 1.22µs ±12% -74.19% (p=0.008 n=5+5) DeleteInsert/count=16-4 920ns ± 2% 168ns ± 2% -81.76% (p=0.008 n=5+5) DeleteInsert/count=128-4 1.73µs ± 4% 0.37µs ± 8% -78.35% (p=0.008 n=5+5) DeleteInsert/count=1024-4 2.69µs ± 3% 0.56µs ± 1% -79.15% (p=0.016 n=5+4) DeleteInsert/count=8192-4 4.55µs ±25% 0.79µs ± 3% -82.70% (p=0.008 n=5+5) DeleteInsert/count=65536-4 7.53µs ± 6% 2.31µs ±26% -69.32% (p=0.008 n=5+5) IterOverlapScan-4 285µs ± 7% 26µs ± 3% -90.96% (p=0.008 n=5+5) ``` _Compared to old btree interval tree (added in a61191e, never enabled):_ ``` Insert/count=16-4 231ns ± 1% 76ns ± 4% -66.99% (p=0.008 n=5+5) Insert/count=128-4 351ns ± 2% 156ns ± 4% -55.53% (p=0.008 n=5+5) Insert/count=1024-4 515ns ± 5% 259ns ± 8% -49.73% (p=0.008 n=5+5) Insert/count=8192-4 786ns ± 3% 386ns ± 1% -50.85% (p=0.008 n=5+5) Insert/count=65536-4 1.50µs ± 3% 0.74µs ± 5% -50.97% (p=0.008 n=5+5) Delete/count=16-4 363ns ±11% 129ns ±16% -64.33% (p=0.008 n=5+5) Delete/count=128-4 466ns ± 9% 189ns ±12% -59.42% (p=0.008 n=5+5) Delete/count=1024-4 806ns ± 6% 338ns ± 7% -58.01% (p=0.008 n=5+5) Delete/count=8192-4 1.43µs ±13% 0.55µs ± 4% -61.71% (p=0.008 n=5+5) Delete/count=65536-4 2.75µs ± 1% 1.22µs ±12% -55.57% (p=0.008 n=5+5) DeleteInsert/count=16-4 557ns ± 1% 168ns ± 2% -69.87% (p=0.008 n=5+5) DeleteInsert/count=128-4 953ns ± 8% 375ns ± 8% -60.71% (p=0.008 n=5+5) DeleteInsert/count=1024-4 1.19µs ± 4% 0.56µs ± 1% -52.72% (p=0.016 n=5+4) DeleteInsert/count=8192-4 1.84µs ±17% 0.79µs ± 3% -57.22% (p=0.008 n=5+5) DeleteInsert/count=65536-4 3.20µs ± 3% 2.31µs ±26% -27.86% (p=0.008 n=5+5) IterOverlapScan-4 70.1µs ± 2% 25.8µs ± 3% -63.23% (p=0.008 n=5+5) ``` Co-authored-by: Nathan VanBenschoten <[email protected]>
2 parents ec54e45 + cc9c227 commit e8ebf3a

File tree

6 files changed

+2407
-55
lines changed

6 files changed

+2407
-55
lines changed

0 commit comments

Comments
 (0)