-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Commit e8ebf3a
Merge #32165
32165: storage/cmdq: create new specialized augmented interval btree r=nvanbenschoten a=nvanbenschoten
This is a component of the larger change in #31997.
The first few commits here modify the existing interval btree implementation,
allowing us to properly benchmark against it.
The second to last commit forks https://github.com/petermattis/pebble/blob/master/internal/btree/btree.go, specializes
it to the command queue, and rips out any references to pebble. There are a number
of changes we'll need to make to it:
1. Add synchronized node and leafNode freelists
2. Add Clear method to release owned nodes into freelists
3. Introduce immutability and a copy-on-write policy
The next commit modifies the btree type added in the previous commit
and turns it into an augmented interval tree. The tree represents
intervals and permits an interval search operation following the
approach laid out in CLRS, Chapter 14. The B-Tree stores cmds in
order based on their start key and each B-Tree node maintains the
upper-bound end key of all cmds in its subtree. This is close to
what `util/interval.btree` does, although the new version doesn't
maintain the lower-bound start key of all cmds in each node.
The new interval btree is significantly faster than both the old
interval btree and the old interval llrb tree because it minimizes
key comparisons while scanning for overlaps. This includes avoiding
all key comparisons for cmds with start keys that are greater than
the search range's start key. See the comment on `overlapScan` for
an explanation of how this is possible.
The new interval btree is also faster because it has been specialized
for the `storage/cmdq` package. This allows it to avoid interfaces
and dynamic dispatch throughout its operations, which showed up
prominently on profiles of the other two implementations.
A third benefit of the rewrite is that it inherits the optimizations
made in pebble's btree. This includes inlining the btree items and
child pointers in nodes instead of using slices.
### Benchmarks:
_The new interval btree:_
```
Insert/count=16-4 76.1ns ± 4%
Insert/count=128-4 156ns ± 4%
Insert/count=1024-4 259ns ± 8%
Insert/count=8192-4 386ns ± 1%
Insert/count=65536-4 735ns ± 5%
Delete/count=16-4 129ns ±16%
Delete/count=128-4 189ns ±12%
Delete/count=1024-4 338ns ± 7%
Delete/count=8192-4 547ns ± 4%
Delete/count=65536-4 1.22µs ±12%
DeleteInsert/count=16-4 168ns ± 2%
DeleteInsert/count=128-4 375ns ± 8%
DeleteInsert/count=1024-4 562ns ± 1%
DeleteInsert/count=8192-4 786ns ± 3%
DeleteInsert/count=65536-4 2.31µs ±26%
IterSeekGE/count=16-4 87.2ns ± 3%
IterSeekGE/count=128-4 141ns ± 3%
IterSeekGE/count=1024-4 227ns ± 4%
IterSeekGE/count=8192-4 379ns ± 2%
IterSeekGE/count=65536-4 882ns ± 1%
IterSeekLT/count=16-4 89.5ns ± 3%
IterSeekLT/count=128-4 145ns ± 1%
IterSeekLT/count=1024-4 226ns ± 6%
IterSeekLT/count=8192-4 379ns ± 1%
IterSeekLT/count=65536-4 891ns ± 1%
IterFirstOverlap/count=16-4 184ns ± 1%
IterFirstOverlap/count=128-4 260ns ± 3%
IterFirstOverlap/count=1024-4 685ns ± 7%
IterFirstOverlap/count=8192-4 1.23µs ± 2%
IterFirstOverlap/count=65536-4 2.14µs ± 1%
IterNext-4 3.82ns ± 2%
IterPrev-4 14.8ns ± 2%
IterNextOverlap-4 8.57ns ± 2%
IterOverlapScan-4 25.8µs ± 3%
```
_Compared to old llrb interval tree (currently in use):_
```
Insert/count=16-4 323ns ± 7% 76ns ± 4% -76.43% (p=0.008 n=5+5)
Insert/count=128-4 539ns ± 2% 156ns ± 4% -71.05% (p=0.008 n=5+5)
Insert/count=1024-4 797ns ± 1% 259ns ± 8% -67.52% (p=0.008 n=5+5)
Insert/count=8192-4 1.30µs ± 5% 0.39µs ± 1% -70.38% (p=0.008 n=5+5)
Insert/count=65536-4 2.69µs ±11% 0.74µs ± 5% -72.65% (p=0.008 n=5+5)
Delete/count=16-4 438ns ± 7% 129ns ±16% -70.44% (p=0.008 n=5+5)
Delete/count=128-4 785ns ± 6% 189ns ±12% -75.89% (p=0.008 n=5+5)
Delete/count=1024-4 1.38µs ± 2% 0.34µs ± 7% -75.44% (p=0.008 n=5+5)
Delete/count=8192-4 2.36µs ± 2% 0.55µs ± 4% -76.82% (p=0.008 n=5+5)
Delete/count=65536-4 4.73µs ±13% 1.22µs ±12% -74.19% (p=0.008 n=5+5)
DeleteInsert/count=16-4 920ns ± 2% 168ns ± 2% -81.76% (p=0.008 n=5+5)
DeleteInsert/count=128-4 1.73µs ± 4% 0.37µs ± 8% -78.35% (p=0.008 n=5+5)
DeleteInsert/count=1024-4 2.69µs ± 3% 0.56µs ± 1% -79.15% (p=0.016 n=5+4)
DeleteInsert/count=8192-4 4.55µs ±25% 0.79µs ± 3% -82.70% (p=0.008 n=5+5)
DeleteInsert/count=65536-4 7.53µs ± 6% 2.31µs ±26% -69.32% (p=0.008 n=5+5)
IterOverlapScan-4 285µs ± 7% 26µs ± 3% -90.96% (p=0.008 n=5+5)
```
_Compared to old btree interval tree (added in a61191e, never enabled):_
```
Insert/count=16-4 231ns ± 1% 76ns ± 4% -66.99% (p=0.008 n=5+5)
Insert/count=128-4 351ns ± 2% 156ns ± 4% -55.53% (p=0.008 n=5+5)
Insert/count=1024-4 515ns ± 5% 259ns ± 8% -49.73% (p=0.008 n=5+5)
Insert/count=8192-4 786ns ± 3% 386ns ± 1% -50.85% (p=0.008 n=5+5)
Insert/count=65536-4 1.50µs ± 3% 0.74µs ± 5% -50.97% (p=0.008 n=5+5)
Delete/count=16-4 363ns ±11% 129ns ±16% -64.33% (p=0.008 n=5+5)
Delete/count=128-4 466ns ± 9% 189ns ±12% -59.42% (p=0.008 n=5+5)
Delete/count=1024-4 806ns ± 6% 338ns ± 7% -58.01% (p=0.008 n=5+5)
Delete/count=8192-4 1.43µs ±13% 0.55µs ± 4% -61.71% (p=0.008 n=5+5)
Delete/count=65536-4 2.75µs ± 1% 1.22µs ±12% -55.57% (p=0.008 n=5+5)
DeleteInsert/count=16-4 557ns ± 1% 168ns ± 2% -69.87% (p=0.008 n=5+5)
DeleteInsert/count=128-4 953ns ± 8% 375ns ± 8% -60.71% (p=0.008 n=5+5)
DeleteInsert/count=1024-4 1.19µs ± 4% 0.56µs ± 1% -52.72% (p=0.016 n=5+4)
DeleteInsert/count=8192-4 1.84µs ±17% 0.79µs ± 3% -57.22% (p=0.008 n=5+5)
DeleteInsert/count=65536-4 3.20µs ± 3% 2.31µs ±26% -27.86% (p=0.008 n=5+5)
IterOverlapScan-4 70.1µs ± 2% 25.8µs ± 3% -63.23% (p=0.008 n=5+5)
```
Co-authored-by: Nathan VanBenschoten <[email protected]>File tree
Expand file treeCollapse file tree
6 files changed
+2407
-55
lines changedFilter options
- pkg
- storage/cmdq
- util/interval
Expand file treeCollapse file tree
6 files changed
+2407
-55
lines changed
0 commit comments