Remove the complicated block search logic for a simpler branchless #1124

fulmicoton · 2021-07-29T12:20:07Z

binary search

The code is simpler and faster.

Before
test postings::bench::bench_segment_intersection                                                                         ... bench:   2,093,697 ns/iter (+/- 115,509)
test postings::bench::bench_skip_next_p01                                                                                ... bench:      58,585 ns/iter (+/- 796)
test postings::bench::bench_skip_next_p1                                                                                 ... bench:     160,872 ns/iter (+/- 5,164)
test postings::bench::bench_skip_next_p10                                                                                ... bench:     615,229 ns/iter (+/- 25,108)
test postings::bench::bench_skip_next_p90                                                                                ... bench:   1,120,509 ns/iter (+/- 22,271)

After
test postings::bench::bench_skip_next_p01                                                                                ... bench:      51,701 ns/iter (+/- 719)
test postings::bench::bench_skip_next_p1                                                                                 ... bench:     124,848 ns/iter (+/- 2,475)
test postings::bench::bench_skip_next_p10                                                                                ... bench:     453,206 ns/iter (+/- 7,315)
test postings::bench::bench_skip_next_p90                                                                                ... bench:     880,564 ns/iter (+/- 6,158)

codecov · 2021-07-29T12:37:57Z

Codecov Report

Merging #1124 (3716285) into main (b8a10c8) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head 3716285 differs from pull request most recent head a8d5dbc. Consider uploading reports for the commit a8d5dbc to get more accurate results

@@           Coverage Diff           @@
##             main    #1124   +/-   ##
=======================================
  Coverage   89.50%   89.51%           
=======================================
  Files         203      203           
  Lines       20231    20201   -30     
=======================================
- Hits        18107    18082   -25     
+ Misses       2124     2119    -5

Impacted Files	Coverage Δ
src/postings/mod.rs	`98.78% <ø> (ø)`
src/postings/block_search.rs	`100.00% <100.00%> (+2.46%)`	⬆️
src/postings/block_segment_postings.rs	`94.56% <100.00%> (ø)`
src/postings/compression/mod.rs	`98.62% <100.00%> (+2.06%)`	⬆️
src/postings/segment_postings.rs	`91.52% <100.00%> (-0.15%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b8a10c8...a8d5dbc. Read the comment docs.

PSeitz

Simpler and faster, really cool. One minor comment. Also I don't know if the data is monotonically increasing or if we can have duplicates.

PSeitz · 2021-07-29T12:40:36Z

src/postings/block_search.rs

+            1041, 1048, 1061, 1062, 1069, 1072, 1079, 1082, 1083, 1102, 1140, 1165, 1170, 1176,
+            1189, 1195, 1198, 1200, 1201, 1208, 1223, 1240, 1252, 1276, 1307,
+        ];
+        for target in block.iter().flat_map(|&el| vec![el - 1, el].into_iter()) {


I would add
vec![el - 1, el, el+1] to test with a jump size larger than 1 to the next bucket (except the last element .take(block.len() * 3 -1) )

el +1 is the same as el-1 for the precedent element.

I'll test all elements in [0..1308).

Simpler and faster, really cool. One minor comment. Also I don't know if the data is monotonically increasing or if we can have duplicates.

Good point I modified the contract. There was an error in it too.

binary search The code is simpler and faster. Before test postings::bench::bench_segment_intersection ... bench: 2,093,697 ns/iter (+/- 115,509) test postings::bench::bench_skip_next_p01 ... bench: 58,585 ns/iter (+/- 796) test postings::bench::bench_skip_next_p1 ... bench: 160,872 ns/iter (+/- 5,164) test postings::bench::bench_skip_next_p10 ... bench: 615,229 ns/iter (+/- 25,108) test postings::bench::bench_skip_next_p90 ... bench: 1,120,509 ns/iter (+/- 22,271) After test postings::bench::bench_segment_intersection ... bench: 1,747,726 ns/iter (+/- 52,867) test postings::bench::bench_skip_next_p01 ... bench: 55,205 ns/iter (+/- 714) test postings::bench::bench_skip_next_p1 ... bench: 131,433 ns/iter (+/- 2,814) test postings::bench::bench_skip_next_p10 ... bench: 478,830 ns/iter (+/- 12,794) test postings::bench::bench_skip_next_p90 ... bench: 931,082 ns/iter (+/- 31,468)

fulmicoton requested a review from PSeitz July 29, 2021 12:20

PSeitz reviewed Jul 29, 2021

View reviewed changes

fulmicoton force-pushed the issue/1122 branch from 9b8e31f to 3f61b3c Compare July 29, 2021 14:45

PSeitz approved these changes Jul 29, 2021

View reviewed changes

fulmicoton force-pushed the issue/1122 branch from 3f61b3c to 3716285 Compare July 30, 2021 02:23

fulmicoton force-pushed the issue/1122 branch from 3716285 to a8d5dbc Compare July 30, 2021 05:36

fulmicoton merged commit f0ee69d into main Jul 30, 2021

fulmicoton deleted the issue/1122 branch July 30, 2021 05:38

PSeitz mentioned this pull request Aug 4, 2021

Performance improvement: reintroduce branchless binary search #1122

Closed

This was referenced Feb 18, 2022

fix open bytes index PSeitz/tantivy#1

Closed

aggregation PSeitz/tantivy#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Remove the complicated block search logic for a simpler branchless #1124

Remove the complicated block search logic for a simpler branchless #1124

Uh oh!

fulmicoton commented Jul 29, 2021 •

edited

Loading

Uh oh!

codecov bot commented Jul 29, 2021 •

edited

Loading

Uh oh!

PSeitz left a comment

Uh oh!

PSeitz Jul 29, 2021

Uh oh!

fulmicoton Jul 29, 2021

Uh oh!

fulmicoton Jul 29, 2021

Uh oh!

fulmicoton Jul 29, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Remove the complicated block search logic for a simpler branchless #1124

Remove the complicated block search logic for a simpler branchless #1124

Uh oh!

Conversation

fulmicoton commented Jul 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jul 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

PSeitz left a comment

Choose a reason for hiding this comment

Uh oh!

PSeitz Jul 29, 2021

Choose a reason for hiding this comment

Uh oh!

fulmicoton Jul 29, 2021

Choose a reason for hiding this comment

Uh oh!

fulmicoton Jul 29, 2021

Choose a reason for hiding this comment

Uh oh!

fulmicoton Jul 29, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fulmicoton commented Jul 29, 2021 •

edited

Loading

codecov bot commented Jul 29, 2021 •

edited

Loading