Use ReadAdvice.RANDOM by default. #13244

jpountz · 2024-03-29T09:32:00Z

This switches the default ReadAdvice from NORMAL to RANDOM, which is a better fit for the kind of access pattern that Lucene has. This is expected to reduce page cache trashing and contention on the page table.

NORMAL is still available, but never used by any of the file formats.

This effectively forces index inputs to be either open with a SEQUENTIAL or RANDOM advice, with nothing in between.

uschindler · 2024-03-29T14:34:18Z

I think this idea is not too bad, because as Robert said, unless we merge or flush, access is always random, so readahead is bad.

We should still compare results under memory pressure and also without pressure to compare how it behaves.

For consistency we should still keep the enum constant, so one can have the option to use another one.

Maybe make the default configurable via sysprop?

mikemccand · 2024-04-01T16:40:15Z

Unfortunately, benchmarking the cold index case correctly is not so easy ... I would not trust luceneutil to give accurate results (its queries are synthetically generated).

We would rather need a real-world large index (or use ramhog to cut back on free OS RAM), and, importantly, real-world and matching query traffic that shows the typical/realistc Zipfian distribution on search terms.

Not only realistic queries, but they should be delivered to Lucene accurately by time (i.e. at the actual arrival times that the queries came to the search engine), asynchronously ("open loop") to avoid the coordinated omission bug.

jpountz · 2024-04-03T09:48:25Z

@mikemccand I wonder if we need to create such a sophisticated benchmark. If we could confirm that performance is not affected when the cache is hot, and better when the cache is cold, maybe that would be good enough?

mikemccand · 2024-04-03T12:38:55Z

Yeah +1 I don't think we should block this change on sophisticated benchmarking! If we "first do no harm" (hot case not affected), and If we can show some improvement (or even no degradation?) in a simple cold benchmark then we should make this simplification!

uschindler · 2024-04-03T16:13:43Z

Could we still keep the NORMAL ReadAdvice constant and its mappings? We should just change the default!

I can add a system property to make it configurable like the other MMapDir options.

jpountz · 2024-04-04T09:57:02Z

That works for me @uschindler. I updated the code and the PR description.

uschindler

I think a system property can be added in a separate PR.

jpountz · 2024-04-04T11:38:56Z

Here's a luceneutil run on wikibigall and data hot in the page cache. No difference, which is what was expected I guess:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                         MedTerm      539.57      (7.5%)      528.42      (6.0%)   -2.1% ( -14% -   12%) 0.338
                         LowTerm      988.10      (5.9%)      970.05      (5.4%)   -1.8% ( -12% -   10%) 0.308
                    OrHighNotLow      317.47      (6.2%)      312.12      (5.0%)   -1.7% ( -12% -   10%) 0.343
                        HighTerm      252.16      (7.3%)      248.37      (5.6%)   -1.5% ( -13% -   12%) 0.464
                   OrHighNotHigh      288.74      (6.3%)      284.71      (4.5%)   -1.4% ( -11% -    9%) 0.419
                        Wildcard      100.66      (2.5%)       99.37      (2.6%)   -1.3% (  -6% -    3%) 0.117
                    OrHighNotMed      374.03      (6.2%)      369.44      (5.0%)   -1.2% ( -11% -   10%) 0.491
                    OrNotHighMed      248.17      (3.7%)      245.38      (4.0%)   -1.1% (  -8% -    6%) 0.359
                   OrNotHighHigh      156.42      (6.3%)      154.80      (5.1%)   -1.0% ( -11% -   11%) 0.568
                        PKLookup      295.59      (2.0%)      293.25      (3.0%)   -0.8% (  -5% -    4%) 0.329
                         Respell       50.41      (2.0%)       50.06      (2.0%)   -0.7% (  -4% -    3%) 0.264
                      AndHighLow     1056.24      (3.4%)     1049.79      (3.6%)   -0.6% (  -7% -    6%) 0.582
                    OrNotHighLow      704.71      (3.4%)      702.19      (3.6%)   -0.4% (  -7% -    6%) 0.748
                          IntNRQ      750.87      (4.4%)      748.32      (3.6%)   -0.3% (  -8% -    8%) 0.791
                     LowSpanNear        2.50      (1.1%)        2.50      (1.2%)   -0.2% (  -2% -    2%) 0.519
            HighTermTitleBDVSort       12.75      (8.5%)       12.72      (5.6%)   -0.2% ( -13% -   15%) 0.935
                       OrHighMed      219.68      (2.1%)      219.28      (2.6%)   -0.2% (  -4% -    4%) 0.805
            HighIntervalsOrdered        3.32      (4.3%)        3.31      (4.4%)   -0.2% (  -8% -    8%) 0.902
                 LowSloppyPhrase       17.94      (1.6%)       17.94      (2.2%)   -0.0% (  -3% -    3%) 0.972
                     MedSpanNear       12.89      (1.4%)       12.89      (1.3%)    0.0% (  -2% -    2%) 0.990
                      OrHighHigh       80.78      (1.3%)       80.79      (1.5%)    0.0% (  -2% -    2%) 0.978
             LowIntervalsOrdered       17.67      (2.4%)       17.69      (2.3%)    0.1% (  -4% -    4%) 0.910
                      AndHighMed       86.41      (2.1%)       86.50      (2.1%)    0.1% (  -4% -    4%) 0.870
                          Fuzzy1      104.55      (2.6%)      104.71      (1.8%)    0.2% (  -4% -    4%) 0.825
                         Prefix3      190.94      (4.2%)      191.33      (3.7%)    0.2% (  -7% -    8%) 0.872
                       MedPhrase       82.82      (5.1%)       82.99      (2.8%)    0.2% (  -7% -    8%) 0.872
               HighTermMonthSort     2859.54      (7.2%)     2866.04      (7.6%)    0.2% ( -13% -   16%) 0.923
                      TermDTSort      420.48      (5.5%)      421.49      (5.1%)    0.2% (  -9% -   11%) 0.888
                       OrHighLow      610.57      (2.6%)      612.29      (2.4%)    0.3% (  -4% -    5%) 0.724
               HighTermTitleSort      149.70      (1.5%)      150.16      (1.2%)    0.3% (  -2% -    3%) 0.477
                HighSloppyPhrase        3.04      (3.3%)        3.05      (2.8%)    0.4% (  -5% -    6%) 0.672
                      HighPhrase       20.25      (4.3%)       20.34      (2.2%)    0.4% (  -5% -    7%) 0.685
                     AndHighHigh       45.77      (2.2%)       45.98      (2.6%)    0.5% (  -4% -    5%) 0.537
             MedIntervalsOrdered        1.93      (1.6%)        1.94      (2.9%)    0.6% (  -3% -    5%) 0.395
                 MedSloppyPhrase       10.80      (2.8%)       10.88      (2.4%)    0.7% (  -4% -    6%) 0.407
                       LowPhrase       12.05      (4.6%)       12.14      (2.4%)    0.7% (  -6% -    8%) 0.534
                          Fuzzy2       95.06      (3.6%)       95.98      (2.6%)    1.0% (  -5% -    7%) 0.327
           HighTermDayOfYearSort      569.67      (3.4%)      575.25      (2.5%)    1.0% (  -4% -    7%) 0.302
                    HighSpanNear        5.34      (5.6%)        5.44      (6.2%)    1.8% (  -9% -   14%) 0.329

jpountz · 2024-04-04T13:49:04Z

I hacked luceneutil to clear my page cache with echo 1 > /proc/sys/vm/drop_caches before each query:

diff --git a/src/main/perf/TaskThreads.java b/src/main/perf/TaskThreads.java
index 313664f..b42a1bd 100644
--- a/src/main/perf/TaskThreads.java
+++ b/src/main/perf/TaskThreads.java
@@ -89,6 +89,12 @@ public class TaskThreads {
                                                // Done
                                                break;
                                        }
+                                       ProcessBuilder pb = new ProcessBuilder("/bin/bash", "/home/jpountz/drop_caches.sh");
+                                       Process p = pb.start();
+                                       int code = p.waitFor();
+                                       if (code != 0) {
+                                         throw new Error();
+                                       }
                                        final long t0 = System.nanoTime();
                                        try {
                                                task.go(indexState, taskParser);

This gives the following output on wikibigall. Differences are somewhat bigger and some p values are low-ish, e.g. OrHighNotLow and IntNRQ (faster) or HighTermDayOfYearSort and HighTermTitleBDVSort (slower) but nothing jumps out as being a blocker for this change.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
           HighTermDayOfYearSort      356.29      (7.2%)      346.35      (5.8%)   -2.8% ( -14% -   11%) 0.178
            HighTermTitleBDVSort       16.47      (7.1%)       16.02      (4.3%)   -2.7% ( -13% -    9%) 0.139
                        HighTerm      284.20      (8.8%)      276.94     (10.1%)   -2.6% ( -19% -   17%) 0.395
                         Respell       51.19      (3.7%)       50.49      (4.4%)   -1.4% (  -9% -    6%) 0.286
                     LowSpanNear       19.00      (3.3%)       18.76      (3.4%)   -1.3% (  -7% -    5%) 0.222
                        PKLookup      238.02      (4.4%)      235.09      (6.7%)   -1.2% ( -11% -   10%) 0.494
                        Wildcard      183.79      (5.4%)      181.86      (6.3%)   -1.1% ( -12% -   11%) 0.571
                       LowPhrase       67.32      (5.7%)       66.67      (5.6%)   -1.0% ( -11% -   11%) 0.591
                    HighSpanNear       26.83      (4.8%)       26.58      (3.6%)   -0.9% (  -8% -    7%) 0.501
            HighIntervalsOrdered        3.10      (3.6%)        3.07      (3.9%)   -0.9% (  -8% -    6%) 0.451
                      HighPhrase      122.98      (5.7%)      121.92      (4.0%)   -0.9% (  -9% -    9%) 0.578
                          Fuzzy2       61.85      (3.3%)       61.56      (4.1%)   -0.5% (  -7% -    7%) 0.693
                HighSloppyPhrase       12.79      (2.9%)       12.73      (3.2%)   -0.4% (  -6% -    5%) 0.642
                          Fuzzy1       78.45      (3.4%)       78.18      (3.0%)   -0.3% (  -6% -    6%) 0.728
                     MedSpanNear        5.64      (2.8%)        5.63      (2.6%)   -0.3% (  -5% -    5%) 0.735
                 LowSloppyPhrase        2.31      (3.6%)        2.31      (3.7%)   -0.2% (  -7% -    7%) 0.843
                 MedSloppyPhrase       13.62      (3.5%)       13.59      (4.0%)   -0.2% (  -7% -    7%) 0.862
             MedIntervalsOrdered       11.97      (4.2%)       11.94      (5.0%)   -0.2% (  -9% -    9%) 0.891
                     AndHighHigh       45.42      (5.0%)       45.33      (4.2%)   -0.2% (  -8% -    9%) 0.893
                       OrHighMed      214.12      (8.0%)      214.07      (5.9%)   -0.0% ( -12% -   15%) 0.992
               HighTermMonthSort     1762.09      (6.4%)     1764.93      (5.3%)    0.2% ( -10% -   12%) 0.931
                    OrHighNotMed      331.20      (7.1%)      331.97      (8.7%)    0.2% ( -14% -   17%) 0.926
                    OrNotHighMed      276.66      (7.0%)      277.84     (10.0%)    0.4% ( -15% -   18%) 0.876
             LowIntervalsOrdered       18.28      (4.3%)       18.41      (3.7%)    0.7% (  -6% -    9%) 0.593
                         Prefix3      429.15      (3.0%)      433.02      (3.2%)    0.9% (  -5% -    7%) 0.353
               HighTermTitleSort      124.84      (4.8%)      126.06      (5.9%)    1.0% (  -9% -   12%) 0.565
                         MedTerm      405.21     (10.5%)      409.60      (8.7%)    1.1% ( -16% -   22%) 0.722
                      AndHighMed      216.72      (6.6%)      219.11      (4.7%)    1.1% (  -9% -   13%) 0.541
                       MedPhrase       79.97      (5.5%)       80.87      (6.0%)    1.1% (  -9% -   13%) 0.540
                      TermDTSort      165.81      (7.8%)      167.85      (9.0%)    1.2% ( -14% -   19%) 0.644
                    OrNotHighLow      588.69      (3.4%)      596.51      (5.2%)    1.3% (  -7% -   10%) 0.338
                       OrHighLow      436.10      (5.7%)      442.05      (6.9%)    1.4% ( -10% -   14%) 0.497
                         LowTerm      725.03      (5.4%)      735.84      (6.4%)    1.5% (  -9% -   14%) 0.426
                   OrHighNotHigh      171.64      (8.7%)      174.23      (8.7%)    1.5% ( -14% -   20%) 0.583
                      OrHighHigh       72.38      (5.3%)       73.65      (4.8%)    1.8% (  -7% -   12%) 0.269
                      AndHighLow      560.60      (5.0%)      572.60      (5.6%)    2.1% (  -8% -   13%) 0.200
                   OrNotHighHigh      157.28      (6.5%)      160.77      (9.3%)    2.2% ( -12% -   19%) 0.382
                          IntNRQ      964.08      (8.8%)     1010.37      (5.3%)    4.8% (  -8% -   20%) 0.036
                    OrHighNotLow      276.39     (10.3%)      290.60      (8.3%)    5.1% ( -12% -   26%) 0.082

mikemccand

Thank you for the benchmarking @jpountz!

uschindler · 2024-04-04T15:00:14Z

Thanks! I will provide a PR to make it configurable.

uschindler · 2024-04-04T15:50:25Z

See #13264

* Lucene 10 changed the IOContext.DEFAULT from sequential to random, which makes sense for search use case: apache/lucene#13244 * place we read a file only once, its better to switch to READONLY(sequential) * this should only be in cases the file is read by the same thread that opened it, e.g. it won't work for RemoteStore that does async upload Signed-off-by: Asim Mahmood <[email protected]>

#17670) * Lucene 10 changed the IOContext.DEFAULT from sequential to random, which makes sense for search use case: apache/lucene#13244 * place we read a file only once, its better to switch to READONLY(sequential) * this should only be in cases the file is read by the same thread that opened it, e.g. it won't work for RemoteStore that does async upload --------- Signed-off-by: Asim Mahmood <[email protected]>

opensearch-project#17670) * Lucene 10 changed the IOContext.DEFAULT from sequential to random, which makes sense for search use case: apache/lucene#13244 * place we read a file only once, its better to switch to READONLY(sequential) * this should only be in cases the file is read by the same thread that opened it, e.g. it won't work for RemoteStore that does async upload --------- Signed-off-by: Asim Mahmood <[email protected]> Signed-off-by: Harsh Kothari <[email protected]>

Remove ReadAdvice.NORMAL.

8f70840

This effectively forces index inputs to be either open with a SEQUENTIAL or RANDOM advice, with nothing in between.

jpountz mentioned this pull request Mar 29, 2024

Recommend lowering the default mmap readahead. #13223

Closed

Merge branch 'main' into remove_ReadAdvice_NORMAL

28cefd0

jpountz added 2 commits April 4, 2024 11:45

Merge branch 'main' into remove_ReadAdvice_NORMAL

683956e

Add back ReadAdvice.NORMAL, keep RANDOM the default.

b64fd4f

jpountz changed the title ~~Remove ReadAdvice.NORMAL.~~ Use ReadAdvice.RANDOM by default. Apr 4, 2024

jpountz marked this pull request as ready for review April 4, 2024 09:56

uschindler approved these changes Apr 4, 2024

View reviewed changes

rmuir approved these changes Apr 4, 2024

View reviewed changes

jpountz added this to the 10.0.0 milestone Apr 4, 2024

uschindler approved these changes Apr 4, 2024

View reviewed changes

mikemccand approved these changes Apr 4, 2024

View reviewed changes

CHANGES

d0ea3a7

jpountz merged commit a2676b1 into apache:main Apr 4, 2024

jpountz deleted the remove_ReadAdvice_NORMAL branch April 4, 2024 14:45

uschindler mentioned this pull request Apr 4, 2024

Make the default ReadAdvice configurable by sysprop #13264

Merged

This was referenced Mar 24, 2025

Switch from IOContext.DEFAULT(RANDOM) to READONLY for sequential cases opensearch-project/OpenSearch#17670

Merged

Switch from IOContext.DEFAULT(RANDOM) to READONLY for sequential cases opensearch-project/OpenSearch#17672

Closed

bharath-techie mentioned this pull request Apr 3, 2025

[BUG] Force merge in 3.0 OS is slower than 2.19 OS opensearch-project/OpenSearch#17722

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use ReadAdvice.RANDOM by default. #13244

Use ReadAdvice.RANDOM by default. #13244

Uh oh!

jpountz commented Mar 29, 2024 •

edited

Loading

Uh oh!

uschindler commented Mar 29, 2024

Uh oh!

mikemccand commented Apr 1, 2024

Uh oh!

jpountz commented Apr 3, 2024

Uh oh!

mikemccand commented Apr 3, 2024

Uh oh!

uschindler commented Apr 3, 2024

Uh oh!

jpountz commented Apr 4, 2024

Uh oh!

uschindler left a comment

Uh oh!

jpountz commented Apr 4, 2024

Uh oh!

jpountz commented Apr 4, 2024

Uh oh!

mikemccand left a comment

Uh oh!

uschindler commented Apr 4, 2024

Uh oh!

uschindler commented Apr 4, 2024

Uh oh!

Uh oh!

Use ReadAdvice.RANDOM by default. #13244

Use ReadAdvice.RANDOM by default. #13244

Uh oh!

Conversation

jpountz commented Mar 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

uschindler commented Mar 29, 2024

Uh oh!

mikemccand commented Apr 1, 2024

Uh oh!

jpountz commented Apr 3, 2024

Uh oh!

mikemccand commented Apr 3, 2024

Uh oh!

uschindler commented Apr 3, 2024

Uh oh!

jpountz commented Apr 4, 2024

Uh oh!

uschindler left a comment

Choose a reason for hiding this comment

Uh oh!

jpountz commented Apr 4, 2024

Uh oh!

jpountz commented Apr 4, 2024

Uh oh!

mikemccand left a comment

Choose a reason for hiding this comment

Uh oh!

uschindler commented Apr 4, 2024

Uh oh!

uschindler commented Apr 4, 2024

Uh oh!

Uh oh!

jpountz commented Mar 29, 2024 •

edited

Loading