Skip to content

Use ReadAdvice.RANDOM by default. #13244

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 4, 2024

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Mar 29, 2024

This switches the default ReadAdvice from NORMAL to RANDOM, which is a better fit for the kind of access pattern that Lucene has. This is expected to reduce page cache trashing and contention on the page table.

NORMAL is still available, but never used by any of the file formats.

This effectively forces index inputs to be either open with a SEQUENTIAL or
RANDOM advice, with nothing in between.
@uschindler
Copy link
Contributor

I think this idea is not too bad, because as Robert said, unless we merge or flush, access is always random, so readahead is bad.

We should still compare results under memory pressure and also without pressure to compare how it behaves.

For consistency we should still keep the enum constant, so one can have the option to use another one.

Maybe make the default configurable via sysprop?

@mikemccand
Copy link
Member

Unfortunately, benchmarking the cold index case correctly is not so easy ... I would not trust luceneutil to give accurate results (its queries are synthetically generated).

We would rather need a real-world large index (or use ramhog to cut back on free OS RAM), and, importantly, real-world and matching query traffic that shows the typical/realistc Zipfian distribution on search terms.

Not only realistic queries, but they should be delivered to Lucene accurately by time (i.e. at the actual arrival times that the queries came to the search engine), asynchronously ("open loop") to avoid the coordinated omission bug.

@jpountz
Copy link
Contributor Author

jpountz commented Apr 3, 2024

@mikemccand I wonder if we need to create such a sophisticated benchmark. If we could confirm that performance is not affected when the cache is hot, and better when the cache is cold, maybe that would be good enough?

@mikemccand
Copy link
Member

Yeah +1 I don't think we should block this change on sophisticated benchmarking! If we "first do no harm" (hot case not affected), and If we can show some improvement (or even no degradation?) in a simple cold benchmark then we should make this simplification!

@uschindler
Copy link
Contributor

Could we still keep the NORMAL ReadAdvice constant and its mappings? We should just change the default!

I can add a system property to make it configurable like the other MMapDir options.

@jpountz jpountz changed the title Remove ReadAdvice.NORMAL. Use ReadAdvice.RANDOM by default. Apr 4, 2024
@jpountz jpountz marked this pull request as ready for review April 4, 2024 09:56
@jpountz
Copy link
Contributor Author

jpountz commented Apr 4, 2024

That works for me @uschindler. I updated the code and the PR description.

Copy link
Contributor

@uschindler uschindler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a system property can be added in a separate PR.

@jpountz
Copy link
Contributor Author

jpountz commented Apr 4, 2024

Here's a luceneutil run on wikibigall and data hot in the page cache. No difference, which is what was expected I guess:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                         MedTerm      539.57      (7.5%)      528.42      (6.0%)   -2.1% ( -14% -   12%) 0.338
                         LowTerm      988.10      (5.9%)      970.05      (5.4%)   -1.8% ( -12% -   10%) 0.308
                    OrHighNotLow      317.47      (6.2%)      312.12      (5.0%)   -1.7% ( -12% -   10%) 0.343
                        HighTerm      252.16      (7.3%)      248.37      (5.6%)   -1.5% ( -13% -   12%) 0.464
                   OrHighNotHigh      288.74      (6.3%)      284.71      (4.5%)   -1.4% ( -11% -    9%) 0.419
                        Wildcard      100.66      (2.5%)       99.37      (2.6%)   -1.3% (  -6% -    3%) 0.117
                    OrHighNotMed      374.03      (6.2%)      369.44      (5.0%)   -1.2% ( -11% -   10%) 0.491
                    OrNotHighMed      248.17      (3.7%)      245.38      (4.0%)   -1.1% (  -8% -    6%) 0.359
                   OrNotHighHigh      156.42      (6.3%)      154.80      (5.1%)   -1.0% ( -11% -   11%) 0.568
                        PKLookup      295.59      (2.0%)      293.25      (3.0%)   -0.8% (  -5% -    4%) 0.329
                         Respell       50.41      (2.0%)       50.06      (2.0%)   -0.7% (  -4% -    3%) 0.264
                      AndHighLow     1056.24      (3.4%)     1049.79      (3.6%)   -0.6% (  -7% -    6%) 0.582
                    OrNotHighLow      704.71      (3.4%)      702.19      (3.6%)   -0.4% (  -7% -    6%) 0.748
                          IntNRQ      750.87      (4.4%)      748.32      (3.6%)   -0.3% (  -8% -    8%) 0.791
                     LowSpanNear        2.50      (1.1%)        2.50      (1.2%)   -0.2% (  -2% -    2%) 0.519
            HighTermTitleBDVSort       12.75      (8.5%)       12.72      (5.6%)   -0.2% ( -13% -   15%) 0.935
                       OrHighMed      219.68      (2.1%)      219.28      (2.6%)   -0.2% (  -4% -    4%) 0.805
            HighIntervalsOrdered        3.32      (4.3%)        3.31      (4.4%)   -0.2% (  -8% -    8%) 0.902
                 LowSloppyPhrase       17.94      (1.6%)       17.94      (2.2%)   -0.0% (  -3% -    3%) 0.972
                     MedSpanNear       12.89      (1.4%)       12.89      (1.3%)    0.0% (  -2% -    2%) 0.990
                      OrHighHigh       80.78      (1.3%)       80.79      (1.5%)    0.0% (  -2% -    2%) 0.978
             LowIntervalsOrdered       17.67      (2.4%)       17.69      (2.3%)    0.1% (  -4% -    4%) 0.910
                      AndHighMed       86.41      (2.1%)       86.50      (2.1%)    0.1% (  -4% -    4%) 0.870
                          Fuzzy1      104.55      (2.6%)      104.71      (1.8%)    0.2% (  -4% -    4%) 0.825
                         Prefix3      190.94      (4.2%)      191.33      (3.7%)    0.2% (  -7% -    8%) 0.872
                       MedPhrase       82.82      (5.1%)       82.99      (2.8%)    0.2% (  -7% -    8%) 0.872
               HighTermMonthSort     2859.54      (7.2%)     2866.04      (7.6%)    0.2% ( -13% -   16%) 0.923
                      TermDTSort      420.48      (5.5%)      421.49      (5.1%)    0.2% (  -9% -   11%) 0.888
                       OrHighLow      610.57      (2.6%)      612.29      (2.4%)    0.3% (  -4% -    5%) 0.724
               HighTermTitleSort      149.70      (1.5%)      150.16      (1.2%)    0.3% (  -2% -    3%) 0.477
                HighSloppyPhrase        3.04      (3.3%)        3.05      (2.8%)    0.4% (  -5% -    6%) 0.672
                      HighPhrase       20.25      (4.3%)       20.34      (2.2%)    0.4% (  -5% -    7%) 0.685
                     AndHighHigh       45.77      (2.2%)       45.98      (2.6%)    0.5% (  -4% -    5%) 0.537
             MedIntervalsOrdered        1.93      (1.6%)        1.94      (2.9%)    0.6% (  -3% -    5%) 0.395
                 MedSloppyPhrase       10.80      (2.8%)       10.88      (2.4%)    0.7% (  -4% -    6%) 0.407
                       LowPhrase       12.05      (4.6%)       12.14      (2.4%)    0.7% (  -6% -    8%) 0.534
                          Fuzzy2       95.06      (3.6%)       95.98      (2.6%)    1.0% (  -5% -    7%) 0.327
           HighTermDayOfYearSort      569.67      (3.4%)      575.25      (2.5%)    1.0% (  -4% -    7%) 0.302
                    HighSpanNear        5.34      (5.6%)        5.44      (6.2%)    1.8% (  -9% -   14%) 0.329

@jpountz jpountz added this to the 10.0.0 milestone Apr 4, 2024
@jpountz
Copy link
Contributor Author

jpountz commented Apr 4, 2024

I hacked luceneutil to clear my page cache with echo 1 > /proc/sys/vm/drop_caches before each query:

diff --git a/src/main/perf/TaskThreads.java b/src/main/perf/TaskThreads.java
index 313664f..b42a1bd 100644
--- a/src/main/perf/TaskThreads.java
+++ b/src/main/perf/TaskThreads.java
@@ -89,6 +89,12 @@ public class TaskThreads {
                                                // Done
                                                break;
                                        }
+                                       ProcessBuilder pb = new ProcessBuilder("/bin/bash", "/home/jpountz/drop_caches.sh");
+                                       Process p = pb.start();
+                                       int code = p.waitFor();
+                                       if (code != 0) {
+                                         throw new Error();
+                                       }
                                        final long t0 = System.nanoTime();
                                        try {
                                                task.go(indexState, taskParser);

This gives the following output on wikibigall. Differences are somewhat bigger and some p values are low-ish, e.g. OrHighNotLow and IntNRQ (faster) or HighTermDayOfYearSort and HighTermTitleBDVSort (slower) but nothing jumps out as being a blocker for this change.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
           HighTermDayOfYearSort      356.29      (7.2%)      346.35      (5.8%)   -2.8% ( -14% -   11%) 0.178
            HighTermTitleBDVSort       16.47      (7.1%)       16.02      (4.3%)   -2.7% ( -13% -    9%) 0.139
                        HighTerm      284.20      (8.8%)      276.94     (10.1%)   -2.6% ( -19% -   17%) 0.395
                         Respell       51.19      (3.7%)       50.49      (4.4%)   -1.4% (  -9% -    6%) 0.286
                     LowSpanNear       19.00      (3.3%)       18.76      (3.4%)   -1.3% (  -7% -    5%) 0.222
                        PKLookup      238.02      (4.4%)      235.09      (6.7%)   -1.2% ( -11% -   10%) 0.494
                        Wildcard      183.79      (5.4%)      181.86      (6.3%)   -1.1% ( -12% -   11%) 0.571
                       LowPhrase       67.32      (5.7%)       66.67      (5.6%)   -1.0% ( -11% -   11%) 0.591
                    HighSpanNear       26.83      (4.8%)       26.58      (3.6%)   -0.9% (  -8% -    7%) 0.501
            HighIntervalsOrdered        3.10      (3.6%)        3.07      (3.9%)   -0.9% (  -8% -    6%) 0.451
                      HighPhrase      122.98      (5.7%)      121.92      (4.0%)   -0.9% (  -9% -    9%) 0.578
                          Fuzzy2       61.85      (3.3%)       61.56      (4.1%)   -0.5% (  -7% -    7%) 0.693
                HighSloppyPhrase       12.79      (2.9%)       12.73      (3.2%)   -0.4% (  -6% -    5%) 0.642
                          Fuzzy1       78.45      (3.4%)       78.18      (3.0%)   -0.3% (  -6% -    6%) 0.728
                     MedSpanNear        5.64      (2.8%)        5.63      (2.6%)   -0.3% (  -5% -    5%) 0.735
                 LowSloppyPhrase        2.31      (3.6%)        2.31      (3.7%)   -0.2% (  -7% -    7%) 0.843
                 MedSloppyPhrase       13.62      (3.5%)       13.59      (4.0%)   -0.2% (  -7% -    7%) 0.862
             MedIntervalsOrdered       11.97      (4.2%)       11.94      (5.0%)   -0.2% (  -9% -    9%) 0.891
                     AndHighHigh       45.42      (5.0%)       45.33      (4.2%)   -0.2% (  -8% -    9%) 0.893
                       OrHighMed      214.12      (8.0%)      214.07      (5.9%)   -0.0% ( -12% -   15%) 0.992
               HighTermMonthSort     1762.09      (6.4%)     1764.93      (5.3%)    0.2% ( -10% -   12%) 0.931
                    OrHighNotMed      331.20      (7.1%)      331.97      (8.7%)    0.2% ( -14% -   17%) 0.926
                    OrNotHighMed      276.66      (7.0%)      277.84     (10.0%)    0.4% ( -15% -   18%) 0.876
             LowIntervalsOrdered       18.28      (4.3%)       18.41      (3.7%)    0.7% (  -6% -    9%) 0.593
                         Prefix3      429.15      (3.0%)      433.02      (3.2%)    0.9% (  -5% -    7%) 0.353
               HighTermTitleSort      124.84      (4.8%)      126.06      (5.9%)    1.0% (  -9% -   12%) 0.565
                         MedTerm      405.21     (10.5%)      409.60      (8.7%)    1.1% ( -16% -   22%) 0.722
                      AndHighMed      216.72      (6.6%)      219.11      (4.7%)    1.1% (  -9% -   13%) 0.541
                       MedPhrase       79.97      (5.5%)       80.87      (6.0%)    1.1% (  -9% -   13%) 0.540
                      TermDTSort      165.81      (7.8%)      167.85      (9.0%)    1.2% ( -14% -   19%) 0.644
                    OrNotHighLow      588.69      (3.4%)      596.51      (5.2%)    1.3% (  -7% -   10%) 0.338
                       OrHighLow      436.10      (5.7%)      442.05      (6.9%)    1.4% ( -10% -   14%) 0.497
                         LowTerm      725.03      (5.4%)      735.84      (6.4%)    1.5% (  -9% -   14%) 0.426
                   OrHighNotHigh      171.64      (8.7%)      174.23      (8.7%)    1.5% ( -14% -   20%) 0.583
                      OrHighHigh       72.38      (5.3%)       73.65      (4.8%)    1.8% (  -7% -   12%) 0.269
                      AndHighLow      560.60      (5.0%)      572.60      (5.6%)    2.1% (  -8% -   13%) 0.200
                   OrNotHighHigh      157.28      (6.5%)      160.77      (9.3%)    2.2% ( -12% -   19%) 0.382
                          IntNRQ      964.08      (8.8%)     1010.37      (5.3%)    4.8% (  -8% -   20%) 0.036
                    OrHighNotLow      276.39     (10.3%)      290.60      (8.3%)    5.1% ( -12% -   26%) 0.082

Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the benchmarking @jpountz!

@jpountz jpountz merged commit a2676b1 into apache:main Apr 4, 2024
@jpountz jpountz deleted the remove_ReadAdvice_NORMAL branch April 4, 2024 14:45
@uschindler
Copy link
Contributor

Thanks! I will provide a PR to make it configurable.

@uschindler
Copy link
Contributor

See #13264

asimmahmood1 added a commit to asimmahmood1/OpenSearch that referenced this pull request Mar 24, 2025
* Lucene 10 changed the IOContext.DEFAULT from sequential to random,
  which makes sense for search use case: apache/lucene#13244
* place we read a file only once, its better to switch to
  READONLY(sequential)
* this should only be in cases the file is read by the same thread that
  opened it, e.g. it won't work for RemoteStore that does async upload

Signed-off-by: Asim Mahmood <[email protected]>
msfroh pushed a commit to opensearch-project/OpenSearch that referenced this pull request Apr 8, 2025
#17670)

* Lucene 10 changed the IOContext.DEFAULT from sequential to random,
  which makes sense for search use case: apache/lucene#13244
* place we read a file only once, its better to switch to
  READONLY(sequential)
* this should only be in cases the file is read by the same thread that
  opened it, e.g. it won't work for RemoteStore that does async upload

---------

Signed-off-by: Asim Mahmood <[email protected]>
Harsh-87 pushed a commit to Harsh-87/OpenSearch that referenced this pull request May 7, 2025
opensearch-project#17670)

* Lucene 10 changed the IOContext.DEFAULT from sequential to random,
  which makes sense for search use case: apache/lucene#13244
* place we read a file only once, its better to switch to
  READONLY(sequential)
* this should only be in cases the file is read by the same thread that
  opened it, e.g. it won't work for RemoteStore that does async upload

---------

Signed-off-by: Asim Mahmood <[email protected]>
Signed-off-by: Harsh Kothari <[email protected]>
Harsh-87 pushed a commit to Harsh-87/OpenSearch that referenced this pull request May 7, 2025
opensearch-project#17670)

* Lucene 10 changed the IOContext.DEFAULT from sequential to random,
  which makes sense for search use case: apache/lucene#13244
* place we read a file only once, its better to switch to
  READONLY(sequential)
* this should only be in cases the file is read by the same thread that
  opened it, e.g. it won't work for RemoteStore that does async upload

---------

Signed-off-by: Asim Mahmood <[email protected]>
Signed-off-by: Harsh Kothari <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants