Skip to content

POC: Parquet predicate results cache #7760

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Jun 24, 2025

Which issue does this PR close?

TODO:

  • Avoid double buffering of intermediate results (via coalesce)
  • File ticket to optimize coalesce kernel more: [EPIC] Optimize the coalesece kernel (BatchCoalescer) #7761
  • Add memory limit for results cache
  • Review actual predicates that are pushed down in DataFusion (are they multiple single column predicates or is it a single set of cached results....) -- aka do we have to handle multiple ArrowPredicates (probably...)

Rationale for this change

I am working on not decoding predicate columns twice when evaluating filters in the reader

In #7513 we prototyped several APIs that we have now started making real (like BatchCoalescer) so I made a new PR that used those APIs which doesn't have hundreds of comments.

I am pleased with how it is looking now, and as before I don't really plan to merge this PR as is, I am using it as a design vehicle

What changes are included in this PR?

  1. Add code to cache columns which are reused in filter and scan

Are there any user-facing changes?

Not yet,

@github-actions github-actions bot added parquet Changes to the parquet crate arrow Changes to the arrow crate labels Jun 24, 2025
@alamb
Copy link
Contributor Author

alamb commented Jun 24, 2025

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.11.0-1015-gcp #15~24.04.1-Ubuntu SMP Thu Apr 24 20:41:05 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/cache_filter_result2 (025d411) to 2b40d1d diff
BENCH_NAME=arrow_reader_clickbench
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_clickbench
BENCH_FILTER=
BENCH_BRANCH_NAME=alamb_cache_filter_result2
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Jun 24, 2025

🤖: Benchmark completed

Details

group                                alamb_cache_filter_result2             main
-----                                --------------------------             ----
arrow_reader_clickbench/async/Q1     1.00      2.1±0.02ms        ? ?/sec    1.14      2.4±0.01ms        ? ?/sec
arrow_reader_clickbench/async/Q10    1.00     13.7±0.13ms        ? ?/sec    1.06     14.6±0.15ms        ? ?/sec
arrow_reader_clickbench/async/Q11    1.00     15.7±0.17ms        ? ?/sec    1.05     16.5±0.15ms        ? ?/sec
arrow_reader_clickbench/async/Q12    1.00     27.2±0.26ms        ? ?/sec    1.46     39.5±0.31ms        ? ?/sec
arrow_reader_clickbench/async/Q13    1.00     40.1±0.30ms        ? ?/sec    1.34     53.8±0.50ms        ? ?/sec
arrow_reader_clickbench/async/Q14    1.00     38.2±0.31ms        ? ?/sec    1.36     51.8±0.60ms        ? ?/sec
arrow_reader_clickbench/async/Q19    1.03      5.3±0.06ms        ? ?/sec    1.00      5.1±0.06ms        ? ?/sec
arrow_reader_clickbench/async/Q20    1.00    109.4±0.67ms        ? ?/sec    1.66    181.8±1.02ms        ? ?/sec
arrow_reader_clickbench/async/Q21    1.00    126.9±0.70ms        ? ?/sec    1.88    239.3±0.97ms        ? ?/sec
arrow_reader_clickbench/async/Q22    1.00    194.8±1.08ms        ? ?/sec    2.63    512.7±1.91ms        ? ?/sec
arrow_reader_clickbench/async/Q23    1.00    436.2±3.40ms        ? ?/sec    1.17   509.6±13.94ms        ? ?/sec
arrow_reader_clickbench/async/Q24    1.00     44.7±0.34ms        ? ?/sec    1.32     59.1±0.67ms        ? ?/sec
arrow_reader_clickbench/async/Q27    1.00    131.9±2.08ms        ? ?/sec    1.27    167.4±1.00ms        ? ?/sec
arrow_reader_clickbench/async/Q28    1.00    125.1±7.17ms        ? ?/sec    1.33    165.9±0.71ms        ? ?/sec
arrow_reader_clickbench/async/Q30    1.00     65.6±0.49ms        ? ?/sec    1.00     65.8±0.46ms        ? ?/sec
arrow_reader_clickbench/async/Q36    1.00    129.1±0.68ms        ? ?/sec    1.33    171.2±1.34ms        ? ?/sec
arrow_reader_clickbench/async/Q37    1.00    101.1±0.66ms        ? ?/sec    1.03    104.4±0.43ms        ? ?/sec
arrow_reader_clickbench/async/Q38    1.02     40.4±0.33ms        ? ?/sec    1.00     39.7±0.21ms        ? ?/sec
arrow_reader_clickbench/async/Q39    1.00     49.8±0.35ms        ? ?/sec    1.00     49.7±0.33ms        ? ?/sec
arrow_reader_clickbench/async/Q40    1.00     49.6±0.40ms        ? ?/sec    1.09     54.3±0.47ms        ? ?/sec
arrow_reader_clickbench/async/Q41    1.01     41.3±0.43ms        ? ?/sec    1.00     41.0±0.34ms        ? ?/sec
arrow_reader_clickbench/async/Q42    1.01     14.5±0.13ms        ? ?/sec    1.00     14.3±0.12ms        ? ?/sec
arrow_reader_clickbench/sync/Q1      1.00   1883.0±7.17µs        ? ?/sec    1.17      2.2±0.01ms        ? ?/sec
arrow_reader_clickbench/sync/Q10     1.00     12.4±0.13ms        ? ?/sec    1.07     13.2±0.11ms        ? ?/sec
arrow_reader_clickbench/sync/Q11     1.00     14.3±0.18ms        ? ?/sec    1.05     15.0±0.08ms        ? ?/sec
arrow_reader_clickbench/sync/Q12     1.00     32.5±2.51ms        ? ?/sec    1.17     38.0±0.30ms        ? ?/sec
arrow_reader_clickbench/sync/Q13     1.00     37.0±0.23ms        ? ?/sec    1.37     50.9±0.46ms        ? ?/sec
arrow_reader_clickbench/sync/Q14     1.00     36.0±0.27ms        ? ?/sec    1.37     49.5±0.33ms        ? ?/sec
arrow_reader_clickbench/sync/Q19     1.04      4.5±0.03ms        ? ?/sec    1.00      4.3±0.04ms        ? ?/sec
arrow_reader_clickbench/sync/Q20     1.00     97.6±0.51ms        ? ?/sec    1.84    179.6±0.76ms        ? ?/sec
arrow_reader_clickbench/sync/Q21     1.00    113.9±0.68ms        ? ?/sec    2.11    240.4±1.42ms        ? ?/sec
arrow_reader_clickbench/sync/Q22     1.00    174.7±1.12ms        ? ?/sec    2.83    495.4±2.41ms        ? ?/sec
arrow_reader_clickbench/sync/Q23     1.00    362.5±2.28ms        ? ?/sec    1.26   458.3±12.59ms        ? ?/sec
arrow_reader_clickbench/sync/Q24     1.00     41.7±0.34ms        ? ?/sec    1.36     56.6±0.61ms        ? ?/sec
arrow_reader_clickbench/sync/Q27     1.00    121.9±2.14ms        ? ?/sec    1.30    158.1±0.87ms        ? ?/sec
arrow_reader_clickbench/sync/Q28     1.00    112.1±1.13ms        ? ?/sec    1.40    156.5±1.13ms        ? ?/sec
arrow_reader_clickbench/sync/Q30     1.00     63.8±0.40ms        ? ?/sec    1.00     63.8±0.44ms        ? ?/sec
arrow_reader_clickbench/sync/Q36     1.00    117.6±1.00ms        ? ?/sec    1.37    161.6±0.99ms        ? ?/sec
arrow_reader_clickbench/sync/Q37     1.00     94.9±0.57ms        ? ?/sec    1.02     97.0±0.53ms        ? ?/sec
arrow_reader_clickbench/sync/Q38     1.01     32.6±0.24ms        ? ?/sec    1.00     32.4±0.23ms        ? ?/sec
arrow_reader_clickbench/sync/Q39     1.01     35.7±0.30ms        ? ?/sec    1.00     35.2±0.41ms        ? ?/sec
arrow_reader_clickbench/sync/Q40     1.00     46.2±0.48ms        ? ?/sec    1.09     50.3±0.30ms        ? ?/sec
arrow_reader_clickbench/sync/Q41     1.02     38.5±0.43ms        ? ?/sec    1.00     37.6±0.25ms        ? ?/sec
arrow_reader_clickbench/sync/Q42     1.01     13.8±0.14ms        ? ?/sec    1.00     13.7±0.14ms        ? ?/sec

@Dandandan
Copy link
Contributor

Nice, no regressions?

@alamb
Copy link
Contributor Author

alamb commented Jun 24, 2025

Nice, no regressions?

Seems like the coalesce batches work is paying off !

@alamb alamb force-pushed the alamb/cache_filter_result2 branch from d236925 to 025d411 Compare June 24, 2025 12:42
@alamb
Copy link
Contributor Author

alamb commented Jun 24, 2025

I have been thinking about memory usage as well as that will be a major factor in if we can cache the predicate results.

I annotated the Q22 with code to calculate memory usage of the cached results:

  • Cached predicate memory usage: 426344 bytes
    That looks strong

Q22 (some of the best performance gains):

 arrow_reader_clickbench/async/Q22    1.00    194.8±1.08ms        ? ?/sec    2.63    512.7±1.91ms        ? ?/sec
        // Q22: SELECT "SearchPhrase", MIN("URL"), MIN("Title"), COUNT(*) AS c, COUNT(DISTINCT "UserID") FROM hits WHERE "Title" LIKE '%Google%' AND "URL" NOT LIKE '%.google.%' AND "SearchPhrase" <> '' GROUP BY "SearchPhrase" ORDER BY c DESC LIMIT 10;
        Query {
            name: "Q22",
            filter_columns: vec!["Title", "URL", "SearchPhrase"],
            projection_columns: vec!["SearchPhrase", "URL", "Title", "UserID"],
            predicates: vec![
                ClickBenchPredicate::like_Google(0),
                ClickBenchPredicate::nlike_google(1),
                ClickBenchPredicate::not_empty(2),
            ],
            expected_row_count: 46,
        },

for hits_1.parquet, the data sizes are:

  • Title: 0.46MB (row group 0), 1.0MB (row group 1), 71.97MB (row group 2)
  • URL: 4.93MB (row group 0), 43.63MB (row group 1), 40.19MB (row group 2)
  • SearchPhrase: 0.03MB (row group 0), 0.11MB (row group 1), 8.6MB (row group 2)

(I totally used @XiangpengHao 's https://parquet-viewer.xiangpeng.systems/ for this analysis)

Screenshot 2025-06-24 at 8 59 18 AM

I will try and add some additional debugging / annotation code to see what the peak memory usage was (and if I limit it to 1MB if that will get triggered for any query)

@alamb
Copy link
Contributor Author

alamb commented Jun 24, 2025

Q22: Cached predicate result size: 827752
Q22: Cached predicate result size: 442728

Seems reasonable to me. I need to think about memory handling a bit more carefully now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants