POC: Test DataFusion with experimental Parquet Filter Pushdown (try 2) #16562

alamb · 2025-06-26T09:56:03Z

Which issue does this PR close?

Related to Enable parquet filter pushdown (filter_pushdown) by default #3463
Updated version of WIP: Test DataFusion with experimental Parquet Filter Pushdown WIP: Test DataFusion with experimental Parquet Filter Pushdown #16222

Rationale for this change

I am doing end to end testing of new parquet pushdown techniques

My plan is to use this PR and analysis to guide additional work needed to get filter pushdown on by default

What changes are included in this PR?

Pin to use pushdown in this branch: POC: Parquet predicate results cache arrow-rs#7760
Force filter pushdown to true

Test Plan

Change the default filter_pushdown and reorder_filters to true
Run clickbench benchmarks (see where we are)
Profile those queries where there is a slowdown

Profiling Anaylsis

(in progress)

alamb · 2025-06-26T11:04:32Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1015-gcp #15~24.04.1-Ubuntu SMP Thu Apr 24 20:41:05 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/new_parquet_result_caching (0d4f506) to db13dd9 diff
Benchmarks: clickbench_partitioned clickbench_1
Results will be posted here when complete

alamb · 2025-06-26T11:33:51Z

🤖: Benchmark completed

Details

Comparing HEAD and alamb_new_parquet_result_caching
--------------------
Benchmark clickbench_1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ alamb_new_parquet_result_caching ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     0.59 ms │                          0.57 ms │     no change │
│ QQuery 1     │    74.83 ms │                         76.78 ms │     no change │
│ QQuery 2     │   108.75 ms │                        105.56 ms │     no change │
│ QQuery 3     │   130.57 ms │                        131.14 ms │     no change │
│ QQuery 4     │   634.26 ms │                        628.89 ms │     no change │
│ QQuery 5     │   881.36 ms │                        842.59 ms │     no change │
│ QQuery 6     │     0.65 ms │                          0.65 ms │     no change │
│ QQuery 7     │    82.37 ms │                         87.41 ms │  1.06x slower │
│ QQuery 8     │   890.73 ms │                        873.34 ms │     no change │
│ QQuery 9     │  1203.12 ms │                       1189.95 ms │     no change │
│ QQuery 10    │   289.41 ms │                        294.86 ms │     no change │
│ QQuery 11    │   325.92 ms │                        330.25 ms │     no change │
│ QQuery 12    │   882.85 ms │                        892.12 ms │     no change │
│ QQuery 13    │  1269.86 ms │                       1452.48 ms │  1.14x slower │
│ QQuery 14    │   823.95 ms │                        989.67 ms │  1.20x slower │
│ QQuery 15    │   815.75 ms │                        823.76 ms │     no change │
│ QQuery 16    │  1639.70 ms │                       1643.62 ms │     no change │
│ QQuery 17    │  1621.55 ms │                       1604.17 ms │     no change │
│ QQuery 18    │  2959.17 ms │                       2949.66 ms │     no change │
│ QQuery 19    │   123.99 ms │                        125.54 ms │     no change │
│ QQuery 20    │  1210.36 ms │                       1188.25 ms │     no change │
│ QQuery 21    │  1411.17 ms │                       1387.61 ms │     no change │
│ QQuery 22    │  2417.57 ms │                       2259.31 ms │ +1.07x faster │
│ QQuery 23    │  7986.84 ms │                       1677.49 ms │ +4.76x faster │
│ QQuery 24    │   499.72 ms │                        600.69 ms │  1.20x slower │
│ QQuery 25    │   423.83 ms │                        398.88 ms │ +1.06x faster │
│ QQuery 26    │   563.47 ms │                        581.21 ms │     no change │
│ QQuery 27    │  1700.84 ms │                       1848.75 ms │  1.09x slower │
│ QQuery 28    │ 13348.87 ms │                      12743.91 ms │     no change │
│ QQuery 29    │   558.46 ms │                        567.18 ms │     no change │
│ QQuery 30    │   811.01 ms │                       1147.45 ms │  1.41x slower │
│ QQuery 31    │   864.91 ms │                       1135.93 ms │  1.31x slower │
│ QQuery 32    │  2577.12 ms │                       2589.12 ms │     no change │
│ QQuery 33    │  3328.06 ms │                       3322.24 ms │     no change │
│ QQuery 34    │  3366.29 ms │                       3314.41 ms │     no change │
│ QQuery 35    │  1310.58 ms │                       1298.05 ms │     no change │
│ QQuery 36    │   158.24 ms │                         70.52 ms │ +2.24x faster │
│ QQuery 37    │   104.92 ms │                         64.29 ms │ +1.63x faster │
│ QQuery 38    │   176.75 ms │                         71.41 ms │ +2.48x faster │
│ QQuery 39    │   257.91 ms │                         69.28 ms │ +3.72x faster │
│ QQuery 40    │    72.88 ms │                         71.59 ms │     no change │
│ QQuery 41    │    84.33 ms │                         67.18 ms │ +1.26x faster │
│ QQuery 42    │    78.78 ms │                         67.68 ms │ +1.16x faster │
└──────────────┴─────────────┴──────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                               │ 58072.30ms │
│ Total Time (alamb_new_parquet_result_caching)   │ 51585.43ms │
│ Average Time (HEAD)                             │  1350.52ms │
│ Average Time (alamb_new_parquet_result_caching) │  1199.66ms │
│ Queries Faster                                  │          9 │
│ Queries Slower                                  │          7 │
│ Queries with No Change                          │         27 │
│ Queries with Failure                            │          0 │
└─────────────────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ alamb_new_parquet_result_caching ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.61 ms │                          2.29 ms │ +1.14x faster │
│ QQuery 1     │    33.98 ms │                         36.41 ms │  1.07x slower │
│ QQuery 2     │    82.02 ms │                         80.00 ms │     no change │
│ QQuery 3     │   100.63 ms │                         98.64 ms │     no change │
│ QQuery 4     │   619.61 ms │                        591.12 ms │     no change │
│ QQuery 5     │   871.49 ms │                        856.27 ms │     no change │
│ QQuery 6     │     2.28 ms │                          2.34 ms │     no change │
│ QQuery 7     │    39.62 ms │                         43.37 ms │  1.09x slower │
│ QQuery 8     │   842.00 ms │                        854.49 ms │     no change │
│ QQuery 9     │  1161.82 ms │                       1168.36 ms │     no change │
│ QQuery 10    │   251.52 ms │                        268.40 ms │  1.07x slower │
│ QQuery 11    │   282.40 ms │                        304.87 ms │  1.08x slower │
│ QQuery 12    │   872.65 ms │                        947.66 ms │  1.09x slower │
│ QQuery 13    │  1279.12 ms │                       1436.53 ms │  1.12x slower │
│ QQuery 14    │   815.67 ms │                       1031.11 ms │  1.26x slower │
│ QQuery 15    │   763.15 ms │                        771.70 ms │     no change │
│ QQuery 16    │  1618.60 ms │                       1611.13 ms │     no change │
│ QQuery 17    │  1630.96 ms │                       1580.43 ms │     no change │
│ QQuery 18    │  3137.91 ms │                       2896.69 ms │ +1.08x faster │
│ QQuery 19    │    83.11 ms │                         85.32 ms │     no change │
│ QQuery 20    │  1168.52 ms │                       1156.76 ms │     no change │
│ QQuery 21    │  1314.27 ms │                       1353.01 ms │     no change │
│ QQuery 22    │  2177.44 ms │                       2244.54 ms │     no change │
│ QQuery 23    │  7456.50 ms │                       1394.31 ms │ +5.35x faster │
│ QQuery 24    │   472.29 ms │                        301.40 ms │ +1.57x faster │
│ QQuery 25    │   394.58 ms │                        380.81 ms │     no change │
│ QQuery 26    │   525.61 ms │                        304.73 ms │ +1.72x faster │
│ QQuery 27    │  1548.88 ms │                       1727.53 ms │  1.12x slower │
│ QQuery 28    │ 13165.50 ms │                      12088.23 ms │ +1.09x faster │
│ QQuery 29    │   536.70 ms │                        538.18 ms │     no change │
│ QQuery 30    │   782.98 ms │                       1246.94 ms │  1.59x slower │
│ QQuery 31    │   818.65 ms │                       1263.46 ms │  1.54x slower │
│ QQuery 32    │  2480.30 ms │                       2514.77 ms │     no change │
│ QQuery 33    │  3295.14 ms │                       3254.71 ms │     no change │
│ QQuery 34    │  3326.28 ms │                       3222.02 ms │     no change │
│ QQuery 35    │  1266.05 ms │                       1260.52 ms │     no change │
│ QQuery 36    │   125.30 ms │                         27.79 ms │ +4.51x faster │
│ QQuery 37    │    50.91 ms │                         26.71 ms │ +1.91x faster │
│ QQuery 38    │   121.79 ms │                         26.63 ms │ +4.57x faster │
│ QQuery 39    │   198.94 ms │                         26.88 ms │ +7.40x faster │
│ QQuery 40    │    40.50 ms │                         27.02 ms │ +1.50x faster │
│ QQuery 41    │    39.98 ms │                         26.24 ms │ +1.52x faster │
│ QQuery 42    │    32.59 ms │                         26.67 ms │ +1.22x faster │
└──────────────┴─────────────┴──────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                               │ 55830.85ms │
│ Total Time (alamb_new_parquet_result_caching)   │ 49106.97ms │
│ Average Time (HEAD)                             │  1298.39ms │
│ Average Time (alamb_new_parquet_result_caching) │  1142.02ms │
│ Queries Faster                                  │         13 │
│ Queries Slower                                  │         10 │
│ Queries with No Change                          │         20 │
│ Queries with Failure                            │          0 │
└─────────────────────────────────────────────────┴────────────┘

alamb · 2025-06-26T16:38:02Z

🤔 the results are not that far away.

Queries to review


│ QQuery 7     │    39.62 ms │                         43.37 ms │  1.09x slower │
│ QQuery 10    │   251.52 ms │                        268.40 ms │  1.07x slower │
│ QQuery 11    │   282.40 ms │                        304.87 ms │  1.08x slower │
│ QQuery 12    │   872.65 ms │                        947.66 ms │  1.09x slower │
│ QQuery 13    │  1279.12 ms │                       1436.53 ms │  1.12x slower │
│ QQuery 14    │   815.67 ms │                       1031.11 ms │  1.26x slower │
│ QQuery 27    │  1548.88 ms │                       1727.53 ms │  1.12x slower │
│ QQuery 30    │   782.98 ms │                       1246.94 ms │  1.59x slower │
│ QQuery 31    │   818.65 ms │                       1263.46 ms │  1.54x slower │

alamb · 2025-06-26T19:56:36Z

Analysis of Q30

I took a quick look at q30 (which is now so much easier after @pepijnve broke the queries into their own files)

SELECT "SearchEngineID", "ClientIP", COUNT(*) AS c, SUM("IsRefresh"), AVG("ResolutionWidth") FROM hits WHERE "SearchPhrase" <> '' GROUP BY "SearchEngineID", "ClientIP" ORDER BY c DESC LIMIT 10;

Here is the comparison branch

./datafusion-cli-alamb_arrow_55.2.0_upgrade_real -f q30-times-10.sql  | grep Elapsed
Elapsed 0.354 seconds.
Elapsed 0.324 seconds.
Elapsed 0.325 seconds.
Elapsed 0.321 seconds.
Elapsed 0.313 seconds.
Elapsed 0.317 seconds.
Elapsed 0.318 seconds.
Elapsed 0.312 seconds.
Elapsed 0.319 seconds.
Elapsed 0.322 seconds.
Elapsed 0.313 seconds.

Here is this branch (definitely slower)

andrewlamb@Andrews-MacBook-Pro-2:~/Downloads$ ./datafusion-cli-alamb_new_parquet_result_caching -f q30-times-10.sql  | grep Elapsed
Elapsed 0.501 seconds.
Elapsed 0.488 seconds.
Elapsed 0.510 seconds.
Elapsed 0.527 seconds.
Elapsed 0.477 seconds.
Elapsed 0.480 seconds.
Elapsed 0.476 seconds.
Elapsed 0.484 seconds.
Elapsed 0.492 seconds.
Elapsed 0.489 seconds.
Elapsed 0.491 seconds.

Here is the plan

> explain SELECT "SearchEngineID", "ClientIP", COUNT(*) AS c, SUM("IsRefresh"), AVG("ResolutionWidth") FROM hits WHERE "SearchPhrase" <> '' GROUP BY "SearchEngineID", "ClientIP" ORDER BY c DESC LIMIT 10;
+---------------+-------------------------------+
| plan_type     | plan                          |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
|               | │  SortPreservingMergeExec  │ |
|               | │    --------------------   │ |
|               | │      c DESClimit: 10      │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       SortExec(TopK)      │ |
|               | │    --------------------   │ |
|               | │          c@2 DESC         │ |
|               | │                           │ |
|               | │         limit: 10         │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       ProjectionExec      │ |
|               | │    --------------------   │ |
|               | │     ClientIP: ClientIP    │ |
|               | │                           │ |
|               | │      SearchEngineID:      │ |
|               | │       SearchEngineID      │ |
|               | │                           │ |
|               | │ avg(hits.ResolutionWidth):│ |
|               | │ avg(hits.ResolutionWidth) │ |
|               | │                           │ |
|               | │     c: count(Int64(1))    │ |
|               | │                           │ |
|               | │    sum(hits.IsRefresh):   │ |
|               | │    sum(hits.IsRefresh)    │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       AggregateExec       │ |
|               | │    --------------------   │ |
|               | │           aggr:           │ |
|               | │     count(1), sum(hits    │ |
|               | │      .IsRefresh), avg     │ |
|               | │   (hits.ResolutionWidth)  │ |
|               | │                           │ |
|               | │         group_by:         │ |
|               | │  SearchEngineID, ClientIP │ |
|               | │                           │ |
|               | │           mode:           │ |
|               | │      FinalPartitioned     │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │    CoalesceBatchesExec    │ |
|               | │    --------------------   │ |
|               | │     target_batch_size:    │ |
|               | │            8192           │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │      RepartitionExec      │ |
|               | │    --------------------   │ |
|               | │ partition_count(in->out): │ |
|               | │          16 -> 16         │ |
|               | │                           │ |
|               | │    partitioning_scheme:   │ |
|               | │  Hash([SearchEngineID@0,  │ |
|               | │      ClientIP@1], 16)     │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       AggregateExec       │ |
|               | │    --------------------   │ |
|               | │           aggr:           │ |
|               | │     count(1), sum(hits    │ |
|               | │      .IsRefresh), avg     │ |
|               | │   (hits.ResolutionWidth)  │ |
|               | │                           │ |
|               | │         group_by:         │ |
|               | │  SearchEngineID, ClientIP │ |
|               | │                           │ |
|               | │       mode: Partial       │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       DataSourceExec      │ |
|               | │    --------------------   │ |
|               | │         files: 115        │ |
|               | │      format: parquet      │ |
|               | │                           │ |
|               | │         predicate:        │ |
|               | │      SearchPhrase !=      │ |
|               | └───────────────────────────┘ |
|               |                               |
+---------------+-------------------------------+

I poked around using Instruments and I think the breakdown is:

10% doing the final aggregate and TopK result (result down to the RepartitionExec)
90% doing the RepartitionExec and below

Of the 90% left,

10% went to the aggregate
70% went to the scan

Interestingly in this case SearchPhrase does not show up in the selection/projection so the result of SearchPhrase <> '' can't be cached.

So my conclusion here is that the overhead of skip/scanning in the decoder takes longer than decoding the entire column and then applying a filter.

My next plan will be

Verify that the overhead of skipping is actually the cause
Look into why CachedReader is showing up in the first place 🤔

alamb · 2025-06-27T18:51:23Z

Verify that the overhead of skipping is actually the cause

To start with this analysis, I have run clickbench 30 and saved the results of evaluating filters (boolean arrays). There are 325 parquet files corresponding to the 325 row groups in the 100 clickbench files
filters_clickbench_q30.zip

alamb added 3 commits June 26, 2025 05:56

Pin to pre-release arrow version

ed3e6d7

Update plans

c57a626

Update pin to pre-release

f57b002

github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Jun 26, 2025

alamb added 6 commits June 26, 2025 05:58

Change default pushdown_filters setting to true

7f78722

update pushdown_filters

3767661

Update explain_tree.slt

a227663

Update parquet.slt

2487af0

Update parquet_statistics.slt

9375f51

Update repartition_scan.slt

b44ba97

alamb force-pushed the alamb/new_parquet_result_caching branch from 35d9940 to b44ba97 Compare June 26, 2025 10:05

github-actions bot added documentation Improvements or additions to documentation common Related to common crate labels Jun 26, 2025

alamb changed the title ~~Test DataFusion with experimental Parquet Filter Pushdown (try 2)~~ POC: Test DataFusion with experimental Parquet Filter Pushdown (try 2) Jun 26, 2025

alamb mentioned this pull request Jun 26, 2025

WIP: Test DataFusion with experimental Parquet Filter Pushdown #16222

Closed

alamb added 2 commits June 26, 2025 06:13

Update tests

055cfba

update reorder_filters

1ead6af

github-actions bot added the core Core DataFusion crate label Jun 26, 2025

alamb added 2 commits June 26, 2025 06:46

Updates

866c598

fix

0d4f506

alamb mentioned this pull request Jun 26, 2025

restore topk pre-filtering of batches and make sort query fuzzer less sensitive to expected non determinism #16501

Merged

alamb mentioned this pull request Jun 27, 2025

[EPIC] Faster performance for parquet predicate evaluation for non selective filters apache/arrow-rs#7456

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

POC: Test DataFusion with experimental Parquet Filter Pushdown (try 2) #16562

POC: Test DataFusion with experimental Parquet Filter Pushdown (try 2) #16562

Uh oh!

alamb commented Jun 26, 2025 •

edited

Loading

Uh oh!

alamb commented Jun 26, 2025

Uh oh!

alamb commented Jun 26, 2025

Uh oh!

alamb commented Jun 26, 2025

Uh oh!

alamb commented Jun 26, 2025

Uh oh!

alamb commented Jun 27, 2025

Uh oh!

Uh oh!

POC: Test DataFusion with experimental Parquet Filter Pushdown (try 2) #16562

Are you sure you want to change the base?

POC: Test DataFusion with experimental Parquet Filter Pushdown (try 2) #16562

Uh oh!

Conversation

alamb commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Test Plan

Profiling Anaylsis

Uh oh!

alamb commented Jun 26, 2025

Uh oh!

alamb commented Jun 26, 2025

Uh oh!

alamb commented Jun 26, 2025

Uh oh!

alamb commented Jun 26, 2025

Analysis of Q30

Uh oh!

alamb commented Jun 27, 2025

Uh oh!

Uh oh!

alamb commented Jun 26, 2025 •

edited

Loading