Skip to content

POC: Test DataFusion with experimental Parquet Filter Pushdown (try 2) #16562

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Jun 26, 2025

Which issue does this PR close?

Rationale for this change

I am doing end to end testing of new parquet pushdown techniques

My plan is to use this PR and analysis to guide additional work needed to get filter pushdown on by default

What changes are included in this PR?

  1. Pin to use pushdown in this branch: POC: Parquet predicate results cache arrow-rs#7760
  2. Force filter pushdown to true

Test Plan

  • Change the default filter_pushdown and reorder_filters to true
  • Run clickbench benchmarks (see where we are)
  • Profile those queries where there is a slowdown

Profiling Anaylsis

(in progress)

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Jun 26, 2025
@alamb alamb force-pushed the alamb/new_parquet_result_caching branch from 35d9940 to b44ba97 Compare June 26, 2025 10:05
@github-actions github-actions bot added documentation Improvements or additions to documentation common Related to common crate labels Jun 26, 2025
@alamb alamb changed the title Test DataFusion with experimental Parquet Filter Pushdown (try 2) POC: Test DataFusion with experimental Parquet Filter Pushdown (try 2) Jun 26, 2025
@github-actions github-actions bot added the core Core DataFusion crate label Jun 26, 2025
@alamb
Copy link
Contributor Author

alamb commented Jun 26, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.11.0-1015-gcp #15~24.04.1-Ubuntu SMP Thu Apr 24 20:41:05 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/new_parquet_result_caching (0d4f506) to db13dd9 diff
Benchmarks: clickbench_partitioned clickbench_1
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Jun 26, 2025

🤖: Benchmark completed

Details

Comparing HEAD and alamb_new_parquet_result_caching
--------------------
Benchmark clickbench_1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ alamb_new_parquet_result_caching ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     0.59 ms │                          0.57 ms │     no change │
│ QQuery 1     │    74.83 ms │                         76.78 ms │     no change │
│ QQuery 2     │   108.75 ms │                        105.56 ms │     no change │
│ QQuery 3     │   130.57 ms │                        131.14 ms │     no change │
│ QQuery 4     │   634.26 ms │                        628.89 ms │     no change │
│ QQuery 5     │   881.36 ms │                        842.59 ms │     no change │
│ QQuery 6     │     0.65 ms │                          0.65 ms │     no change │
│ QQuery 7     │    82.37 ms │                         87.41 ms │  1.06x slower │
│ QQuery 8     │   890.73 ms │                        873.34 ms │     no change │
│ QQuery 9     │  1203.12 ms │                       1189.95 ms │     no change │
│ QQuery 10    │   289.41 ms │                        294.86 ms │     no change │
│ QQuery 11    │   325.92 ms │                        330.25 ms │     no change │
│ QQuery 12    │   882.85 ms │                        892.12 ms │     no change │
│ QQuery 13    │  1269.86 ms │                       1452.48 ms │  1.14x slower │
│ QQuery 14    │   823.95 ms │                        989.67 ms │  1.20x slower │
│ QQuery 15    │   815.75 ms │                        823.76 ms │     no change │
│ QQuery 16    │  1639.70 ms │                       1643.62 ms │     no change │
│ QQuery 17    │  1621.55 ms │                       1604.17 ms │     no change │
│ QQuery 18    │  2959.17 ms │                       2949.66 ms │     no change │
│ QQuery 19    │   123.99 ms │                        125.54 ms │     no change │
│ QQuery 20    │  1210.36 ms │                       1188.25 ms │     no change │
│ QQuery 21    │  1411.17 ms │                       1387.61 ms │     no change │
│ QQuery 22    │  2417.57 ms │                       2259.31 ms │ +1.07x faster │
│ QQuery 23    │  7986.84 ms │                       1677.49 ms │ +4.76x faster │
│ QQuery 24    │   499.72 ms │                        600.69 ms │  1.20x slower │
│ QQuery 25    │   423.83 ms │                        398.88 ms │ +1.06x faster │
│ QQuery 26    │   563.47 ms │                        581.21 ms │     no change │
│ QQuery 27    │  1700.84 ms │                       1848.75 ms │  1.09x slower │
│ QQuery 28    │ 13348.87 ms │                      12743.91 ms │     no change │
│ QQuery 29    │   558.46 ms │                        567.18 ms │     no change │
│ QQuery 30    │   811.01 ms │                       1147.45 ms │  1.41x slower │
│ QQuery 31    │   864.91 ms │                       1135.93 ms │  1.31x slower │
│ QQuery 32    │  2577.12 ms │                       2589.12 ms │     no change │
│ QQuery 33    │  3328.06 ms │                       3322.24 ms │     no change │
│ QQuery 34    │  3366.29 ms │                       3314.41 ms │     no change │
│ QQuery 35    │  1310.58 ms │                       1298.05 ms │     no change │
│ QQuery 36    │   158.24 ms │                         70.52 ms │ +2.24x faster │
│ QQuery 37    │   104.92 ms │                         64.29 ms │ +1.63x faster │
│ QQuery 38    │   176.75 ms │                         71.41 ms │ +2.48x faster │
│ QQuery 39    │   257.91 ms │                         69.28 ms │ +3.72x faster │
│ QQuery 40    │    72.88 ms │                         71.59 ms │     no change │
│ QQuery 41    │    84.33 ms │                         67.18 ms │ +1.26x faster │
│ QQuery 42    │    78.78 ms │                         67.68 ms │ +1.16x faster │
└──────────────┴─────────────┴──────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                               │ 58072.30ms │
│ Total Time (alamb_new_parquet_result_caching)   │ 51585.43ms │
│ Average Time (HEAD)                             │  1350.52ms │
│ Average Time (alamb_new_parquet_result_caching) │  1199.66ms │
│ Queries Faster                                  │          9 │
│ Queries Slower                                  │          7 │
│ Queries with No Change                          │         27 │
│ Queries with Failure                            │          0 │
└─────────────────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ alamb_new_parquet_result_caching ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.61 ms │                          2.29 ms │ +1.14x faster │
│ QQuery 1     │    33.98 ms │                         36.41 ms │  1.07x slower │
│ QQuery 2     │    82.02 ms │                         80.00 ms │     no change │
│ QQuery 3     │   100.63 ms │                         98.64 ms │     no change │
│ QQuery 4     │   619.61 ms │                        591.12 ms │     no change │
│ QQuery 5     │   871.49 ms │                        856.27 ms │     no change │
│ QQuery 6     │     2.28 ms │                          2.34 ms │     no change │
│ QQuery 7     │    39.62 ms │                         43.37 ms │  1.09x slower │
│ QQuery 8     │   842.00 ms │                        854.49 ms │     no change │
│ QQuery 9     │  1161.82 ms │                       1168.36 ms │     no change │
│ QQuery 10    │   251.52 ms │                        268.40 ms │  1.07x slower │
│ QQuery 11    │   282.40 ms │                        304.87 ms │  1.08x slower │
│ QQuery 12    │   872.65 ms │                        947.66 ms │  1.09x slower │
│ QQuery 13    │  1279.12 ms │                       1436.53 ms │  1.12x slower │
│ QQuery 14    │   815.67 ms │                       1031.11 ms │  1.26x slower │
│ QQuery 15    │   763.15 ms │                        771.70 ms │     no change │
│ QQuery 16    │  1618.60 ms │                       1611.13 ms │     no change │
│ QQuery 17    │  1630.96 ms │                       1580.43 ms │     no change │
│ QQuery 18    │  3137.91 ms │                       2896.69 ms │ +1.08x faster │
│ QQuery 19    │    83.11 ms │                         85.32 ms │     no change │
│ QQuery 20    │  1168.52 ms │                       1156.76 ms │     no change │
│ QQuery 21    │  1314.27 ms │                       1353.01 ms │     no change │
│ QQuery 22    │  2177.44 ms │                       2244.54 ms │     no change │
│ QQuery 23    │  7456.50 ms │                       1394.31 ms │ +5.35x faster │
│ QQuery 24    │   472.29 ms │                        301.40 ms │ +1.57x faster │
│ QQuery 25    │   394.58 ms │                        380.81 ms │     no change │
│ QQuery 26    │   525.61 ms │                        304.73 ms │ +1.72x faster │
│ QQuery 27    │  1548.88 ms │                       1727.53 ms │  1.12x slower │
│ QQuery 28    │ 13165.50 ms │                      12088.23 ms │ +1.09x faster │
│ QQuery 29    │   536.70 ms │                        538.18 ms │     no change │
│ QQuery 30    │   782.98 ms │                       1246.94 ms │  1.59x slower │
│ QQuery 31    │   818.65 ms │                       1263.46 ms │  1.54x slower │
│ QQuery 32    │  2480.30 ms │                       2514.77 ms │     no change │
│ QQuery 33    │  3295.14 ms │                       3254.71 ms │     no change │
│ QQuery 34    │  3326.28 ms │                       3222.02 ms │     no change │
│ QQuery 35    │  1266.05 ms │                       1260.52 ms │     no change │
│ QQuery 36    │   125.30 ms │                         27.79 ms │ +4.51x faster │
│ QQuery 37    │    50.91 ms │                         26.71 ms │ +1.91x faster │
│ QQuery 38    │   121.79 ms │                         26.63 ms │ +4.57x faster │
│ QQuery 39    │   198.94 ms │                         26.88 ms │ +7.40x faster │
│ QQuery 40    │    40.50 ms │                         27.02 ms │ +1.50x faster │
│ QQuery 41    │    39.98 ms │                         26.24 ms │ +1.52x faster │
│ QQuery 42    │    32.59 ms │                         26.67 ms │ +1.22x faster │
└──────────────┴─────────────┴──────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                               │ 55830.85ms │
│ Total Time (alamb_new_parquet_result_caching)   │ 49106.97ms │
│ Average Time (HEAD)                             │  1298.39ms │
│ Average Time (alamb_new_parquet_result_caching) │  1142.02ms │
│ Queries Faster                                  │         13 │
│ Queries Slower                                  │         10 │
│ Queries with No Change                          │         20 │
│ Queries with Failure                            │          0 │
└─────────────────────────────────────────────────┴────────────┘

@alamb
Copy link
Contributor Author

alamb commented Jun 26, 2025

🤔 the results are not that far away.

Queries to review


│ QQuery 7     │    39.62 ms │                         43.37 ms │  1.09x slower │
│ QQuery 10    │   251.52 ms │                        268.40 ms │  1.07x slower │
│ QQuery 11    │   282.40 ms │                        304.87 ms │  1.08x slower │
│ QQuery 12    │   872.65 ms │                        947.66 ms │  1.09x slower │
│ QQuery 13    │  1279.12 ms │                       1436.53 ms │  1.12x slower │
│ QQuery 14    │   815.67 ms │                       1031.11 ms │  1.26x slower │
│ QQuery 27    │  1548.88 ms │                       1727.53 ms │  1.12x slower │
│ QQuery 30    │   782.98 ms │                       1246.94 ms │  1.59x slower │
│ QQuery 31    │   818.65 ms │                       1263.46 ms │  1.54x slower │

@alamb
Copy link
Contributor Author

alamb commented Jun 26, 2025

Analysis of Q30

I took a quick look at q30 (which is now so much easier after @pepijnve broke the queries into their own files)

SELECT "SearchEngineID", "ClientIP", COUNT(*) AS c, SUM("IsRefresh"), AVG("ResolutionWidth") FROM hits WHERE "SearchPhrase" <> '' GROUP BY "SearchEngineID", "ClientIP" ORDER BY c DESC LIMIT 10;

Here is the comparison branch

./datafusion-cli-alamb_arrow_55.2.0_upgrade_real -f q30-times-10.sql  | grep Elapsed
Elapsed 0.354 seconds.
Elapsed 0.324 seconds.
Elapsed 0.325 seconds.
Elapsed 0.321 seconds.
Elapsed 0.313 seconds.
Elapsed 0.317 seconds.
Elapsed 0.318 seconds.
Elapsed 0.312 seconds.
Elapsed 0.319 seconds.
Elapsed 0.322 seconds.
Elapsed 0.313 seconds.

Here is this branch (definitely slower)

andrewlamb@Andrews-MacBook-Pro-2:~/Downloads$ ./datafusion-cli-alamb_new_parquet_result_caching -f q30-times-10.sql  | grep Elapsed
Elapsed 0.501 seconds.
Elapsed 0.488 seconds.
Elapsed 0.510 seconds.
Elapsed 0.527 seconds.
Elapsed 0.477 seconds.
Elapsed 0.480 seconds.
Elapsed 0.476 seconds.
Elapsed 0.484 seconds.
Elapsed 0.492 seconds.
Elapsed 0.489 seconds.
Elapsed 0.491 seconds.

Here is the plan

> explain SELECT "SearchEngineID", "ClientIP", COUNT(*) AS c, SUM("IsRefresh"), AVG("ResolutionWidth") FROM hits WHERE "SearchPhrase" <> '' GROUP BY "SearchEngineID", "ClientIP" ORDER BY c DESC LIMIT 10;
+---------------+-------------------------------+
| plan_type     | plan                          |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
|               | │  SortPreservingMergeExec  │ |
|               | │    --------------------   │ |
|               | │      c DESClimit: 10      │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       SortExec(TopK)      │ |
|               | │    --------------------   │ |
|               | │          c@2 DESC         │ |
|               | │                           │ |
|               | │         limit: 10         │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       ProjectionExec      │ |
|               | │    --------------------   │ |
|               | │     ClientIP: ClientIP    │ |
|               | │                           │ |
|               | │      SearchEngineID:      │ |
|               | │       SearchEngineID      │ |
|               | │                           │ |
|               | │ avg(hits.ResolutionWidth):│ |
|               | │ avg(hits.ResolutionWidth) │ |
|               | │                           │ |
|               | │     c: count(Int64(1))    │ |
|               | │                           │ |
|               | │    sum(hits.IsRefresh):   │ |
|               | │    sum(hits.IsRefresh)    │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       AggregateExec       │ |
|               | │    --------------------   │ |
|               | │           aggr:           │ |
|               | │     count(1), sum(hits    │ |
|               | │      .IsRefresh), avg     │ |
|               | │   (hits.ResolutionWidth)  │ |
|               | │                           │ |
|               | │         group_by:         │ |
|               | │  SearchEngineID, ClientIP │ |
|               | │                           │ |
|               | │           mode:           │ |
|               | │      FinalPartitioned     │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │    CoalesceBatchesExec    │ |
|               | │    --------------------   │ |
|               | │     target_batch_size:    │ |
|               | │            8192           │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │      RepartitionExec      │ |
|               | │    --------------------   │ |
|               | │ partition_count(in->out): │ |
|               | │          16 -> 16         │ |
|               | │                           │ |
|               | │    partitioning_scheme:   │ |
|               | │  Hash([SearchEngineID@0,  │ |
|               | │      ClientIP@1], 16)     │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       AggregateExec       │ |
|               | │    --------------------   │ |
|               | │           aggr:           │ |
|               | │     count(1), sum(hits    │ |
|               | │      .IsRefresh), avg     │ |
|               | │   (hits.ResolutionWidth)  │ |
|               | │                           │ |
|               | │         group_by:         │ |
|               | │  SearchEngineID, ClientIP │ |
|               | │                           │ |
|               | │       mode: Partial       │ |
|               | └─────────────┬─────────────┘ |
|               | ┌─────────────┴─────────────┐ |
|               | │       DataSourceExec      │ |
|               | │    --------------------   │ |
|               | │         files: 115        │ |
|               | │      format: parquet      │ |
|               | │                           │ |
|               | │         predicate:        │ |
|               | │      SearchPhrase !=      │ |
|               | └───────────────────────────┘ |
|               |                               |
+---------------+-------------------------------+

I poked around using Instruments and I think the breakdown is:

  • 10% doing the final aggregate and TopK result (result down to the RepartitionExec)
  • 90% doing the RepartitionExec and below

Of the 90% left,

  • 10% went to the aggregate
  • 70% went to the scan

Interestingly in this case SearchPhrase does not show up in the selection/projection so the result of SearchPhrase <> '' can't be cached.

Screenshot 2025-06-26 at 3 47 54 PM

So my conclusion here is that the overhead of skip/scanning in the decoder takes longer than decoding the entire column and then applying a filter.

My next plan will be

  1. Verify that the overhead of skipping is actually the cause
  2. Look into why CachedReader is showing up in the first place 🤔

@alamb
Copy link
Contributor Author

alamb commented Jun 27, 2025

Verify that the overhead of skipping is actually the cause

To start with this analysis, I have run clickbench 30 and saved the results of evaluating filters (boolean arrays). There are 325 parquet files corresponding to the 325 row groups in the 100 clickbench files
filters_clickbench_q30.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
common Related to common crate core Core DataFusion crate documentation Improvements or additions to documentation sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant