Apply binary search filter expressions directly on the block metadata of `Index Scan`s #1619

realHannes · 2024-11-15T10:38:48Z

With this PR, filter expressions that can be evaluated via binary search on a sorted input are directly evaluated on the block metadata of an IndexScan. For example in a query that contains { ?s ?p ?o FILTER (?o > 3)} only the blocks of the full index scan (sorted by the object) are read from disk that according to their metadata might contain values > 3.

Currently this mechanism has the following limitations:

It can only be applied if the IndexScan directly is the child of the FILTER clause
It can only be applied to logical expressions (AND/OR/NOT) and to relational expressions (greater than, equal to, etc.) between a variable and a constant. Currently the constant can not yet be an IRI or Literal.

Co-authored-by: Johannes Kalmbach <[email protected]>

…number conv.

…neral implementation

Co-authored-by: Johannes Kalmbach <[email protected]>

…ng correct prefix in Constants.h

…bleExpression

…d::from_chars()

joka921

This is very nice and was quite a lot of work.
Mostly missing are Tests that the Prefilter is applied in the indexScan etc.
A good way (also for manual debugging) is to add the applied prefilters to the runtimInfo() as a detail, then they appear in the UI, and this can also be used for testing.

src/engine/IndexScan.h

src/engine/IndexScan.cpp

src/engine/IndexScan.h

joka921

Some very small nitpicks remain.
We still have to analyze, why the query planner takes so long.

src/engine/Filter.h

src/engine/IndexScan.cpp

src/engine/QueryExecutionTree.cpp

src/engine/IndexScan.cpp

joka921

There was a tiny misunderstanding, but this is almost ready to merge.

src/engine/QueryExecutionTree.cpp

src/engine/QueryExecutionTree.h

sparql-conformance · 2024-11-30T18:37:56Z

Conformance check passed ✅

No test result changes.

Details: https://qlever.cs.uni-freiburg.de/sparql-conformance-ui?cur=d5ad1ba8d1d3308f1a4f1c508686af1e12edc86b&prev=82ccc5157f68c83532005c309802e54fcc5e1244

sonarqubecloud · 2024-11-30T19:52:52Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

joka921

Thank you very much.
This is a great milestone for QLever!

… of `Index Scan`s (ad-freiburg#1619) With this PR, filter expressions that can be evaluated via binary search on a sorted input are directly evaluated on the block metadata of an IndexScan. For example in a query that contains `{ ?s ?p ?o FILTER (?o > 3)`} only the blocks of the full index scan (sorted by the object) are read from disk that according to their metadata might contain values `> 3`. Currently this mechanism has the following limitations: 1. It can only be applied if the IndexScan directly is the child of the FILTER clause 2. It can only be applied to logical expressions (AND/OR/NOT) and to relational expressions (greater than, equal to, etc.) between a variable and a constant. Currently the constant can not yet be an IRI or Literal.

With this PR, the prefilter expressions implemented in #1619 also apply to literals and IRIs. For example the following query only extracts the relevant, prefiltered blocks from the `IndexScan`: ``` SELECT * { ?s ?p ?o FILTER (?o >= "hallo" && ?o <= "hello") } ```

Since #1619, the size estimate for an index scan always involved one or several copies of the block metadata, which incurred a significant query planning cost for most queries. Now, such a copy is only made for an index scan followed by a `FILTER` and only the metadata of those blocks is copied, which remain after the `FILTER` (in which case the two operations are expensive anyway).

realHannes and others added 30 commits April 25, 2024 19:07

Added conversion str to int

99da434

templated function for toNumeric, add declaration to NaryExpression.h

424d023

str to num for SparqlExpression implemented + added test

0117e82

Merge branch 'ad-freiburg:master' into master

35fd0b1

Update src/engine/sparqlExpressions/StringExpressions.cpp

94356c2

Co-authored-by: Johannes Kalmbach <[email protected]>

Update src/engine/sparqlExpressions/StringExpressions.cpp

decc8ba

Co-authored-by: Johannes Kalmbach <[email protected]>

Update src/engine/sparqlExpressions/StringExpressions.cpp

850152c

Co-authored-by: Johannes Kalmbach <[email protected]>

Update src/engine/sparqlExpressions/StringExpressions.cpp

d650d67

Co-authored-by: Johannes Kalmbach <[email protected]>

using now absl::from_chars() and stripping whitespaces for string to …

46cc697

…number conv.

added new functions to processIriFuntionCall() (for string to number)

7fc5c28

renaming to: toIntExpression and toDoubleExpression for later more ge…

efb0e24

…neral implementation

made format (clang-format-16)

a88537c

Update src/parser/sparqlParser/SparqlQleverVisitor.cpp

ca1e2e0

Co-authored-by: Johannes Kalmbach <[email protected]>

Update src/parser/sparqlParser/SparqlQleverVisitor.cpp

4adc831

Co-authored-by: Johannes Kalmbach <[email protected]>

renaming in NaryExpression.h for accordance with other function, addi…

d0f0d63

…ng correct prefix in Constants.h

added test coverage for function calls makeIntExpression and make Dou…

a118609

…bleExpression

toNumeric has now correct behavior and uses absl::from_chars() and st…

062052e

…d::from_chars()

made clang-format for NaryExpressionImpl.h

6d0f42a

Merge branch 'ad-freiburg:master' into master

f90b8e2

Merge branch 'ad-freiburg:master' into master

fb88493

Merge remote-tracking branch 'upstream/master'

b2eb514

Merge branch 'ad-freiburg:master' into master

b165ac1

Merge branch 'master' of https://github.com/realHannes/qlever

7a3dfb2

Merge branch 'ad-freiburg:master' into master

fc0ad3a

Merge branch 'ad-freiburg:master' into master

f3e6086

Merge branch 'ad-freiburg:master' into master

fd4c351

Merge branch 'ad-freiburg:master' into master

220c9bf

Merge branch 'ad-freiburg:master' into master

a81cb8a

Merge branch 'ad-freiburg:master' into master

acc0109

Merge branch 'ad-freiburg:master' into master

cb8e560

realHannes added 2 commits November 21, 2024 15:53

select only Variable for first column index + extend constructor

61a15ce

set multiplicities in IndexScan constructor

6ef031b

joka921 requested changes Nov 21, 2024

View reviewed changes

realHannes and others added 7 commits November 23, 2024 19:16

changes for review (1)

bef8211

changes for review (2)

80d54df

correction for Codespell

25eb418

add testing

83f2363

Merge branch 'master' into implement-apply-prefilter-expressions

097e206

FilterTest fix + codespell fix

1574860

simplify test in IndexScanTest

9de745a

joka921 requested changes Nov 27, 2024

View reviewed changes

src/engine/Filter.h Outdated Show resolved Hide resolved

src/engine/IndexScan.cpp Show resolved Hide resolved

src/engine/QueryExecutionTree.cpp Show resolved Hide resolved

realHannes and others added 5 commits November 28, 2024 18:35

more changes

0a7672c

test

0ff8689

Merge branch 'master' into implement-apply-prefilter-expressions

bef9326

fix test error

6e54939

adapt code part merge conflict

2751fe1

joka921 reviewed Nov 29, 2024

View reviewed changes

src/engine/IndexScan.cpp Show resolved Hide resolved

joka921 requested changes Nov 29, 2024

View reviewed changes

src/engine/QueryExecutionTree.cpp Show resolved Hide resolved

src/engine/QueryExecutionTree.cpp Show resolved Hide resolved

src/engine/QueryExecutionTree.h Outdated Show resolved Hide resolved

realHannes and others added 2 commits November 30, 2024 17:57

Merge branch 'master' into implement-apply-prefilter-expressions

a884e39

use std::move + adjust test for this PR

d5ad1ba

joka921 approved these changes Dec 2, 2024

View reviewed changes

joka921 changed the title ~~Apply PrefilterExpressionIndex in IndexScan~~ Apply binary search filter expressions directly on the block metadata of Index Scans Dec 2, 2024

joka921 merged commit 7680177 into ad-freiburg:master Dec 2, 2024
22 checks passed

realHannes mentioned this pull request Dec 3, 2024

Extend PrefilterExpression for Literal and Iri #1653

Merged

hannahbast mentioned this pull request Dec 12, 2024

Make query planning of index scans fast again #1674

Merged

Apply binary search filter expressions directly on the block metadata of Index Scans #1619

Apply binary search filter expressions directly on the block metadata of Index Scans #1619

Uh oh!

Conversation

realHannes commented Nov 15, 2024 • edited by joka921 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joka921 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joka921 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joka921 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sparql-conformance bot commented Nov 30, 2024

Conformance check passed ✅

Uh oh!

sonarqubecloud bot commented Nov 30, 2024

Quality Gate passed

Uh oh!

joka921 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Apply binary search filter expressions directly on the block metadata of `Index Scan`s #1619

Apply binary search filter expressions directly on the block metadata of `Index Scan`s #1619

realHannes commented Nov 15, 2024 •

edited by joka921

Loading