-
Notifications
You must be signed in to change notification settings - Fork 3
Migrate docs from main repo #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ullingerc
wants to merge
8
commits into
qlever-dev:master
Choose a base branch
from
ullingerc:move-docs
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+555
−0
Open
Changes from 2 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
b4da340
migrate docs from main repo
ullingerc 26a9351
minor formatting
Qup42 73a759c
apply feedback
ullingerc 7767491
Merge remote-tracking branch 'origin/master' into move-docs
1129320
Remove outdated `docs/misc.md`
c18cad7
Remove `geof:envelope` from "Materialized Views" section
eb9f3f3
Merge https://github.com/qlever-dev/qlever-docs into move-docs
ullingerc 1567550
add explanation on setup-config
ullingerc File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,309 @@ | ||
| # Path Search in QLever | ||
|
|
||
| The Path Search feature in this SPARQL engine allows users to perform advanced queries | ||
| to find paths between sources and targets in a graph. It supports a variety of configurations, | ||
| including single or multiple source and target nodes, optional edge properties, and | ||
| custom algorithms for path discovery. This feature is accessed using the `SERVICE` keyword | ||
| and the service IRI `<https://qlever.cs.uni-freiburg.de/pathSearch/>`. | ||
|
|
||
| ## Basic Syntax | ||
|
|
||
| The general structure of a Path Search query is as follows: | ||
|
|
||
| ```sparql | ||
| PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/> | ||
|
|
||
| SELECT ?start ?end ?path ?edge WHERE { | ||
| SERVICE pathSearch: { | ||
| _:path pathSearch:algorithm pathSearch:allPaths ; # Specify the algorithm | ||
| pathSearch:source <sourceNode> ; # Specify the source node(s) | ||
| pathSearch:target <targetNode> ; # Specify the target node(s) | ||
| pathSearch:pathColumn ?path ; # Bind the path variable | ||
| pathSearch:edgeColumn ?edge ; # Bind the edge variable | ||
| pathSearch:start ?start ; # Bind the edge start variable | ||
| pathSearch:end ?end ; # Bind the edge end variable | ||
| {SELECT * WHERE { | ||
| ?start <predicate> ?end. # Define the edge pattern | ||
| }} | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| ### Parameters | ||
|
|
||
| `pathSearch:algorithm`: Defines the algorithm used to search paths. Currently, only `pathSearch:allPaths` is supported. | ||
|
|
||
| `pathSearch:source`: Defines the source node(s) of the search. | ||
|
|
||
| `pathSearch:target` (optional): Defines the target node(s) of the search. | ||
|
|
||
| `pathSearch:pathColumn`: Defines the variable for the path. | ||
|
|
||
| `pathSearch:edgeColumn`: Defines the variable for the edge. | ||
|
|
||
| `pathSearch:start`: Defines the variable for the start of the edges. | ||
|
|
||
| `pathSearch:end`: Defines the variable for the end of the edges. | ||
|
|
||
| `pathSearch:edgeProperty` (optional): Specifies properties for the edges in the path. | ||
|
|
||
| `pathSearch:cartesian` (optional): Controls the behaviour of path searches between | ||
| source and target nodes. Expects a boolean. The default is `true`. If set to `true`, the search will compute the paths from each source to **all targets**. If set to `false`, the search will compute the paths from each source to exactly | ||
| **one target**. Sources and targets are paired based on their index (i.e. the paths | ||
| from the first source to the first target are searched, then the second source and | ||
| target, and so on). | ||
|
|
||
| `pathSearch:numPathsPerTarget` (optional): The path search will only search and store paths, | ||
| if the number of found paths is lower or equal to the value of the parameter. Expects an integer. | ||
| Example: if the value is 5, then the search will enumerate all paths until 5 paths have been found. | ||
| Other paths will be ignored. | ||
|
|
||
| ??? note "Examples" | ||
|
|
||
| **Single Source and Target** | ||
|
|
||
| The simplest case is searching for paths between a single source and a single target: | ||
|
|
||
| ```sparql | ||
| PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/> | ||
|
|
||
| SELECT ?start ?end ?path ?edge WHERE { | ||
| SERVICE pathSearch: { | ||
| _:path pathSearch:algorithm pathSearch:allPaths ; | ||
| pathSearch:source <source> ; | ||
| pathSearch:target <target> ; | ||
| pathSearch:pathColumn ?path ; | ||
| pathSearch:edgeColumn ?edge ; | ||
| pathSearch:start ?start ; | ||
| pathSearch:end ?end ; | ||
| { | ||
| SELECT * WHERE { | ||
| ?start <predicate> ?end. | ||
| } | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| **Multiple Sources or Targets** | ||
|
|
||
| It is possible to specify a set of sources or targets for the path search. | ||
|
|
||
| ```sparql | ||
| PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/> | ||
|
|
||
| SELECT ?start ?end ?path ?edge WHERE { | ||
| SERVICE pathSearch: { | ||
| _:path pathSearch:algorithm pathSearch:allPaths ; | ||
| pathSearch:source <source1> ; | ||
| pathSearch:source <source2> ; | ||
| pathSearch:target <target1> ; | ||
| pathSearch:target <target2> ; | ||
| pathSearch:pathColumn ?path ; | ||
| pathSearch:edgeColumn ?edge ; | ||
| pathSearch:start ?start ; | ||
| pathSearch:end ?end ; | ||
| { | ||
| SELECT * WHERE { | ||
| ?start <predicate> ?end. | ||
| } | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| This query will search forall between all sources and all targets, i.e. | ||
| - (`<source1>`, `<target1>`) | ||
| - (`<source1>`, `<target2>`) | ||
| - (`<source2>`, `<target1>`) | ||
| - (`<source2>`, `<target2>`) | ||
|
|
||
| It is possible to specify, whether the sources and targets should be combined according | ||
| to the cartesian product (as seen above) or if they should be matched up pairwise, i.e. | ||
| - (`<source1>`, `<target1>`) | ||
| - (`<source2>`, `<target2>`) | ||
|
|
||
| This can be done with the parameter `pathSearch:cartesian`. This parameter expects a | ||
| boolean. If set to `true`, then the cartesian product is used to match the sources with | ||
| the targets. | ||
| If set to `false`, then the sources and targets are matched pairwise. If left | ||
| unspecified, then the default `true` is used. | ||
|
|
||
| ```sparql | ||
| PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/> | ||
|
|
||
| SELECT ?start ?end ?path ?edge WHERE { | ||
| SERVICE pathSearch: { | ||
| _:path pathSearch:algorithm pathSearch:allPaths ; | ||
| pathSearch:source <source1> ; | ||
| pathSearch:source <source2> ; | ||
| pathSearch:target <target1> ; | ||
| pathSearch:target <target2> ; | ||
| pathSearch:pathColumn ?path ; | ||
| pathSearch:edgeColumn ?edge ; | ||
| pathSearch:start ?start ; | ||
| pathSearch:end ?end ; | ||
| pathSearch:cartesian false; | ||
| { | ||
| SELECT * WHERE { | ||
| ?start <predicate> ?end. | ||
| } | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| **Edge Properties** | ||
|
|
||
| You can also include edge properties in the path search to further refine the results: | ||
|
|
||
| ```sparql | ||
| PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/> | ||
|
|
||
| SELECT ?start ?end ?path ?edge WHERE { | ||
| SERVICE pathSearch: { | ||
| _:path pathSearch:algorithm pathSearch:allPaths ; | ||
| pathSearch:source <source> ; | ||
| pathSearch:target <target> ; | ||
| pathSearch:pathColumn ?path ; | ||
| pathSearch:edgeColumn ?edge ; | ||
| pathSearch:edgeProperty ?middle ; | ||
| pathSearch:start ?start ; | ||
| pathSearch:end ?end ; | ||
| { | ||
| SELECT * WHERE { | ||
| ?start <predicate1> ?middle. | ||
| ?middle <predicate2> ?end. | ||
| } | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| This is especially useful for [N-ary relations](https://www.w3.org/TR/swbp-n-aryRelations/). | ||
| Considering the example above, it is possible to query additional relations of `?middle`: | ||
|
|
||
| ```sparql | ||
| PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/> | ||
|
|
||
| SELECT ?start ?end ?path ?edge WHERE { | ||
| SERVICE pathSearch: { | ||
| _:path pathSearch:algorithm pathSearch:allPaths ; | ||
| pathSearch:source <source> ; | ||
| pathSearch:target <target> ; | ||
| pathSearch:pathColumn ?path ; | ||
| pathSearch:edgeColumn ?edge ; | ||
| pathSearch:edgeProperty ?middle ; | ||
| pathSearch:edgeProperty ?edgeInfo ; | ||
| pathSearch:start ?start ; | ||
| pathSearch:end ?end ; | ||
| { | ||
| SELECT * WHERE { | ||
| ?start <predicate1> ?middle. | ||
| ?middle <predicate2> ?end. | ||
| ?middle <predicate3> ?edgeInfo. | ||
| } | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| This makes it possible to query additional properties of the edge between `?start` and `?end` (such as `?edgeInfo` in the example above). | ||
|
|
||
|
|
||
| **Source or Target as Variables** | ||
|
|
||
| You can also bind the source and/or target dynamically using variables. The examples | ||
| below use `VALUES` clauses, which can be convenient to specify sources and targets. | ||
| However, the source/target variables can also be bound using any regular SPARQL construct. | ||
|
|
||
| **Source Variable** | ||
|
|
||
| ```sparql | ||
| PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/> | ||
|
|
||
| SELECT ?start ?end ?path ?edge WHERE { | ||
| VALUES ?source {<source>} | ||
| SERVICE pathSearch: { | ||
| _:path pathSearch:algorithm pathSearch:allPaths ; | ||
| pathSearch:source ?source ; | ||
| pathSearch:target <target> ; | ||
| pathSearch:pathColumn ?path ; | ||
| pathSearch:edgeColumn ?edge ; | ||
| pathSearch:start ?start ; | ||
| pathSearch:end ?end ; | ||
| { | ||
| SELECT * WHERE { | ||
| ?start <p> ?end. | ||
| } | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| **Target Variable** | ||
|
|
||
| ```sparql | ||
| PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/> | ||
|
|
||
| SELECT ?start ?end ?path ?edge WHERE { | ||
| VALUES ?target {<target>} | ||
| SERVICE pathSearch: { | ||
| _:path pathSearch:algorithm pathSearch:allPaths ; | ||
| pathSearch:source <source> ; | ||
| pathSearch:target ?target ; | ||
| pathSearch:pathColumn ?path ; | ||
| pathSearch:edgeColumn ?edge ; | ||
| pathSearch:start ?start ; | ||
| pathSearch:end ?end ; | ||
| { | ||
| SELECT * WHERE { | ||
| ?start <p> ?end. | ||
| } | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| **Limit Number of Paths per Target** | ||
|
|
||
| It is possible to limit how many paths per target are returned. This is especially useful if | ||
| the query uses a lot of memory. In that case, it is possible to query a limited number of | ||
| paths to debug where the problem is. | ||
|
|
||
| The following query for example will only return one path per source and target pair. | ||
| I.e. one path for `(<source1>, <target1>)`, one path for `(<source1>, <target2>)` and so on. | ||
|
|
||
| ```sparql | ||
| PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/> | ||
|
|
||
| SELECT ?start ?end ?path ?edge WHERE { | ||
| SERVICE pathSearch: { | ||
| _:path pathSearch:algorithm pathSearch:allPaths ; | ||
| pathSearch:source <source1> ; | ||
| pathSearch:source <source2> ; | ||
| pathSearch:target <target1> ; | ||
| pathSearch:target <target2> ; | ||
| pathSearch:pathColumn ?path ; | ||
| pathSearch:edgeColumn ?edge ; | ||
| pathSearch:start ?start ; | ||
| pathSearch:end ?end ; | ||
| pathSearch:numPathsPerTarget 1; | ||
| { | ||
| SELECT * WHERE { | ||
| ?start <predicate> ?end. | ||
| } | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| ## Error Handling | ||
|
|
||
| The Path Search feature will throw errors in the following scenarios: | ||
|
|
||
| - **Missing Start Parameter**: If the `start` parameter is not specified, an error will be raised. | ||
| - **Multiple Start or End Variables**: If multiple `start` or `end` variables are defined, an error is raised. | ||
| - **Invalid Non-Variable Start/End**: If the `start` or `end` parameter is not bound to a variable, the query will fail. | ||
| - **Unsupported Argument**: Arguments other than those listed (like custom user arguments) will cause an error. | ||
| - **Non-IRI Predicate**: Predicates must be IRIs. If not, an error will occur. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,84 @@ | ||
| # Miscellaneous special features | ||
|
|
||
| ## Internal Triples for SPARQL+Text and SPARQL Autocompletion | ||
|
|
||
| On top of the vanilla SPARQL functionality, QLever allows so-called SPARQL+Text | ||
| queries on a text corpus linked to a knowledge base via entity recognition. For | ||
| example, the following query finds all mentions of astronauts next to the words | ||
| "moon" and "walk*" in the text corpus: | ||
|
|
||
| ```sparql | ||
| SELECT ?a TEXT(?t) SCORE(?t) WHERE { | ||
| ?a <is-a> <Astronaut> . | ||
| ?t ql:contains-entity ?a . | ||
| ?t ql:contains-word "walk* moon" | ||
| } ORDER BY DESC(SCORE(?t)) | ||
| ``` | ||
|
|
||
| Such queries can be simulated in standard SPARQL, but only with poor | ||
| performance, see the CIKM'17 paper above. Details about the required input data | ||
| and the SPARQL+text query syntax and semantics can be found | ||
| [here](text-search.md). | ||
|
|
||
| QLever also supports efficient SPARQL autocompletion. For example, the | ||
| following query yields a list of all predicates associated with people in the | ||
| knowledge base, ordered by the number of people which have that predicate. | ||
|
|
||
| ```sparql | ||
| SELECT ?predicate (COUNT(?predicate) as ?count) WHERE { | ||
| ?x <is-a> <Person> . | ||
| ?x ql:has-predicate ?predicate | ||
| } | ||
| GROUP BY ?predicate | ||
| ORDER BY DESC(?count) | ||
| ``` | ||
|
|
||
| Note that this query could also be processed by a standard SPARQL engine simply | ||
| by replacing the second triple with `?x ?predicate ?object` and adding | ||
| `DISTINCT` inside the `COUNT()`. | ||
|
|
||
| However, that query will produce a very large intermediate result (all triples | ||
| of all people) with a correspondingly long query time. In contrast, the query | ||
| above takes only about 100 ms on a standard Linux machine (with 16 GB memory) | ||
| and a dataset with 360 million triples and 530 million text records. | ||
|
|
||
| ## Statistics | ||
|
|
||
| You can get statistics for the currently active index in the following way: | ||
|
|
||
| ``` | ||
| <server>:<port>/?cmd=stats | ||
| ``` | ||
|
|
||
| This query will yield a JSON response that features: | ||
|
|
||
| * The name of the KB index | ||
| * The number of triples in the KB index | ||
| * The number of index permutations build (usually 2 or 6) | ||
| * The numbers of distinct subjects, predicates and objects (only available if 6 permutations are built) | ||
| * The name of the text index (if one is present) | ||
| * The number of text records in the text index (if a text index is present) | ||
| * The number of word occurrences/postings in the text index (if a text index is present) | ||
| * The number of entity occurrences/postings in the text index (if a text index is present) | ||
|
|
||
| The name of an index is the name of the input file (and wordsfile for the | ||
| text index), but can also be specified manually while building an index. | ||
| Therefore, IndexbuilderMain takes two optional arguments: `--text-index-name` (`-T`) | ||
| and `--kb-index-name` (`-K`). | ||
|
|
||
| ## Send vs Compute | ||
|
|
||
| Currently, QLever does not compute partial results if there is a `LIMIT` modifier. | ||
|
|
||
| However, strings (for entities and text excerpts) are only resolved for those | ||
| items that that will be transmitted. Furthermore, a UI usually only requires | ||
| a limited amount of rows at a time. | ||
|
|
||
| While specifying a `LIMIT` is recommended, some experiments may want | ||
| to measure the time to produce the full result. | ||
| Therefore an additional HTTP parameter `&send=<x>` can be used to send only | ||
| k result rows while still computing the readable result for up to `LIMIT` rows. | ||
|
|
||
| **IMPORTANT: Unless you want to measure QLever's performance, using `LIMIT` (+ | ||
| `OFFSET` for sequential loading) is preferred in all applications. `LIMIT` is | ||
| faster and produces the same output as the `send` parameter** |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.