qlever-dev · ullingerc · Jan 7, 2026 · Jan 8, 2026 · Jan 9, 2026 · Jan 23, 2026
diff --git a/docs/path-search.md b/docs/path-search.md
@@ -0,0 +1,309 @@
+# Path Search in QLever
+
+The Path Search feature in this SPARQL engine allows users to perform advanced queries
+to find paths between sources and targets in a graph. It supports a variety of configurations,
+including single or multiple source and target nodes, optional edge properties, and
+custom algorithms for path discovery. This feature is accessed using the `SERVICE` keyword
+and the service IRI `<https://qlever.cs.uni-freiburg.de/pathSearch/>`.
+
+## Basic Syntax
+
+The general structure of a Path Search query is as follows:
+
+```sparql
+PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/>
+
+SELECT ?start ?end ?path ?edge WHERE {
+  SERVICE pathSearch: {
+    _:path pathSearch:algorithm pathSearch:allPaths ;  # Specify the algorithm
+           pathSearch:source <sourceNode> ;            # Specify the source node(s)
+           pathSearch:target <targetNode> ;            # Specify the target node(s)
+           pathSearch:pathColumn ?path ;               # Bind the path variable
+           pathSearch:edgeColumn ?edge ;               # Bind the edge variable
+           pathSearch:start ?start ;                   # Bind the edge start variable
+           pathSearch:end ?end ;                       # Bind the edge end variable
+    {SELECT * WHERE {
+        ?start <predicate> ?end.                       # Define the edge pattern
+    }}
+  }
+}
+```
+
+### Parameters
+
+`pathSearch:algorithm`: Defines the algorithm used to search paths. Currently, only `pathSearch:allPaths` is supported.
+
+`pathSearch:source`: Defines the source node(s) of the search.
+
+`pathSearch:target` (optional): Defines the target node(s) of the search.
+
+`pathSearch:pathColumn`: Defines the variable for the path.
+
+`pathSearch:edgeColumn`: Defines the variable for the edge.
+
+`pathSearch:start`: Defines the variable for the start of the edges.
+
+`pathSearch:end`: Defines the variable for the end of the edges.
+
+`pathSearch:edgeProperty` (optional): Specifies properties for the edges in the path.
+
+`pathSearch:cartesian` (optional): Controls the behaviour of path searches between
+ source and target nodes. Expects a boolean. The default is `true`. If set to `true`, the search will compute the paths from each source to **all targets**. If set to `false`, the search will compute the paths from each source to exactly
+  **one target**. Sources and targets are paired based on their index (i.e. the paths
+  from the first source to the first target are searched, then the second source and
+  target, and so on).
+
+`pathSearch:numPathsPerTarget` (optional): The path search will only search and store paths,
+  if the number of found paths is lower or equal to the value of the parameter. Expects an integer.
+  Example: if the value is 5, then the search will enumerate all paths until 5 paths have been found.
+  Other paths will be ignored.
+
+??? note "Examples"
+
+    **Single Source and Target**
+
+    The simplest case is searching for paths between a single source and a single target:
+
+    ```sparql
+    PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/>
+
+    SELECT ?start ?end ?path ?edge WHERE {
+      SERVICE pathSearch: {
+        _:path pathSearch:algorithm pathSearch:allPaths ;
+              pathSearch:source <source> ;
+              pathSearch:target <target> ;
+              pathSearch:pathColumn ?path ;
+              pathSearch:edgeColumn ?edge ;
+              pathSearch:start ?start ;
+              pathSearch:end ?end ;
+        {
+          SELECT * WHERE {
+            ?start <predicate> ?end.
+          }
+        }
+      }
+    }
+    ```
+
+    **Multiple Sources or Targets**
+
+    It is possible to specify a set of sources or targets for the path search.
+
+    ```sparql
+    PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/>
+
+    SELECT ?start ?end ?path ?edge WHERE {
+      SERVICE pathSearch: {
+        _:path pathSearch:algorithm pathSearch:allPaths ;
+              pathSearch:source <source1> ;
+              pathSearch:source <source2> ;
+              pathSearch:target <target1> ;
+              pathSearch:target <target2> ;
+              pathSearch:pathColumn ?path ;
+              pathSearch:edgeColumn ?edge ;
+              pathSearch:start ?start ;
+              pathSearch:end ?end ;
+        {
+          SELECT * WHERE {
+            ?start <predicate> ?end.
+          }
+        }
+      }
+    }
+    ```
+
+    This query will search forall between all sources and all targets, i.e.
+    - (`<source1>`, `<target1>`)
+    - (`<source1>`, `<target2>`)
+    - (`<source2>`, `<target1>`)
+    - (`<source2>`, `<target2>`)
+
+    It is possible to specify, whether the sources and targets should be combined according
+    to the cartesian product (as seen above) or if they should be matched up pairwise, i.e.
+    - (`<source1>`, `<target1>`)
+    - (`<source2>`, `<target2>`)
+
+    This can be done with the parameter `pathSearch:cartesian`. This parameter expects a
+    boolean. If set to `true`, then the cartesian product is used to match the sources with
+    the targets.
+    If set to `false`, then the sources and targets are matched pairwise. If left 
+    unspecified, then the default `true` is used.
+
+    ```sparql
+    PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/>
+
+    SELECT ?start ?end ?path ?edge WHERE {
+      SERVICE pathSearch: {
+        _:path pathSearch:algorithm pathSearch:allPaths ;
+              pathSearch:source <source1> ;
+              pathSearch:source <source2> ;
+              pathSearch:target <target1> ;
+              pathSearch:target <target2> ;
+              pathSearch:pathColumn ?path ;
+              pathSearch:edgeColumn ?edge ;
+              pathSearch:start ?start ;
+              pathSearch:end ?end ;
+              pathSearch:cartesian false;
+        {
+          SELECT * WHERE {
+            ?start <predicate> ?end.
+          }
+        }
+      }
+    }
+    ```
+
+    **Edge Properties**
+
+    You can also include edge properties in the path search to further refine the results:
+
+    ```sparql
+    PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/>
+
+    SELECT ?start ?end ?path ?edge WHERE {
+      SERVICE pathSearch: {
+        _:path pathSearch:algorithm pathSearch:allPaths ;
+              pathSearch:source <source> ;
+              pathSearch:target <target> ;
+              pathSearch:pathColumn ?path ;
+              pathSearch:edgeColumn ?edge ;
+              pathSearch:edgeProperty ?middle ;
+              pathSearch:start ?start ;
+              pathSearch:end ?end ;
+        {
+          SELECT * WHERE {
+            ?start <predicate1> ?middle.
+            ?middle <predicate2> ?end.
+          }
+        }
+      }
+    }
+    ```
+
+    This is especially useful for [N-ary relations](https://www.w3.org/TR/swbp-n-aryRelations/). 
+    Considering the example above, it is possible to query additional relations of `?middle`:
+
+    ```sparql
+    PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/>
+
+    SELECT ?start ?end ?path ?edge WHERE {
+      SERVICE pathSearch: {
+        _:path pathSearch:algorithm pathSearch:allPaths ;
+              pathSearch:source <source> ;
+              pathSearch:target <target> ;
+              pathSearch:pathColumn ?path ;
+              pathSearch:edgeColumn ?edge ;
+              pathSearch:edgeProperty ?middle ;
+              pathSearch:edgeProperty ?edgeInfo ;
+              pathSearch:start ?start ;
+              pathSearch:end ?end ;
+        {
+          SELECT * WHERE {
+            ?start <predicate1> ?middle.
+            ?middle <predicate2> ?end.
+            ?middle <predicate3> ?edgeInfo.
+          }
+        }
+      }
+    }
+    ```
+
+    This makes it possible to query additional properties of the edge between `?start` and `?end` (such as `?edgeInfo` in the example above).
+
+
+    **Source or Target as Variables**
+
+    You can also bind the source and/or target dynamically using variables. The examples
+    below use `VALUES` clauses, which can be convenient to specify sources and targets.
+    However, the source/target variables can also be bound using any regular SPARQL construct.
+
+    **Source Variable**
+
+    ```sparql
+    PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/>
+
+    SELECT ?start ?end ?path ?edge WHERE {
+      VALUES ?source {<source>}
+      SERVICE pathSearch: {
+        _:path pathSearch:algorithm pathSearch:allPaths ;
+              pathSearch:source ?source ;
+              pathSearch:target <target> ;
+              pathSearch:pathColumn ?path ;
+              pathSearch:edgeColumn ?edge ;
+              pathSearch:start ?start ;
+              pathSearch:end ?end ;
+        {
+          SELECT * WHERE {
+            ?start <p> ?end.
+          }
+        }
+      }
+    }
+    ```
+
+    **Target Variable**
+
+    ```sparql
+    PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/>
+
+    SELECT ?start ?end ?path ?edge WHERE {
+      VALUES ?target {<target>}
+      SERVICE pathSearch: {
+        _:path pathSearch:algorithm pathSearch:allPaths ;
+              pathSearch:source <source> ;
+              pathSearch:target ?target ;
+              pathSearch:pathColumn ?path ;
+              pathSearch:edgeColumn ?edge ;
+              pathSearch:start ?start ;
+              pathSearch:end ?end ;
+        {
+          SELECT * WHERE {
+            ?start <p> ?end.
+          }
+        }
+      }
+    }
+    ```
+
+    **Limit Number of Paths per Target**
+
+    It is possible to limit how many paths per target are returned. This is especially useful if
+    the query uses a lot of memory. In that case, it is possible to query a limited number of
+    paths to debug where the problem is.
+
+    The following query for example will only return one path per source and target pair.
+    I.e. one path for `(<source1>, <target1>)`, one path for `(<source1>, <target2>)` and so on.
+
+    ```sparql
+    PREFIX pathSearch: <https://qlever.cs.uni-freiburg.de/pathSearch/>
+
+    SELECT ?start ?end ?path ?edge WHERE {
+      SERVICE pathSearch: {
+        _:path pathSearch:algorithm pathSearch:allPaths ;
+              pathSearch:source <source1> ;
+              pathSearch:source <source2> ;
+              pathSearch:target <target1> ;
+              pathSearch:target <target2> ;
+              pathSearch:pathColumn ?path ;
+              pathSearch:edgeColumn ?edge ;
+              pathSearch:start ?start ;
+              pathSearch:end ?end ;
+              pathSearch:numPathsPerTarget 1;
+        {
+          SELECT * WHERE {
+            ?start <predicate> ?end.
+          }
+        }
+      }
+    }
+    ```
+
+## Error Handling
+
+The Path Search feature will throw errors in the following scenarios:
+
+- **Missing Start Parameter**: If the `start` parameter is not specified, an error will be raised.
+- **Multiple Start or End Variables**: If multiple `start` or `end` variables are defined, an error is raised.
+- **Invalid Non-Variable Start/End**: If the `start` or `end` parameter is not bound to a variable, the query will fail.
+- **Unsupported Argument**: Arguments other than those listed (like custom user arguments) will cause an error.
+- **Non-IRI Predicate**: Predicates must be IRIs. If not, an error will occur.
diff --git a/docs/special-features.md b/docs/special-features.md
@@ -0,0 +1,84 @@
+# Miscellaneous special features
+
+## Internal Triples for SPARQL+Text and SPARQL Autocompletion
+
+On top of the vanilla SPARQL functionality, QLever allows so-called SPARQL+Text
+queries on a text corpus linked to a knowledge base via entity recognition.  For
+example, the following query finds all mentions of astronauts next to the words
+"moon" and "walk*" in the text corpus:
+
+```sparql
+SELECT ?a TEXT(?t) SCORE(?t) WHERE {
+    ?a <is-a> <Astronaut> .
+    ?t ql:contains-entity ?a .
+    ?t ql:contains-word "walk* moon"
+} ORDER BY DESC(SCORE(?t))
+```
+
+Such queries can be simulated in standard SPARQL, but only with poor
+performance, see the CIKM'17 paper above.  Details about the required input data
+and the SPARQL+text query syntax and semantics can be found
+[here](text-search.md).
+
+QLever also supports efficient SPARQL autocompletion.  For example, the
+following query yields a list of all predicates associated with people in the
+knowledge base, ordered by the number of people which have that predicate.
+
+```sparql
+SELECT ?predicate (COUNT(?predicate) as ?count) WHERE {
+    ?x <is-a> <Person> .
+    ?x ql:has-predicate ?predicate
+}
+GROUP BY ?predicate
+ORDER BY DESC(?count)
+```
+
+Note that this query could also be processed by a standard SPARQL engine simply
+by replacing the second triple with `?x ?predicate ?object` and adding
+`DISTINCT` inside the `COUNT()`.
+
+However, that query will produce a very large intermediate result (all triples
+of all people) with a correspondingly long query time.  In contrast, the query
+above takes only about 100 ms on a standard Linux machine (with 16 GB memory)
+and a dataset with 360 million triples and 530 million text records.
+
+## Statistics
+
+You can get statistics for the currently active index in the following way:
+
+```
+<server>:<port>/?cmd=stats
+```
+
+This query will yield a JSON response that features:
+
+* The name of the KB index
+* The number of triples in the KB index
+* The number of index permutations build (usually 2 or 6)
+* The numbers of distinct subjects, predicates and objects (only available if 6 permutations are built)
+* The name of the text index (if one is present)
+* The number of text records in the text index (if a text index is present)
+* The number of word occurrences/postings in the text index (if a text index is present)
+* The number of entity occurrences/postings in the text index (if a text index is present)
+
+The name of an index is the name of the input file (and wordsfile for the
+text index), but can also be specified manually while building an index.
+Therefore, IndexbuilderMain takes two optional arguments: `--text-index-name` (`-T`)
+and `--kb-index-name` (`-K`).
+
+## Send vs Compute
+
+Currently, QLever does not compute partial results if there is a `LIMIT` modifier.
+
+However, strings (for entities and text excerpts) are only resolved for those
+items that that will be transmitted.  Furthermore, a UI usually only requires
+a limited amount of rows at a time.
+
+While specifying a `LIMIT` is recommended, some experiments may want
+to measure the time to produce the full result.
+Therefore an additional HTTP parameter `&send=<x>` can be used to send only
+k result rows while still computing the readable result for up to `LIMIT` rows.
+
+**IMPORTANT: Unless you want to measure QLever's performance, using `LIMIT` (+
+`OFFSET` for sequential loading) is preferred in all applications. `LIMIT` is
+faster and produces the same output as the `send` parameter**