Commit 8f9b13a
authored
Completely refactor the fulltext operations (#1093)
As of this commit, the fulltext index (triggered by `ql:contains-word` and `ql:contains-entity`) uses two basic operations:
1. `TextIndexScanForWord`: For a given word or prefix, return all text records that contain the word, (possibly together with the matched word in the case of a prefix, and the score of the match).
2. `TextIndexScanForEntity`: For a given word or prefix, return a superset of all pairs of `(text, entity)` where the entity is contained in the text according to `ql:contains-entity` and the text contains the `word`. For technical reasons this is a superset: We always have to scan the complete block from the half-inverted index which might belong to a shorter prefix.
The general processing is then as follows:
* For each word or prefix that appears as part of the object of a `ql:contains-word` triple, a `TextIndexScanForWord` is created.
* For each entity or variable that appears as the object of a `ql:contains-entity` triple, a `TextIndexScanForEntity` is created.
* The rest of the query processing is handled by the "ordinary" query planner using the normal operations like JOIN that are also used to process standard SPARQL queries.
This is much cleaner than the old `TextOperationWith[out]Filter` operations which combined the functionality of the above scan operations with JOIN operations, because the old approach lead to a lot of code duplication (the code for a join of two tables was duplicated for the fulltext module) and because the new approach makes queries easier to optimize and to reason about because the runtime information trees become much clearer if the scans and joins are represented separately.1 parent f7c2c32 commit 8f9b13a
File tree
33 files changed
+1624
-863
lines changed- e2e
- src
- engine
- global
- index
- parser
- data
- sparqlParser
- test
- engine
- util
33 files changed
+1624
-863
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
| 13 | + | |
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
| 17 | + | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
| 26 | + | |
27 | 27 | | |
28 | 28 | | |
29 | | - | |
| 29 | + | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | | - | |
39 | | - | |
40 | 38 | | |
41 | | - | |
42 | | - | |
43 | | - | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
44 | 42 | | |
45 | 43 | | |
46 | 44 | | |
47 | 45 | | |
48 | 46 | | |
49 | | - | |
50 | 47 | | |
51 | 48 | | |
52 | 49 | | |
| |||
55 | 52 | | |
56 | 53 | | |
57 | 54 | | |
58 | | - | |
| 55 | + | |
59 | 56 | | |
60 | 57 | | |
61 | 58 | | |
62 | | - | |
| 59 | + | |
63 | 60 | | |
64 | 61 | | |
65 | 62 | | |
| |||
88 | 85 | | |
89 | 86 | | |
90 | 87 | | |
91 | | - | |
| 88 | + | |
92 | 89 | | |
93 | 90 | | |
94 | 91 | | |
95 | 92 | | |
96 | 93 | | |
97 | | - | |
| 94 | + | |
98 | 95 | | |
99 | 96 | | |
100 | 97 | | |
101 | | - | |
| 98 | + | |
102 | 99 | | |
103 | | - | |
| 100 | + | |
104 | 101 | | |
105 | 102 | | |
106 | | - | |
| 103 | + | |
107 | 104 | | |
108 | 105 | | |
109 | 106 | | |
110 | | - | |
| 107 | + | |
111 | 108 | | |
112 | 109 | | |
113 | 110 | | |
114 | 111 | | |
115 | 112 | | |
116 | 113 | | |
117 | 114 | | |
118 | | - | |
119 | 115 | | |
120 | | - | |
| 116 | + | |
121 | 117 | | |
122 | 118 | | |
123 | 119 | | |
124 | 120 | | |
125 | 121 | | |
126 | | - | |
127 | 122 | | |
128 | 123 | | |
129 | 124 | | |
130 | 125 | | |
131 | 126 | | |
132 | | - | |
133 | 127 | | |
134 | | - | |
135 | 128 | | |
136 | | - | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
137 | 180 | | |
138 | 181 | | |
139 | 182 | | |
140 | | - | |
| 183 | + | |
141 | 184 | | |
142 | 185 | | |
143 | 186 | | |
144 | 187 | | |
145 | 188 | | |
146 | 189 | | |
147 | 190 | | |
148 | | - | |
149 | 191 | | |
150 | | - | |
| 192 | + | |
151 | 193 | | |
152 | 194 | | |
153 | 195 | | |
154 | 196 | | |
155 | 197 | | |
156 | | - | |
157 | | - | |
158 | 198 | | |
159 | 199 | | |
160 | 200 | | |
| |||
1239 | 1279 | | |
1240 | 1280 | | |
1241 | 1281 | | |
1242 | | - | |
| 1282 | + | |
1243 | 1283 | | |
1244 | 1284 | | |
1245 | 1285 | | |
1246 | | - | |
| 1286 | + | |
1247 | 1287 | | |
1248 | 1288 | | |
1249 | 1289 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
| 13 | + | |
14 | 14 | | |
15 | 15 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
| 32 | + | |
31 | 33 | | |
32 | 34 | | |
33 | 35 | | |
| |||
176 | 178 | | |
177 | 179 | | |
178 | 180 | | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
179 | 185 | | |
180 | 186 | | |
181 | 187 | | |
| |||
217 | 223 | | |
218 | 224 | | |
219 | 225 | | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
220 | 230 | | |
221 | 231 | | |
222 | 232 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
| 48 | + | |
| 49 | + | |
48 | 50 | | |
49 | 51 | | |
50 | 52 | | |
| |||
0 commit comments