Add missing internal triples for UPDATEs#2674
Add missing internal triples for UPDATEs#2674RobinTF wants to merge 9 commits intoad-freiburg:masterfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2674 +/- ##
=======================================
Coverage 91.58% 91.59%
=======================================
Files 480 480
Lines 41357 41368 +11
Branches 5494 5496 +2
=======================================
+ Hits 37877 37889 +12
+ Misses 1901 1900 -1
Partials 1579 1579 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
joka921
left a comment
There was a problem hiding this comment.
A request for minor improvements.
joka921
left a comment
There was a problem hiding this comment.
small suggestions for the caching, let's see how easy they are to implement, but there is a chance to learn something here:)
joka921
left a comment
There was a problem hiding this comment.
Definitely improved, I have some minor suggestions, but the next round should be good to go.
src/index/DeltaTriples.h
Outdated
| ad_utility::util::LRUCache<std::string, Id> languageTagCache_{ | ||
| languageTagCacheSize_}; | ||
|
|
||
| // Cache commonly used predicates between calls. |
There was a problem hiding this comment.
Can be more precise. It caches the IDs of commonly used language tagged predicates like @en@rdfs:label
There was a problem hiding this comment.
That's not what it does though. It just caches the predicates without the language tags, because those might be the ones that are most likely expensive to look up.
| CPP_template(typename Key, typename Func)( | ||
| requires ad_utility::InvocableWithConvertibleReturnType< | ||
| Func, V, const K&>) const V& getOrCompute(const K& key, | ||
| Func, V, const K&>) const V& getOrCompute(Key&& key, |
There was a problem hiding this comment.
Should be Func, V, const Key& in the requires clause for a little more precision.
There was a problem hiding this comment.
No it shouldn't, because we end up passing a const ref of the actual thing to the function. (I ended up with slightly different code than we discussed)
There was a problem hiding this comment.
Agreed, thanks for the explanation.
| } | ||
| auto result = cache_.try_emplace(key, computeFunction(key), keys_.begin()); | ||
| auto result = cache_.try_emplace( | ||
| AD_FWD(key), computeFunction(keys_.front()), keys_.begin()); |
There was a problem hiding this comment.
yes, that does the trick (the double string creation still is not nice, but that was already there before...)
src/index/DeltaTriples.h
Outdated
| languageTagCacheSize_}; | ||
|
|
||
| // Cache commonly used predicates between calls. | ||
| static constexpr size_t predicateCacheSize_ = 50; |
There was a problem hiding this comment.
I think both of the cache sizes are a little small (it is global for the full index, maybe use a few hundreds or thousands?
There was a problem hiding this comment.
For the language tags, the cache size seems reasonable in my opinion, there aren't that many frequently used languages. For the predicates you might have a point. It's heavily dataset dependent though. For Wikidata to get to a predicate that's being used fewer than 1M times, you'd have to cache at least 587 predicates which are used more frequently.
I'll increase it to 100 for now and I'll leave the final decision to @hannahbast
joka921
left a comment
There was a problem hiding this comment.
Thank youy very much, feel free to forward this to @hannahbast
Overview
Conformance check passed ✅No test result changes. |
|



This PR adds code to add the missing
"object"@language ql:langtag <@language>internal triples to the delta triples on insertions. This way all kinds of language filters now work with update. A caveat is that these triples are never removed again, so the memory requirement will simply increase more and more, but the behaviour will never be wrong, since these new triples are always joined with a regular index scan before being used.