-
Notifications
You must be signed in to change notification settings - Fork 97
Completely refactor the fulltext operations #1093
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 8 commits
Commits
Show all changes
162 commits
Select commit
Hold shift + click to select a range
a91a811
quick fix so indexToOptionalString works
NickG-1 36bc2fd
Increased readability in IndexImpl.Text.cpp plus WordId shows up in r…
NickG-1 a8483fc
small fix
NickG-1 aef409d
functional added ?completedWord to output
NickG-1 b7c468e
Revert "quick fix so indexToOptionalString works"
NickG-1 8d8d54b
changed gitignore
NickG-1 ddc76ef
Merge branch 'uptodate'
NickG-1 88f2783
reintegrated vocab quick fix
NickG-1 0e2db52
merge fixes
NickG-1 ae113f4
fixed output of completedWord
NickG-1 9afc2e5
formatting
NickG-1 fe4ae9a
Merge branch 'ad-freiburg:master' into master
NickG-1 4f2e57c
Merge branch 'uptodate'
NickG-1 9ccb914
Merge branch 'master' of https://github.com/NickG-1/qlever
NickG-1 371eb30
Merge branch 'ad-freiburg:master' into master
NickG-1 f523dc1
PR review changes
NickG-1 012786d
sonar and formatter
NickG-1 6dd7bcf
renaming and bug fix in aggScoresAndTakeTop...
NickG-1 3d7b9c5
sonar stayle changes
NickG-1 486fef5
small fix
NickG-1 f00eeef
sonar
NickG-1 bf40643
fixed tests
NickG-1 dd8e9e0
formatting
NickG-1 89602c2
Merge branch 'uptodate'
NickG-1 b8f1823
clean up
NickG-1 e2c9ec5
added test-cases
NickG-1 ea86a37
formatting
NickG-1 4a1e3af
Merge branch 'uptodate'
NickG-1 1c2c8b4
adapt to merge
NickG-1 32bc101
Merge branch 'uptodate'
NickG-1 9658545
small fix
NickG-1 519ff34
Merge branch 'ad-freiburg:master' into master
NickG-1 8af4b1c
Merge branch 'ad-freiburg:master' into master
NickG-1 2b1baff
matching word variable now is only show if selected
NickG-1 b50d749
matching word now also works for filtered queries
NickG-1 5acb316
merge
NickG-1 497ee97
Merge branch 'master' into uptodate
NickG-1 e3bec2a
matchingword now also works for filter queries
NickG-1 13f0837
formatting
NickG-1 5d4d68f
updated tests
NickG-1 98accd9
Merge branch 'ad-freiburg:master' into master
NickG-1 e35caae
matching word now works with terms of any size
NickG-1 0e5aad3
Merge branch 'master' of https://github.com/NickG-1/qlever
NickG-1 7cdb730
fixed issue (now result with multiple matchings shows up multiple ti…
NickG-1 7f47f5f
Merge branch 'ad-freiburg:master' into master
NickG-1 25283fb
adapt matchingWord for multiple Variables part 1
NickG-1 4f7b9aa
rewrote kWayIntersect (mult var part 2)
NickG-1 47bbfeb
improved WordEntityPostings struct (mult var Part 3)
NickG-1 902d8ae
other approach to new wep
NickG-1 b2cead7
querries with two terms now work
NickG-1 e5e916e
call fixed size fix
NickG-1 2d5e1f5
restructured wep struct
NickG-1 368d9b9
adapted other aggScore functions to mult terms
NickG-1 42cfb76
bug fix
NickG-1 339937a
updated behaviour of the aggscoresand... funtions
NickG-1 293c555
formatting
NickG-1 e1a4e99
textscore bug fix
NickG-1 c34a54c
Merge branch 'uptodate'
NickG-1 1e77128
Merge branch 'ad-freiburg:master' into master
NickG-1 cd54b9d
bug fix
NickG-1 886bcfe
Merge branch 'uptodate'
NickG-1 f190047
Merge branch 'ad-freiburg:master' into master
NickG-1 46f6cd7
Merge branch 'ad-freiburg:master' into master
NickG-1 2714a69
adapted for zero vars
NickG-1 b718915
fiexed unit tests
NickG-1 757b017
Merge branch 'uptodate'
NickG-1 4a31031
fixed e2e tests
NickG-1 75a3556
bug fix
NickG-1 aa71dd6
bug fix
NickG-1 3583691
bug fix
NickG-1 cd62dba
bug fix
NickG-1 a5c7c90
formatting
NickG-1 98d86b5
Merge branch 'uptodate'
NickG-1 319049b
bug fix
NickG-1 5c049a0
enhanced tests
NickG-1 9f01d87
formatting
NickG-1 6b27ee1
updated stxxl
NickG-1 b148b3f
Merge branch 'ad-freiburg:master' into master
NickG-1 666681f
deleted unused functions
NickG-1 b63fc2e
changed aggScoresMultVar behaviour
NickG-1 9fa37a6
improved test coverage part 1
NickG-1 ab237b5
deletion of unused functions
NickG-1 c361a38
improved test coverage part 2
NickG-1 13152b0
improved test coverage part 3
NickG-1 e861db8
review changes part 1
NickG-1 6cb4256
Merge branch 'ad-freiburg:master' into master
NickG-1 6eb2c73
review changes part 2
NickG-1 b1ac1a2
Merge branch 'master' of https://github.com/NickG-1/qlever
NickG-1 989bc1e
fixed mistake in comment
NickG-1 8018ef4
restructured the tests
NickG-1 0abc594
Merge branch 'ad-freiburg:master' into master
NickG-1 aad5703
Merge branch 'ad-freiburg:master' into master
NickG-1 d882777
review changes part 3
NickG-1 9250669
formatting
NickG-1 285e830
extended e2e tests
NickG-1 ce12a4d
bug fix
NickG-1 16721c9
Merge branch 'ad-freiburg:master' into master
NickG-1 dd1cbc6
Merge branch 'ad-freiburg:master' into master
NickG-1 6b712ff
bug fix part 2
NickG-1 19fa40e
Merge branch 'master' of https://github.com/NickG-1/qlever
NickG-1 2290e03
resolved error
NickG-1 d05db1f
review
NickG-1 2f0690b
Merge branch 'ad-freiburg:master' into master
NickG-1 9fe7629
init wordIndexScan
NickG-1 6c2a590
first running version of WordIndexScan
NickG-1 c1b8f96
Merge branch 'ad-freiburg:master' into master
NickG-1 c25628e
Merge branch 'master' into wordIndexScan
NickG-1 afffee6
formatting and minor fixes
NickG-1 7174707
bug fix and extended e2e tests
NickG-1 f46af52
adapted review changes + first version of wordIndexScanTest
NickG-1 99a46b5
first working version
NickG-1 af3415c
name change
NickG-1 18fb19d
Merge branch 'ad-freiburg:master' into master
NickG-1 122cf4b
Merge branch 'master' into wordIndexScan
NickG-1 1b31da4
Merge branch 'ad-freiburg:master' into wordIndexScan
NickG-1 f535908
Review changes
NickG-1 d660a8a
added feature: ql:contains-entity <fixed entity> works now
NickG-1 8a63f28
formatting
NickG-1 87f00c5
Merge branch 'wordIndexScan' of https://github.com/NickG-1/qlever int…
NickG-1 a878fab
Merge branch 'ad-freiburg:master' into wordIndexScan
NickG-1 c9e4937
updated wordIndexScanTest
NickG-1 2f1a9ce
added feature: outputs scores
NickG-1 502585a
restructured code and changed behaviour slightly
NickG-1 5a8bba4
added tests
NickG-1 0ff8b53
bug fix
NickG-1 056f99b
review changes
NickG-1 cd9b8c2
Merge branch 'ad-freiburg:master' into master
NickG-1 1290b96
Merge branch 'wordIndexScan'
NickG-1 c14f3f3
use idTable directly instead of converting to wep first
NickG-1 af97dca
Merge branch 'uptodate' into wordIndexScan
NickG-1 fc661a4
Merge branch 'ad-freiburg:master' into wordIndexScan
NickG-1 31714cc
bug fix and silencing compiler errors
NickG-1 604b482
review changes
NickG-1 7565783
review changes
NickG-1 5b7bf1d
added missing functions for textoperations
NickG-1 b415672
added queryPlanner tests for texIndex Opterations
NickG-1 365cbf4
Merge branch 'uptodate' into wordIndexScan
NickG-1 fcb2e8d
swapped out deprecated functions
NickG-1 0a5428d
fixed test and bug fix
NickG-1 13f87ea
Merge remote-tracking branch 'origin/master' into wordIndexScan
4af3ca8
first() and last()
2cb45c4
Merge branch 'uptodate' into wordIndexScan
NickG-1 675bad9
added score columns to output and made their names unambiguous
NickG-1 9ac14b4
review changes
NickG-1 bb1b101
Merge branch 'uptodate' into wordIndexScan
NickG-1 06fe14d
review changes
NickG-1 4fbbbe6
review changes
NickG-1 ffd79f3
bug fix and added tests
NickG-1 5ff7aff
sonar and codecov changes
NickG-1 a3998b7
review changes
NickG-1 6a91b48
bug fix
NickG-1 db86119
format
NickG-1 e098b75
codecov and sonar
NickG-1 835956d
bug fix
NickG-1 8a0913a
formatting
NickG-1 f870f3b
Revert "formatting"
NickG-1 4c835a7
formatting
NickG-1 a21da1f
review changes
NickG-1 1e03a7e
Fix newline character
NickG-1 8456b3b
Merge branch 'ad-freiburg:master' into wordIndexScan
NickG-1 00ae623
Merge branch 'wordIndexScan' of https://github.com/NickG-1/qlever int…
NickG-1 c7e5855
sonar
NickG-1 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,12 +1,16 @@ | ||
| // Copyright 2023, University of Freiburg, | ||
| // Chair of Algorithms and Data Structures. | ||
| // Author: Nick Göckel <[email protected]> | ||
|
|
||
| #include "engine/TextIndexScanForEntity.h" | ||
|
|
||
| // _____________________________________________________________________________ | ||
| TextIndexScanForEntity::TextIndexScanForEntity( | ||
| QueryExecutionContext* qec, Variable textRecordVar, | ||
| const std::variant<Variable, std::string>& entity, string word) | ||
| std::variant<Variable, std::string> entity, string word) | ||
| : Operation(qec), | ||
| textRecordVar_(std::move(textRecordVar)), | ||
| entity_(VarOrFixedEntity(qec, entity)), | ||
| varOrFixed_(VarOrFixedEntity(qec, std::move(entity))), | ||
NickG-1 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| word_(std::move(word)) {} | ||
|
|
||
| // _____________________________________________________________________________ | ||
|
|
@@ -15,14 +19,21 @@ ResultTable TextIndexScanForEntity::computeResult() { | |
| word_, getExecutionContext()->getAllocator()); | ||
|
|
||
| if (hasFixedEntity()) { | ||
| auto beginErase = std::ranges::remove_if( | ||
| idTable.begin(), idTable.end(), [this](const auto& row) { | ||
| return row[1].getVocabIndex() != entity_.index_; | ||
| }); | ||
| auto beginErase = std::ranges::remove_if(idTable, [this](const auto& row) { | ||
| return row[1].getVocabIndex() != getVocabIndexOfFixedEntity(); | ||
| }); | ||
| idTable.erase(beginErase.begin(), idTable.end()); | ||
| idTable.setColumnSubset(std::vector<ColumnIndex>{0, 2}); | ||
| } | ||
|
|
||
| // Add details to the runtimeInfo. This is has no effect on the result. | ||
| if (hasFixedEntity()) { | ||
| runtimeInfo().addDetail("fixed entity: ", getFixedEntity()); | ||
| } else { | ||
| runtimeInfo().addDetail("entity var: ", getEntityVariable().name()); | ||
| } | ||
| runtimeInfo().addDetail("word: ", word_); | ||
|
|
||
| return {std::move(idTable), resultSortedOn(), LocalVocab{}}; | ||
| } | ||
|
|
||
|
|
@@ -36,11 +47,10 @@ VariableToColumnMap TextIndexScanForEntity::computeVariableToColumnMap() const { | |
| }; | ||
| addDefinedVar(textRecordVar_); | ||
| if (hasFixedEntity()) { | ||
| addDefinedVar( | ||
| textRecordVar_.getScoreVariable(entity_.fixedEntity_.value())); | ||
| addDefinedVar(textRecordVar_.getScoreVariable(getFixedEntity())); | ||
| } else { | ||
| addDefinedVar(entity_.entityVar_.value()); | ||
| addDefinedVar(textRecordVar_.getScoreVariable(entity_.entityVar_.value())); | ||
| addDefinedVar(getEntityVariable()); | ||
| addDefinedVar(textRecordVar_.getScoreVariable(getEntityVariable())); | ||
| } | ||
| return vcmap; | ||
| } | ||
|
|
@@ -53,19 +63,24 @@ size_t TextIndexScanForEntity::getResultWidth() const { | |
| // _____________________________________________________________________________ | ||
| size_t TextIndexScanForEntity::getCostEstimate() { | ||
| if (hasFixedEntity()) { | ||
| return 2 * getExecutionContext()->getIndex().getEntitySizeEstimate(word_); | ||
| // We currently have to first materialize and then filter the complete list | ||
| // for the fixed entity | ||
| return 2 * getExecutionContext()->getIndex().getSizeOfTextBlockForEntities( | ||
| word_); | ||
| } else { | ||
| return getExecutionContext()->getIndex().getEntitySizeEstimate(word_); | ||
| return getExecutionContext()->getIndex().getSizeOfTextBlockForEntities( | ||
| word_); | ||
| } | ||
| } | ||
|
|
||
| // _____________________________________________________________________________ | ||
| uint64_t TextIndexScanForEntity::getSizeEstimateBeforeLimit() { | ||
| if (hasFixedEntity()) { | ||
| return uint64_t( | ||
| return static_cast<uint64_t>( | ||
| getExecutionContext()->getIndex().getAverageNofEntityContexts()); | ||
| } else { | ||
| return getExecutionContext()->getIndex().getEntitySizeEstimate(word_); | ||
| return getExecutionContext()->getIndex().getSizeOfTextBlockForEntities( | ||
| word_); | ||
| } | ||
| } | ||
|
|
||
|
|
@@ -89,7 +104,6 @@ string TextIndexScanForEntity::getCacheKeyImpl() const { | |
| std::ostringstream os; | ||
| os << "ENTITY INDEX SCAN FOR WORD: " | ||
| << " with word: \"" << word_ << "\" and fixed-entity: \"" | ||
| << (hasFixedEntity() ? entity_.fixedEntity_.value() : "no fixed-entity") | ||
| << " \""; | ||
| << (hasFixedEntity() ? getFixedEntity() : "no fixed-entity") << " \""; | ||
| return std::move(os).str(); | ||
| } | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.