Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add abilty to yield in Ivarators, AndIterator, OrIterator and return metrics before yield (#704) #2042

Open
wants to merge 1 commit into
base: integration
Choose a base branch
from

Conversation

billoley
Copy link
Collaborator

@billoley billoley commented Jul 21, 2023

Add abilty to yield in Ivarators, AndIterator, OrIterator and return metrics before yield (#704)

WaitWindowObserver object tracks the remaining time before a yield and has a bunch of convenience methods for manipulating keys. !YIELD_AT_BEGIN and \uffffYIELD_AT_END are used to yield either before a given key or after a given key. They are used most often in the colFam but sometimes in the colQual if the query is using non-sorted UIDs as an optimization. The ! sorts before alphanumeric characters and the \uffff sorts after alphanumerics. The rest of the strings are for easy identification.

The remaining time is tracked in the WaitWindowObserver using a separate thread to minimize the calls to System.currentTimeMillis.

When time is expired, a WaitWindowOverrunException is thrown and then caught/propagated (determining the correct yieldKey) at each AndIterator/OrIterator up to the WaitWindowOverseerIterator which is under either the SerialIterator of the PipelineIterator but above the rest of the boolean stack of iterators.

If collectTimingDetails=false, we simply yield when either the SerialIterator of the PipelineIterator detects a WaitWindowOverrun. If collectTimingDetails=true, then we return a document with a WAIT_WINDOW_OVERRUN attribute and a TIMING_METADATA that hands off its souce, next, yield, etc metrics to the DocumentTransformerSupport class and then gets ignored. On the next call to the QueryIterator, the hasTop method checks with the WaitWindowObserver and yields.

IveratorFutures are now tracked in IteratorThreadPoolManager so that when an Ivarator yields before fillSortedSet is completed for all ranges, a post-yield call can reclaim the HDFS-backed sorted set from the IvaratorFuture.

Included fixes to the PipelineIterator discovered during testing:

  1. When yielding, PipelineIterator should evaluate all possible keys to find the lowest, including the evaluationQueue, results, and yieldKey from WaitWindowOverrun. This could have caused missed data when yielding in the PipelineIterator

  2. Must fill the evaluationQueue after calling flushCompletedResults in getNext otherwise flushCompletedResults might empty the evaluationQueue with no valid results and on the next call to getNext, results will be empty and when cacheNextResult is called, evaluationQueue will be empty and PipelineIterator declares itself done. This could have caused missed data when yielding in the PipelineIterator

  3. Pipeline should save any caught Exception so that the PipelineIterator can propagate the Exception when calling getResult. This is particularly important when running Pipline in an Executor via the PipelineIterator. Otherwise, the evaluation of that Document just returns null which is the same as a Document not matching. The SerialIterator runs the Pipline.run() method inline and was not subject to this bug.

QueryIteratorIT and extended classes WaitWindowQueryIteratorSerialIT and WaitWindowQueryIteratorPipelineIT use a test harness that more closely simulates Accumulo's LookupTask.

IvaratorYieldingTest now tests Pipeline/Serial, sortedUIDs/unsortedUIDs, and collectTimingDetails T/F for 8 variations instead of the previous 2 variations Serial + sortedUIDs/unsortedUIDs.

@billoley billoley force-pushed the feature/issue-704 branch 2 times, most recently from 5be0608 to 1935337 Compare October 3, 2023 13:28
apmoriarty
apmoriarty previously approved these changes Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants