Add abilty to yield in Ivarators, AndIterator, OrIterator and return metrics before yield (#704) #2042

billoley · 2023-07-21T20:55:58Z

Add abilty to yield in Ivarators, AndIterator, OrIterator and return metrics before yield (#704)

WaitWindowObserver object tracks the remaining time before a yield and has a bunch of convenience methods for manipulating keys. !YIELD_AT_BEGIN and \uffffYIELD_AT_END are used to yield either before a given key or after a given key. They are used most often in the colFam but sometimes in the colQual if the query is using non-sorted UIDs as an optimization. The ! sorts before alphanumeric characters and the \uffff sorts after alphanumerics. The rest of the strings are for easy identification.

The remaining time is tracked in the WaitWindowObserver using a separate thread to minimize the calls to System.currentTimeMillis.

When time is expired, a WaitWindowOverrunException is thrown and then caught/propagated (determining the correct yieldKey) at each AndIterator/OrIterator up to the WaitWindowOverseerIterator which is under either the SerialIterator of the PipelineIterator but above the rest of the boolean stack of iterators.

If collectTimingDetails=false, we simply yield when either the SerialIterator of the PipelineIterator detects a WaitWindowOverrun. If collectTimingDetails=true, then we return a document with a WAIT_WINDOW_OVERRUN attribute and a TIMING_METADATA that hands off its souce, next, yield, etc metrics to the DocumentTransformerSupport class and then gets ignored. On the next call to the QueryIterator, the hasTop method checks with the WaitWindowObserver and yields.

IveratorFutures are now tracked in IteratorThreadPoolManager so that when an Ivarator yields before fillSortedSet is completed for all ranges, a post-yield call can reclaim the HDFS-backed sorted set from the IvaratorFuture.

Ivarator yielding and timeouts:

Yielding is taken care of by checks in WaitWindowObserver and uses property query.iterator.yield.threshold.ms

IteratorThreadPoolManager has a timer-based check that an Ivarator has not been in fillSortedSet more than ivaratorCacheScanTimeout which is set by the logic property IvaratorCacheScanTimeoutMinutes which is set by the property query.max.call.time.minutes (default 60 minutes). When an Ivarator suspends and resumes, the start time is maintained such that this check operates against the cumulative time in fillSortedSet for that Ivarator.

IteratorThreadPoolManager has a timer-based check that an IvaratorRunnable (used in fillSortedSet to fill the set from part of the Range) is not running for greater than tserver.datawave.ivarator.runnableTimeoutMinutes which is set in Accumulo properties and checked frequently for changes on a timer. The default is also 60 minutes -- so it will likely never be involked, but can be used as a failsafe. It can also be set lower temporarily to force shutdown of all IvaratorRunnables.

IteratorThreadPoolManager also uses the tserver.datawave.ivarator.runnableTimeoutMinutes property to ensure that IvaratorFuture objectd are evicted from the Caffeine cache after 1.1 * that setting. These objects are removed when an Ivarator completes but could be abandoned if an Ivarator is suspended on yield and then not resumed.

Included fixes to the PipelineIterator discovered during testing:

When yielding, PipelineIterator should evaluate all possible keys to find the lowest, including the evaluationQueue, results, and yieldKey from WaitWindowOverrun. This could have caused missed data when yielding in the PipelineIterator
Must fill the evaluationQueue after calling flushCompletedResults in getNext otherwise flushCompletedResults might empty the evaluationQueue with no valid results and on the next call to getNext, results will be empty and when cacheNextResult is called, evaluationQueue will be empty and PipelineIterator declares itself done. This could have caused missed data when yielding in the PipelineIterator
Pipeline should save any caught Exception so that the PipelineIterator can propagate the Exception when calling getResult. This is particularly important when running Pipline in an Executor via the PipelineIterator. Otherwise, the evaluation of that Document just returns null which is the same as a Document not matching. The SerialIterator runs the Pipline.run() method inline and was not subject to this bug.

QueryIteratorIT and extended classes WaitWindowQueryIteratorSerialIT and WaitWindowQueryIteratorPipelineIT use a test harness that more closely simulates Accumulo's LookupTask.

IvaratorYieldingTest now tests Pipeline/Serial, sortedUIDs/unsortedUIDs, and collectTimingDetails T/F for 8 variations instead of the previous 2 variations Serial + sortedUIDs/unsortedUIDs.

warehouse/query-core/src/main/java/datawave/query/iterator/waitwindow/WaitWindowObserver.java

.../query-core/src/main/java/datawave/query/iterator/waitwindow/WaitWindowOverseerIterator.java

warehouse/query-core/src/main/java/datawave/query/iterator/waitwindow/WaitWindowObserver.java

warehouse/query-core/src/main/java/datawave/core/iterators/IteratorThreadPoolManager.java

warehouse/query-core/src/main/java/datawave/query/exceptions/WaitWindowOverrunException.java

.../query-core/src/main/java/datawave/core/iterators/DatawaveFieldIndexCachingIteratorJexl.java

warehouse/query-core/src/main/java/datawave/query/jexl/functions/KeyAdjudicator.java

.../query-core/src/main/java/datawave/core/iterators/DatawaveFieldIndexCachingIteratorJexl.java

warehouse/query-core/src/main/java/datawave/query/iterator/waitwindow/WaitWindowObserver.java

warehouse/query-core/src/main/java/datawave/query/iterator/logic/OrIterator.java

warehouse/query-core/src/main/java/datawave/query/iterator/logic/AndIterator.java

apmoriarty

Noted a few preferences, otherwise looks good.

warehouse/query-core/src/main/java/datawave/query/iterator/pipeline/SerialIterator.java

warehouse/query-core/src/main/java/datawave/query/iterator/profile/QuerySpanCollector.java

warehouse/query-core/src/test/java/datawave/query/iterator/QueryIteratorIT.java

ivakegg

I still have to complete this review, but here is a start.

...use/query-core/src/main/java/datawave/core/iterators/DatawaveFieldIndexListIteratorJexl.java

...se/query-core/src/main/java/datawave/core/iterators/DatawaveFieldIndexRangeIteratorJexl.java

...e/query-core/src/main/java/datawave/core/iterators/DatawaveFieldIndexFilterIteratorJexl.java

.../query-core/src/main/java/datawave/core/iterators/DatawaveFieldIndexCachingIteratorJexl.java

warehouse/query-core/src/main/java/datawave/query/attributes/WaitWindowExceededMetadata.java

warehouse/query-core/src/main/java/datawave/query/iterator/waitwindow/WaitWindowObserver.java

…metrics before yield (#704)

…datawave-query

…es; add some documentation

…iguration

ivakegg · 2025-05-15T19:05:50Z

.../query-core/src/main/java/datawave/core/iterators/DatawaveFieldIndexCachingIteratorJexl.java

-                                cq.substring(fieldnameIndex + 1) + '\0' + cf + '\0');
+        Key startKey = r.getStartKey();
+        if (!sortedUIDs) {
+            String cq = WaitWindowObserver.removeMarkers(startKey.getColumnQualifier()).toString();


Can we get a little more inline documentation as to what we are doing when we update the WaitWindowObserver?

I can add more documentation as necessary. However, this line is a static call to strip the YIELD_AT_BEGIN or YIELD_AT_END marker from the colQual (if it exists). The WaitWindowObserver itself is not affected.

billoley force-pushed the feature/issue-704 branch from 64a4062 to 5eb1278 Compare July 22, 2023 15:52

billoley requested review from jwomeara, ivakegg and keith-ratcliffe July 22, 2023 18:23

billoley force-pushed the feature/issue-704 branch from 5eb1278 to b887203 Compare July 24, 2023 15:16

ivakegg reviewed Jul 26, 2023

View reviewed changes

Rexblk81 approved these changes Jul 27, 2023

View reviewed changes