Skip to content

Spill improvements #25892

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 6, 2025
Merged

Spill improvements #25892

merged 9 commits into from
Jun 6, 2025

Conversation

sopel39
Copy link
Member

@sopel39 sopel39 commented May 30, 2025

Description

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## General
* Improve reliability of spilling for aggregation by retaining revocable memory, preventing out‑of‑memory errors.
* Fix correctness issues when unspilling hash aggregation by using correct hash values, avoiding aggregation miscalculations.

sopel39 added 2 commits April 23, 2025 16:22
TestHashAggregationOperator#testHashAggregation() runtime reduced
from 2m45s to 4s.
- SpillableHashAggregationBuilder#buildResult is now spilling asynchronously
- Revocable memory is not converted to user memory when !shouldMergeWithMemory.
  This was causing OOM issues previously.
@cla-bot cla-bot bot added the cla-signed label May 30, 2025
@github-actions github-actions bot added the docs label May 30, 2025
@sopel39
Copy link
Member Author

sopel39 commented May 30, 2025

cc @osscm

@sopel39 sopel39 requested a review from raunaqmorarka May 30, 2025 15:32
@raunaqmorarka raunaqmorarka added the bug Something isn't working label May 30, 2025
Copy link
Member

@raunaqmorarka raunaqmorarka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments on metrics commit, we can take that to separate PR if you want to merge the remaining first

Copy link
Member

@raunaqmorarka raunaqmorarka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// TODO: this should be asynchronous
getFutureValue(spillToDisk());
updateMemory();
return flatten(WorkProcessor.create(() -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of flatten and create here makes this code a little difficult to follow since most of the logic only needs to be called once. Could this be refactored to a transform that runs after the spillToDisk() unblocks?

Copy link
Member Author

@sopel39 sopel39 Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would make it much simpler. Think of flatten processor as a control-loop that delegates to other actions.

We could maybe add more WorkProcessor constructs, e.g:

WorkProcessor.of()
  .blockingOptionally(() -> {
        checkState(hasPreviousSpillCompletedSuccessfully(), "Previous spill hasn't yet finished");
        updateMemory();
        if (localRevocableMemoryContext.getBytes() > 0) {
            // No spill happened, try to build result from memory
            if (spiller.isEmpty()) {
                // No spill happened, try to build result from memory. Revocable memory needs to be converted to user memory as producing output stage is no longer revocable.
                long currentRevocableBytes = localRevocableMemoryContext.getBytes();
                localRevocableMemoryContext.setBytes(0);
                if (!localUserMemoryContext.trySetBytes(localUserMemoryContext.getBytes() + currentRevocableBytes)) {
                    // TODO: this might fail (even though we have just released memory), but we don't
                    // have a proper way to atomically convert memory reservations
                    localRevocableMemoryContext.setBytes(currentRevocableBytes);
                    // spill since revocable memory could not be converted to user memory immediately
                    return Optional.of(spillToDisk());
                }
            }
            else if (!shouldMergeWithMemory(getSizeInMemoryWhenUnspilling())) {
                return Optional.of(spillToDisk());
            }
        }
        return Optional.empty();
  })
  .flatReturn(() -> {
        checkState(hasPreviousSpillCompletedSuccessfully(), "Previous spill hasn't yet finished");
        // update memory after potential spill from the previous call to buildResult
        updateMemory();
        if (spiller.isEmpty()) {
            return hashAggregationBuilder.buildResult();
        }
        return mergeFromDiskAndMemory();
  })

it's probably easier to read, but I would skip it for now

sopel39 added 7 commits June 5, 2025 14:29
Previously groupByHash.getRawHash was used when spilling
and InterpretedHashGenerator was used when unspillin.
This could lead to correctness issues.

With this fix, raw hash is appended to spilled pages and
used when unspilling, preventing invalid page breaks.
Previously spilled data size was comptued at query level by summarizing
stats from operators. However, operator stats are unavailable while
operator is running. Therefore spilled stats were not real time stats.

This change makes spilled stats as first class citizen and reported
in real time.
This allows to saturate spill disks better in case of multiple spill
locations.
FileSingleStreamSpiller fields can be accessed from multiple threads,
therefore they need to be thread safe.
Spilling metrics for:
- number of spills
- spill wall time
- spilled data
- number of unspills
- unspill wall time
- unspilled data
@sopel39 sopel39 merged commit 7af29d5 into trinodb:master Jun 6, 2025
104 checks passed
@sopel39 sopel39 deleted the ks/fix_spill branch June 6, 2025 09:10
@github-actions github-actions bot added this to the 477 milestone Jun 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cla-signed docs
Development

Successfully merging this pull request may close these issues.

3 participants