Skip to content

Conversation

@lluwm
Copy link
Contributor

@lluwm lluwm commented Nov 14, 2025

Problem Statement

We estimate the size of pubsubmessage and based on it to compute the aggregate size of drainer buffer queue. A limit is set on the queue to control its occupied heapsize. In a recent OOM heap dump analysis, we found that this estimation can inaccurate e.g. buffer size is about 400MB, whereas the limit setting is only 10MB, and the reason is that we use the sum of key, value, and pubSubPosition to estimate an ImmutablePubSubMessage size. However, when examining the dump, in this particular case, the major contributor was PubSubMessageHeaders which was about 16KB and we don't take it into account today.

Solution

This PR includes the header size inside getHeapSize estimation for ImmutablePubSubMessage.

Code changes

  • Added new code behind a config. If so list the config names and their default values in the PR description.
  • Introduced new log lines.
    • Confirmed if logs need to be rate limited to avoid excessive logging.

Concurrency-Specific Checks

Both reviewer and PR author to verify

  • Code has no race conditions or thread safety issues.
  • Proper synchronization mechanisms (e.g., synchronized, RWLock) are used where needed.
  • No blocking calls inside critical sections that could lead to deadlocks or performance degradation.
  • Verified thread-safe collections are used (e.g., ConcurrentHashMap, CopyOnWriteArrayList).
  • Validated proper exception handling in multi-threaded code to avoid silent thread termination.

How was this PR tested?

  • New unit tests added.
  • New integration tests added.
  • Modified or extended existing tests.
  • Verified backward compatibility (if applicable).

Does this PR introduce any user-facing or breaking changes?

  • No. You can skip the rest of this section.
  • Yes. Clearly explain the behavior change and its impact.

…psize estimation

We estimate the size of pubsubmessage and based on it to compute the aggregate size of drainer buffer queue.
A limit is set on the queueto control its occupied heapsize. In a recent OOM heap dump analysis, we found
that this estimation can inaccurate e.g. buffer size is about 400MB, whereas the limit setting is only 10MB,
and the reason is that we use the sum of key, value, and pubSubPosition to estimate an ImmutablePubSubMessage size.
However, when examining the dump, in this particular case, the major contributor was PubSubMessageHeaders which was
about 16KB and we don't take it into account today.

This PR includes the header size inside getHeapSize estimation for ImmutablePubSubMessage.
Copilot AI review requested due to automatic review settings November 14, 2025 06:57
Copilot finished reviewing on behalf of lluwm November 14, 2025 06:59
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses an inaccurate heap size estimation in ImmutablePubSubMessage that caused buffer size limits to be ineffective, leading to memory issues (e.g., 400MB actual buffer size vs 10MB configured limit).

Key Changes:

  • Modified getHeapSize() method to include PubSubMessageHeaders in the heap size calculation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant