Skip to content

Conversation

hanabi1224
Copy link
Contributor

@hanabi1224 hanabi1224 commented Jul 22, 2025

Summary of changes

Changes introduced in this pull request:

# HELP network_messages_0_size_bytes Qeueue message size of flume channel network_messages_0 in bytes
# TYPE network_messages_0_size_bytes gauge
# UNIT network_messages_0_size_bytes bytes
network_messages_0_size_bytes 0
# HELP network_messages_0_total_size_bytes Total message size of flume channel network_messages_0 in bytes
# TYPE network_messages_0_total_size_bytes gauge
# UNIT network_messages_0_total_size_bytes bytes
network_messages_0_total_size_bytes 396800
# HELP network_messages_0_len Queued message count of flume channel network_messages_0
# TYPE network_messages_0_len gauge
network_messages_0_len 0
# HELP network_messages_0_total_len Total message count of flume channel network_messages_0
# TYPE network_messages_0_total_len gauge
network_messages_0_total_len 1600
# HELP network_events_1_size_bytes Qeueue message size of flume channel network_events_1 in bytes
# TYPE network_events_1_size_bytes gauge
# UNIT network_events_1_size_bytes bytes
network_events_1_size_bytes 0
# HELP network_events_1_total_size_bytes Total message size of flume channel network_events_1 in bytes
# TYPE network_events_1_total_size_bytes gauge
# UNIT network_events_1_total_size_bytes bytes
network_events_1_total_size_bytes 7894858
# HELP network_events_1_len Queued message count of flume channel network_events_1
# TYPE network_events_1_len gauge
network_events_1_len 0
# HELP network_events_1_total_len Total message count of flume channel network_events_1
# TYPE network_events_1_total_len gauge
network_events_1_total_len 8791

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Summary by CodeRabbit

  • New Features

    • Introduced size-tracking wrappers for message channels, providing real-time metrics on memory usage and message counts.
    • Added Prometheus metrics integration for network channels, enabling enhanced monitoring and observability.
    • Implemented precise memory size calculation for select network messages and data structures.
  • Refactor

    • Replaced standard message channels with size-tracking variants across networking, message pool, and RPC components.
    • Updated public interfaces to use new size-tracking sender and receiver types for improved resource tracking.
  • Tests

    • Enhanced test setups to utilize size-tracking channels with integrated metrics.

Copy link
Contributor

coderabbitai bot commented Jul 22, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This change replaces all uses of standard flume channels for network messages and events with custom size-tracking wrappers (SizeTrackingSender and SizeTrackingReceiver) throughout the codebase. It introduces new channel constructors with integrated Prometheus metrics, updates relevant struct fields and function signatures, and adds heap size measurement for key types. Some test and utility modules are updated to use the new channel types and constructors.

Changes

File(s) Change Summary
src/utils/flume/mod.rs Introduces SizeTrackingSender, SizeTrackingReceiver, metric-tracking channel constructors, and Prometheus integration.
src/libp2p/service.rs Refactors to use SizeTrackingSender/Receiver for network channels; adds GetSize impls for network types; updates signatures.
src/chain_sync/chain_follower.rs
src/chain_sync/network_context.rs
src/message_pool/msgpool/msg_pool.rs
src/message_pool/msgpool/mod.rs
src/rpc/mod.rs
Updates struct fields, function signatures, and imports to use size-tracking channel types instead of plain flume channels.
src/message_pool/msgpool/selection.rs
src/message_pool/msgpool/mod.rs (tests)
src/rpc/methods/sync.rs (tests)
src/tool/offline_server/server.rs
src/tool/subcommands/api_cmd/generate_test_snapshot.rs
src/tool/subcommands/api_cmd/test_snapshot.rs
Updates test modules and tool code to use metric-enabled channel constructors and size-tracking receivers.
src/libp2p/hello/message.rs Adds GetSize derive to HelloRequest, customizes size calculation for fields.
src/shim/bigint.rs Switches internal bigint type, adds GetSize implementation for BigInt.
src/utils/get_size/mod.rs Adds utility function vec_alike_get_size for measuring slice-like container sizes.

Sequence Diagram(s)

sequenceDiagram
    participant Sender as SizeTrackingSender
    participant Receiver as SizeTrackingReceiver
    participant Metrics as PrometheusRegistry

    Sender->>Receiver: send(msg: T)
    Note right of Sender: Increments queued/total counters<br>by msg.get_heap_size()
    Receiver->>Sender: recv()
    Note left of Receiver: Decrements queued counters<br>by msg.get_heap_size()
    Metrics-->>Sender: Collects channel metrics
    Metrics-->>Receiver: Collects channel metrics
Loading

Estimated code review effort

4 (~75–90 minutes)

Suggested reviewers

  • elmattic
  • LesnyRumcajs
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch hm/size-tracking-flume-channel

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@hanabi1224 hanabi1224 marked this pull request as ready for review July 23, 2025 14:40
@hanabi1224 hanabi1224 requested a review from a team as a code owner July 23, 2025 14:40
@hanabi1224 hanabi1224 requested review from akaladarshi and elmattic and removed request for a team July 23, 2025 14:40
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/utils/flume/mod.rs (1)

233-237: Fix typo in metric help text

"Qeueue" should be "Queued".

            let size_metric_help = format!(
-                "Qeueue message size of flume channel {}_{} in bytes",
+                "Queued message size of flume channel {}_{} in bytes",
                self.name, self.id
            );
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74d4340 and 56d5a99.

📒 Files selected for processing (15)
  • src/chain_sync/chain_follower.rs (5 hunks)
  • src/chain_sync/network_context.rs (5 hunks)
  • src/libp2p/hello/message.rs (1 hunks)
  • src/libp2p/service.rs (16 hunks)
  • src/message_pool/msgpool/mod.rs (7 hunks)
  • src/message_pool/msgpool/msg_pool.rs (3 hunks)
  • src/message_pool/msgpool/selection.rs (2 hunks)
  • src/rpc/methods/sync.rs (1 hunks)
  • src/rpc/mod.rs (2 hunks)
  • src/shim/bigint.rs (1 hunks)
  • src/tool/offline_server/server.rs (2 hunks)
  • src/tool/subcommands/api_cmd/generate_test_snapshot.rs (2 hunks)
  • src/tool/subcommands/api_cmd/test_snapshot.rs (2 hunks)
  • src/utils/flume/mod.rs (2 hunks)
  • src/utils/get_size/mod.rs (1 hunks)
🧬 Code Graph Analysis (5)
src/libp2p/hello/message.rs (1)
src/utils/get_size/mod.rs (1)
  • vec_alike_get_size (8-13)
src/message_pool/msgpool/selection.rs (1)
src/utils/flume/mod.rs (1)
  • bounded_with_default_metrics_registry (52-57)
src/tool/subcommands/api_cmd/generate_test_snapshot.rs (2)
src/utils/flume/mod.rs (1)
  • bounded_with_default_metrics_registry (52-57)
src/chain_sync/network_context.rs (1)
  • network_send (140-142)
src/message_pool/msgpool/msg_pool.rs (1)
src/libp2p/service.rs (1)
  • network_sender (403-405)
src/utils/flume/mod.rs (3)
src/message_pool/msgpool/msg_pool.rs (3)
  • flume (486-486)
  • new (71-76)
  • new (461-581)
src/metrics/mod.rs (1)
  • default_registry (26-28)
src/utils/cache/lru.rs (1)
  • size_in_bytes (124-132)
🧰 Additional context used
🧬 Code Graph Analysis (5)
src/libp2p/hello/message.rs (1)
src/utils/get_size/mod.rs (1)
  • vec_alike_get_size (8-13)
src/message_pool/msgpool/selection.rs (1)
src/utils/flume/mod.rs (1)
  • bounded_with_default_metrics_registry (52-57)
src/tool/subcommands/api_cmd/generate_test_snapshot.rs (2)
src/utils/flume/mod.rs (1)
  • bounded_with_default_metrics_registry (52-57)
src/chain_sync/network_context.rs (1)
  • network_send (140-142)
src/message_pool/msgpool/msg_pool.rs (1)
src/libp2p/service.rs (1)
  • network_sender (403-405)
src/utils/flume/mod.rs (3)
src/message_pool/msgpool/msg_pool.rs (3)
  • flume (486-486)
  • new (71-76)
  • new (461-581)
src/metrics/mod.rs (1)
  • default_registry (26-28)
src/utils/cache/lru.rs (1)
  • size_in_bytes (124-132)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests-release
  • GitHub Check: tests
  • GitHub Check: Build Ubuntu
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: All lint checks
🔇 Additional comments (26)
src/tool/offline_server/server.rs (1)

92-92: LGTM: Channel replaced with metrics-enabled variant

The replacement of flume::bounded(5) with bounded_with_default_metrics_registry(5, "network_message".into()) correctly integrates Prometheus metrics tracking for the network message channel. This aligns with the PR's objective of adding size and count tracking for flume channels.

src/utils/get_size/mod.rs (1)

8-13: LGTM: Well-designed generic utility function

The vec_alike_get_size function is elegantly implemented with appropriate generic bounds. Using AsRef<[T]> allows it to work with various slice-like types, and std::mem::size_of_val provides accurate byte size calculation.

src/message_pool/msgpool/selection.rs (1)

896-896: LGTM: Test updated to use metrics-enabled channels

The test function correctly adopts the new size-tracking channel creation pattern, replacing flume::bounded(50) with bounded_with_default_metrics_registry(50, "network_messages".into()). This ensures test consistency with the production code changes.

src/shim/bigint.rs (1)

13-17: LGTM: Accurate heap size calculation for BigInt

The GetSize implementation correctly calculates the heap size by determining the number of bits and converting to bytes using div_ceil(8). This provides accurate memory usage tracking for big integer values in the size-tracking system.

src/rpc/methods/sync.rs (1)

178-183: LGTM: Test helper updated to use size-tracking channels

The ctx() function correctly adopts the new size-tracking channel pattern. The return type is properly updated to SizeTrackingReceiver<NetworkMessage> and the channel creation uses bounded_with_default_metrics_registry, maintaining test consistency with production code changes.

src/libp2p/hello/message.rs (4)

6-8: LGTM! Clean imports for size tracking functionality.

The imports are well-organized and the alias vec_alike_get_heap_size clearly indicates the intent of measuring heap size for vector-like structures.


13-13: Excellent addition of size tracking capability.

Deriving GetSize for HelloRequest enables precise heap size measurement, which is crucial for the network message size tracking being implemented across the codebase.


15-16: Proper use of custom size function for vector-like field.

The vec_alike_get_heap_size function is correctly applied to the heaviest_tip_set field, ensuring accurate heap size measurement for the NonEmpty<Cid> collection.


19-20: Smart optimization by excluding fixed-size identifier.

Ignoring the genesis_cid field from size calculations is a reasonable optimization since Cid is a small, fixed-size identifier that doesn't contribute significantly to heap usage.

src/rpc/mod.rs (2)

16-16: Appropriate import for size-tracking functionality.

The addition of SizeTrackingSender import aligns with the systematic upgrade to size-tracking channels throughout the codebase.


480-482: Consistent upgrade to size-tracking sender.

The return type change to SizeTrackingSender<NetworkMessage> is part of the systematic refactor to enable network message size tracking in RPC operations, maintaining the same interface while adding metrics capabilities.

src/message_pool/msgpool/mod.rs (4)

19-19: Clean import addition for size tracking.

The addition of SizeTrackingSender to the existing utils import is appropriate and maintains clean import organization.


337-337: Appropriate test utility import.

The import of bounded_with_default_metrics_registry enables the test code to use the new size-tracking channel constructor consistently with the production code.


57-57: Appropriate parameter type upgrade.

The change from flume::Sender to SizeTrackingSender<NetworkMessage> enables size tracking for message republishing operations while maintaining the same functional interface.


357-357: Consistent test updates for size-tracking channels.

The replacement of flume::bounded with bounded_with_default_metrics_registry in tests ensures consistency between test and production code, maintaining the same capacity while adding size tracking capabilities.

Also applies to: 437-437, 498-498, 591-591

src/tool/subcommands/api_cmd/test_snapshot.rs (3)

5-5: Clean imports for size-tracking functionality.

The imports for SizeTrackingReceiver and bounded_with_default_metrics_registry are appropriate and necessary for the updated channel functionality.


133-133: Appropriate return type upgrade for size tracking.

The change to SizeTrackingReceiver<NetworkMessage> enables size tracking capabilities while maintaining the same functional interface.


136-137: Consistent channel creation with metrics tracking.

The use of bounded_with_default_metrics_registry with capacity 5 and label "network_messages" appropriately adds size tracking while maintaining the existing functionality.

src/tool/subcommands/api_cmd/generate_test_snapshot.rs (3)

6-6: Consistent imports matching refactoring pattern.

The imports for size-tracking functionality are consistent with the pattern established across the codebase, ensuring uniform adoption of the new channel types.


102-102: Consistent return type upgrade across tool subcommands.

The change to SizeTrackingReceiver<NetworkMessage> maintains consistency with similar functions across the codebase while adding size tracking capabilities.


105-106: Systematic completion of size-tracking channel adoption.

The channel creation using bounded_with_default_metrics_registry with consistent parameters completes the systematic upgrade to size-tracking channels across tool subcommands.

src/chain_sync/chain_follower.rs (1)

3-3: LGTM!

The migration from flume::Receiver<NetworkEvent> to SizeTrackingReceiver<NetworkEvent> is correctly implemented across all relevant locations.

Also applies to: 19-19, 26-26, 71-71, 94-94, 140-140

src/chain_sync/network_context.rs (1)

25-25: LGTM!

The migration from flume::Sender<NetworkMessage> to SizeTrackingSender<NetworkMessage> is correctly implemented throughout the module.

Also applies to: 53-53, 123-123, 140-141, 337-337

src/message_pool/msgpool/msg_pool.rs (1)

26-26: LGTM!

The migration to SizeTrackingSender<NetworkMessage> is properly implemented with correct import statements and type updates.

Also applies to: 183-183, 463-463

src/utils/flume/mod.rs (1)

1-314: Well-designed size-tracking wrapper implementation

The implementation provides comprehensive metrics tracking for flume channels with:

  • Thread-safe atomic counters for message counts and sizes
  • Proper Prometheus metrics integration with unique channel identification
  • Preservation of flume's API compatibility
  • Support for both bounded and unbounded channels
  • Correct handling of sync/async operations and streaming
src/libp2p/service.rs (1)

4-16: LGTM! Comprehensive migration to size-tracking channels

All flume channel types have been consistently replaced with their size-tracking variants throughout the service, including:

  • Struct fields
  • Method signatures
  • Channel creation with metrics registry integration
  • Stream creation using the size-tracking receiver

Also applies to: 32-32, 188-191, 241-244, 309-309, 403-404, 408-409

Comment on lines +115 to +123
impl GetSize for NetworkEvent {
fn get_heap_size(&self) -> usize {
match self {
Self::HelloResponseOutbound { request, .. } => request.get_heap_size(),
_ => 0,
}
}
}

Copy link
Contributor

@coderabbitai coderabbitai bot Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Consider expanding GetSize implementation for comprehensive metrics

The current implementation only tracks heap size for HelloResponseOutbound events and returns 0 for all other variants. This might lead to incomplete memory usage metrics.

Consider implementing proper size calculation for other message types that contain heap-allocated data (e.g., PubsubMessage, ChainExchangeRequest, etc.) to get more accurate memory usage metrics.

Also applies to: 160-165


🏁 Script executed:

#!/bin/bash
# Check if NetworkEvent and NetworkMessage variants contain heap-allocated fields that should be tracked
ast-grep --pattern 'enum NetworkEvent {
  $$$
}'

ast-grep --pattern 'enum NetworkMessage {
  $$$
}'

Length of output: 103


🏁 Script executed:

#!/bin/bash
# Locate definitions of NetworkEvent and NetworkMessage for inspection
rg -n "enum NetworkEvent" -A10 -B2
rg -n "enum NetworkMessage" -A10 -B2

Length of output: 1424


🏁 Script executed:

#!/bin/bash
# Inspect GetSize implementation for NetworkMessage
rg -n "impl GetSize for NetworkMessage" -A10 -B5 src/libp2p/service.rs

Length of output: 521


Expand GetSize implementations to cover all heap‐backed variants

To ensure accurate memory‐usage metrics, extend both get_heap_size() impls to account for every variant carrying heap data:

• In src/libp2p/service.rs (impl for NetworkEvent, lines 115–123):

  • Add
    Self::PubsubMessage { message } => message.get_heap_size(),
    before the existing HelloResponseOutbound arm.

• In the same file (impl for NetworkMessage, lines 160–165):

  • Replace the unconditional 0 with a match that handles:
    Self::PubsubMessage { message, .. }         => message.get_heap_size(),
    Self::ChainExchangeRequest { request, .. }  => request.get_heap_size(),
    Self::JSONRPCRequest { method }             => method.get_heap_size(),
    _                                           => 0,

Make sure that PubsubMessage, ChainExchangeRequest, and NetRPCMethods each implement GetSize (or provide custom logic) so their heap allocations are correctly tallied.

🤖 Prompt for AI Agents
In src/libp2p/service.rs around lines 115 to 123, extend the get_heap_size()
implementation for NetworkEvent by adding a match arm for Self::PubsubMessage {
message } that returns message.get_heap_size(), placing it before the existing
HelloResponseOutbound arm. Additionally, in the same file around lines 160 to
165, update the get_heap_size() implementation for NetworkMessage to replace the
unconditional 0 with a match that handles PubsubMessage, ChainExchangeRequest,
and JSONRPCRequest variants by returning their respective get_heap_size()
values, and defaulting to 0 for others. Ensure that PubsubMessage,
ChainExchangeRequest, and NetRPCMethods implement GetSize or have custom logic
to correctly report their heap sizes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, GetSize impl is incomplete. Will fix in upcoming commits

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanabi1224, sounds good! Thanks for acknowledging the incomplete GetSize implementations. Looking forward to seeing the comprehensive size tracking in the upcoming commits.

  (\   /)
 ( ._. )
o_("_")("_")
*thoughtful about completeness*

🧠 Learnings used
Learnt from: hanabi1224
PR: ChainSafe/forest#5841
File: src/utils/get_size/mod.rs:10-10
Timestamp: 2025-07-17T15:21:40.753Z
Learning: The get_size2 crate's GetSize trait provides default implementations: get_stack_size() uses std::mem::size_of, get_heap_size() returns 0, and get_size() returns their sum. An empty impl like `impl GetSize for MyType {}` is valid and uses these defaults, making it suitable for simple wrapper types that don't allocate heap memory.

)
}

#[derive(Debug)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be great to put some docs around this (even with doc examples), especially given its ubiquitous usage across the codebase. Same goes for other introduces SizeTracking* data structures.

I'd also consider putting them (flume, cache and I think there was one more) in a dedicated module, i.e., src/utils/size_tracking/*. This could be done in a separate PR if you decide it makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests?

Comment on lines 8 to 14
pub fn vec_alike_get_size<V, T>(slice: &V) -> usize
where
V: AsRef<[T]>,
{
std::mem::size_of_val(slice.as_ref())
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand the correctness of this? This seems to work at most with shallow collections, but something like this will totally not work:

https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=a8109b6ba152d250b00a46a026ccb91d

@hanabi1224 hanabi1224 marked this pull request as draft July 24, 2025 08:07
@elmattic
Copy link
Contributor

@hanabi1224 Should we close this draft PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants