-
Notifications
You must be signed in to change notification settings - Fork 28
Investigate increased memory usage while syncing with Genesis in 10.5 #1545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I have to pause this, leaving some plots/analysis here: Plot of a mainnet sync from Genesis using the default Genesis configLocal sync of the first 1e6 slotsThis definitely shows that Genesis uses more memory. One slightly subtle thing is the creation of a "local-only" peer snapshot, see here. Also see the eventlog2html output (with a custom GHC and Sync of the first 1e6 slots using Praos, but using 30 peersThis variant (suggested by @karknu) intends to reveal to what extend this memory increase is related just to the amount of peers. Things to note:
This seems to imply that the Genesis memory increase is at least partially due to some non-Genesis-related per-peer leak. (The Possible next steps
|
This trifecta is confusing me.
Would it be easy to add a "live bytes" line to the second plot? I'm wondering if the RSS is initially inflated since maybe the GC isn't under pressure to conserve memory? (That sounds like a silly behavior, but seems worth a check if it's easy to check.) |
Done 👍 The live bytes definitely are rather dispersed for Genesis. |
Uh oh!
There was an error while loading. Please reload this page.
@karknu has performed some experiments with full syncs using 10.5 (which is not yet released, but using
ouroboros-consensus-0.27
which is intended for 10.5, with some additional patches), and reports an increase in memory, causing the node to crash with a 28GB heap limit:The goal of this ticket is to find out whether there is a new serious memory leak (which we must fix) or whether the memory requirement just grew slightly (eg it is expected that #1288 should increase peak memory usage slightly), which is not a big problem.
We know that 10.4.1 definitely doesn't have a (serious) leak; Nick did a full sync, and the SDET sync tests also support this. We do know that Genesis uses a bit more memory than Praos (goes away after a restart once caught-up), which would be nice to fully understand, but it is low priority.
The text was updated successfully, but these errors were encountered: