Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPICH high memory footprint #7199

Open
aditya-nishtala opened this issue Nov 7, 2024 · 6 comments
Open

MPICH high memory footprint #7199

aditya-nishtala opened this issue Nov 7, 2024 · 6 comments
Assignees
Labels

Comments

@aditya-nishtala
Copy link

We ran a simple hello world mpich program where each rank prints the rank id + hostname its running on.
The program allocates no memory at all, all of the memory allocation comes from whatever MPICH is doing.

We scaled the from 32 nodes to 768 nodes and measured how much memory is being consumed.
MPICH commit tag is 204f8cd
This is happening on Aurora

Memory Consumption is equivalent whether using DDR or HBM. Below data is measured on DDR.

  Max DDR utilization (GB)
Node count mpich/ opt/ develop-git.204f8cd
32 22.53
64 24.58
128 28.16
256 35.33
512 50.18
768 68.10
@nsdhaman
Copy link

nsdhaman commented Nov 7, 2024

Note that above data is with PPN 96. The reported memory footprint values are in GB per socket. There is linear increase in memory overhead and it is persisting through entire program execution.

@hzhou
Copy link
Contributor

hzhou commented Nov 7, 2024

@aditya-nishtala Could you retry the experiment using a debug-build and enable MPIR_CVAR_DEBUG_SUMMARY=1 and then post one of the log? That may help identify whether the memory is allocated by MPICH or by one of its dependent libraries.

@hzhou
Copy link
Contributor

hzhou commented Nov 7, 2024

Taking the difference, the memory increase are roughly linear to the number of nodes, ~55-65 MB/Node. @aditya-nishtala How many PPN (process per node)?

@nsdhaman
Copy link

nsdhaman commented Nov 7, 2024

This is with PPN 96.

@hzhou
Copy link
Contributor

hzhou commented Nov 7, 2024

Thanks @nsdhaman . So that is roughly 6KB per connection.

@hzhou
Copy link
Contributor

hzhou commented Nov 7, 2024

Okay, I think the issue is we are allocating too much address table prepared for all possible connections. If we assume no application will use multiple VCI, we could configure with --with-ch4-max-vcis=1, that will cut down the memory growth by 1/64.

For more appropriate fix, we could change the av table accommodate multi-VCI/NIC entries dynamically rather than statically. I probably can implement something like that.

@hzhou hzhou self-assigned this Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants