CH4 performance worse than CH3 in 2-process ping-pong benchmark #7592
AhmeddHanyy
started this conversation in
General
Replies: 1 comment 2 replies
-
|
Could you provide your data and setup details? Are you testing intranode latency? |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have been testing a simple ping-pong communication benchmark with 2 processes using MPICH.
When I compare MPICH 4.2.2 with CH3 vs CH4, I see that CH3 consistently outperforms CH4 by ~1.5x.
Observations
CH3 (mpich-4.2.2-ch3):
Better latency and throughput in my ping-pong test.
CH4 (mpich-4.2.2 default build, with libfabric):
By default, it selects sockets as the provider.
With sockets → performance is worse than CH3.
With FI_PROVIDER=shm (forcing shared memory provider) → performance improves, but still worse than CH3.
My expectation
From the documentation, I believed that CH4 should provide equal or better performance compared to CH3, especially in the shared-memory 2-process case.
Question
Is this performance difference expected?
Beta Was this translation helpful? Give feedback.
All reactions