Skip to content

Conversation

@j-xiong
Copy link
Contributor

@j-xiong j-xiong commented Nov 26, 2025

Cherry-picked fixes for the multi_ep test.

ft_finalize_ep is called in all tests and guarantees that no peer exits without
confirming that the other peer is done as well. This is normally done with an
in band send/recv with the FI_TRANSMIT_COMPLETE flag set which guarantees the
send completion won't be generated until the message has completely be sent.
This patch adds an out of band finalize option using the out of band socket if
the user requested out of band syncs.
Typically an in band finalize is sufficient but may be problematic with certain
scenarios. For example, if the provider doesn't support FI_TRANSMIT_COMPLETE (rare)
and for providers using an unreliable protocol underneath (rxd) where the last
underlying acknowledgement may be dropped, which can result in one side exiting
before the other waiting for an acknowldgement retry that will never come

Signed-off-by: Alexia Ingerson <[email protected]>
(cherry picked from commit c4b8492)
In the RDM case of the multi_ep test, the test calls fi_getinfo again
for each EP in order to override the source address.
The source address was getting set as the node for the new call but the
FI_SOURCE flag was not set meaning the local source address was getting
resolved into the dst_addr which was never used anyway.
This removes the node parameter with the source address to let the
provider set a source address for us.

This also fixes an issue on cleanup if one of the calls to getinfo fails,
the test would try to cleanup the resources and would segfault because
the fi is NULL.

Signed-off-by: Alexia Ingerson <[email protected]>
(cherry picked from commit 1a749cb)
@j-xiong j-xiong requested a review from aingerson November 26, 2025 05:14
@j-xiong j-xiong merged commit 1d8866a into ofiwg:v2.3.x Nov 28, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants