Skip to content

when is PMIx_Fence required? #511

@thomasgillis

Description

@thomasgillis

Hi all, I am reaching out with a question about the requirements around PMIx_Fence.

Context

For our application, we are looking at the case where processes generate globally unique keys. They are posted to other processes using PMIx_Put.
On the other side, the processes requesting the value associated with the globally unique key have no knowledge of where the key is located, so they rely on PMIX_RANK_UNDEF.

Clarification

For that specific case, the API documentation mentions a few things around the need of PMIx_Fence but the final answer is not clear to me. Here are the list of sections I have identified to relate to this use case and some associated questions:

  • section 5.3 (L1-4), about how PMIx_Get is intended to work:

If the target process has a rank of PMIX_RANK_UNDEF, then this indicates that the key being requested is globally unique and not associated with a specific process. In this case, the server shall hold the request until either the data appears at the server or, if given, the PMIX_TIMEOUT is reached. [...]

It's clear that the PMIx_Get will block until the key is available (or timeout). But it's unclear to me how can I guarantee that the data appears on the server?

  • section 5.1 (L27-34)

[...] However, in some cases, non-reserved keys are provided on a globally unique basis and the retrieving process has no knowledge of the identity of the process posting the key. This is typically found in legacy applications (where the originating process identifier is often embedded in the key itself) and in unstructured applications that lack rank-related behavior. In these cases, the key remains associated with the namespace of the process that posted it, but is retrieved by use of the PMIX_RANK_UNDEF rank. In addition, the keys must be globally exchanged prior to retrieval as there is no way for the host to otherwise locate the source for the information.

Here, I presume that "must be globally exchanged prior to retrieval" refers to the "global" method?

Global, collective exchange of the information prior to retrieval. This is accomplished by executing a barrier operation that includes collection and exchange of the data provided by each process such that each process has access to the full set of data from all participants once the operation has completed. PMIx provides the PMIx_Fence function (or its non-blocking equivalent) for this purpose, accompanied by the PMIX_COLLECT_DATA qualifier.

If so, does it mean that to have the data on the server, we are semantically required to call PMIx_Fence in our case?

Thanks for your help in clarifying the semantics :-)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions