Skip to content

Conversation

@AkiRusProd
Copy link

Add Version Details to NVSHMEM Version Mismatch Errors

Problem: When NVSHMEM device and host library versions mismatch, the error message only provided a generic warning without specific version details, making it difficult to diagnose compatibility issues.

Solution: Added detailed version number output in major.minor.patch format when version mismatches are detected:

  • In nvshmemid_hostlib_init_attr function during library initialization
  • In nvshmemi_cuobject_init_common function during CUmodule/CUlibrary initialization

Error output example now shows:

NVSHMEM device library version (3.3.24) does not match with NVSHMEM host library version (3.2.5)

Benefits:

  • Accelerates compatibility issue diagnosis
  • Allows precise identification of which version needs updating
  • Simplifies debugging in heterogeneous environments with different library versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant