-
Notifications
You must be signed in to change notification settings - Fork 930
Open
Description
I have built Open MPI with the address sanitizer enabled and get this error when launching the application:
=================================================================
==473823==ERROR: AddressSanitizer: odr-violation (0x7ffff156e4a0):
[1] size=64 'mca_common_sm_module_t_class' ../../../../../opal/mca/common/sm/common_sm.c:43:1
[2] size=64 'mca_common_sm_module_t_class' ../../../../../opal/mca/common/sm/common_sm.c:43:1
These globals were registered at these points:
[1]:
#0 0x7ffff7762b28 in __asan_register_globals ../../../../libsanitizer/asan/asan_globals.cpp:346
#1 0x7ffff15603f4 in _sub_I_00099_1 (/gpfs/home/jschuchart/opt/ompi-big-datatypes/lib/openmpi/mca_btl_smcuda.so+0x203f4)
#2 0x7ffff7fcc51d in call_init /usr/src/debug/glibc-2.34-168.el9_6.23.x86_64/elf/dl-init.c:70
#3 0x7ffff7fcc51d in call_init /usr/src/debug/glibc-2.34-168.el9_6.23.x86_64/elf/dl-init.c:26
[2]:
#0 0x7ffff7762b28 in __asan_register_globals ../../../../libsanitizer/asan/asan_globals.cpp:346
#1 0x7fffe1c1857f in _sub_I_00099_1 (/gpfs/home/jschuchart/opt/ompi-big-datatypes/lib/libopen-pal.so.0+0x25457f)
#2 0x7ffff7fcc51d in call_init /usr/src/debug/glibc-2.34-168.el9_6.23.x86_64/elf/dl-init.c:70
#3 0x7ffff7fcc51d in call_init /usr/src/debug/glibc-2.34-168.el9_6.23.x86_64/elf/dl-init.c:26
==473823==HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0
SUMMARY: AddressSanitizer: odr-violation: global 'mca_common_sm_module_t_class' at ../../../../../opal/mca/common/sm/common_sm.c:43:1
==473823==ABORTING
It seems that libopen-palmca_common_sm_noinst.a (which contains mca_common_sm_module_t_class) is built statically and gets linked into both libopen-pal.so and mca_btl_smcuda.so, which leads to two instances of global variables with the same name being loaded.
In common/sm/Makefile.am I found this comment:
# Note that building this common component statically and linking
# against other dynamic components is *not* supported!
I think by building smcuda dynamically and linking common_sm statically we're violating that note. Maybe we should force common_sm to be built dynamically if mca_btl_smcuda.so is being built?
Metadata
Metadata
Assignees
Labels
No labels