Skip to content

Problems running oshmpi with Fujitsu MPI on Fugaku #105

@tonycurtis

Description

@tonycurtis

I can build, but this is what I see on a compute node. Any idea?

(gdb) cont
Continuing.
[New Thread 0x4000025ff010 (LWP 14149)]
[New Thread 0x4000029ff010 (LWP 14150)]

Thread 1 "a.out" received signal SIGSEGV, Segmentation fault.
ompi_mfh_base_real_t_cvar_write () at pcvar_write.c:43
43	pcvar_write.c: No such file or directory.
(gdb) bt
#0  ompi_mfh_base_real_t_cvar_write () at pcvar_write.c:43
#1  ompi_mfh_ptl_t_cvar_write ()
    at ../../../../src/ompi/mca/mfh/ptl/mfh_ptl_call.h:692
#2  PMPI_T_cvar_write ()
    at ../../../../src/ompi/mca/mfh/base/mfh_base_func_defs.h:13523
#3  0x00004000000a62a8 in set_mpit_cvar (cvar_name=<optimized out>,
    val=<optimized out>) at ../oshmpi-git/src/internal/setup_impl.c:698
#4  0x00004000000a6354 in initialize_mpit ()
    at ../oshmpi-git/src/internal/setup_impl.c:708
#5  0x00004000000a65d4 in OSHMPI_initialize_thread (required=<optimized out>,
    provided=<optimized out>) at ../oshmpi-git/src/internal/setup_impl.c:780
#6  0x00004000000b24a0 in shmem_init () at ../oshmpi-git/src/shmem/setup.c:13
#7  0x0000000000400ee4 in main () at hello.c:64
(gdb) q
A debugging session is active.

	Inferior 1 [process 14143] will be detached.

Quit anyway? (y or n) y
Detaching from program: /vol0004/ra010008/XXXXXX/shmem/openshmem-examples/c/a.out, process 14143
[Inferior 1 (process 14143) detached]
[c34-0003c:14143] *** Process received signal ***
[c34-0003c:14143] Signal: Segmentation fault (11)
[c34-0003c:14143] Signal code: Address not mapped (1)
[c34-0003c:14143] Failing at address: 0x1
[c34-0003c:14143] [ 0] linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x40000006066c]
[c34-0003c:14143] [ 1] /opt/FJSVxtclanga/tcsds-1.2.30a/lib64/libmpi.so.0(PMPI_T_cvar_write+0x54)[0x40000023d574]
[c34-0003c:14143] [ 2] /home/ra010008/XXXXXX/opt/oshmpi/git/lib/liboshmpi.so.0(+0x162a8)[0x4000000a62a8]
[c34-0003c:14143] [ 3] /home/ra010008/XXXXXXopt/oshmpi/git/lib/liboshmpi.so.0(+0x16354)[0x4000000a6354]
[c34-0003c:14143] [ 4] /home/ra010008/XXXXXX/opt/oshmpi/git/lib/liboshmpi.so.0(OSHMPI_initialize_thread+0x270)[0x4000000a65d4]
[c34-0003c:14143] [ 5] /home/ra010008/XXXXXX/opt/oshmpi/git/lib/liboshmpi.so.0(shmem_init+0x24)[0x4000000b24a0]
[c34-0003c:14143] [ 6] ./a.out[0x400ee4]
[c34-0003c:14143] [ 7] /lib64/libc.so.6(__libc_start_main+0xe4)[0x400001030be4]
[c34-0003c:14143] [ 8] ./a.out[0x400dfc]
[c34-0003c:14143] *** End of error message ***

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions