Skip to content

Need OMPI_MCA_osc=sm,pt2pt when using libyt #86

@cindytsai

Description

@cindytsai

Need OMPI_MCA_osc=sm,pt2pt when using libyt

At the time I was developing features related to using RMA (remote memory access), all I care is make it work on HPC system and haven't thought much about why do we need this parameter so that it can run on HPC. We don't need this on single machine, ex: my laptop.

TODO

  • What is OMPI_MCA_osc=sm,pt2pt? and why?

Problems

  • Slow and isn't recommanded
    • Though when using the parameter for strong scaling test, it is still faster than post-processing. (Just for reference).

When do we need this?

Attaching same pointer multiple times

When I was testing particle array using example like this:

int temp[0] = {myrank};
grids_local[index_local].particle_data[0][3].data_ptr = temp;

I get error:

[xps:25522] *** An error occurred in MPI_Win_attach
[xps:25522] *** reported by process [3353411585,1]
[xps:25522] *** on win rdma window 3
[xps:25522] *** MPI_ERR_RMA_ATTACH: Could not attach RMA segment
[xps:25522] *** MPI_ERRORS_ARE_FATAL (processes in this win will now abort,
[xps:25522] ***    and potentially your MPI job)
[xps:25513] 3 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[xps:25513] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

This is probably caused by attaching same data to windows.
But it is strange that it can be fixed by using

OMPI_MCA_osc=sm,pt2pt mpirun -np 4 ./example

When running on Taiwania 3 and Eureka

Needs to add:

OMPI_MCA_osc=sm,pt2pt mpirun -np 4 ./example

Otherwise I get error:

(something related to attaching...)

Metadata

Metadata

Assignees

No one assigned

    Labels

    parallelismParallel computing (ex, MPI, OpenMP) issue.questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions