Skip to content

mpl/ze: fast_memcpy crash due to mmap in implicit mode #7619

@victor-anisimov

Description

@victor-anisimov

One-sided communications crash on very small window sizes when using PVC in implicit scaling mode (gpu_dev_compact.sh) with the following error

mmap failed fd: 46 size: 969998336
mmap device to host: Invalid argument
Abort(15) on node 1: Fatal error in internal_Get: Other MPI error

The same test works fine when using PVC in explicit scaling mode (gpu_tile_compact.sh)

The reproducer is a Fortran-90 file that runs on a single node in an interactive session using 6 MPI ranks.

test.F90.txt

run-test.sh

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions