Skip to content

Libhio file/node on Lustre hangs  #53

@gshipman

Description

@gshipman

Hi folks,

I'm trying to use libhio on a lustre file system and it is hanging here (see stack trace below).
Interestingly it does not hang when using file/node on GPFS.

0x00002aaab5a9e9d3 in __pwrite64_nocancel () from /lib64/libpthread.so.0

(gdb) bt
#0  0x00002aaab5a9e9d3 in __pwrite64_nocancel () from /lib64/libpthread.so.0
#1  0x00002aaab1dd3ebd in hioi_file_write (file=0xa, ptr=0x7fffffe1c4a0, count=1921280) at hio_internal.c:622
#2  0x00002aaab1ddae2c in builtin_posix_module_element_io_internal (posix_module=0xa, element=0x7fffffe1c4a0, 
    offset=1921280, iovec=0x2aaab5a9e9d3 <__pwrite64_nocancel+10>, count=-20878584, reading=128)
    at builtin-posix_component.c:2123
#3  0x00002aaab1dda7d6 in builtin_posix_module_process_reqs (dataset=0xa, reqs=0x7fffffe1c4a0, req_count=1921280)
    at builtin-posix_component.c:2233
#4  0x00002aaab1de3312 in hio_element_write_strided_nb (element=0xa, request=0x7fffffe1c4a0, offset=1921280, 
    reserved0=46912680618451, ptr=0x3a668ffec16b08, count=46912496271488, size=1921280, stride=0)
    at api/element_write.c:151
#5  0x00002aaab1de31cc in hio_element_write_strided (element=0xa, offset=140737486374048, reserved0=1921280, 
    ptr=0x2aaab5a9e9d3 <__pwrite64_nocancel+10>, count=16438317289663240, size=46912496271488, stride=69)
    at api/element_write.c:99
#6  0x00002aaab1de3191 in hio_element_write (element=0xa, offset=140737486374048, reserved0=1921280, 
    ptr=0x2aaab5a9e9d3 <__pwrite64_nocancel+10>, count=16438317289663240, size=46912496271488)
    at api/element_write.c:85
#7  0x000000000196e912 in hioc_writeat2 (unit=10, serial=-1981280, data=0x1d5100, 
    offset0=0x2aaab5a9e9d3 <__pwrite64_nocancel+10>, buf_bytes=1921280) at hio.c:752
#8  0x000000000196b875 in hio_module::my_hio_file_write2 (
    unit=<error reading variable: Cannot access memory at address 0xa>, serial=.FALSE., 
    numdata=<error reading variable: Cannot access memory at address 0x1d5100>, pos0=3779645987002334536, 
    data_c1=<error reading variable: Cannot access memory at address 0x3a668ffec16b48>, 
    data_i32=<error reading variable: Cannot access memory at address 0x2aaaaaad00c0>, 
    data_i64=<error reading variable: Location address is not set.>, 
    data_r32=<error reading variable: Location address is not set.>, 
    data_r64=<error reading variable: value requires 1921280 bytes, which is more than max-value-size>)
    at module_hio.f90:255

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions