-
Notifications
You must be signed in to change notification settings - Fork 142
Description
Hello,
@phibeck reported this bug to me and I have made a mwe of a segfault. setindex!
on a dataset with an array of a compound datatype appears to be thread unsafe.
This is the mwe
using HDF5
T = ComplexF64
# T = Float64 # OK
sizes = (100, 100, 100)
Threads.@threads for i in 1:Threads.nthreads()
h5open("file$i.hdf5", "w") do h5
d = create_dataset(h5, "rand", T, sizes)
for n in 1:sizes[3]
d[:,:,n] = rand(T, sizes[1:2])
end
end
end
This is the segfault, although the message is different every time
$ julia --project --threads 6 mwe.jl
[400950] signal 7 (128): Bus error
in expression starting at /local/home/lxvm/projects/autobz/hdf5/mwe.jl:6
H5FL_reg_malloc at /home/lxvm/.julia/artifacts/3f844d84068534dcd6606936ab5f28e1120a9bb0/lib/libhdf5.so (unknown line)
H5FL_reg_calloc at /home/lxvm/.julia/artifacts/3f844d84068534dcd6606936ab5f28e1120a9bb0/lib/libhdf5.so (unknown line)
H5CX_push at /home/lxvm/.julia/artifacts/3f844d84068534dcd6606936ab5f28e1120a9bb0/lib/libhdf5.so (unknown line)
H5Sget_simple_extent_type at /home/lxvm/.julia/artifacts/3f844d84068534dcd6606936ab5f28e1120a9bb0/lib/libhdf5.so (unknown line)
h5s_get_simple_extent_type at /home/lxvm/.julia/packages/HDF5/Z859u/src/api/functions.jl:6716
setindex! at /home/lxvm/.julia/packages/HDF5/Z859u/src/datasets.jl:368
unknown function (ip: 0x7f6496319060)
#2 at /local/home/lxvm/projects/autobz/hdf5/mwe.jl:10
#17 at /home/lxvm/.julia/packages/HDF5/Z859u/src/file.jl:101
task_local_storage at ./task.jl:315
#h5open#16 at /home/lxvm/.julia/packages/HDF5/Z859u/src/file.jl:96 [inlined]
h5open at /home/lxvm/.julia/packages/HDF5/Z859u/src/file.jl:94 [inlined]
macro expansion at /local/home/lxvm/projects/autobz/hdf5/mwe.jl:7 [inlined]
#26#threadsfor_fun#1 at ./threadingconstructs.jl:252
#26#threadsfor_fun at ./threadingconstructs.jl:219 [inlined]
#1 at ./threadingconstructs.jl:154
unknown function (ip: 0x7f649630ba9f)
jl_apply at /cache/build/builder-demeter6-6/julialang/julia-master/src/julia.h:2157 [inlined]
start_task at /cache/build/builder-demeter6-6/julialang/julia-master/src/task.c:1202
Allocations: 4069986 (Pool: 4069810; Big: 176); GC: 6
Bus error
In @phibeck's bug report the segfault showed a trace through setindex! -> get_jl_type -> get_mem_compatible_jl_type -> h5t_get_member_name
and I noticed there was no lock on the library call in this function:
Lines 982 to 989 in 1a24872
# Note: The following two functions implement direct ccalls because the binding generator | |
# cannot (yet) do the string wrapping and memory freeing. | |
""" | |
h5t_get_member_name(type_id::hid_t, index::Cuint) -> String | |
See `libhdf5` documentation for [`H5Oopen`](https://portal.hdfgroup.org/display/HDF5/H5T_GET_MEMBER_NAME). | |
""" | |
function h5t_get_member_name(type_id, index) |
It seems that calling setindex!
with an array of a compound datatype, such as ComplexF64
, hits this codepath and consistently segfaults.
Although our code could be rewritten to avoid multi-threaded hdf5 calls, I am hoping to identify and fix the issue.
For reference, I ran this on linux and this is the environment:
julia --project
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.11.1 (2024-10-16)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
(hdf5) pkg> st -m
Status `/local/home/lxvm/projects/autobz/hdf5/Manifest.toml`
[34da2185] Compat v4.16.0
[f67ccb44] HDF5 v0.17.2
[692b3bcd] JLLWrappers v1.6.1
[3da0fdf6] MPIPreferences v0.1.11
[21216c6a] Preferences v1.4.3
[ae029012] Requires v1.3.0
[0234f1f7] HDF5_jll v1.14.3+3
[e33a78d0] Hwloc_jll v2.11.2+1
[7cb0a576] MPICH_jll v4.2.3+0
[f1f71cc9] MPItrampoline_jll v5.5.1+0
[9237b28f] MicrosoftMPI_jll v10.1.4+2
⌅ [fe0851c0] OpenMPI_jll v4.1.6+0
[458c3c95] OpenSSL_jll v3.0.15+1
[477f73a3] libaec_jll v1.1.2+0
[0dad84c5] ArgTools v1.1.2
[56f22d72] Artifacts v1.11.0
[2a0f44e3] Base64 v1.11.0
[ade2ca70] Dates v1.11.0
[f43a241f] Downloads v1.6.0
[7b1f6079] FileWatching v1.11.0
[4af54fe1] LazyArtifacts v1.11.0
[b27032c2] LibCURL v0.6.4
[76f85450] LibGit2 v1.11.0
[8f399da3] Libdl v1.11.0
[56ddb016] Logging v1.11.0
[d6f4376e] Markdown v1.11.0
[a63ad114] Mmap v1.11.0
[ca575930] NetworkOptions v1.2.0
[44cfe95a] Pkg v1.11.0
[de0858da] Printf v1.11.0
[9a3f8284] Random v1.11.0
[ea8e919c] SHA v0.7.0
[fa267f1f] TOML v1.0.3
[a4e569a6] Tar v1.10.0
[cf7118a7] UUIDs v1.11.0
[4ec0a83e] Unicode v1.11.0
[e66e0078] CompilerSupportLibraries_jll v1.1.1+0
[deac9b47] LibCURL_jll v8.6.0+0
[e37daf67] LibGit2_jll v1.7.2+0
[29816b5a] LibSSH2_jll v1.11.0+1
[c8ffd9c3] MbedTLS_jll v2.28.6+0
[14a3606d] MozillaCACerts_jll v2023.12.12
[83775a58] Zlib_jll v1.2.13+1
[8e850ede] nghttp2_jll v1.59.0+0
[3f19e933] p7zip_jll v17.4.0+2