Skip to content

Multithreaded GroupedDataFrame with PyCall leads to "signal 11 (1): Segmentation fault" #3498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pfarndt opened this issue Apr 26, 2025 · 5 comments
Labels
bug ecosystem Issues in DataFrames.jl ecosystem

Comments

@pfarndt
Copy link

pfarndt commented Apr 26, 2025

I sometimes (actually often) get signal 11 segmentation faults while using DataFrames in multithreaded Julia sessions. I managed to create a MWE:

Consider this script:

using DataFrames
using PyCall

import Pkg
Pkg.status()


println("threads: ", Threads.nthreads())

df = DataFrame(col1 = rand(1000), col2 = rand(1:2, 1000))
# col1 random numbers
# col2 random 1 or 2


println("\nv1:")
o = combine(df, :col1  => sum => :sum_col1)
println(o)

println("\nv2:")
o = combine(groupby(df, :col2), :col1 => sum => :sum_col1)
println(o)

# same thing using PyCall

println("\nv3:")
o = combine(df, :col1 => (r -> py"sum"(r)) => :sum_col1)
println(o)

println("\nv4:")
o = combine(groupby(df, :col2), :col1 => (r -> py"sum"(r)) => :sum_col1)
println(o)

The thing is that this script runs smoothly non-GroupedDataFrames (see "v3") and for low numbers of threads, i.e. <= 4 on my MacBook and <= 13 on my linux box. However once I use more threads I consistently get segmentation faults as shown here:

> julia --proj -t 16 script.jl 
(pwd(), Base.active_project(), gethostname()) = ("/home/arndt/tmp/GDF-bug", "/home/arndt/tmp/GDF-bug/Project.toml", "tardis")
Status `~/tmp/GDF-bug/Project.toml`
  [a93c6f00] DataFrames v1.7.0
  [438e738f] PyCall v1.96.4
threads: 16

v1:
1×1 DataFrame
 Row │ sum_col1 
     │ Float64  
─────┼──────────
   1 │  498.327

v2:
2×2 DataFrame
 Row │ col2   sum_col1 
     │ Int64  Float64  
─────┼─────────────────
   1 │     1   259.221
   2 │     2   239.106

v3:
1×1 DataFrame
 Row │ sum_col1 
     │ Float64  
─────┼──────────
   1 │  498.327

v4:

[16274] signal 11 (1): Segmentation fault
in expression starting at /home/arndt/tmp/GDF-bug/script.jl:35
_PyErr_GetRaisedException at /usr/local/src/conda/python-3.12.2/Python/errors.c:490 [inlined]
PyErr_GetRaisedException at /usr/local/src/conda/python-3.12.2/Python/errors.c:499
PyObject_ClearWeakRefs at /usr/local/src/conda/python-3.12.2/Objects/weakrefobject.c:959
array_dealloc at /home/arndt/.julia/conda/3/lib/python3.12/site-packages/numpy/core/_multiarray_umath.cpython-312-x86_64-linux-gnu.so (unknown line)
pydecref_ at /home/arndt/.julia/packages/PyCall/1gn3u/src/PyCall.jl:118
pydecref at /home/arndt/.julia/packages/PyCall/1gn3u/src/PyCall.jl:123
jfptr_pydecref_4569 at /home/arndt/.julia/compiled/v1.11/PyCall/GkzkC_jDbxf.so (unknown line)
run_finalizer at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/gc.c:303
jl_gc_run_finalizers_in_list at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/gc.c:393
run_finalizers at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/gc.c:439
jl_mutex_unlock at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/julia_locks.h:80 [inlined]
jl_generate_fptr_impl at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/jitlayers.cpp:545
jl_compile_method_internal at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/gf.c:2536 [inlined]
jl_compile_method_internal at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/gf.c:2423
_jl_invoke at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/gf.c:2940 [inlined]
ijl_apply_generic at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/gf.c:3125
_combine_process_pair at /home/arndt/.julia/packages/DataFrames/kcA9R/src/groupeddataframe/splitapplycombine.jl:630
#781 at /home/arndt/.julia/packages/DataFrames/kcA9R/src/groupeddataframe/splitapplycombine.jl:742
unknown function (ip: 0x7f6592b3c1af)
jl_apply at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
start_task at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/task.c:1202
Allocations: 8373914 (Pool: 8372179; Big: 1735); GC: 9
Segmentation fault (core dumped)
@bkamins bkamins added bug ecosystem Issues in DataFrames.jl ecosystem labels Apr 27, 2025
@bkamins
Copy link
Member

bkamins commented Apr 27, 2025

It seems to me to be a bug in PyCall.jl not in DataFrames.jl.
If I understand your example correctly the Julia sum works OK, but py"sum" errors. Right?

To be sure could you also please check the following code:

combine(groupby(df, :col2), :col1 => (r -> sum(r)) => :sum_col1)

(which would test a custom Julia function, as sum is from Julia Base, which might matter here)

(I cannot reproduce the error as I do not have access to a machine with that many threads)

@pfarndt
Copy link
Author

pfarndt commented Apr 27, 2025

Your suggested line

combine(groupby(df, :col2), :col1 => (r -> sum(r)) => :sum_col1)

runs smoothly for whatever number of threads.

I agree that the involvement of PyCall looks strange. But I still believe that it has something to do with GroupedDataFrame, since the un-grouped command runs.

Thanks for taking care of this issue.

@pfarndt
Copy link
Author

pfarndt commented Apr 27, 2025

This construction also gives a segmentation fault if I run it with many threads (as indicated above):

function mysum(r)
    s = collect(r)
    return py"sum"(s)
end

combine(groupby(df, :col2), :col1 => mysum => :sum_col1)

@pfarndt
Copy link
Author

pfarndt commented Apr 27, 2025

I think in the end it is a PyCall.jl issue

@pfarndt pfarndt closed this as completed Apr 27, 2025
@bkamins
Copy link
Member

bkamins commented Apr 27, 2025

Thank you for reporting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug ecosystem Issues in DataFrames.jl ecosystem
Projects
None yet
Development

No branches or pull requests

2 participants