-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Description
Throwing ReadOnlyMemoryError() when trying to read parquet using data = DataFrame(read_parquet(parquet_dir)).
I tries using other python libraries and the files are being read properly.
Note: I cannot share the data being used.
parquet file info
for filename in readdir(parquet_dir; join=true)
print(Parquet.File(filename))
break
end
gives
Parquet file: ./data/name.parquet/part-00000-2a4r6a85-ff94-4316-b790-ga9ec2aafgg5-c000.snappy.parquet
version: 1
nrows: 5491
created by: parquet-mr version 1.5.0-cdh5.13.3 (build ${buildNumber})
cached: 0 column chunks
Looking at stacktrace
try
data = DataFrame(read_parquet(parquet_dir))
catch
show(stdout, "text/plain", stacktrace())
end
gives
9-element Vector{Base.StackTraces.StackFrame}:
top-level scope at In[31]:5
eval at boot.jl:360 [inlined]
include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String) at loading.jl:1116
softscope_include_string(m::Module, code::String, filename::String) at SoftGlobalScope.jl:65
execute_request(socket::ZMQ.Socket, msg::IJulia.Msg) at execute_request.jl:67
#invokelatest#2 at essentials.jl:708 [inlined]
invokelatest at essentials.jl:706 [inlined]
eventloop(socket::ZMQ.Socket) at eventloop.jl:8
(::IJulia.var"#15#18")() at task.jl:411
Using
Parquet v0.8.3
Julia info
Julia Version 1.6.4
Commit 35f0c911f4 (2021-11-19 03:54 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin19.5.0)
CPU: Apple M1 Max
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-11.0.1 (ORCJIT, westmere)
Metadata
Metadata
Assignees
Labels
No labels