-
Notifications
You must be signed in to change notification settings - Fork 162
Drop dependency on JSON3, use JSON instead
#5451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
At least some of the code that was reading data using JSON3 should be switched to |
From the migration guide: The lazy support in JSON.jl is truly lazy and the underlying JSON is only parsed/navigated as explicitly requested. JSON3.jl still fully parsed the JSON into a fairly compact binary representation, avoiding full materialization of objects and arrays.
I'll have a look once the rest of this PR works. |
|
I have one failure that leaves me a but puzzled: When trying to load https://github.com/oscar-system/serialization-upgrade-tests/blob/ce778d995ea63dd73307e7a9ecde1bf5b215ba22/version_1_3_0/Vector-86/0.mrdi (yes, only this single one of all the upgrade tests fails...) I get the following stacktrace: julia> load("0.mrdi")
ERROR: MethodError: no method matching getindex(::String, ::Symbol)
The function `getindex` exists, but no method is defined for this combination of argument types.
Closest candidates are:
getindex(::String, ::UnitRange{Int64})
@ Base strings/string.jl:496
getindex(::String, ::Int64)
@ Base strings/string.jl:460
getindex(::String, ::AbstractUnitRange{<:Integer})
@ Base strings/string.jl:494
...
Stacktrace:
[1] upgrade_1_4_0(s::UpgradeState, dict::Dict{Symbol, Any})
@ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/Upgrades/1.4.0.jl:163
[2] upgrade_1_4_0(s::UpgradeState, dict::JSON.Object{Symbol, Any})
@ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/Upgrades/1.4.0.jl:279
[3] (::Oscar.Serialization.UpgradeScript)(s::UpgradeState, dict::JSON.Object{Symbol, Any})
@ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/Upgrades/main.jl:53
[4] upgrade(format_version::VersionNumber, dict::JSON.Object{Symbol, Any})
@ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/Upgrades/main.jl:273
[5] load(io::IOStream; params::Nothing, type::Nothing, serializer::Oscar.Serialization.JSONSerializer, with_attrs::Bool)
@ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/main.jl:795
[6] load
@ ~/code/julia/Oscar.jl/src/Serialization/main.jl:761 [inlined]
[7] #564
@ ~/code/julia/Oscar.jl/src/Serialization/main.jl:856 [inlined]
[8] open(f::Oscar.Serialization.var"#564#565"{Nothing, Nothing, Oscar.Serialization.JSONSerializer}, args::String; kwargs::@Kwargs{})
@ Base ./io.jl:410
[9] open
@ ./io.jl:407 [inlined]
[10] #load#563
@ ~/code/julia/Oscar.jl/src/Serialization/main.jl:855 [inlined]
[11] load(filename::String)
@ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/main.jl:852
[12] top-level scope
@ REPL[20]:1@antonydellavecchia @benlorenz if you have any idea where this comes from and why we haven't seen it before, this would be much appreciated. Otherwise, I'll investigate further in a few days. |
for some reason |
1f4ecfd to
8d1a301
Compare
I did a small benchmark (17c86c3 is a recent master commit): julia> K,a = quadratic_field(42);
julia> R,(x,y) = polynomial_ring(K, [:x,:y]) ;
julia> f = (9//10*a + 1//2)*x^9*y^10 + (3*a + 1)*x^8*y^2 + (a + 7//2)*x^5*y^3 + (1//3*a + 7//2)*x^2*y^9 + (2//9*a + 3)*x^2*y^2 + (8//5*a + 7//9)*x*y^3;
julia> save("/tmp/bla", f)
julia> @b load("/tmp/bla")
210.301 μs (898 allocs: 54.695 KiB) #17c86c3
172.651 μs (916 allocs: 38.961 KiB) #lg/drop-JSON3
julia> @b begin Oscar.Serialization.reset_global_serializer_state(); load("/tmp/bla") end
333.322 μs (1326 allocs: 80.406 KiB) #17c86c3
216.521 μs (1108 allocs: 45.383 KiB) #lg/drop-JSON3
julia> save("/tmp/bla", [f for _ in 1:1000])
julia> @b load("/tmp/bla")
168.491 ms (739706 allocs: 39.564 MiB) #17c86c3
142.924 ms (710724 allocs: 27.903 MiB) #lg/drop-JSON3
julia> @b begin Oscar.Serialization.reset_global_serializer_state(); load("/tmp/bla") end
169.655 ms (740134 allocs: 39.631 MiB) #17c86c3
144.615 ms (710916 allocs: 27.905 MiB) #lg/drop-JSON3
julia> save("/tmp/blamaster", [f for _ in 1:100000])
julia> @time load("/tmp/blamaster");
20.340702 seconds (74.00 M allocations: 3.760 GiB, 13.14% gc time) #17c86c3
21.463136 seconds (71.10 M allocations: 2.719 GiB, 28.77% gc time) #lg/drop-JSON3At least according to this piece of work, the runtime of the eager version now contained in this PR is comparable to the previous one using JSON3. |
benlorenz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable, it would be good if @antonydellavecchia could also have a look.
I added the extra-long label to let these tests run as well.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #5451 +/- ##
==========================================
+ Coverage 84.05% 84.08% +0.02%
==========================================
Files 730 730
Lines 99189 100279 +1090
==========================================
+ Hits 83375 84319 +944
- Misses 15814 15960 +146
🚀 New features to boost your workflow:
|
antonydellavecchia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment, but looks good
| # with a name and a params key | ||
| dict[:_type] isa String && return dict | ||
| if dict[:data] isa Vector{String} | ||
| if !isempty(dict[:data]) && all(e -> e isa String, dict[:data]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we ended up losing the type information for some reason here.
Do you know why? I believe it's because you we use Dict{Symbol, Any} now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure where exactly we are losing the type information, but iirc it was already in the JSON library call. But having a Vector{Any} here is actually better for type stability on the macro level as the the type does not depend on the content of the json file.
|
The job for the extra-long tests is running for about 2 hours now last night the whole extra-long tests took about 1 hour, and that file 2874s: Edit: So the time for that file almost doubled and the whole job now finished after 3 hours, with this very long testfile: Should we try to look into this before merging or will this be checked when this is improved for lazy parsing? |
|
As a first step, I would try to factor all of the changes out of this PR that are needed for the switch from JSON3 to JSON, but that are already possible to do right now. That way, we split this large PR into two and see which of the two parts influences the runtime by how much. |
2716135 to
5c5164a
Compare
5c5164a to
9a73158
Compare
Next step towards #5438.