Skip to content

Conversation

@lgoettgens
Copy link
Member

Next step towards #5438.

@lgoettgens lgoettgens added the release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes label Oct 14, 2025
@benlorenz
Copy link
Member

benlorenz commented Oct 14, 2025

At least some of the code that was reading data using JSON3 should be switched to JSON.lazy instead of parsing everything to dicts. JSON3 was doing lazy parsing by default.

@lgoettgens
Copy link
Member Author

JSON3 was doing lazy parsing by default.

From the migration guide: The lazy support in JSON.jl is truly lazy and the underlying JSON is only parsed/navigated as explicitly requested. JSON3.jl still fully parsed the JSON into a fairly compact binary representation, avoiding full materialization of objects and arrays.
So it did some kind of in-between thing.

At least some of the code that was reading data using JSON3 should be switched to JSON.lazy instead of parsing everything to dicts.

I'll have a look once the rest of this PR works.

@lgoettgens
Copy link
Member Author

I have one failure that leaves me a but puzzled: When trying to load https://github.com/oscar-system/serialization-upgrade-tests/blob/ce778d995ea63dd73307e7a9ecde1bf5b215ba22/version_1_3_0/Vector-86/0.mrdi (yes, only this single one of all the upgrade tests fails...) I get the following stacktrace:

julia> load("0.mrdi")
ERROR: MethodError: no method matching getindex(::String, ::Symbol)
The function `getindex` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  getindex(::String, ::UnitRange{Int64})
   @ Base strings/string.jl:496
  getindex(::String, ::Int64)
   @ Base strings/string.jl:460
  getindex(::String, ::AbstractUnitRange{<:Integer})
   @ Base strings/string.jl:494
  ...

Stacktrace:
  [1] upgrade_1_4_0(s::UpgradeState, dict::Dict{Symbol, Any})
    @ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/Upgrades/1.4.0.jl:163
  [2] upgrade_1_4_0(s::UpgradeState, dict::JSON.Object{Symbol, Any})
    @ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/Upgrades/1.4.0.jl:279
  [3] (::Oscar.Serialization.UpgradeScript)(s::UpgradeState, dict::JSON.Object{Symbol, Any})
    @ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/Upgrades/main.jl:53
  [4] upgrade(format_version::VersionNumber, dict::JSON.Object{Symbol, Any})
    @ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/Upgrades/main.jl:273
  [5] load(io::IOStream; params::Nothing, type::Nothing, serializer::Oscar.Serialization.JSONSerializer, with_attrs::Bool)
    @ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/main.jl:795
  [6] load
    @ ~/code/julia/Oscar.jl/src/Serialization/main.jl:761 [inlined]
  [7] #564
    @ ~/code/julia/Oscar.jl/src/Serialization/main.jl:856 [inlined]
  [8] open(f::Oscar.Serialization.var"#564#565"{Nothing, Nothing, Oscar.Serialization.JSONSerializer}, args::String; kwargs::@Kwargs{})
    @ Base ./io.jl:410
  [9] open
    @ ./io.jl:407 [inlined]
 [10] #load#563
    @ ~/code/julia/Oscar.jl/src/Serialization/main.jl:855 [inlined]
 [11] load(filename::String)
    @ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/main.jl:852
 [12] top-level scope
    @ REPL[20]:1

@antonydellavecchia @benlorenz if you have any idea where this comes from and why we haven't seen it before, this would be much appreciated. Otherwise, I'll investigate further in a few days.

@antonydellavecchia
Copy link
Collaborator

I have one failure that leaves me a but puzzled: When trying to load https://github.com/oscar-system/serialization-upgrade-tests/blob/ce778d995ea63dd73307e7a9ecde1bf5b215ba22/version_1_3_0/Vector-86/0.mrdi (yes, only this single one of all the upgrade tests fails...) I get the following stacktrace:

julia> load("0.mrdi")
ERROR: MethodError: no method matching getindex(::String, ::Symbol)
The function `getindex` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  getindex(::String, ::UnitRange{Int64})
   @ Base strings/string.jl:496
  getindex(::String, ::Int64)
   @ Base strings/string.jl:460
  getindex(::String, ::AbstractUnitRange{<:Integer})
   @ Base strings/string.jl:494
  ...

Stacktrace:
  [1] upgrade_1_4_0(s::UpgradeState, dict::Dict{Symbol, Any})
    @ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/Upgrades/1.4.0.jl:163
  [2] upgrade_1_4_0(s::UpgradeState, dict::JSON.Object{Symbol, Any})
    @ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/Upgrades/1.4.0.jl:279
  [3] (::Oscar.Serialization.UpgradeScript)(s::UpgradeState, dict::JSON.Object{Symbol, Any})
    @ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/Upgrades/main.jl:53
  [4] upgrade(format_version::VersionNumber, dict::JSON.Object{Symbol, Any})
    @ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/Upgrades/main.jl:273
  [5] load(io::IOStream; params::Nothing, type::Nothing, serializer::Oscar.Serialization.JSONSerializer, with_attrs::Bool)
    @ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/main.jl:795
  [6] load
    @ ~/code/julia/Oscar.jl/src/Serialization/main.jl:761 [inlined]
  [7] #564
    @ ~/code/julia/Oscar.jl/src/Serialization/main.jl:856 [inlined]
  [8] open(f::Oscar.Serialization.var"#564#565"{Nothing, Nothing, Oscar.Serialization.JSONSerializer}, args::String; kwargs::@Kwargs{})
    @ Base ./io.jl:410
  [9] open
    @ ./io.jl:407 [inlined]
 [10] #load#563
    @ ~/code/julia/Oscar.jl/src/Serialization/main.jl:855 [inlined]
 [11] load(filename::String)
    @ Oscar.Serialization ~/code/julia/Oscar.jl/src/Serialization/main.jl:852
 [12] top-level scope
    @ REPL[20]:1

@antonydellavecchia @benlorenz if you have any idea where this comes from and why we haven't seen it before, this would be much appreciated. Otherwise, I'll investigate further in a few days.

for some reason dict[:data] on line 269 in the upgrade script 1.4.0.jl is now an Vector{Any} instead of Vector{String}

@lgoettgens
Copy link
Member Author

At least some of the code that was reading data using JSON3 should be switched to JSON.lazy instead of parsing everything to dicts.

I'll have a look once the rest of this PR works.

I did a small benchmark (17c86c3 is a recent master commit):

julia> K,a = quadratic_field(42);

julia> R,(x,y) = polynomial_ring(K, [:x,:y]) ; 

julia> f = (9//10*a + 1//2)*x^9*y^10 + (3*a + 1)*x^8*y^2 + (a + 7//2)*x^5*y^3 + (1//3*a + 7//2)*x^2*y^9 + (2//9*a + 3)*x^2*y^2 + (8//5*a + 7//9)*x*y^3;

julia> save("/tmp/bla", f)

julia> @b load("/tmp/bla")
210.301 μs (898 allocs: 54.695 KiB)		#17c86c3
172.651 μs (916 allocs: 38.961 KiB)		#lg/drop-JSON3

julia> @b begin Oscar.Serialization.reset_global_serializer_state(); load("/tmp/bla") end
333.322 μs (1326 allocs: 80.406 KiB)		#17c86c3
216.521 μs (1108 allocs: 45.383 KiB)		#lg/drop-JSON3

julia> save("/tmp/bla", [f for _ in 1:1000])

julia> @b load("/tmp/bla")
168.491 ms (739706 allocs: 39.564 MiB)	#17c86c3
142.924 ms (710724 allocs: 27.903 MiB)	#lg/drop-JSON3

julia> @b begin Oscar.Serialization.reset_global_serializer_state(); load("/tmp/bla") end
169.655 ms (740134 allocs: 39.631 MiB)	#17c86c3
144.615 ms (710916 allocs: 27.905 MiB)	#lg/drop-JSON3

julia> save("/tmp/blamaster", [f for _ in 1:100000])

julia> @time load("/tmp/blamaster");
 20.340702 seconds (74.00 M allocations: 3.760 GiB, 13.14% gc time)	#17c86c3
 21.463136 seconds (71.10 M allocations: 2.719 GiB, 28.77% gc time)	#lg/drop-JSON3

At least according to this piece of work, the runtime of the eager version now contained in this PR is comparable to the previous one using JSON3.
Since converting everything to use lazy parsing instead will possibly need a bunch of changes at different places in the deserialization code, I would be happy if we could merge this PR without that and move the lazy parsing to a follow-up.

@lgoettgens lgoettgens marked this pull request as ready for review October 17, 2025 14:35
@benlorenz benlorenz added the extra-long Also run the extra-long tests during CI. label Oct 21, 2025
@benlorenz benlorenz closed this Oct 21, 2025
@benlorenz benlorenz reopened this Oct 21, 2025
Copy link
Member

@benlorenz benlorenz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable, it would be good if @antonydellavecchia could also have a look.

I added the extra-long label to let these tests run as well.

@codecov
Copy link

codecov bot commented Oct 21, 2025

Codecov Report

❌ Patch coverage is 90.00000% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.08%. Comparing base (17c86c3) to head (2716135).
⚠️ Report is 9 commits behind head on master.

Files with missing lines Patch % Lines
src/Serialization/Upgrades/1.4.0.jl 86.95% 9 Missing ⚠️
src/Serialization/Upgrades/0.13.0.jl 80.00% 3 Missing ⚠️
src/Serialization/Upgrades/0.11.3.jl 80.00% 1 Missing ⚠️
src/Serialization/Upgrades/1.1.0.jl 50.00% 1 Missing ⚠️
src/Serialization/polymake.jl 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5451      +/-   ##
==========================================
+ Coverage   84.05%   84.08%   +0.02%     
==========================================
  Files         730      730              
  Lines       99189   100279    +1090     
==========================================
+ Hits        83375    84319     +944     
- Misses      15814    15960     +146     
Files with missing lines Coverage Δ
src/Serialization/Combinatorics.jl 100.00% <100.00%> (ø)
src/Serialization/PolyhedralGeometry.jl 93.20% <100.00%> (ø)
src/Serialization/Upgrades/0.12.0.jl 93.10% <100.00%> (+0.24%) ⬆️
src/Serialization/Upgrades/0.12.2.jl 77.63% <100.00%> (ø)
src/Serialization/Upgrades/0.15.0.jl 100.00% <100.00%> (ø)
src/Serialization/Upgrades/1.2.0.jl 100.00% <100.00%> (ø)
src/Serialization/Upgrades/1.3.0.jl 100.00% <100.00%> (ø)
src/Serialization/Upgrades/1.6.0-1.jl 100.00% <100.00%> (ø)
src/Serialization/Upgrades/1.6.0.jl 100.00% <100.00%> (ø)
src/Serialization/Upgrades/main.jl 97.26% <100.00%> (ø)
... and 7 more

... and 10 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@antonydellavecchia antonydellavecchia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment, but looks good

# with a name and a params key
dict[:_type] isa String && return dict
if dict[:data] isa Vector{String}
if !isempty(dict[:data]) && all(e -> e isa String, dict[:data])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we ended up losing the type information for some reason here.

Do you know why? I believe it's because you we use Dict{Symbol, Any} now

Copy link
Member Author

@lgoettgens lgoettgens Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure where exactly we are losing the type information, but iirc it was already in the JSON library call. But having a Vector{Any} here is actually better for type stability on the macro level as the the type does not depend on the content of the json file.

@benlorenz
Copy link
Member

benlorenz commented Oct 21, 2025

The job for the extra-long tests is running for about 2 hours now

      From worker 4:	-> Testing experimental/FTheoryTools/test/paper_tests.jl took: runtime 5274.726 seconds + compilation 282.062 seconds + recompilation 0.0 seconds, 218.856 GiB

last night the whole extra-long tests took about 1 hour, and that file 2874s:

      From worker 4:	-> Testing experimental/FTheoryTools/test/paper_tests.jl took: runtime 2874.218 seconds + compilation 259.867 seconds + recompilation 0.0 seconds, 235.277 GiB

Edit: So the time for that file almost doubled and the whole job now finished after 3 hours, with this very long testfile:

      From worker 3:	-> Testing experimental/FTheoryTools/test/FTM-1511-03209.jl took: runtime 9773.772 seconds + compilation 100.067 seconds + recompilation 0.0 seconds, 243.155 GiB

https://github.com/oscar-system/Oscar.jl/actions/runs/18681344886/job/53262968347?pr=5451#step:7:17923

Should we try to look into this before merging or will this be checked when this is improved for lazy parsing?

@lgoettgens lgoettgens marked this pull request as draft October 24, 2025 11:46
@lgoettgens
Copy link
Member Author

As a first step, I would try to factor all of the changes out of this PR that are needed for the switch from JSON3 to JSON, but that are already possible to do right now. That way, we split this large PR into two and see which of the two parts influences the runtime by how much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

extra-long Also run the extra-long tests during CI. release notes: not needed PRs introducing changes that are wholly irrelevant to the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants