Skip to content

limit recursion depth to prevent stack overflow in 2-arg promote_type #57517

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 52 commits into
base: master
Choose a base branch
from

Conversation

nsajko
Copy link
Contributor

@nsajko nsajko commented Feb 24, 2025

Replace the recursive application of promote_rule by a harcoded (with metapogramming) sequence of steps, up to a limited depth.

Now that recursion is gone, promote_rule can constant fold more often.

The simplest cases of conflicting promote_rule definitions are detected even before hitting the depth limit, as in #57507. This is just to provide better UX, because of the more precise error message in that case and the shorter stack trace.

Fixes #13193

Closes #57507

@nsajko nsajko added the bugfix This change fixes an existing bug label Feb 24, 2025
@nsajko nsajko force-pushed the prevent_stack_overflow_in_type_promotion branch from 57dbdeb to 69e5d32 Compare February 24, 2025 14:31
@adienes
Copy link
Member

adienes commented Feb 24, 2025

what is the advantage over #57507 ? this seems to add quite a bit more complexity and includes scary things like hardcoded constants

@nsajko
Copy link
Contributor Author

nsajko commented Feb 24, 2025

what is the advantage over #57507 ?

This PR prevents a much wider class of stack overflows. In particular, #57507 doesn't fix #13193 (see the test here for an example), while this PR should.

scary things like hardcoded constants

Usually I avoid hardcoded constants as best as I can, but in this case there's no way to guarantee (local) termination without hardcoding a cutoff. Another approach would be to stick with recursion, but limit the depth, however in any case it definitely seems necessary to have some kind of hardcoded limit.

promote_type shouldn't run without bound just because of faulty/conflicting promote_rules. Or, worse, overflow the stack. Or, even worse, cause the compiler itself to run without bound while it's trying to compile the promote_type method.

@nsajko nsajko force-pushed the prevent_stack_overflow_in_type_promotion branch from 9e1f5dc to 4b7bb02 Compare February 24, 2025 22:00
@nsajko nsajko added the backport 1.12 Change should be backported to release-1.12 label Feb 25, 2025
@nsajko
Copy link
Contributor Author

nsajko commented Feb 25, 2025

This is how the behavior for #13193 looks with this PR, due to the limited recursion depth:

julia> struct SIQuantity{T<:Number} <: Number; end

julia> Base.promote_rule(::Type{SIQuantity{T}}, ::Type{SIQuantity{S}}) where {T, S} = SIQuantity{promote_type(T,S)}

julia> Base.promote_rule(::Type{SIQuantity{T}}, ::Type{S}) where {T, S<:Number} = SIQuantity{promote_type(T,S)}

julia> struct Interval{T<:Number} <: Number; end

julia> Base.promote_rule(::Type{Interval{T}}, ::Type{Interval{S}}) where {T, S} = Interval{promote_type(T,S)}

julia> Base.promote_rule(::Type{Interval{T}}, ::Type{S}) where {T, S<:Number} = Interval{promote_type(T,S)}

julia> promote_type(Interval{Int}, SIQuantity{Int})
ERROR: ArgumentError: `promote_type`: recursion depth limit reached, giving up; check for faulty/conflicting/missing `promote_rule` methods
Stacktrace:
  [1] _promote_type_binary(::Type, ::Type, ::Tuple{})
    @ Base ./promotion.jl:308
  [2] promote_result(::Type, ::Type, ::Type{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{Int64}}}}}}}}}}}}, ::Type{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Int64}}}}}}}}}}}}, l::Tuple{})
    @ Base ./promotion.jl:360
  [3] _promote_type_binary(::Type{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Int64}}}}}}}}}}}, ::Type{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{Int64}}}}}}}}}}}, recursion_depth_limit::Tuple{Nothing})
    @ Base ./promotion.jl:325
  [4] promote_result(::Type, ::Type, ::Type{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Int64}}}}}}}}}}}, ::Type{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{Int64}}}}}}}}}}}, l::Tuple{Nothing})
    @ Base ./promotion.jl:360
  [5] _promote_type_binary(::Type{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{Int64}}}}}}}}}}, ::Type{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Int64}}}}}}}}}}, recursion_depth_limit::Tuple{Nothing, Nothing})
    @ Base ./promotion.jl:325
  [6] promote_result(::Type, ::Type, ::Type{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{Int64}}}}}}}}}}, ::Type{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Int64}}}}}}}}}}, l::Tuple{Nothing, Nothing})
    @ Base ./promotion.jl:360
  [7] _promote_type_binary
    @ ./promotion.jl:325 [inlined]
  [8] promote_result(::Type, ::Type, ::Type{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Int64}}}}}}}}}, ::Type{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{Int64}}}}}}}}}, l::Tuple{Nothing, Nothing, Nothing})
    @ Base ./promotion.jl:360
  [9] _promote_type_binary
    @ ./promotion.jl:325 [inlined]
 [10] promote_result(::Type, ::Type, ::Type{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{Int64}}}}}}}}, ::Type{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Int64}}}}}}}}, l::NTuple{4, Nothing})
    @ Base ./promotion.jl:360
 [11] _promote_type_binary
    @ ./promotion.jl:325 [inlined]
 [12] promote_result(::Type, ::Type, ::Type{Interval{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Int64}}}}}}}, ::Type{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Interval{Int64}}}}}}}, l::NTuple{5, Nothing})
    @ Base ./promotion.jl:360
 [13] _promote_type_binary
    @ ./promotion.jl:325 [inlined]
 [14] promote_result(::Type, ::Type, ::Type{Interval{SIQuantity{Interval{SIQuantity{Interval{Int64}}}}}}, ::Type{SIQuantity{Interval{SIQuantity{Interval{SIQuantity{Int64}}}}}}, l::NTuple{6, Nothing})
    @ Base ./promotion.jl:360
 [15] _promote_type_binary
    @ ./promotion.jl:325 [inlined]
 [16] promote_result(::Type, ::Type, ::Type{Interval{SIQuantity{Interval{SIQuantity{Int64}}}}}, ::Type{SIQuantity{Interval{SIQuantity{Interval{Int64}}}}}, l::NTuple{7, Nothing})
    @ Base ./promotion.jl:360
 [17] _promote_type_binary
    @ ./promotion.jl:325 [inlined]
 [18] promote_result(::Type, ::Type, ::Type{Interval{SIQuantity{Interval{Int64}}}}, ::Type{SIQuantity{Interval{SIQuantity{Int64}}}}, l::NTuple{8, Nothing})
    @ Base ./promotion.jl:360
 [19] _promote_type_binary
    @ ./promotion.jl:325 [inlined]
 [20] promote_result(::Type, ::Type, ::Type{Interval{SIQuantity{Int64}}}, ::Type{SIQuantity{Interval{Int64}}}, l::NTuple{9, Nothing})
    @ Base ./promotion.jl:360
 [21] _promote_type_binary
    @ ./promotion.jl:325 [inlined]
 [22] promote_type(::Type{Interval{Int64}}, ::Type{SIQuantity{Int64}})
    @ Base ./promotion.jl:340
 [23] top-level scope
    @ REPL[7]:1

@nsajko nsajko added error handling Handling of exceptions by Julia or the user error messages Better, more actionable error messages labels Feb 25, 2025
@nsajko nsajko requested review from aviatesk and adienes February 25, 2025 06:48
@adienes
Copy link
Member

adienes commented Feb 25, 2025

not sure if I'm 100% qualified to review beyond stylistic things but as far as I can tell I like this approach

it might make sense to run the benchmarks? I know this shouldn't change anything but promote_type is quite a highly used function so just in case

@nsajko nsajko added minor change Marginal behavior change acceptable for a minor release needs pkgeval Tests for all registered packages should be run with this change needs nanosoldier run This PR should have benchmarks run on it labels Feb 25, 2025
@nsajko nsajko changed the title prevent stack overflow in 2-arg promote_type limit recursion depth to prevent stack overflow in 2-arg promote_type Feb 25, 2025
@nsajko nsajko added triage This should be discussed on a triage call and removed backport 1.12 Change should be backported to release-1.12 labels Feb 25, 2025
@nsajko
Copy link
Contributor Author

nsajko commented Feb 25, 2025

Question for triage: does limiting the recursion depth seem like an acceptable minor change? It's a bugfix, as it prevents stack overflows, but hypothetically some user code could break if it relied on an awkward chain of promote_rules.

NB: an earlier version of this PR tried to do away with recursion altogether in favor of a loop, however that resulted in inference regressions.

NB: setting the recursion depth limit higher breaks bootstrap for some reason, so currently it's not possible to set the limit higher.

@nsajko nsajko force-pushed the prevent_stack_overflow_in_type_promotion branch from 6236d3f to fac1ce7 Compare February 26, 2025 20:24
@nsajko
Copy link
Contributor Author

nsajko commented Feb 26, 2025

@nanosoldier runtests()

@nsajko nsajko force-pushed the prevent_stack_overflow_in_type_promotion branch from 0bc1c59 to d2e0fb2 Compare February 28, 2025 14:29
@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

Report summary

❗ Packages that crashed

1 packages crashed only on the current version.

  • An internal error was encountered: 1 packages

12 packages crashed on the previous version too.

✖ Packages that failed

38 packages failed only on the current version.

  • Package fails to precompile: 1 packages
  • Package has test failures: 2 packages
  • Package tests unexpectedly errored: 3 packages
  • Tests became inactive: 2 packages
  • Test duration exceeded the time limit: 27 packages
  • Test log exceeded the size limit: 3 packages

1054 packages failed on the previous version too.

✔ Packages that passed tests

23 packages passed tests only on the current version.

  • Other: 23 packages

5334 packages passed tests on the previous version too.

~ Packages that at least loaded

9 packages successfully loaded only on the current version.

  • Other: 9 packages

2968 packages successfully loaded on the previous version too.

➖ Packages that were skipped altogether

2 packages were skipped only on the current version.

  • Package could not be installed: 2 packages

908 packages were skipped on the previous version too.

@nsajko
Copy link
Contributor Author

nsajko commented Mar 3, 2025

The pkgeval seems OK, it doesn't seem like any packages hit the recursion depth limit. That said, it might be good to repeat the pkgeval once the bpart-related changes are more polished.

@LilithHafner
Copy link
Member

PolynomialRings failure looks real (link that will go stale)

@nsajko
Copy link
Contributor Author

nsajko commented Mar 13, 2025

PolynomialRings is buggy, doesn't have much to do with this PR specifically: tkluck/PolynomialRings.jl#10

EDIT: the current version of this PR doesn't break PolynomialRings.jl any more, even though it's that package's own fault for adding methods to promote_type.

EDIT: the current version of this PR will again break PolynomialRings.jl, unless it gets broken for an unrelated reason by #58720 first. But, again, that's the fault of PolynomialRings.jl.

@nsajko nsajko removed the triage This should be discussed on a triage call label Jun 4, 2025
nsajko added 26 commits June 13, 2025 22:12
If this is fine with the test suite and PkgEval, we can raise it back
for even greater assurance that there will be no breakage in practice.
Ideally this shouldn't matter, however it's an easy fix for the test
suite of MaxPlus.jl, which expects this ordering of the calls in an
overly-strict `test_throws` test.
Will remove them in a follow-up PR. Just adding them back to make this
PR easier to review.
In preparation for replacing recursion with metaprogramming.
In particular, this enables more calls to be constant folded.
@nsajko nsajko force-pushed the prevent_stack_overflow_in_type_promotion branch from 9b1e379 to 011d7a1 Compare June 13, 2025 20:12
nsajko and others added 3 commits June 13, 2025 22:35
The registered package in question will be broken for an unrelated
reason now anyway by PR JuliaLang#58720, so there's no sense in trying to
protect it any more.

In any case, it doesn't make sense to hold back the language just
to protect a single broken package.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix This change fixes an existing bug error handling Handling of exceptions by Julia or the user error messages Better, more actionable error messages minor change Marginal behavior change acceptable for a minor release needs nanosoldier run This PR should have benchmarks run on it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Infinite recursion when promote_rule depends on order of arguments
6 participants