Skip to content

Fix lower incomplete gamma functions with x = 0 #1251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 23, 2025

Conversation

cohomology
Copy link
Contributor

In this case, the errno error handling did not work correctly, as internal functions where accidently setting it, although no overflow happens.

Fixes #1249.

In this case, the errno error handling did not work correctly,
as internal functions where accidently setting it, although no
overflow happens.

Fixes boostorg#1249.
Copy link

codecov bot commented Apr 2, 2025

Codecov Report

Attention: Patch coverage is 98.55072% with 1 line in your changes missing coverage. Please review.

Project coverage is 93.82%. Comparing base (a5c0625) to head (7503be4).

Files with missing lines Patch % Lines
test/git_issue_1249.cpp 98.38% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1251      +/-   ##
===========================================
- Coverage    93.83%   93.82%   -0.01%     
===========================================
  Files          657      658       +1     
  Lines        55244    55330      +86     
===========================================
+ Hits         51840    51916      +76     
- Misses        3404     3414      +10     
Files with missing lines Coverage Δ
include/boost/math/special_functions/gamma.hpp 92.08% <100.00%> (-0.18%) ⬇️
test/git_issue_1249.cpp 98.38% <98.38%> (ø)

... and 7 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a5c0625...7503be4. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@@ -1627,7 +1631,7 @@ BOOST_MATH_GPU_ENABLED T gamma_incomplete_imp_final(T a, T x, bool normalised, b
#endif
result = gam - result;
}
if(p_derivative)
if(p_derivative && x > 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to let this line execute without the check for x > 0, that way our root finder will get a derivative back: in this particular case, any arbitrary large value will do.... ah but there should be an else before the *p_derivative /= x;.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I adjusted the coding. Is it okay now?

I also corrected some wrong ternary operators in the file which prevent using some functions with non-standard float/double T template parameters.

@jzmaddock
Copy link
Collaborator

Other than my one small comment, this all looks good. Many thanks for this!

@jzmaddock
Copy link
Collaborator

Looks great, thanks, just running CI now...

@cohomology
Copy link
Contributor Author

Where are these CI errors come from? I don't get them ...

@mborland
Copy link
Member

Where are these CI errors come from? I don't get them ...

The Cauchy one was failing with the SYCL compiler, and the beta ones are new failure there that are unrelated. The last one is we need to increment the current version in Boost.Config. I would not worry about any of these failures. I need to address them, and make sure we haven't broken CUDA along the way

@jzmaddock
Copy link
Collaborator

Ah shucks, sorry, I hadn't spotted that the new beta tests were failing on CUDA. @mborland some of them are trivial failures (need to update expected error rates), but there are a few "gross" errors which might warrant investigation. Or it may be that the inputs are so extreme that there's not much we can do on that platform. If you have CUDA/Sycl set up can I let you investigate?

@mborland
Copy link
Member

Ah shucks, sorry, I hadn't spotted that the new beta tests were failing on CUDA. @mborland some of them are trivial failures (need to update expected error rates), but there are a few "gross" errors which might warrant investigation. Or it may be that the inputs are so extreme that there's not much we can do on that platform. If you have CUDA/Sycl set up can I let you investigate?

Yes, next week I'll look into these. The Scipy guys also are in a position now to use the CUDA stuff so I'll put some effort into the rest of the library too.

@jzmaddock
Copy link
Collaborator

@cohomology can you please merge develop into this PR so we kick off a fresh CI run and do a final check before merging? Thanks!

@cohomology
Copy link
Contributor Author

cohomology commented Apr 28, 2025

I don't understand your changes in gamma.hpp.

The assert:

(x >= 1) || (tools::max_value() * x >= *p_derivative)

is true in my case, because in "tools::max_value() * x >= *p_derivative" the left side is 0 and the right side is zero, too.

So the division by 0: *p_derivative /= x;

gives NaN for the derivate. Is that okay? Or introduce another if?

@cohomology
Copy link
Contributor Author

Ok, adjusted to give the result max/2 as before.

@jzmaddock
Copy link
Collaborator

I don't understand your changes in gamma.hpp.

You're correct: I know I've only just merged these but they're old changes from about a year ago I was half way through and it looks like I missed a case.

It can overflow, but only for x denormal or zero.

Test case is for a=0.01, x = denom_min.

Feel free to reinstate the old commented out code, and I'll try and figure out a way to get coverage later...

@cohomology
Copy link
Contributor Author

Ok, done. CI must still run through, but already was okay with yesterdays commit, except the sycl errors.

@jzmaddock
Copy link
Collaborator

Thanks @cohomology .

I was about to merge this and then realized that the new test is generating UBSAN failures in cpp_int's internal logic. It would seem to be unrelated to this patch, but I'm wary of introducing a new "known failure".

I've tried and failed to reproduce the issues locally, anyone else? @ckormanyos @mborland ?

For the record the failure is:

testing.capture-output ../../../bin.v2/libs/math/test/git_issue_1249.test/clang-linux-18/debug/x86_64/link-static/threading-multi/visibility-hidden/git_issue_1249.run

====== BEGIN OUTPUT ======

../../../boost/multiprecision/cpp_int/bitwise.hpp:441:48: runtime error: left shift of 16962843832447060 by 50 places cannot be represented in type 'unsigned long long'

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:441:48 

../../../boost/multiprecision/cpp_int/bitwise.hpp:446:48: runtime error: left shift of 11191096789291741588 by 50 places cannot be represented in type 'unsigned long long'

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:446:48 

../../../boost/multiprecision/cpp_int/bitwise.hpp:596:35: runtime error: left shift of 7029918509865828866 by 63 places cannot be represented in type 'unsigned long long'

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:596:35 

/usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/basic_string.h:490:51: runtime error: unsigned integer overflow: 3 - 9 cannot be represented in type 'size_type' (aka 'unsigned long')

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/basic_string.h:490:51 

../../../boost/multiprecision/cpp_int/bitwise.hpp:596:35: runtime error: left shift of 562949953421312 by 54 places cannot be represented in type 'limb_type' (aka 'unsigned long long')

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:596:35 

../../../boost/multiprecision/cpp_int/bitwise.hpp:596:35: runtime error: left shift of 11544872091648 by 60 places cannot be represented in type 'limb_type' (aka 'unsigned long long')

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:596:35 

Running 1 test case...

git_issue_1249.cpp(93): error: in "test_main": check (*__errno_location ()) == saveErrno has failed

../../../boost/multiprecision/cpp_int/bitwise.hpp:596:35: runtime error: left shift of 14355223812243456 by 60 places cannot be represented in type 'limb_type' (aka 'unsigned long long')

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:596:35 

../../../boost/multiprecision/cpp_int/bitwise.hpp:441:48: runtime error: left shift of 900719066480640 by 20 places cannot be represented in type 'limb_type' (aka 'unsigned long long')

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:441:48 

../../../boost/multiprecision/cpp_int/bitwise.hpp:596:35: runtime error: left shift of 33030144 by 61 places cannot be represented in type 'limb_type' (aka 'unsigned long long')

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:596:35 

../../../boost/multiprecision/cpp_int/bitwise.hpp:441:48: runtime error: left shift of 17870283321406128128 by 27 places cannot be represented in type 'limb_type' (aka 'unsigned long long')

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:441:48 

../../../boost/multiprecision/cpp_int/bitwise.hpp:596:35: runtime error: left shift of 18446744073705357312 by 18 places cannot be represented in type 'limb_type' (aka 'unsigned long long')

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:596:35 

../../../boost/multiprecision/cpp_int/bitwise.hpp:446:48: runtime error: left shift of 9223372036854775807 by 41 places cannot be represented in type 'limb_type' (aka 'unsigned long long')

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:446:48 

../../../boost/multiprecision/cpp_int/bitwise.hpp:441:48: runtime error: left shift of 14379386343318 by 60 places cannot be represented in type 'limb_type' (aka 'unsigned long long')

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:441:48 

git_issue_1249.cpp(93): error: in "test_main": check (*__errno_location ()) == saveErrno has failed

../../../boost/multiprecision/cpp_int/bitwise.hpp:446:48: runtime error: left shift of 2953679248783 by 42 places cannot be represented in type 'limb_type' (aka 'unsigned long long')

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:446:48 

../../../boost/multiprecision/cpp_int/bitwise.hpp:441:48: runtime error: left shift of 10335793219406891510 by 13 places cannot be represented in type 'limb_type' (aka 'unsigned long long')

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:441:48 

git_issue_1249.cpp(93): error: in "test_main": check (*__errno_location ()) == saveErrno has failed


*** 3 failures are detected in the test module "Master Test Suite"

@jzmaddock
Copy link
Collaborator

Update, definitely don't understand this one, we have:

limb_type shift  = static_cast<limb_type>(s % Int::limb_bits);

And this could (should actually) be marked as const, so we must have 0 <= shift < limb_bits. Only thing I can think of is a memory overrun bug, but all the ASAN tests are green and we've checked this stuff pretty carefully...? Then again USAN is always right, given that it's a runtime check!

@ckormanyos
Copy link
Member

ckormanyos commented May 18, 2025

I made some recent changes to bitwise.hpp in e1cc7ba66d72a444458cdce5d3335fec9518fd5c

I'm not sure if this is real or bogus, but I do see that I made a cast to unsigned long long there, which might have been better as double_limb_type.

Also I notice that these left shifts are exceeding the range of 64-bit. So maybe I messed that one up and these tests reveal that mess-up.

On the other hand, this might be a new phenomenon.

I can investigate this tomorrow in peace. But I'm not sure if this is the right place (in my changes) or if we have discovered a new phenomenon?

Cc: @jzmaddock and @mborland and @cohomology

@ckormanyos
Copy link
Member

ckormanyos commented May 19, 2025

Hi @jzmaddock in a PR in multiprecision, I have reverted the recent changes in cpp_int/bitwise.hpp that I had made.

As soon as that cycles green (as it is expected to), I will merge it into develop of Multiprecision.

Could we then, at that point in time, repeat these UBSAN tests? And then we might be able to find out if the recent changes in cpp_int/bitwise.hpp are related to the UBSAN detections in this Math PR.

Cc: @mborland and @cohomology

@jzmaddock
Copy link
Collaborator

Not so easy to re-run the drone tests... ah, maybe closing and re-opening might do it? Also need to wait half a day after multiprecision has been merged for the master project to catch up.

Might be easier... to import the failing test into multiprecision on your own fork/branch and test locally that way? That's what I was thinking of, but have run out of time at least for today. That would need replicating the failing runner on Gibhub actions as well. But it might make it easier to rapidly get to the cause (there's only really one test runner that needs running) without changing develop?

@ckormanyos
Copy link
Member

Might be easier... to import the failing test into multiprecision on your own fork/branch and test locally that way?

Something like that. I could even spin up a dedicated repo and a simplified CI for this purpose.

  • Are the sanitizer detections restricted to the file git_issue_1249.cpp only?

@ckormanyos
Copy link
Member

Hmmmm... Thinking out loud. Why don't I see if I can reproduce the UBSAN issues locally?

I will report back.

@jzmaddock
Copy link
Collaborator

Something like that. I could even spin up a dedicated repo and a simplified CI for this purpose.

Yes exactly

Are the sanitizer detections restricted to the file git_issue_1249.cpp only?

Yes.

Thinking out loud. Why don't I see if I can reproduce the UBSAN issues locally?

Even better if you can, I don't have a suitable Linux machine setup at present.

I did add a multitude of asserts for all those functions and couldn't detect anything with msvc or mingw though.

@jzmaddock jzmaddock closed this May 22, 2025
@jzmaddock jzmaddock reopened this May 22, 2025
@jzmaddock
Copy link
Collaborator

Sorry for the noise, apparently closing and re-opening doesn't trigger a drone re-build :(

@jzmaddock
Copy link
Collaborator

I think the one failure is a false positive, and if everyone agrees I'll disable that test under -fsanitize=integer.

I made a cut down test PR here: #1267 which pulls multiprecision from the integration_check_do_not_merge branch.

I added asserts to the effected code like this:

      BOOST_ASSERT(shift < sizeof(pr[0]) * CHAR_BIT);
      pr[rs - 1 - i] = pr[rs - 1 - i - offset] << shift;  // sanitizer still fails here
../../../boost/multiprecision/cpp_int/bitwise.hpp:443:48: runtime error: left shift of 16962843832447060 by 50 places cannot be represented in type 'unsigned long long'

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../../boost/multiprecision/cpp_int/bitwise.hpp:443:48 

And I think we were misreading this: it's complaining that the value stored in pr[rs - 1 - i - offset] can not be left shifted without loss of digits... well yes, that's what we want!! This is not UB either, despite what the message says.

It would be possible to suppress the error by masking off the operand to keep just the bits we're keeping post-shift, but that's silly and hamstrings the code unnecessarily.

So basically, my conclusion, is we should just not running the unsigned-integer sanitizer on this code, as it's much too picky and we already have a USAN run anyway.

@ckormanyos
Copy link
Member

ckormanyos commented May 22, 2025

Hi John @jzmaddock I've been busy in another office. And I could not actually do anything locally.

That all makes sense. I had also wanted to add that I have been running cpp_int signed and unsigned in fuzzing runs with random keys in nightly builds in another project for well over a year now and never encountered a wrong numerical answer. I run add, sub, mul, div, sqrt and prime for 15 minutes each in those non-boost nightly runs. So I've got high confidence in cpp-int. This is versus my own int. So there have been a lot of calculations going down.

Cc: @mborland

@mborland
Copy link
Member

So basically, my conclusion, is we should just not running the unsigned-integer sanitizer on this code, as it's much too picky and we already have a USAN run anyway.

I would agree with this since shifting bits off the left side of an unsigned integer is well defined (they are discarded).

@jzmaddock jzmaddock merged commit 036cf85 into boostorg:develop May 23, 2025
84 of 156 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong result for lower incomplete gamma function and x = 0
4 participants