Skip to content

gccrs: Fix 128-bit non-decimal integer literal saturation#4454

Open
nsvke wants to merge 1 commit intoRust-GCC:masterfrom
nsvke:fix-non-decimal-128-saturation
Open

gccrs: Fix 128-bit non-decimal integer literal saturation#4454
nsvke wants to merge 1 commit intoRust-GCC:masterfrom
nsvke:fix-non-decimal-128-saturation

Conversation

@nsvke
Copy link

@nsvke nsvke commented Feb 28, 2026

Description

This PR addresses the saturation bug where large non-decimal integer literals (such as 128-bit hex, bin, and octal values) were truncated to 64-bit limits (LONG_MAX) during the lexing phase.

Key Changes:

  • Replaced the usage of std::strtol in Lexer::parse_non_decimal_int_literal with GNU MP (GMP) for arbitrary-precision base conversion.
  • Corrected the hex dispatcher in Lexer::parse_non_decimal_int_literals to pass the raw string without prepending the "0x" prefix, ensuring compatibility with mpz_set_str.
  • Added a regression test (non_decimal_128_saturation.rs) to verify that $2^{64}$ limits are properly exceeded without truncation.

gcc/rust/ChangeLog:

* lex/rust-lex.cc (Lexer::parse_non_decimal_int_literal): Use GMP for base conversion to support 128-bit literals.
(Lexer::parse_non_decimal_int_literals): Fix hex prefix inconsistency by passing pure string.

gcc/testsuite/ChangeLog:

* rust/execute/non_decimal_128_saturation.rs: New test.

Closes #4453

@nsvke nsvke force-pushed the fix-non-decimal-128-saturation branch from 62acb63 to 927d24e Compare February 28, 2026 18:52
if (current_char == 'x')
{
// hex (integer only)
return parse_non_decimal_int_literal (loc, is_x_digit, str + "x", 16);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this was doing, since it's appending an x

Copy link
Collaborator

@powerboat9 powerboat9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just need to fix the endian-dependent behavior in the test

@nsvke
Copy link
Author

nsvke commented Feb 28, 2026

Thanks for the feedback! I'll update the test shortly.

This patch replaces std::strtol with GNU MP (GMP) for arbitrary-precision
parsing. Additionally, the hex literal dispatcher was corrected to
avoid prepending "0x" to the internal string, ensuring compatibility
with mpz_set_str.

gcc/rust/ChangeLog:

	* lex/rust-lex.cc (Lexer::parse_non_decimal_int_literal): Use GMP
	for base conversion to support 128-bit literals.
	(Lexer::parse_non_decimal_int_literals): Fix hex prefix inconsistency
	by passing pure string.

gcc/testsuite/ChangeLog:

	* rust/execute/non_decimal_128_saturation.rs: New test.

Signed-off-by: Enes Cevik <nsvke@proton.me>
@nsvke nsvke force-pushed the fix-non-decimal-128-saturation branch from 927d24e to cd32199 Compare March 1, 2026 08:14
@nsvke nsvke requested a review from powerboat9 March 1, 2026 08:23
unsafe {
let hex_val: u128 = 0x10000000000000000_u128;
if (hex_val >> 64) as u8 != 1 {
abort();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change this to return 1 and check for the status using dejaGNU to avoid relying on the extern abi and keep the test somewhat minimal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lexer: Non-decimal integer literals (hex, bin, oct) are saturated at 64-bit limits

3 participants