Skip to content

Conversation

@martin-augment
Copy link
Owner

19926: To review by AI

@coderabbitai
Copy link

coderabbitai bot commented Jan 22, 2026

Walkthrough

The changes modify the decimal rounding implementation to include precision-aware validation. The round function is updated to increase decimal precision by 1 after rounding to accommodate potential carry, with validation applied across all decimal types (Decimal32, Decimal64, Decimal128, Decimal256). Dedicated validation helper functions are introduced for each decimal type. The return type logic is extended to map decimal inputs to higher-precision types. Both scalar and columnar execution paths are updated to apply this increased precision and validation. Test expectations are updated to reflect the new precision values, and new test cases are added for carry-over rounding scenarios and overflow handling.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @martin-augment, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where the ROUND function in DataFusion could lead to overflow truncation when applied to decimal types. The changes ensure that the precision of decimal types is increased during rounding to accommodate potential carry-over, and validation is performed to prevent overflow. Updated SQL logic tests confirm the fix and add a new test case for overflow handling.

Highlights

  • Decimal Precision Handling: The return_type function now increases the precision of decimal types by 1 when rounding to accommodate potential carry-over, preventing overflow truncation. The maximum precision is capped at the maximum allowed value for each decimal type (32, 64, 128, 256).
  • Validation of Decimal Precision: New functions validate_decimal32_precision, validate_decimal64_precision, validate_decimal128_precision, and validate_decimal256_precision are added to ensure that the rounded decimal values fit within the specified precision, returning an ArrowError if overflow occurs.
  • Columnar Rounding: The round_columnar function now uses the increased precision and validates the results of rounding operations on decimal columns to prevent overflow.
  • SQL Logic Tests: Updated SQL logic tests to verify the fix, including a test case that previously returned an incorrect value due to precision overflow and a test case that checks for the expected overflow error when rounding at maximum precision.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@claude
Copy link

claude bot commented Jan 22, 2026

PR Review: Fix ROUND decimal precision overflow

Summary

This PR addresses issue apache#19926 by increasing the precision of decimal values returned from the ROUND function to prevent overflow truncation. The fix handles cases where rounding can cause a carry-over that requires an additional digit (e.g., 999.9 → 1000.0).


✅ Code Quality & Best Practices

Strengths:

  • Clear documentation: The inline comments in return_type() (round.rs:128-131) explain the rationale well and reference PostgreSQL behavior
  • Consistent pattern: The fix is applied uniformly across all four decimal types (Decimal32, Decimal64, Decimal128, Decimal256)
  • Appropriate use of Arrow constants: Uses Arrow's predefined MAX_DECIMAL*_FOR_EACH_PRECISION and MIN_DECIMAL*_FOR_EACH_PRECISION arrays for validation
  • DRY principle: Validation logic is properly extracted into separate functions (validate_decimal32_precision, etc.)

Suggestions:

  1. Code duplication in validation functions: The four validate_decimal*_precision functions (round.rs:386-431) are nearly identical except for their type signatures. Consider using a trait-based approach or macro to reduce duplication if Arrow provides appropriate type bounds.

  2. Missing documentation on validation functions: The validation functions lack doc comments. Consider adding:

    /// Validates that a Decimal32 value fits within the specified precision.
    /// Returns an error if the value exceeds the maximum or minimum for the precision.
    

🐛 Potential Bugs & Issues

Critical - Bounds checking:
The validation functions use array indexing without bounds checking:

let max = MAX_DECIMAL32_FOR_EACH_PRECISION[precision as usize];

While this should be safe because new_precision is clamped to DECIMAL32_MAX_PRECISION (which is 10 for Decimal32), there's a theoretical risk if:

  • The Arrow constants array is shorter than expected
  • There's a mismatch between the max precision constant and array length

Recommendation: Add a debug assertion or explicit bounds check:

debug_assert!(
    (precision as usize) < MAX_DECIMAL32_FOR_EACH_PRECISION.len(),
    "Precision {} exceeds array bounds",
    precision
);

Minor - Potential information loss:
When precision is already at max (e.g., 38 for Decimal128), the PR correctly caps it at the maximum rather than increasing:

let new_precision = (*precision + 1).min(DECIMAL128_MAX_PRECISION);

However, this means values at max precision that would overflow are only caught at runtime via validation. The test case correctly verifies this produces an error (scalar.slt:975-976), which is good defensive behavior.


⚡ Performance Considerations

Positive:

  • Validation is performed inline during the rounding operation, avoiding a second pass through the data
  • The validation logic is simple (two comparisons), so overhead should be minimal

Potential concern:

  • The precision increase happens for ALL decimal rounding operations, even when the value won't overflow (e.g., rounding 1.5 doesn't need extra precision)
  • This means all consumers of the rounded result will see increased precision metadata, which could affect downstream operations

Analysis: This is likely acceptable because:

  1. The precision metadata increase is small (just +1)
  2. It matches PostgreSQL behavior (as noted in comments)
  3. The alternative (dynamic precision calculation) would be more complex and potentially slower

🔒 Security Concerns

No significant security issues identified.

The validation functions properly prevent:

  • Integer overflow (caught by Arrow's checked math operations in round_decimal)
  • Out-of-range decimal values (caught by the new validation logic)

The error messages don't leak sensitive information and are appropriate for debugging.


🧪 Test Coverage

Well covered:

  • ✅ Basic carry-over case: round('999.9'::decimal(4,1)) → 1000 (scalar.slt:964-968)
  • ✅ Negative case: round('-999.9'::decimal(4,1)) → -1000
  • ✅ Max precision overflow: round('9999...9.9'::decimal(38,1)) returns error (scalar.slt:971-972)
  • ✅ Updated existing test expectations to reflect new precision (decimal.slt:785)

Potential gaps:

  1. No explicit tests for Decimal32, Decimal64, or Decimal256 - all new tests use Decimal128
  2. No tests for columnar (array) operations - only scalar values tested in the new test cases
  3. No tests for edge cases like:
    • Values at exactly max precision that don't overflow (e.g., 500.0 at max precision)
    • Rounding with negative decimal_places (e.g., round(999.9, -1))
    • Very large scale values

Recommendation: Add at least one test for columnar operations to ensure the validation logic works correctly in round_columnar code path.


📝 Overall Assessment

Verdict: Approve with minor suggestions

This is a well-implemented fix that:

  • ✅ Addresses the core issue correctly
  • ✅ Follows PostgreSQL behavior (industry standard)
  • ✅ Includes appropriate error handling
  • ✅ Has good test coverage for the main use case

Before merging:

  1. Consider adding bounds checking assertions in validation functions
  2. Consider adding a columnar test case
  3. Consider adding documentation to validation functions

The code is production-ready, and the suggestions are for future maintainability rather than correctness concerns.


Nice work! The fix properly handles decimal precision overflow while maintaining consistency with established database semantics. 🎉

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses the decimal overflow issue in the ROUND function by increasing the precision. The logic seems sound, and the added tests cover the fix and edge cases well. My main feedback is on improving code maintainability by reducing duplication. There are several places where almost identical code is used for Decimal32, Decimal64, Decimal128, and Decimal256. Using macros would make this code much cleaner and easier to maintain.

Comment on lines +384 to +431
/// Validate that an i32 (Decimal32) value fits within the specified precision.
/// Uses Arrow's pre-defined MAX/MIN_DECIMAL32_FOR_EACH_PRECISION constants.
fn validate_decimal32_precision(value: i32, precision: u8) -> Result<i32, ArrowError> {
let max = MAX_DECIMAL32_FOR_EACH_PRECISION[precision as usize];
let min = MIN_DECIMAL32_FOR_EACH_PRECISION[precision as usize];
if value > max || value < min {
return Err(ArrowError::ComputeError(format!(
"Decimal overflow: rounded value exceeds precision {precision}"
)));
}
Ok(value)
}

fn validate_decimal64_precision(value: i64, precision: u8) -> Result<i64, ArrowError> {
let max = MAX_DECIMAL64_FOR_EACH_PRECISION[precision as usize];
let min = MIN_DECIMAL64_FOR_EACH_PRECISION[precision as usize];
if value > max || value < min {
return Err(ArrowError::ComputeError(format!(
"Decimal overflow: rounded value exceeds precision {precision}"
)));
}
Ok(value)
}

fn validate_decimal128_precision(value: i128, precision: u8) -> Result<i128, ArrowError> {
let max = MAX_DECIMAL128_FOR_EACH_PRECISION[precision as usize];
let min = MIN_DECIMAL128_FOR_EACH_PRECISION[precision as usize];
if value > max || value < min {
return Err(ArrowError::ComputeError(format!(
"Decimal overflow: rounded value exceeds precision {precision}"
)));
}
Ok(value)
}

fn validate_decimal256_precision(
value: arrow::datatypes::i256,
precision: u8,
) -> Result<arrow::datatypes::i256, ArrowError> {
let max = MAX_DECIMAL256_FOR_EACH_PRECISION[precision as usize];
let min = MIN_DECIMAL256_FOR_EACH_PRECISION[precision as usize];
if value > max || value < min {
return Err(ArrowError::ComputeError(format!(
"Decimal overflow: rounded value exceeds precision {precision}"
)));
}
Ok(value)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These validate_decimal..._precision functions are identical apart from the types. This code duplication can be eliminated by using a macro, which would make the code more maintainable.

macro_rules! define_validate_decimal_precision {
    ($name:ident, $ty:ty, $max_const:ident, $min_const:ident) => {
        fn $name(value: $ty, precision: u8) -> Result<$ty, ArrowError> {
            let max = $max_const[precision as usize];
            let min = $min_const[precision as usize];
            if value > max || value < min {
                return Err(ArrowError::ComputeError(format!(
                    "Decimal overflow: rounded value exceeds precision {precision}"
                )));
            }
            Ok(value)
        }
    };
}

define_validate_decimal_precision!(
    validate_decimal32_precision,
    i32,
    MAX_DECIMAL32_FOR_EACH_PRECISION,
    MIN_DECIMAL32_FOR_EACH_PRECISION
);

define_validate_decimal_precision!(
    validate_decimal64_precision,
    i64,
    MAX_DECIMAL64_FOR_EACH_PRECISION,
    MIN_DECIMAL64_FOR_EACH_PRECISION
);

define_validate_decimal_precision!(
    validate_decimal128_precision,
    i128,
    MAX_DECIMAL128_FOR_EACH_PRECISION,
    MIN_DECIMAL128_FOR_EACH_PRECISION
);

define_validate_decimal_precision!(
    validate_decimal256_precision,
    arrow::datatypes::i256,
    MAX_DECIMAL256_FOR_EACH_PRECISION,
    MIN_DECIMAL256_FOR_EACH_PRECISION
);

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:bug; feedback:The Gemini AI reviewer is correct! By using a macro the code duplication could be reduced a lot. Prevents higher maintenance cost in the future for all the duplicated code.

@augmentcode
Copy link

augmentcode bot commented Jan 22, 2026

🤖 Augment PR Summary

Summary: This PR adjusts the behavior of the SQL round function for decimal types to avoid precision overflow/truncation when rounding causes a carry into a new digit.

Changes:

  • Updates RoundFunc::return_type to increase decimal precision by 1 (capped at Arrow’s max precision) for Decimal32/64/128/256.
  • Aligns scalar and columnar execution paths with the new precision by constructing results using the increased precision.
  • Adds explicit post-rounding precision validation using Arrow’s per-precision MAX/MIN constants to surface overflows as compute errors.
  • Updates sqllogictest expectations where the output decimal type precision now increases by 1.
  • Adds regression tests for carry-over rounding (e.g. 999.91000.0) and for overflow at max precision.

Technical Notes: The precision bump is intended to match behavior seen in systems like PostgreSQL where rounding can increase digits before the decimal point; values that still exceed the (possibly capped) precision now error rather than silently truncating.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. No suggestions at this time.

Comment augment review to trigger a new review at any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants