Skip to content

Conversation

@kumarUjjawal
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

SELECT ROUND('999.9'::DECIMAL(4,1)) incorrectly returned 100.0 instead of 1000.

When rounding a decimal value causes a carry-over that increases the number of digits (e.g., 999.9 → 1000.0), the result overflows the original precision constraint. The overflow was silently truncated during display, producing incorrect results.

What changes are included in this PR?

  • Increased the precision by 1 for all decimal types (capped at max precision)
  • Added and udpated slt tests

Are these changes tested?

  • Added new sqllogictest case for the specific bug scenario
  • Updated 2 existing tests with new expected precision values
  • All existing tests pass

Are there any user-facing changes?

Aspect Before After
ROUND('999.9'::DECIMAL(4,1)) 100.0 (wrong) 1000 (correct)
Return type Decimal128(4, 1) Decimal128(5, 1)

The return type precision is now increased by 1 to prevent overflow. This matches PostgreSQL behavior where ROUND can increase the number of digits before the decimal point when carry-over occurs.

Users who depend on exact output precision may need to explicitly cast:

SELECT CAST(ROUND(...) AS DECIMAL(p, s))

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Jan 21, 2026
@kumarUjjawal
Copy link
Contributor Author

Other options considered:

  1. Increase output precision by 1 (PostgreSQL behavior): The return_type should increase precision by 1 to accommodate potential carry-over from rounding. For DECIMAL(p, s) with decimal_places=0, return DECIMAL(p+1, s).

  2. Return error on overflow: Keep same precision but return an error when the rounded value exceeds the precision (current code uses mul_checked but the overflow happens silently in display).

  3. Dynamic precision based on scale reduction: When reducing scale (e.g., from 1 to 0), calculate the minimum precision needed.

I chose option 1 as the most user friendly, I would like to hear if there are other suggestion for this fix.

@Jefffrey
Copy link
Contributor

Other options considered:

1. Increase output precision by 1 (PostgreSQL behavior): The `return_type` should increase precision by 1 to accommodate potential carry-over from rounding. For `DECIMAL(p, s)` with `decimal_places=0`, return `DECIMAL(p+1, s)`.

2. Return error on overflow: Keep same precision but return an error when the rounded value exceeds the precision (current code uses mul_checked but the overflow happens silently in display).

3. Dynamic precision based on scale reduction: When reducing scale (e.g., from 1 to 0), calculate the minimum precision needed.

I chose option 1 as the most user friendly, I would like to hear if there are other suggestion for this fix.

Option 1 sounds good. Option 2 would be too strict, not sure I fully understand what option 3 means here.

One thing to keep in mind is we still need to consider edge case where precision is already maxed out. Would this bug occur again?

@kumarUjjawal
Copy link
Contributor Author

Other options considered:

1. Increase output precision by 1 (PostgreSQL behavior): The `return_type` should increase precision by 1 to accommodate potential carry-over from rounding. For `DECIMAL(p, s)` with `decimal_places=0`, return `DECIMAL(p+1, s)`.

2. Return error on overflow: Keep same precision but return an error when the rounded value exceeds the precision (current code uses mul_checked but the overflow happens silently in display).

3. Dynamic precision based on scale reduction: When reducing scale (e.g., from 1 to 0), calculate the minimum precision needed.

I chose option 1 as the most user friendly, I would like to hear if there are other suggestion for this fix.

Option 1 sounds good. Option 2 would be too strict, not sure I fully understand what option 3 means here.

One thing to keep in mind is we still need to consider edge case where precision is already maxed out. Would this bug occur again?

You were right to point that, i did some more test and found out it was overflowing after 38 digits, I handled that as an error.


// Validate the result fits within the precision
// The max value for a given precision is 10^precision - 1
let max_value = ten.pow_checked(precision as u32).map_err(|_| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

round('1234.5678'::decimal(50,4), 2);
----
Decimal256(50, 4) 1234.57
Decimal256(51, 4) 1234.57
Copy link
Member

@martin-g martin-g Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to ask whether the increase of the precision could be done only when really needed.
And I went checking what the other DBs do:

  1. Postgres v18
postgres=# SELECT pg_typeof(round(999.9::DECIMAL(4,1))), round(999.9::DECIMAL(4,1));
 pg_typeof | round
-----------+-------
 numeric   |  1000
(1 row)

(Not helpful)

  1. Apache Spark (v4.0.1)
spark-sql (default)> SELECT typeof(round(999.9::DECIMAL(4,1))), round(999.9::DECIMAL(4,1));
decimal(4,0)    1000
Time taken: 0.032 seconds, Fetched 1 row(s)

Reduced the scale!

  1. DuckDB
SELECT typeof(round(999.9::DECIMAL(4,1))), round(999.9::DECIMAL(4,1));
┌────────────────────────────────────────────┬────────────────────────────────────┐
│ typeof(round(CAST(999.9 AS DECIMAL(4,1)))) │ round(CAST(999.9 AS DECIMAL(4,1))) │
│                  varchar                   │            decimal(4,0)            │
├────────────────────────────────────────────┼────────────────────────────────────┤
│ DECIMAL(4,0)                               │                1000                │
└────────────────────────────────────────────┴────────────────────────────────────┘

Reduced the scale!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice, Thank you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case Input Result Type
dp column 123.4567, dp=1 123.5000 DECIMAL(10,4)
carry-over 999.9999, dp=0 1000.0000 DECIMAL(10,4)
constant dp 999.9 1000 DECIMAL(4,0)
scale reduction 1234.5678, dp=2 1234.57 DECIMAL(50,2)
negative dp 12345.67, dp=-2 12300 DECIMAL(7,0)
dp > scale 123.4, dp=5 123.4 DECIMAL(4,1)
negative values -999.9, dp=0 -1000 DECIMAL(4,0)

// Get decimal_places from scalar_arguments
// If dp is not a constant scalar, we must keep the original scale because
// we can't determine a single output scale for varying per-row dp values.
let (decimal_places, dp_is_scalar): (i32, bool) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here decimal_places is initialized as i32 but below it is casted to i8 with as i8 and this may lead to problems. A validation is needed that it is not bigger than i8::MAX

Ok(Arc::new(Field::new(
self.name(),
return_type,
input_field.is_nullable(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also take into account the decimal_places arg.

Postgres:

postgres=# SELECT pg_typeof(round(999.9::DECIMAL(4,1))), round(999.9::DECIMAL(4,1), NULL);
 pg_typeof | round
-----------+-------
 numeric   |
(1 row)

Apache Spark:

spark-sql (default)> SELECT typeof(round(999.9::DECIMAL(4,1))), round(999.9::DECIMAL(4,1), NULL);
decimal(4,0)    NULL
Time taken: 0.055 seconds, Fetched 1 row(s)

DuckDB:

D SELECT typeof(round(999.9::DECIMAL(4,1))), round(999.9::DECIMAL(4,1), NULL);
┌────────────────────────────────────────────┬──────────────────────────────────────────┐
│ typeof(round(CAST(999.9 AS DECIMAL(4,1)))) │ round(CAST(999.9 AS DECIMAL(4,1)), NULL) │
│                  varchar                   │                  int32                   │
├────────────────────────────────────────────┼──────────────────────────────────────────┤
│ DECIMAL(4,0)                               │                   NULL                   │
└────────────────────────────────────────────┴──────────────────────────────────────────┘

if args.scalar_arguments.len() > 1 {
match args.scalar_arguments[1] {
Some(ScalarValue::Int32(Some(v))) => (*v, true),
Some(ScalarValue::Int64(Some(v))) => (*v as i32, true),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better use a safe cast and return an Err if the number is bigger than i32::MAX (or even i8::MAX)

decimal_places: Option<ArrayRef>,
) -> Result<ArrayRef, DataFusionError> {
let number_rows = value.len();
let return_type = value.data_type().clone();
Copy link
Member

@martin-g martin-g Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be wrong for decimals which scale is reduced.
No need to fix/improve it. Just a comment acknowledging that is enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

round decimal incorrect result

3 participants