Skip to content

Add RoundMode to round and fix rounding inconsistency (Decimal uses "half_even", float uses "up") #21800

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Julian-J-S opened this issue Mar 17, 2025 · 15 comments · May be fixed by #22248
Open
Labels
enhancement New feature or an improvement of an existing feature

Comments

@Julian-J-S
Copy link
Contributor

Julian-J-S commented Mar 17, 2025

Description

Problem (edited) ⚠

current round expression has no mode and is inconsistent

  • Decimal: uses "half even"
  • Float: uses "half away from zero"

Why

Depending on the industry (financial, pricing, ...) it is critical to use the "correct" rounding mode.
The two most common are

  • half even / bankers rounding
  • half away from zero / up (e.g. excel)

There are multiple others that can be added in the future.

@Julian-J-S Julian-J-S added the enhancement New feature or an improvement of an existing feature label Mar 17, 2025
@orlp
Copy link
Member

orlp commented Mar 17, 2025

In think in general our round implementation needs work, because we don't always consistently round half-to-even:

>>> s = pl.Series([0.5, 1.5, 2.5, 3.5])
>>> s.round()
shape: (4,)
Series: '' [f64]
[
        1.0
        2.0
        3.0
        4.0
]
>>> s.cast(pl.Decimal(scale=1)).round()
shape: (4,)
Series: '' [decimal[*,1]]
[
        0.0
        2.0
        2.0
        4.0
]

I would in theory be ok with adding a mode option to Expr.round and co with the following parameters:

(names got redesigned a bit, see below)

  • "half_even" (default),
  • "half_up", (towards +inf)
  • "half_down", (towards -inf)
  • "half_to_zero" (towards 0)
  • "half_to_inf" (away from 0)
  • "half_stochastic" (up or down at random)

as well as each mode without the half_ such that it always applies rather than as a tiebreaker (where stochastic rounds up with probability frac(x)).

However, the question is how much work it would be to implement all of these 12 rounding modes correctly...

@Julian-J-S
Copy link
Contributor Author

Great idea!

I think we can go one step at a time and don't need all at once.
Maybe start with the 2 most used ones and add others in the future 🤓

@Julian-J-S
Copy link
Contributor Author

@orlp sorry for being a little fast with the implementation. You are right, let uns talk about the details here first 😉

RoundModes & Naming

From what I see in other libraries and languages it seems like the following are common

Half-Versions

Rounds to the nearest value with a defined "tie-breaker"

  • "half_even": clear
  • "half_up": away from zero
  • "half_down" towards zero

Non-half-versions

Rounds in the defined "direction"

  • "up": away from zero
  • "down": towards zero
  • "ceiling": towards +inf
  • "floor": towards -inf

Proposal

Currently "Decimal" uses "half_even" and "float" uses "half_up" (inconsistent)
As a first step I suggest to start small and implement those 2 modes for Decimal & Float.
These are the most common modes and probably cover 99% of use-cases.

If there is time and demand for more we can add that later.

@orlp
Copy link
Member

orlp commented Mar 21, 2025

It seems there isn't a clear consensus, for example Julia uses up/down to indicate ceil/floor(similar to my instinct) and uses to_zero and from_zero. I think it's probably best if we avoid up and down entirely.

@Julian-J-S
Copy link
Contributor Author

okay, I get that.
I think the more concise we want to be, the longer the name. 🤔
But I am actually pretty fine with that because we get auto-complete on those modes 😃

How about:

  • half_to_even: to even
  • half_to_zero: to 0
  • half_away_from_zero / half_to_inf: to +- inf

and

  • away_from_zero / to_inf: "up"
  • to_zero: "down"
  • ceiling
  • floor

@orlp
Copy link
Member

orlp commented Mar 21, 2025

To reiterate, (eventually) I'd want every half_ variant to also have a non-half_ variant and vice versa. I guess rounding with non-standard modes is fairly rare, so I guess having a bit longer names is alright. Thus I'd suggest the following names (and their half_ cousins):

  • to_even, to nearest even (default),
  • to_zero, to 0,
  • away_from_zero, away from 0 to +inf for positive, -inf for negative,
  • ceil, to +inf,
  • floor, to -inf,
  • stochastic, as explained earlier.

Note: not ceiling, we already have pl.Expr.ceil so we should stick with that wording.

@Julian-J-S
Copy link
Contributor Author

Julian-J-S commented Mar 21, 2025

I am unsure if to_even makes sense without the "half" version?

Without the "half" (tie breaker) version we usually have a SINGLE direction (round always to either a higher or lower value).

Problem:
How to round the number "3"?
We would need a tie breaker again because "3" is exactly in the middle of "2" and "4".

One option would be to keep odd numbers with remainder of zero but I would expect to end up with only even numbers.

@orlp
Copy link
Member

orlp commented Mar 21, 2025

@Julian-J-S The number 3 would round to itself because it's already an integer.

One option would be to keep odd numbers with remainder of zero but I would expect to end up with only even numbers.

A bit silly, but would you expect rounding with to_zero to only end up with the zero number? Or would you expect rounding 3 with to_zero to result in 2?

I understand that to_even feels a bit weird because it has 'unstable' points which round differently with arbitrarily small epsilons in either direction, but that's not a fundamental problem.

@Julian-J-S
Copy link
Contributor Author

imo it could be a little confusing. But on the other hand this is a very niche use-case I assume so people probably know what they are doing 😆 .

But okay, I think now I fully understand the idea:

  • if the value is already an int: do nothing
  • otherwise: round is the given "direction"

Somehow when reading to_even I had in mind this would always return an even number.

@orlp
So I would start now with the two most common ones:

  • half_to_even
    • we already use this for Decimal
    • needs implementation for f62 / f32 🔧
  • half_away_from_zero
    • we already use this for f62 / f32
    • needs implementation for Decimal 🔧

I will also set default for both to be half_to_even? 🔧

@eitsupi
Copy link
Contributor

eitsupi commented Mar 22, 2025

Please check #17798

@Julian-J-S Julian-J-S changed the title Add rounding strategy "Half to infinity" (or "Half Up") Add RoundMode to round and fix rounding inconsistency (Decimal uses "half_even", float uses "up") Mar 22, 2025
@Julian-J-S
Copy link
Contributor Author

Please check #17798

@eitsupi what should I check?

The linked issue only mentions to add rounding "half to even" (bankers rounding).
We already have that for the Decimal type.
Unfortunately floats use "half away from zero".
We want to add different modes to let the user decide and also fix the inconsistencies between the types 😉

@eitsupi
Copy link
Contributor

eitsupi commented Mar 22, 2025

So my point is that this issue is essentially a duplication of #17798.

@Julian-J-S
Copy link
Contributor Author

So my point is that this issue is essentially a duplication of #17798.

no, it is not 😉

This here is much more fixing the "round problem" on a higher level. The other one has wrong assumptions and problematic ideas.

The other issue says:

  • we need to add round "half to even" because we do not have it: this is partly incorrect; we have that for Decimal type. We are missing that on float. It is the inconsistent rounding that is the problem
  • we should add a to_even bool parameter to round: this is also not a good idea. There are many round modes; we cannot simplify this to a bool parameter.

@orlp
Copy link
Member

orlp commented Mar 22, 2025

@orlp So I would start now with the two most common ones:

* `half_to_even`
  
  * we already use this for `Decimal` ✅
  * needs implementation for `f62` / `f32` 🔧

* `half_away_from_zero`
  
  * we already use this for `f62` / `f32` ✅
  * needs implementation for `Decimal` 🔧

I will also set default for both to be half_to_even? 🔧

Yes, that would be great :)

@eitsupi
Copy link
Contributor

eitsupi commented Mar 22, 2025

  • we need to add round "half to even" because we do not have it: this is partly incorrect; we have that for Decimal type. We are missing that on float.

Just to point out for the record, I believe round support was added to the Decimal type in Polars after that issue was created.

It is the inconsistent rounding that is the problem

Half up rounding in general is feasible if adding, multiplication and ceiling can be used, and I see no reason why the round function should not do round ties even by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature
Projects
None yet
3 participants