Skip to content

Conversation

@kumarUjjawal
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

  • widen the variance UDF signature to allow decimal arguments while
    keeping Float64 outputs, and add unit coverage that feeds decimal
    batches through both population and sample accumulators
  • extend aggregate.slt with decimal variance queries and align the
    expected strings with the Float64 results emitted by the engine

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Nov 26, 2025
Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the changes are doing what you expect here; I believe decimals are still being converted to float for these calculations. What we are looking to do is to perform the calculations natively on decimals without casting them to floats. The function already supports decimal inputs by casting them to floats.

@kumarUjjawal
Copy link
Contributor Author

I don't think the changes are doing what you expect here; I believe decimals are still being converted to float for these calculations. What we are looking to do is to perform the calculations natively on decimals without casting them to floats. The function already supports decimal inputs by casting them to floats.

Thanks for pointing that. I just checked out the PR #1408 for the guidance and updated the code. I also used LLM for this.(disclosure)

@Jefffrey
Copy link
Contributor

One of the key requirements that we want here is that decimal inputs are not cast to floats, and the return type remains decimal. I think if you check the tests introduced here you'll see that is not the case currently, hence the code for native decimal implementation isn't actually being used.

let decimal_array = builder.finish().with_precision_and_scale(10, 3).unwrap();
let array: ArrayRef = Arc::new(decimal_array);

let mut pop_acc = VarianceAccumulator::try_new(StatsType::Population)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this use DecimalVarianceAccumulator ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it should, I made a mistake, there are a few more I will fix it soon.

pop_acc.update_batch(&pop_input)?;
assert_variance(pop_acc.evaluate()?, 11025.9450285);

let mut sample_acc = VarianceAccumulator::try_new(StatsType::Sample)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too

T: DecimalType + ArrowNumericType + Debug,
T::Native: DecimalNative,
{
fn new(scale: i8, stats_type: StatsType) -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why here the scale is not checked (scale > DECIMAL256_MAX_SCALE) ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants