Skip to content

Conversation

@devozerov
Copy link
Contributor

@devozerov devozerov commented Nov 29, 2024

Description

Connector objects may contain columns that contain only NaN values. When these values are passed to DoubleRange ctor, an exception occurs. Depending on the context, this may lead to either silent fallback to empty stats (e.g., SHOW STATS will give the user misleading values), or cause exceptions within an optimizer.

This PR fixes the problem, normalizing ranges with NaN to empty range.

Additional context and related issues

Originally this problem was observed during careful analysis of BaseIcebergConnectorTest.testPartitionedByDoubleWithNaN behavior. One may think that this is an Iceberg issue, which we may fix around Iceberg's TableStatisticsReader.makeTableStatistics class. However, other engines may also legitimately expose NaN as their stats, so it seems that it is better to fix it completely at the SPI level.

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## General
* Fix query failures or wrong output in `SHOW STATS` when a connector returns NaN values for table statistics. ({issue}`24315`)

Copy link
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % @raunaqmorarka comments

@raunaqmorarka raunaqmorarka merged commit 5b07d8e into trinodb:master Dec 3, 2024
93 checks passed
@github-actions github-actions bot added this to the 467 milestone Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed iceberg Iceberg connector

Development

Successfully merging this pull request may close these issues.

3 participants