Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Widen type promotion for decimals with larger scale in Parquet Read [databricks] #11727

Draft
wants to merge 2 commits into
base: branch-24.12
Choose a base branch
from

Conversation

nartal1
Copy link
Collaborator

@nartal1 nartal1 commented Nov 16, 2024

This PR contributes to #11433 and contributes to #11512

This PR supports additional type promotion to decimals with larger precision and scale.
As long as the precision increases by at least as much as the scale, the decimal values can be promoted without loss of precision.
A similar change is added in Apache Spark-4.0 version - apache/spark#44513
Currently, the code throws an Exception if the scale of read schema is not as same as the schema that was written for all versions previous to Spark-4.0 on CPU. This fix is available for all versions in spark-rapids.

We have removed separate checks for the decimal if they can be read as int, long and byte_array and consolidated into one function canReadAsDecimal. Added integration test to verify that the conditions of the type promotions are met.

@nartal1 nartal1 self-assigned this Nov 16, 2024
@nartal1 nartal1 changed the title Widen type promotion for decimals with larger scale in Parquet Read Widen type promotion for decimals with larger scale in Parquet Read [databricks] Nov 16, 2024
@nartal1 nartal1 marked this pull request as draft November 16, 2024 00:59
@nartal1
Copy link
Collaborator Author

nartal1 commented Nov 16, 2024

build

@sameerz sameerz added the bug Something isn't working label Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants