-
Notifications
You must be signed in to change notification settings - Fork 246
feat: Implement array-to-string cast support #2425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2425 +/- ##
=============================================
- Coverage 56.12% 22.14% -33.99%
+ Complexity 976 438 -538
=============================================
Files 119 146 +27
Lines 11743 13626 +1883
Branches 2251 2362 +111
=============================================
- Hits 6591 3017 -3574
- Misses 4012 10050 +6038
+ Partials 1140 559 -581 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
15e8ee2
to
effb775
Compare
2f160fc
to
c104e3f
Compare
f4ff894
to
a2b7590
Compare
This PR is ready for review. Please have a look when you have some time. @andygrove |
} else { | ||
str.clear(); | ||
let value_ref = array.value(row_index); | ||
let native_cast_result = cast_array(value_ref, &Utf8, spark_cast_options).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be more efficient to call cast_array
once on array.values
before the for loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I updated cast_array_to_string
to cast array.values()
once before the loop.
BTW, does Rust/Comet have any tool for running micro‑benchmarks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, does Rust/Comet have any tool for running micro‑benchmarks?
There is a documentation for cargo-bench: https://datafusion.apache.org/comet/contributor-guide/profiling_native_code.html#running-micro-benchmarks-with-cargo-bench
68d9abc
to
56e5316
Compare
spark/src/test/spark-3.5/org/apache/spark/sql/CometToPrettyStringSuite.scala
Outdated
Show resolved
Hide resolved
spark/src/test/spark-4.0/org/apache/spark/sql/CometToPrettyStringSuite.scala
Outdated
Show resolved
Hide resolved
|
||
test("cast ArrayType to StringType") { | ||
val hasIncompatibleType = (dt: DataType) => | ||
if (CometConf.COMET_NATIVE_SCAN_IMPL.get() == "auto") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For cases where the test is run without setting the environment variable COMET_PARQUET_SCAN_IMPL
, such as when developing in IntelliJ IDEA.
val result: DataFrame = Dataset.ofRows(spark, analyzed) | ||
CometCast.isSupported(field.dataType, DataTypes.StringType, Some(spark.sessionState.conf.sessionLocalTimeZone), CometEvalMode.TRY) match { | ||
case _: Compatible => checkSparkAnswerAndOperator(result) | ||
case _: Compatible if CometScanTypeChecker(CometConf.COMET_NATIVE_SCAN_IMPL.get()).isTypeSupported(field.dataType, field.name, ListBuffer.empty) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code style is incorrect, reformat it after #2557
Which issue does this PR close?
Closes #2410 .
Rationale for this change
What changes are included in this PR?
Implement array-to-string cast support
How are these changes tested?
Added UTs