Skip to content

Conversation

kazantsev-maksim
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@comphead
Copy link
Contributor

comphead commented Oct 6, 2025

There are some compilation issues

@codecov-commenter
Copy link

codecov-commenter commented Oct 6, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 41.39%. Comparing base (f09f8af) to head (e17e4ae).
⚠️ Report is 577 commits behind head on main.

Additional details and impacted files
@@              Coverage Diff              @@
##               main    #2523       +/-   ##
=============================================
- Coverage     56.12%   41.39%   -14.74%     
- Complexity      976     1068       +92     
=============================================
  Files           119      146       +27     
  Lines         11743    13562     +1819     
  Branches       2251     2358      +107     
=============================================
- Hits           6591     5614      -977     
- Misses         4012     6975     +2963     
+ Partials       1140      973      -167     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@parthchandra parthchandra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to recall that we implemented this because the DF version was slower.

@comphead
Copy link
Contributor

comphead commented Oct 7, 2025

This is a good point @parthchandra we can "steal" date_trunc bench from DF and check it out.

@andygrove
Copy link
Member

Current Comet performance for date_trunc isn't great according to CometDatetimeExpressionBenchmark.

Running benchmark: Date Truncate  - YEAR
  Running case: Date Truncate  - YEAR - Spark 
  Stopped after 38 iterations, 2006 ms
  Running case: Date Truncate  - YEAR - Comet
  Stopped after 39 iterations, 2022 ms

OpenJDK 64-Bit Server VM 11.0.28+6-post-Ubuntu-1ubuntu122.04.1 on Linux 6.8.0-84-generic
AMD Ryzen 9 7950X3D 16-Core Processor
Date Truncate  - YEAR:                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Date Truncate  - YEAR - Spark                        46             53           6         23.0          43.5       1.0X
Date Truncate  - YEAR - Comet                        46             52           5         22.6          44.2       1.0X

Running benchmark: Date Truncate  - YYYY
  Running case: Date Truncate  - YYYY - Spark 
  Stopped after 46 iterations, 2019 ms
  Running case: Date Truncate  - YYYY - Comet
  Stopped after 42 iterations, 2006 ms

OpenJDK 64-Bit Server VM 11.0.28+6-post-Ubuntu-1ubuntu122.04.1 on Linux 6.8.0-84-generic
AMD Ryzen 9 7950X3D 16-Core Processor
Date Truncate  - YYYY:                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Date Truncate  - YYYY - Spark                        39             44           6         26.6          37.6       1.0X
Date Truncate  - YYYY - Comet                        43             48           6         24.4          40.9       0.9X

Running benchmark: Date Truncate  - YY
  Running case: Date Truncate  - YY - Spark 
  Stopped after 51 iterations, 2010 ms
  Running case: Date Truncate  - YY - Comet
  Stopped after 44 iterations, 2040 ms

OpenJDK 64-Bit Server VM 11.0.28+6-post-Ubuntu-1ubuntu122.04.1 on Linux 6.8.0-84-generic
AMD Ryzen 9 7950X3D 16-Core Processor
Date Truncate  - YY:                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Date Truncate  - YY - Spark                          37             39           3         28.6          34.9       1.0X
Date Truncate  - YY - Comet                          43             46           6         24.5          40.8       0.9X

I tried running against this PR but got similar failures to the ones we are seeing in CI.

@kazantsev-maksim
Copy link
Contributor Author

The DataFusion implementation of date_trunc has a narrower scope than Spark's, as it only accepts TimestampType, while Spark's version additionally supports StringType and DateType.

Comet does not fully support casting DateType to TimestampType yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants