-
Notifications
You must be signed in to change notification settings - Fork 490
chore: update to DataFusion 48.0.0
#3520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
}; | ||
use datafusion::datasource::{listing::PartitionedFile, MemTable, TableProvider, TableType}; | ||
use datafusion::execution::context::{SessionConfig, SessionContext, SessionState, TaskContext}; | ||
use datafusion::execution::runtime_env::RuntimeEnv; | ||
use datafusion::execution::FunctionRegistry; | ||
use datafusion::optimizer::simplify_expressions::ExprSimplifier; | ||
use datafusion::physical_optimizer::pruning::{PruningPredicate, PruningStatistics}; | ||
use datafusion::physical_optimizer::pruning::PruningPredicate; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PruningStatistics is no longer accessible via datafuson::physical_optimizer
as it was moved. We should fix this with a pub use
I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.with_limit(self.limit) | ||
.with_table_partition_cols(table_partition_cols) | ||
.build(); | ||
let file_source = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is all reformatting as now with_schema_adapter_factory
returns an Arc directly
@@ -1114,8 +1113,11 @@ impl ExecutionPlan for DeltaScan { | |||
Some(self.metrics.clone_inner()) | |||
} | |||
|
|||
fn statistics(&self) -> DataFusionResult<Statistics> { | |||
self.parquet_scan.statistics() | |||
fn partition_statistics( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
statistics was deprecated
48.0.0
48.0.0
f79704e
to
ca487a2
Compare
@ion-elgreco or @rtyler is there any chance you can kick off the CI tests for this PR (so I can verify that DataFusion 48.0.0 passes the delta.rs tests?) |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3520 +/- ##
==========================================
- Coverage 74.24% 74.21% -0.04%
==========================================
Files 150 150
Lines 44733 44758 +25
Branches 44733 44758 +25
==========================================
+ Hits 33210 33215 +5
- Misses 9376 9389 +13
- Partials 2147 2154 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
CI failure is due to my unpinning arrow-rs I will try and unpin it in datafusion as well |
390478d
to
783acbc
Compare
Ok, I think this is ready for another round of tests |
783acbc
to
fe51187
Compare
@@ -311,7 +312,7 @@ fn parse_partitions(batch: RecordBatch, partition_schema: &StructType) -> DeltaR | |||
|
|||
insert_field( | |||
batch, | |||
StructArray::try_new( | |||
StructArray::try_new_with_length( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated a few calls to use StructArray::try_new_with_length(
as without them several of the unit tests were failing
/// Use the native parser to parse the statement | ||
// TODO fix for multiple statememnts and keeping parsers in sync | ||
fn parse_with_native(&mut self) -> Result<Statement, ParserError> { | ||
// DataFusion's parser returns DataFusionError as it may also add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unfortunate to have to translate back from DataFusion error to a ParserError, but I think it is unavoidable as now DataFusion will potentially augment ParserErrors with Span and context information
To retain the same API we need to strip off any additional information and return the underlying ParserError
Ok, I resolved some other issues and updated to DataFusion 48 rc2 -- things are looking good for me locally. Will wait for the CI results |
The remaining failures don't look to me like they are related to this PR so I am claiming success |
Signed-off-by: Andrew Lamb <[email protected]>
fe51187
to
ad59945
Compare
Description
The description of the main changes of your pull request
Related Issue(s)
48.0.0
(June 2025) apache/datafusion#15771This tests upgrading delta-rs to DataFusion 48. It seems to have gone fine, but
I hit a few rough edges that I will list and file fixes for in DataFusion:
Documentation