Implemented enhancements:
- Feature: Vacuum with version retention #3530
- Any way to prune the delta_log or support shallow clones #3565
- Upgrade Arrow version to 55.1.0 #3540
- Add config option to suppress
deltalake_core::writer::stats
warnings about bytes columns #3519 - Remove pyarrow dependency (make opt-in), replace with arro3 for core components #3455
- Don't retry lakefs commit or merge on
412
response (precondition failed) #3429 - Use
object_store
spawnService #3427 - Alter table description #3401
- Remove put if absent options injection #3310
- v1.0 Release tracking issue #3250
- feat: add a table description and name to the Delta Table from Python #3464 (fvaleye)
Fixed bugs:
- Python building broken on main due to maturin issue #3559
- TypeError: write_deltalake() got an unexpected keyword argument 'schema' (deltalake/polars) #3546
- SchemaMismatchError on empty ArrayType field while contains_null=True #3544
- Can't open a delta-table: Unsupported reader features required: DeletionVectors #3543
- Attempting to write a transaction 3 but the underlying table has been updated to 3 #3534
- DeltaOps not recognizing abfss scheme for Azure #3523
- Query execution time difference between
QueryBuilder
and using DataFusion directly. #3517 - bug: timezone not preserved & raise exc on merge operation #3507
- allow_unsafe_rename option stopped working in version 1 #3493
- predicate appears to ignore partition and stats in pruning #3491
max_rows_per_file
ignored when writing with rust engine #3490- delta-rs includes pending versions written by spark #3422
Merged pull requests:
- chore: bump minor version for rust crate #3586 (rtyler)
- refactor!: use delta-kernel Protocol and Metadata actions #3581 (roeap)
- feat: vacuum with version retention #3537 (corwinjoy)
- chore: bump patch versions for another relaese #3585 (rtyler)
- feat: write
engineInfo
with delta-rs version #3584 (zachschuermann) - chore: remove the deltalake-sql crate #3582 (rtyler)
- chore: latest clippy #3571 (roeap)
- feat: convert partition filters to kernel predicates #3570 (roeap)
- refactor: move schema code to kernel module #3569 (roeap)
- chore: remove redundant words in comment #3568 (shangchenglumetro)
- docs: ensure create_checkpoint() is visible in the Python API docs #3564 (itamarst)
- chore: upgrade to delta_kernel 0.12.x #3561 (rtyler)
- chore: clean up licenses in python project which are causing build issues #3560 (rtyler)
- chore: update arrow/parquet to 55.2.0 #3558 (alamb)
- fix: use proper DeltaTableState for vacuum commits #3550 (jeromegn)
- fix: version binary search #3549 (aditanase)
- chore: update the minor version to reflect a behavior change #3542 (rtyler)
- chore: pin aws crates #3532 (ion-elgreco)
- chore: set java version to 21 for pyspark 4.0 #3524 (ion-elgreco)
- fix: using state provided in args in merge op #3522 (gtrawinski)
- refactor: remove unecessary uses of datafusion subcrates #3521 (alamb)
- chore: update to DataFusion
48.0.0
/ arrow to 55.2.0 #3520 (alamb) - feat: make TableConfig accessible #3518 (ion-elgreco)
- fix: remove forced table update from python writer #3515 (ohanf)
- refactor: compute stats schema with kernel types #3514 (roeap)
- feat: add convenience extension for kernel engine types #3510 (roeap)
- refactor: move LazyTableProvider into python crate #3509 (roeap)
- fix: setting wrong schema in table provider for
merge
#3508 (ion-elgreco) - fix: constraint parsing, roundtripping #3503 (ion-elgreco)
- refactor!: have DeltaTable::version return an Option #3500 (roeap)
- chore!: remove get_earliest_version #3499 (roeap)
- chore: prepare for the next python release #3498 (rtyler)
- ci: improve coverage collection #3497 (roeap)
- chore: update runner #3494 (ion-elgreco)
- docs: update link to df #3489 (rluvaton)
- refactor!: remove and deprecate some python methods #3488 (roeap)
- fix: ensure projecting only columns that exist in new files afte sche… #3487 (alexwilcoxson-rel)
- chore: exclude Invariants from the default writer v2 feature set #3486 (rtyler)
- test: improve storage config testing #3485 (roeap)
- refactor!: get transaction versions for specific applications #3484 (roeap)
- docs: fix bullet list formatting in dagster docs #3483 (avriiil)
- fix: set casting safe param to False #3481 (ion-elgreco)
- chore: update kernel to 0.11 #3480 (roeap)
- chore: update migration docs #3479 (ion-elgreco)
- chore: remove unused stats_parsed field #3475 (roeap)
- refactor: remove protocol error #3473 (roeap)
- chore: more typos #3471 (roeap)
- chore: remove unused time_utils #3470 (roeap)
- chore: set correct markers #3469 (ion-elgreco)
- fix: schema conversion, add conversion test cases #3468 (ion-elgreco)
- feat: write checkpoints with kernel #3466 (roeap)
- fix: correct spelling errors found by CI spell checker #3465 (fvaleye)
- chore: update kernel #3462 (roeap)
- fix: use more accurate log path parsing #3461 (roeap)
- refactor: remove pyarrow dependency #3459 (ion-elgreco)
- chore: mark more tests which require datafusion #3458 (rtyler)
- ci: add spellchecker to pr tests #3457 (roeap)
- refactor: use full paths in log processing #3456 (roeap)
- chore: ensuring default builds work without datafusion #3453 (rtyler)
- refactor: use LogStore in Snapshot / LogSegment APIs #3452 (roeap)
- test: avoid circular dependency with core/test crates #3450 (roeap)
- feat: expose kernel Engine on LogStore #3446 (roeap)
- refactor: more specific factory parameter names #3445 (roeap)
- docs: add 1.0.0 migration guide #3443 (ion-elgreco)
- chore: minor table module refactors #3442 (rtyler)
- chore: remove unused code and deps #3441 (roeap)
- chore: experiment with using sccache in GitHub Actions #3437 (rtyler)
- feat: optimize datafusion predicate pushdown and partition pruning #3436 (rtyler)
- chore: prepare py-1.0 release #3435 (ion-elgreco)
- chore: make codecov more vigorously enforced to help ensure quality #3434 (rtyler)
- chore: rely on the testing during coverage generation to speed up tests #3431 (rtyler)
- chore: bump crate versions which are due for release #3430 (rtyler)
- chore(deps): bump foyer to v0.17.2 to prevent from wrong result #3428 (MrCroxx)
- feat: spawn io with spawn service #3426 (ion-elgreco)
- fix: ignore temp log entries #3423 (corwinjoy)
- fix: build Unity Catalog crate without DataFusion #3420 (linhr)
- feat: added a check for gc code to run #3419 (JustinRush80)
- chore: include license file in deltalake-derive crate #3417 (ankane)
- fix: drop column update #3416 (ion-elgreco)
- chore: missed a version bump for core #3415 (rtyler)
- chore: bringing dat integration testing in ahead of kernel replay #3411 (rtyler)
- chore: reduce scope of feature flags and compilation requirements for subcrates #3409 (rtyler)
- chore: commit the contents of the 0.26.0 release #3408 (rtyler)
- chore: bump versions of rust crates for another release party #3406 (rtyler)
- fix: the default target size should be 100MB #3404 (HiromuHota)
- chore: update delta_kernel to 0.10.0 #3403 (zachschuermann)
- refactor: make "cloud" feature in object_store optional #3398 (zeevm)
- chore: put a couple symbols behind the right feature gate #3393 (rtyler)
- fix: clippy warnings #3390 (alamb)
- feat: derive macro for config implementations #3389 (roeap)
- feat!: update storage configuration system #3383 (roeap)
- refactor!: move storage module into logstore #3382 (roeap)
- chore: move proofs into dedicated folder #3381 (roeap)
- refactor: move transaction module to kernel #3380 (roeap)
- chore: clippy #3379 (roeap)
- feat: upgrade to DataFusion 47.0.0 #3378 (alamb)
- fix: if field contains space in constraint expression, checks will fail #3374 (Nordalf)
- fix: parse unconventional logs #3373 (roeap)
- feat: introduce VacuumMode::Full for cleaning up orphaned files #3368 (rtyler)
- chore: fix some minor build warnings #3366 (rtyler)
- chore: remove cdf feature #3365 (ion-elgreco)
- docs: add example how to authenticate using Azure CLI for Azure ADSL integration #3357 (DanielBertocci)
- fix: parse snapshot #3355 (ion-elgreco)
- docs: update merge-tables.md with "Optimizing Merge Performance" section #3351 (ldacey)
- fix: use field physical name when resolving partition columns #3349 (zeevm)
- feat: during LakeFS file operations, skip merge when 0 changes #3346 (smeyerre)
- refactor(python): improve typing, linting #3344 (ion-elgreco)
- docs: update dataFusion integration example #3343 (riziles)
- perf: use lazy sync reader #3338 (ion-elgreco)
- feat(api): add rustls and native-tls features #3335 (zeevm)
- refactor: add 'cloud' feature to 'core' to enable 'cloud' on 'object_store' only when needed #3332 (zeevm)
- chore: improve io error msg #3328 (ion-elgreco)
- chore: remove pyarrow upper #3325 (ion-elgreco)
- fix: block_in_place to allow nested tasks #3324 (ion-elgreco)
- fix: check for all known valid delta files in is_deltatable #3318 (umartin)
- fix: added restored metadata as action to the next committed version #3303 (Nordalf)
- fix: correct Python docs for incremental compaction on OPTIMIZE #3301 (roykim98)