Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: streamed execution, writer refactor, simplified generated columns and schema evolution #3229

Merged
merged 5 commits into from
Feb 19, 2025

Conversation

ion-elgreco
Copy link
Collaborator

@ion-elgreco ion-elgreco commented Feb 18, 2025

Description

This PR is a mix of other branches I've been working and one of @rtyler's branch with the streamed execution.

The changes that this brings are as follows:

  • writer refactored into a module
  • schema evolution casting done through dataframe API, so way more simplified
  • Generated columns code paths moved into writer module, so that merge and write use the same GC logic
  • streamed execution in python rust writer always enabled
  • Resolved all pyo3 deprecations

Supersedes #3196 #3151

Copy link

codecov bot commented Feb 18, 2025

Codecov Report

Attention: Patch coverage is 71.07438% with 210 lines in your changes missing coverage. Please review.

Project coverage is 72.16%. Comparing base (0473d7f) to head (39010c4).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
python/src/lib.rs 0.00% 39 Missing ⚠️
crates/core/src/operations/write/execution.rs 89.23% 4 Missing and 34 partials ⚠️
python/src/writer.rs 0.00% 36 Missing ⚠️
...tes/core/src/operations/write/generated_columns.rs 31.25% 30 Missing and 3 partials ⚠️
crates/core/src/operations/write/mod.rs 86.70% 7 Missing and 14 partials ⚠️
python/src/schema.rs 0.00% 16 Missing ⚠️
.../core/src/operations/write/schema_evolution/mod.rs 68.88% 13 Missing and 1 partial ⚠️
python/src/filesystem.rs 0.00% 5 Missing ⚠️
crates/core/src/operations/write/metrics.rs 75.00% 3 Missing and 1 partial ⚠️
python/src/utils.rs 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3229      +/-   ##
==========================================
- Coverage   72.16%   72.16%   -0.01%     
==========================================
  Files         138      143       +5     
  Lines       45874    45692     -182     
  Branches    45874    45692     -182     
==========================================
- Hits        33107    32973     -134     
+ Misses      10676    10642      -34     
+ Partials     2091     2077      -14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rtyler rtyler self-assigned this Feb 18, 2025
@ion-elgreco ion-elgreco force-pushed the feat/lazy-stream-v3-with-cdf branch 4 times, most recently from 4778a13 to dc2090e Compare February 18, 2025 16:09
Copy link
Member

@rtyler rtyler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this stuff on @ion-elgreco , let's smash on this some more from main and cut a 0.25 soon

@ion-elgreco ion-elgreco added this pull request to the merge queue Feb 19, 2025
Merged via the queue into delta-io:main with commit 69a94e9 Feb 19, 2025
30 checks passed
@ion-elgreco ion-elgreco deleted the feat/lazy-stream-v3-with-cdf branch February 19, 2025 14:27
@thomasfrederikhoeck
Copy link
Contributor

Sick!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate
Projects
None yet
3 participants