Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: streamed execution, writer refactor, simplified generated columns and schema evolution #3223

Closed
wants to merge 4 commits into from

Conversation

ion-elgreco
Copy link
Collaborator

@ion-elgreco ion-elgreco commented Feb 16, 2025

Description

This PR is a mix of other branches I've been working and one of @rtyler's branch with the streamed execution.

The changes that this brings are as follows:

  • writer refactored into a module
  • schema evolution casting done through dataframe API, so way more simplified
  • Generated columns code paths moved into writer module, so that merge and write use the same GC logic
  • streamed execution in python rust writer always enabled (except when CDF is enabled, I will do this in a follow up since I don't want to make this PR too huge)
  • Resolved all pyo3 deprecations

Supersedes #3196 #3151

@github-actions github-actions bot added binding/python Issues for the Python package binding/rust Issues for the Rust crate labels Feb 16, 2025
Copy link

codecov bot commented Feb 16, 2025

Codecov Report

Attention: Patch coverage is 67.55070% with 208 lines in your changes missing coverage. Please review.

Project coverage is 72.25%. Comparing base (2f1a5ac) to head (4c0d728).

Files with missing lines Patch % Lines
python/src/lib.rs 0.00% 51 Missing ⚠️
python/src/writer.rs 0.00% 36 Missing ⚠️
...tes/core/src/operations/write/generated_columns.rs 31.25% 30 Missing and 3 partials ⚠️
crates/core/src/operations/write/execution.rs 90.87% 3 Missing and 23 partials ⚠️
crates/core/src/operations/write/mod.rs 85.27% 7 Missing and 12 partials ⚠️
python/src/schema.rs 0.00% 16 Missing ⚠️
.../core/src/operations/write/schema_evolution/mod.rs 68.88% 13 Missing and 1 partial ⚠️
python/src/filesystem.rs 0.00% 5 Missing ⚠️
crates/core/src/operations/write/metrics.rs 75.00% 3 Missing and 1 partial ⚠️
python/src/utils.rs 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3223      +/-   ##
==========================================
- Coverage   72.31%   72.25%   -0.07%     
==========================================
  Files         138      144       +6     
  Lines       45398    45339      -59     
  Branches    45398    45339      -59     
==========================================
- Hits        32831    32761      -70     
- Misses      10489    10494       +5     
- Partials     2078     2084       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ion-elgreco
Copy link
Collaborator Author

Superseded by #3229 which adds streamed execution also while doing CDF.

auto-merge was automatically disabled February 18, 2025 12:23

Pull request was closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate
Projects
None yet
1 participant