Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor!: update ScanData to struct with new FilteredEngineData type #768

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

sebastiantia
Copy link
Collaborator

@sebastiantia sebastiantia commented Mar 26, 2025

What changes are proposed in this pull request?

  1. Updated ScanData from typed tuple to struct. ScanData is now a struct with fields:
  • filtered_data: A FilteredEngineData instance.
  • transforms: A vector of optional transformations to be applied to the rows in filtered_data.
  1. Introduction of FilteredEngineData type:
    Couples EngineData with a selection vector indicating which rows to process.
    This type is returned from thescan_data API and the incoming checkpoint API

  2. Updates visit_scan_files parameters to accept ScanData to avoid de-structuring.

  3. Corresponding FFI changes for visit_scan_files to accept ScanData param

How was this change tested?

All current tests pass.

Copy link

codecov bot commented Mar 26, 2025

Codecov Report

Attention: Patch coverage is 60.81081% with 29 lines in your changes missing coverage. Please review.

Project coverage is 84.38%. Comparing base (7e5476a) to head (48c2bdc).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
ffi/src/scan.rs 0.00% 23 Missing ⚠️
kernel/src/scan/mod.rs 85.18% 0 Missing and 4 partials ⚠️
kernel/src/scan/log_replay.rs 92.85% 0 Missing and 1 partial ⚠️
kernel/src/scan/state.rs 90.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #768      +/-   ##
==========================================
+ Coverage   84.35%   84.38%   +0.03%     
==========================================
  Files          81       81              
  Lines       19233    19249      +16     
  Branches    19233    19249      +16     
==========================================
+ Hits        16224    16244      +20     
+ Misses       2205     2201       -4     
  Partials      804      804              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sebastiantia sebastiantia added the breaking-change Change that will require a version bump label Mar 26, 2025
@sebastiantia sebastiantia changed the title refactor!: make ScanData struct with new FilteredEngineData type refactor!: update ScanData to struct with new FilteredEngineData type Mar 26, 2025
@sebastiantia sebastiantia marked this pull request as ready for review March 26, 2025 20:09
.map(|res| {
let (data, vec, transforms) = res?;
let scan_data = res?;
let (data, sel_vec) = scan_data.filtered_data;
let scan_files = vec![];
state::visit_scan_files(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a visit_scan_data_files or similar that just takes the ScanData? Then we don't have to do this decomposition all over the place

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about updating visit_scan_files to just take ScanData?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed a little offline: i'm kinda partial to a ScanData.visit(callback, context)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented @zachschuermann 's approach

@sebastiantia sebastiantia requested a review from nicklan March 26, 2025 22:57
Copy link
Collaborator

@zachschuermann zachschuermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few comments. and we need to ensure all the breaking changes are clearly spelled out in PR description

///
/// `Box<dyn EngineData>` - The underlying data
/// `Vec<bool>` - Selection vector where `true` marks rows to include in results
pub type FilteredEngineData = (Box<dyn EngineData>, Vec<bool>);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about just making this a struct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

.map(|res| {
let (data, vec, transforms) = res?;
let scan_data = res?;
let (data, sel_vec) = scan_data.filtered_data;
let scan_files = vec![];
state::visit_scan_files(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed a little offline: i'm kinda partial to a ScanData.visit(callback, context)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change Change that will require a version bump
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants