Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map access for expressions #352

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Map access for expressions #352

wants to merge 16 commits into from

Conversation

hntd187
Copy link
Collaborator

@hntd187 hntd187 commented Sep 23, 2024

These support general map access, access of specific keys and binary expression against maps. These expressions have some null semantics which need to be discussed and agreed upon.

I removed the HashMap<_, Option<String>> on partition values since Nick had already done something for it. This is a draft for now I'll add more context for discussion.

Copy link

codecov bot commented Sep 23, 2024

Codecov Report

Attention: Patch coverage is 62.71186% with 88 lines in your changes missing coverage. Please review.

Project coverage is 77.65%. Comparing base (313272e) to head (ba16e68).

Files with missing lines Patch % Lines
kernel/src/engine/arrow_expression.rs 67.31% 41 Missing and 26 partials ⚠️
kernel/src/expressions/mod.rs 37.50% 10 Missing ⚠️
kernel/src/engine/parquet_stats_skipping.rs 11.11% 7 Missing and 1 partial ⚠️
kernel/src/engine/arrow_data.rs 0.00% 1 Missing ⚠️
kernel/src/engine/parquet_row_group_skipping.rs 0.00% 1 Missing ⚠️
kernel/src/scan/data_skipping.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #352      +/-   ##
==========================================
- Coverage   77.99%   77.65%   -0.34%     
==========================================
  Files          49       49              
  Lines       10328    10535     +207     
  Branches    10328    10535     +207     
==========================================
+ Hits         8055     8181     +126     
- Misses       1821     1876      +55     
- Partials      452      478      +26     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

}

fn nullable() -> bool {
V::nullable()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right? The nullability of a list (or map) should be independent of whether the list elements (mapped values) are nullable? ie Option<Vec<T>> and Vec<Option<T>> (or Option<HashMap> and HashMap<K, Option<V>>) should be orthogonal?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up removing this based on Nick's feedback

kernel/src/engine/arrow_expression.rs Outdated Show resolved Hide resolved
let values = map_struct.column(1).as_string::<i32>();
for (key, value) in keys.iter().zip(values.iter()) {
if let (Some(key), value) = (key, value) {
ret.insert(key.into(), value.map(Into::into));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is the only line that differs between the two methods; is there a way to factor it out with a lambda arg to capture the value transformation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was also removed because it wasn't necessary and partitionValues when back to being HashMap<String, String>

@github-actions github-actions bot added the breaking-change Change that will require a version bump label Oct 15, 2024
@hntd187 hntd187 marked this pull request as ready for review October 16, 2024 14:12
@@ -318,27 +363,21 @@ fn evaluate_expression(
.map(wrap_comparison_result)
.map_err(Error::generic_err)
}
(BinaryOperation { op, left, right }, _)
if matches!(**left, MapAccess { .. }) && matches!(**right, Literal(_)) =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it important that we handle BinaryOperation with left: MapAccess, and right: Literal? What about the other way around with left: Literal, right: MapAccess?

Copy link
Collaborator

@scovich scovich Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually, we're doing map['key']; 'key'[map] doesn't really make sense?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the left side is map['key'] by itself, so if you knew the key you'd write map['key'] = 1 or whatever so the map access left side essentially boils down to a literal vs literal match.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change Change that will require a version bump
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants