Skip to content

Commit

Permalink
Add visitor for converting kernel expressions to engine expressions (#…
Browse files Browse the repository at this point in the history
…363)

## Rationale for the change
Engines may need to apply expressions that originate in the kernel. This
can include filter predicates such as filters for protocol and metadata
actions in a log file. Another example is logical to physical
transformations that the engine must apply, for instance to materialize
partition columns. This PR introduces a visitor framework for kernel
expressions to be converted to engine expressions so that engines can
apply their own evaluators, predicate pushdowns, and optimizations on
these expressions.

Closes: #358 
## What changes are included in this PR?
5 major changes are made in this PR:
1) A new module `ffi/src/expressions/kernel.rs` is added. This module is
responsible for converting kernel `Expression`s to the engine's
representation of expressions through an ffi-compatible visitor. The
visitor supports binary operations, scalars, struct expressions,
variadic and unary operators, and columns. All the engine has to do is
implement an `EngineExpressionVisitor` to support the conversion.
2) The `EnginePredicate` to kernel `Expression`visitor code is moved to
`ffi/src/expressions/engine.rs`. This is done to keep all ffi expression
code under the `ffi/src/expressions` module.
3) This PR also allows engines to get `SharedExpression`s, which are
handles to the kernel's `Expression` type. These are given back to the
kernel to specify which expression the engine would like to visit.
4) A new ffi example is added to showcase how the
`EngineExpressionVisitor` can be used to construct expressions. This is
a C implementation that can be found in
`ffi/examples/visit-expression/`. The example only prints an expression
that it receives from the kernel and asserts that it matches an expected
output.
5) A new testing feature flag `test-ffi` is added to the `ffi` crate.
This feature flag is used to expose testing functions found in
`ffi/src/test_ffi`. This module is currently used to return a kernel
expression `get_testing_kernel_expression` used in the example c
implementation.
<!-- Mention all the new apis for the visitor. How they work (ex:
Schema) -->
## Are these changes tested?
Two tests are added:
- I introduced a new test to validate the expression visitor functions
correctly and converts data as expected. This simply prints out the
expression tree structure and compares it to an expected result. The
kernel expression can be found in `ffi/src/test_ffi.rs`, and the
expected output is in `ffi/tests/test-expression-visitor/expected.txt`
- There is also a test that checks for memory leaks in the expression
visitor using valgrind.

## Are there any user-facing changes?
This PR introduces a public facing API for constructing expressions and
visiting them. It also adds `delta_kernel_derive` as a public export of
`kernel` crate. This PR does not break existing APIs.
  • Loading branch information
OussamaSaoudi-db authored Oct 28, 2024
1 parent 80f08db commit 5e5c299
Showing 18 changed files with 1,279 additions and 8 deletions.
13 changes: 11 additions & 2 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
@@ -86,6 +86,7 @@ jobs:
sudo apt update
sudo apt install -y -V libarrow-dev # For C++
sudo apt install -y -V libarrow-glib-dev # For GLib (C)
sudo apt install -y -V valgrind # For memory leak test
elif [ "$RUNNER_OS" == "macOS" ]; then
brew install apache-arrow
brew install apache-arrow-glib
@@ -108,16 +109,24 @@ jobs:
cargo build
popd
pushd ffi
cargo b --features default-engine,sync-engine
cargo b --features default-engine,sync-engine,test-ffi
popd
- name: build and run test
- name: build and run read-table test
run: |
pushd ffi/examples/read-table
mkdir build
pushd build
cmake ..
make
make test
- name: build and run visit-expression test
run: |
pushd ffi/examples/visit-expression
mkdir build
pushd build
cmake ..
make
make test
coverage:
runs-on: ubuntu-latest
1 change: 1 addition & 0 deletions ffi/Cargo.toml
Original file line number Diff line number Diff line change
@@ -47,3 +47,4 @@ default-engine = [
]
sync-engine = ["delta_kernel/sync-engine"]
developer-visibility = []
test-ffi = []
33 changes: 33 additions & 0 deletions ffi/examples/visit-expression/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
cmake_minimum_required(VERSION 3.12)
project(visit_expressions)

add_executable(visit_expression visit_expression.c)
target_compile_definitions(visit_expression PUBLIC DEFINE_DEFAULT_ENGINE)
target_include_directories(visit_expression PUBLIC "${CMAKE_CURRENT_SOURCE_DIR}/../../../target/ffi-headers")
target_link_directories(visit_expression PUBLIC "${CMAKE_CURRENT_SOURCE_DIR}/../../../target/debug")
target_link_libraries(visit_expression PUBLIC delta_kernel_ffi)
target_compile_options(visit_expression PUBLIC)

target_compile_options(visit_expression PRIVATE -Wall -Wextra -Wpedantic -Werror -Wno-strict-prototypes)

# Get info on the OS and platform
if (${CMAKE_SYSTEM_NAME} MATCHES "Darwin")
set(MACOSX TRUE)
endif()
if(UNIX AND NOT APPLE)
set(LINUX TRUE)
endif()

# Add the kernel expresion -> engine expression test
include(CTest)
set(ExprTestRunner "../../../tests/test-expression-visitor/run_test.sh")
set(ExprExpectedPath "../../../tests/test-expression-visitor/expected.txt")
add_test(NAME test_expression_visitor COMMAND ${ExprTestRunner} ${ExprExpectedPath})
if(LINUX)
add_test(NAME test_expression_visitor_leaks COMMAND valgrind
--error-exitcode=1
--tool=memcheck
--leak-check=full
--errors-for-leak-kinds=definite
--show-leak-kinds=definite ./visit_expression)
endif()
Loading

0 comments on commit 5e5c299

Please sign in to comment.