Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Port cdf tests from delta-spark to kernel #611

Merged
merged 12 commits into from
Jan 15, 2025
Prev Previous commit
Next Next commit
address issues
OussamaSaoudi-db committed Dec 20, 2024
commit 67a859622eebcf2a497e508801caf43226990dcf
7 changes: 3 additions & 4 deletions kernel/tests/cdf.rs
Original file line number Diff line number Diff line change
Expand Up @@ -329,8 +329,9 @@ fn simple_cdf_version_ranges() -> DeltaResult<()> {
#[test]
fn update_operations() -> DeltaResult<()> {
let batches = read_cdf_for_table("cdf-table-update-ops", 0, 2, None)?;
// Note: `update_pre` and `update_post` are technically not part of the delta spec, but are
// part of the tests used in delta
// Note: `update_pre` and `update_post` are technically not part of the delta spec, and instead
// should be `update_preimage` and `update_postimage` respectively. However, the tests in
// delta-spark use the post and pre.
Comment on lines +332 to +334
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are we observing update_pre and update_post here then? aren't we reading the CDF and then filling in our own update_preimage etc.?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All update _change_types come directly from a cdc file. We don't insert or modify them. In this test, delta-spark wrote update_pre and update_post directly into the cdc file.

let mut expected = vec![
"+----+--------------+-----------------+",
"| id | _change_type | _commit_version |",
Expand Down Expand Up @@ -365,8 +366,6 @@ fn update_operations() -> DeltaResult<()> {
#[test]
fn false_data_change_is_ignored() -> DeltaResult<()> {
let batches = read_cdf_for_table("cdf-table-data-change", 0, 1, None)?;
// Note: `update_pre` and `update_post` are technically not part of the delta spec, but are
// part of the tests used in delta
let mut expected = vec![
"+----+--------------+-----------------+",
"| id | _change_type | _commit_version |",
Expand Down
2 changes: 0 additions & 2 deletions kernel/tests/common/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,9 @@ pub(crate) fn load_test_data(
test_name: &str,
) -> Result<tempfile::TempDir, Box<dyn std::error::Error>> {
let path = format!("{test_parent_dir}/{test_name}.tar.zst");
println!("Path: {path}");
let tar = zstd::Decoder::new(std::fs::File::open(path)?)?;
let mut archive = tar::Archive::new(tar);
let temp_dir = tempfile::tempdir()?;
println!("dir : {:?}", temp_dir.path());
archive.unpack(temp_dir.path())?;
Ok(temp_dir)
}
Expand Down