feat: add `Table::checkpoint()` API #797

sebastiantia · 2025-04-02T20:47:41Z

What changes are proposed in this pull request?

Please ignore the checkpoint/log_replay mod (change in stacked PR #774)

This PR implements the checkpoint API, enabling table checkpoints that provide compact summaries of table state for faster recovery without full log replay.

Key Features

Adds a Table::checkpoint() method that automatically selects the appropriate checkpoint type based on table feature support:
- Tables with v2Checkpoints feature → Creates a Single-file Classic-named V2 Checkpoint
- Tables without v2Checkpoints feature → Creates a Single-file Classic-named V1 Checkpoint
Implements the CheckpointWriter.
- Handles the generation of checkpoint data & path on call to .checkpoint_data()
- Handles the finalization of the checkpointing process by writing the _last_checkpoint file on call to.finalize().
- Note: we require the engine to write the entire checkpoint file to storage before calling .finalize(), otherwise the table may be corrupted.

API Overview

The checkpoint workflow for the engine consists of:

Creating a CheckpointWriter via Table::checkpoint(engine, version)
Retrieving checkpoint data with CheckpointWriter::get_data()
Writing the data to storage (implementation-specific)
Finalizing the checkpoint by calling CheckpointWriter.finalize()

This PR is stacked on #774.

Please only review these commits.

How was this change tested?

Unit tests in checkpoint/mod.rs
test_deleted_file_retention_timestamp - tests file retention timestamp calculations
test_create_checkpoint_metadata_batch_when_v2_checkpoints_is_supported
test_create_checkpoint_metadata_batch_when_v2_checkpoints_not_supported
test_create_last_checkpoint_metadata
test_create_last_checkpoint_metadata_with_invalid_batch

'Integration tests' in checkpoint/tests.rs
test_v1_checkpoint_latest_version_by_default: table that does not support v2Checkpoint, no checkpoint version specified
test_v1_checkpoint_specific_version: table that does not support v2Checkpoint, checkpointing at a specific version
test_v2_checkpoint_supported_table: table that supports v2Checkpoint & no version is specified
test_checkpoint_error_handling_invalid_version: checkpoint with a version that does not exist in the log

zachschuermann

few comments to start

kernel/src/checkpoint/mod.rs

zachschuermann · 2025-04-09T02:36:05Z

kernel/src/checkpoint/mod.rs

+        &mut self,
+        engine: &dyn Engine,
+    ) -> DeltaResult<SingleFileCheckpointData> {
+        if self.data_consumed {


we can use types to manage this 'state transition' if needed

Taking a step back, do we want to limit the engine from calling .checkpoint_data() multiple times to retrieve multiple copies of the checkpoint data & path in the first place? If they really wanted to, they could always just call table.checkpoint() again so I don't see a strong reason to?

kernel/src/checkpoint/mod.rs

sebastiantia · 2025-04-11T18:11:27Z

kernel/src/checkpoint/log_replay.rs

This PR is stacked on #744, do not review this file (added in stacked pr)

sebastiantia · 2025-04-11T19:16:16Z

I can separate out the finalize() API to a separate PR if reviewers think this PR is too bloated, lmk!

OussamaSaoudi · 2025-04-12T20:11:10Z

kernel/src/checkpoint/mod.rs

+//!
+//! ## Example: Writing a classic-named V1/V2 checkpoint (depending on `v2Checkpoints` feature support)
+//!
+//! TODO(seb): unignore example


One thing that I like to do is to use FIXME for stuff I want to fix this PR, and TODO for stuff that I wanna fix in a subsequent issue/PR.

OussamaSaoudi · 2025-04-12T20:14:43Z

kernel/src/checkpoint/mod.rs

+//! - Single-file UUID-named V2 checkpoints (using `n.checkpoint.u.{json/parquet}` naming) are to be
+//!   implemented in the future. The current implementation only supports classic-named V2 checkpoints.
+//! - Multi-file V2 checkpoints are not supported yet. The API is designed to be extensible for future
+//!   multi-file support, but the current implementation only supports single-file checkpoints.
+//!


I'd like to see github issues and TODOs associated with these so that they're tracked. Something like:

TODO(433): Support Single-file

@zachschuermann This is a pattern I've seen in other repositories, and I'd advocate for this to be the way we track todos.

nice, great idea! let's try to use that issue numbering i agree that's nice

regarding issues/etc. how about we update the original 'checkpoint support' issue with some clearly documented bits on what was done and what is future work (and can always make the future work into sub-issues, etc.)

i think original issues were #736 and #499 - how about we consolidate (feel free to just close one or both with comments on whatever we select or new one we make

OussamaSaoudi · 2025-04-12T20:15:48Z

kernel/src/checkpoint/mod.rs

+//! let checkpoint_data = writer.checkpoint_data()?;
+//!
+//! // Write checkpoint data to storage (implementation-specific)
+//! let metadata = your_storage_implementation.write_checkpoint(


It is not clear to me what your_storage_implementation is. For code docs like this, I'd like for them to be something that actually compiles

OussamaSaoudi · 2025-04-12T20:26:46Z

kernel/src/checkpoint/mod.rs

+    }
+
+    // The current checkpoint API only supports single-file checkpoints.
+    let parts: i64 = 1; // Coerce the type to `i64`` to match the expected schema.


Suggested change

let parts: i64 = 1; // Coerce the type to `i64`` to match the expected schema.

let parts = 1i64;

OussamaSaoudi · 2025-04-12T20:28:05Z

kernel/src/checkpoint/mod.rs

+/// - `sizeInBytes` (i64, optional): Size of checkpoint file in bytes
+/// - `numOfAddFiles` (i64, optional): Number of Add actions
+///
+/// Note: The fields `checkpointSchema` and `checksum` are not yet included in this


Could you link the todos here?

TODO(xxx): Add checkpointSchema field to _last_checkpoint TODO(xxx): Add checksum field to _last_checkpoint

OussamaSaoudi · 2025-04-12T20:29:34Z

kernel/src/checkpoint/mod.rs

+    let last_checkpoint_schema = Arc::new(StructType::new([
+        StructField::not_null("version", DataType::LONG),
+        StructField::not_null("size", DataType::LONG),
+        StructField::nullable("parts", DataType::LONG),
+        StructField::nullable("sizeInBytes", DataType::LONG),
+        StructField::nullable("numOfAddFiles", DataType::LONG),
+    ]));


Declare this somewhere with a static lazy lock, then clone the arc.

OussamaSaoudi · 2025-04-12T20:42:18Z

kernel/src/checkpoint/mod.rs

+            self.total_actions_counter.load(Ordering::Relaxed),
+            self.add_actions_counter.load(Ordering::Relaxed),


I would be wary of using Ordering::Relaxed without being absolutely certain things won't go wrong.

Check out the rustinomicon.

I'd advise to keep all the counter accesses/updates as SeqCst. We're not doing a lot of load/stores. Especially when you compare to the huge amount of data we'd be processing, so it's not a big performance hit.

OussamaSaoudi · 2025-04-12T20:45:37Z

kernel/src/checkpoint/mod.rs

+        let schema = Arc::new(StructType::new([StructField::not_null(
+            CHECKPOINT_METADATA_NAME,
+            DataType::struct_type([StructField::not_null("version", DataType::LONG)]),
+        )]));


This is another opportunity for static LazyLock + clone.

OussamaSaoudi · 2025-04-12T20:52:24Z

kernel/src/checkpoint/mod.rs

+    let now_ms: i64 = now_duration
+        .as_millis()
+        .try_into()
+        .map_err(|_| Error::generic("Current timestamp exceeds i64 millisecond range"))?;


Please create an error type for the checkpoint writer. The use of error::generic in our code makes it harder to test these error cases, and it makes debugging really challenging.

Suppose you run a kernel connector, and just gets back "Current timestamp exceeds i64 millisecond range". Where do they even begin to look for the bug?

sebastiantia added 30 commits March 12, 2025 10:16

introduce visitors

435302e

remove pub

e500a10

assert! instead of assert_eq with bool

19733cd

log replay for checkpoints

87c9f31

rename & some clean up

db5ccd0

remove new path for now

42c08c1

merge non file action visitor tests

f91baeb

mvp for refactor

9fdfba7

these github action checks clog my screen

d420fd1

base file actions struct

9e0e048

combine visitors

303444b

fmt

5dbc924

remove old code

b793961

move FileActionKey

508976f

Merge branch 'main' into checkpoint-visitors

bccaa17

merge

a23d7cb

fix whitespace

0160ef1

remove old code

aae7046

refactor more

f574370

refactor

a618833

more docs

7da74b2

invert is_log_batch logic

220a216

docs

9d86911

docs

e5b0e32

docs and imports

a5393dc

improve mod doc

a23c651

improve doc

d712d18

docs'

e564ae1

docs

b14ff19

update

a52d484

sebastiantia added 3 commits April 8, 2025 10:38

remove file

aed3ab6

Merge branch 'checkpoint-visitors' into checkpoint-replay

c92ea56

docs

fcb289d

sebastiantia marked this pull request as ready for review April 8, 2025 20:10

zachschuermann reviewed Apr 9, 2025

View reviewed changes

sebastiantia added 13 commits April 9, 2025 09:36

docs

4d2029e

docs

e0d81ab

review

b4e28ee

partial review

544c42a

arc atomic

e8d1239

arc

e9de5bc

.finalize() with tests

99d31a7

merge

0b609d5

docs

ab0a373

test coverage

9a9697a

doc

c7630a3

merge

011ec3f

fix merge

c58074b

sebastiantia commented Apr 11, 2025

View reviewed changes

docs

78fab5f

sebastiantia requested review from zachschuermann, nicklan, scovich and OussamaSaoudi April 11, 2025 19:13

sebastiantia removed the breaking-change Change that will require a version bump label Apr 11, 2025

build & doc fixes

64c720d

fmt

7c90c33

OussamaSaoudi reviewed Apr 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add `Table::checkpoint()` API #797

feat: add `Table::checkpoint()` API #797

sebastiantia commented Apr 2, 2025 •

edited

Loading

zachschuermann left a comment

zachschuermann Apr 9, 2025

sebastiantia Apr 9, 2025 •

edited

Loading

sebastiantia Apr 11, 2025

sebastiantia commented Apr 11, 2025

OussamaSaoudi Apr 12, 2025

OussamaSaoudi Apr 12, 2025

zachschuermann Apr 13, 2025

OussamaSaoudi Apr 12, 2025

OussamaSaoudi Apr 12, 2025

OussamaSaoudi Apr 12, 2025

OussamaSaoudi Apr 12, 2025

OussamaSaoudi Apr 12, 2025

OussamaSaoudi Apr 12, 2025

OussamaSaoudi Apr 12, 2025

	let parts: i64 = 1; // Coerce the type to `i64`` to match the expected schema.
	let parts = 1i64;

		self.total_actions_counter.load(Ordering::Relaxed),
		self.add_actions_counter.load(Ordering::Relaxed),

feat: add Table::checkpoint() API #797

Are you sure you want to change the base?

feat: add Table::checkpoint() API #797

Conversation

sebastiantia commented Apr 2, 2025 • edited Loading

What changes are proposed in this pull request?

Key Features

API Overview

How was this change tested?

zachschuermann left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sebastiantia Apr 9, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sebastiantia commented Apr 11, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

feat: add `Table::checkpoint()` API #797

feat: add `Table::checkpoint()` API #797

sebastiantia commented Apr 2, 2025 •

edited

Loading

sebastiantia Apr 9, 2025 •

edited

Loading