Skip to content

Conversation

@samansmink
Copy link
Collaborator

Opening PR to get some feedback on the FFI design here.

This is a first stab at updating the FFIs for snapshot to allow metadata reuse. I've tested this by using it in DuckDB's delta extension and are seeing a quite significant performance increase with this across high latency network, even for tiny tables!

My main question for the reviewer now is that of how to design the FFI, particularly related to breaking the existing api.

Right now we have snapshot ffi surface that can create snapshots using various params:

  • specific version
  • log tail (for catalog managed)
  • old snapshot (for metadata reuse)

Previously, we had:

fn snapshot(
    path: KernelStringSlice,
    engine: Handle<SharedExternEngine>,
)

fn snapshot_at_version(
    path: KernelStringSlice,
    engine: Handle<SharedExternEngine>,
    version: Version,
)

In this PR i would like to propose the breaking change

fn snapshot(
    path: OptionalValue<KernelStringSlice>,
    engine: Handle<SharedExternEngine>,
    old_snapshot: OptionalValue<Handle<SharedSnapshot>>,
    version: OptionalValue<Version>,
)
fn snapshot_with_log_tail(
    path: OptionalValue<KernelStringSlice>,
    engine: Handle<SharedExternEngine>,
    old_snapshot: OptionalValue<Handle<SharedSnapshot>>,
    version: OptionalValue<Version>,
    log_tail: OptionalValue<log_path::LogPathArray>,
)

We can also consider making a non-breaking change by adding a new function, but I feel that just clutters things and having 1 is just a little nicer, even though invoking it is now a little uglier.

Would love some feedback here!

@samansmink samansmink changed the title Snapshot reuse snapshot FFI rework Jan 21, 2026
@samansmink samansmink changed the title snapshot FFI rework [DRAFT] feat: snapshot FFI rework Jan 21, 2026
@samansmink samansmink requested a review from nicklan January 21, 2026 14:11
@github-actions github-actions bot added the breaking-change Change that require a major version bump label Jan 21, 2026
Copy link
Member

@nicklan nicklan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool yeah, this approach looks good for now. we'll need to also update the example c code to pass NULL for the old snapshot, but otherwise this looks great. thanks!

/// Parameters:
/// - `path`: Table path (required when `old_snapshot` is not provided, ignored otherwise)
/// - `engine`: The engine handle
/// - `old_snapshot`: Pass an existing snapshot for optimized creation (avoids re-reading the entire log)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
/// - `old_snapshot`: Pass an existing snapshot for optimized creation (avoids re-reading the entire log)
/// - `old_snapshot`: Optionally pass an existing snapshot for optimized creation (avoids re-reading the entire log)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change Change that require a major version bump

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants