Proposed semver treament for 🦀 1.0 #3405

rtyler · 2025-05-01T02:01:03Z

rtyler
May 1, 2025
Maintainer

I was thinking about how we might handle semantic versioning with a really-soon-now-I-promise deltalake 1.0 Rust crate release.

As you may know, we have fairly rapid major version dependencies changing underneath us from both datafusion and arrow crates. Technically object_store can, and has recently had some "major" (0.x) API changes. I personally don't want run like a hamster on a release treadmill where arrow releases a major version change, and the ndatafusion releases a major version change, and then delta-rs needs a major version change.

The theory behind this treadmill is that since we expose RecordBatch and other datafusion/arrow symbols in our APIs, it would be semantically breaking for us not to rev major versions when those dependencies change.

I believe deltalake 1.0 should not have major version changes due to upgrades of arrow or datafusion.

I have written before about our policy on re-exporting some symbols which are quite helpful for downstream users of the crate(s). I believe this pattern of symbol re-export can help insulate our users from major semver churn due to underlying changes in arrow and datafusion. Meaning we should strongly encourage users to use our re-exported symbols which will always be the "right" RecordBatch, for example.

The only potential hiccup is when a downstream consumer is pulling arrow in directly because they are negotiating the dependency graphs between deltalake and another crate which requires arrow. For those users, I think we can commit to clearly documenting the minor versions where we would see these changes, and encourage them to pin their dependencies to 1.0.x or 1.1.x

If we wanted to go that extra mile, I might suggest that our minor versions track datafusion, and then we release under patch releases, i.e.:

1.47.0 : first release with df 47
1.47.1 : patch release with our changes
1.47.2 : etc
1.48.0 : first release with df 48
2.49.0 : major breaking API changes in deltalake (and df 49(

and so on.

I'm curious what everybody thinks, I'm hoping not to 🏃‍♂️ too much after our upstream dependencies 😄

roeap · 2025-05-01T18:31:45Z

roeap
May 1, 2025
Maintainer

First of all, i really like having an articulated policy for this 👍.

One major question for we would be, will we - in a kernelized world - still try to maintain a version that does not need datafusion? If so, aligning our versioning with datafusion might be problematic as it becomes somewhat meaningless...

Then again the difference in features between what we can do without datafusion and what kernel can do with the default engine might not be significant. So I would be open to always requiring datafusion in 1.0.

Just a thought dump though 🙂

0 replies

alexwilcoxson-rel · 2025-05-04T18:19:50Z

alexwilcoxson-rel
May 4, 2025

As most a consumer, it would not bother me either way. We pull in multiple arrow and datafusion crates, and update the triad of dependencies when deltalake releases.

For simpler use cases +1 to pushing for using the re-exported symbols.

As for minor version being datafusion version. The trade off is tying your minor level releases to also a df upgrade. So it's only really semver when datafusion upgrades.

If folks are pushed to use the re-exported symbols then it doesn't really matter to them. Maybe for the advanced users some version matrix is sufficient? (that is easier at-a-glance than parsing the Cargo.toml 😄)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposed semver treament for 🦀 1.0 #3405

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Proposed semver treament for 🦀 1.0 #3405

Uh oh!

rtyler May 1, 2025 Maintainer

Replies: 2 comments

Uh oh!

roeap May 1, 2025 Maintainer

Uh oh!

alexwilcoxson-rel May 4, 2025

rtyler
May 1, 2025
Maintainer

roeap
May 1, 2025
Maintainer

alexwilcoxson-rel
May 4, 2025