Skip to content

feat(transaction): Add retry logic to transaction #1484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 14, 2025

Conversation

CTTY
Copy link
Contributor

@CTTY CTTY commented Jul 1, 2025

Which issue does this PR close?

What changes are included in this PR?

  • Added retry logic using backon

Are these changes tested?

Added unit tests

@CTTY CTTY marked this pull request as draft July 1, 2025 18:43
Copy link
Contributor

@dentiny dentiny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious how do we plan to test the retry? i.e. is it possible to use mock catalog and inject retriable / non-retriable error?

@CTTY
Copy link
Contributor Author

CTTY commented Jul 1, 2025

curious how do we plan to test the retry? i.e. is it possible to use mock catalog and inject retriable / non-retriable error?

Hi @dentiny , this is a good question:) I'm thinking of using a mock catalog, but haven't figured out exactly how

@dentiny
Copy link
Contributor

dentiny commented Jul 1, 2025

Hi @dentiny , this is a good question:) I'm thinking of using a mock catalog, but haven't figured out exactly how

I use mockall on traits.
Just for reference:)

.expect("Invalid value for commit.retry.num-retries"),
)
.with_factor(2.0)
.build()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really related to this change, but something I noticed while writing the PR:

  1. There is not a centralized properties class, which makes properties hard to locate in the code base: Some properties are defined in table_metadata.rs, and some are defined in the classes that use them: example
  2. No orthogonal way to fetch and parse property value, which makes error handling inconsistent. For example, this LOC will silently fallback to the default value if property parsing failed

For 1, we can have a dedicated file like table_metadata.rs to hold all hardcoded property keys, or we can have a property struct to define properties we support like what datafusion does

For 2, something like iceberg-java's PropertyUtil can be very useful

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of 1. I agree that we could have a module like table_properties to contain all the properties, and have sth to parse properties in a type safe way like following:

pub struct TableProperties {
     #[key="commit.num.retries", default="1"]
     commitRetries: i32
}

We could track this in following up issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created the issue to track the task of centralizing properties: #1505

@CTTY CTTY marked this pull request as ready for review July 3, 2025 04:36
@CTTY
Copy link
Contributor Author

CTTY commented Jul 3, 2025

cc @liurenjie1024 @Xuanwo this PR is ready for review, PTAL, thanks!

Copy link
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CTTY for this pr, generally LGTM! Just some nits.

.expect("Invalid value for commit.retry.num-retries"),
)
.with_factor(2.0)
.build()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of 1. I agree that we could have a module like table_properties to contain all the properties, and have sth to parse properties in a type safe way like following:

pub struct TableProperties {
     #[key="commit.num.retries", default="1"]
     commitRetries: i32
}

We could track this in following up issue.

@CTTY CTTY requested review from liurenjie1024 and dentiny July 14, 2025 05:08
Copy link
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CTTY for this pr, LGTM!

@liurenjie1024 liurenjie1024 merged commit e47b268 into apache:main Jul 14, 2025
18 checks passed
@CTTY CTTY deleted the ctty/retry-backon branch July 14, 2025 17:03
@@ -82,6 +83,7 @@ itertools = "0.13"
linkedbytes = "0.1.8"
metainfo = "0.7.14"
mimalloc = "0.1.46"
mockall = "0.13.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add it to dev dependency? Since it's only used in tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I saw you created a PR already: #1507

Thanks for catching this!

liurenjie1024 pushed a commit that referenced this pull request Jul 15, 2025
## What changes are included in this PR?

`mockall` dependency is introduced in
#1484, which is used for unit
test on retry.
But it's only used in unit tests, but not production code; so it should
be placed under dev section, and completely disabled in production code
path.
Current implementation unnecessarily increases crate size for production
usage.

## Are these changes tested?

N/A, a no-op change, a passed compilation should be enough
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add retry in Transaction::commit method
3 participants