Support per-task transactional leasing in `loadTasks` #1523

danielhumanmod · 2025-05-04T21:19:57Z

Potentially as a improvement for Fix #774

Context

Introduce per-task transactional leasing in the metastore layer via loadTasks(...). This enables allows tasks to be leased and updated one at a time, avoiding the all-or-nothing semantics of bulk operations (which is also mentioned in TODO).

Motivation

The current loadTasks fetches and updates a batch of tasks in a single transaction. In highly concurrent environments, if multiple executors attempt to lease overlapping tasks, write conflicts are more likely—especially with large batches. By switching to per-task transactions, we limit the scope of each transaction, reducing the chance of conflicts (but potentially at the cost of performance)
This behavior was already suggested as a TODO in the original implementation, so it’s likely something we want to improve?

// TODO: Consider performing incremental leasing of individual tasks one at a time
// instead of requiring all-or-none semantics for all the tasks we think we listed,
// or else contention could be very bad.

This change can serve as an optional enhancement to PR #1585. Happy to hear any feedback from the community :)

collado-mike · 2025-05-08T00:32:34Z

Introduce per-task transactional leasing in the metastore layer via loadTasks(...). This enables fine-grained compensation by allowing tasks to be leased and updated one at a time, avoiding the all-or-nothing semantics of bulk operations (which is also mentioned in TODO). This is important for retry scenarios, where we want to isolate failures and ensure that tasks are independently retried without affecting each other.

I don't understand how this PR enables isolation of task failures. This PR only reads the tasks from the metastore one at a time, so the only failure would be in loading the task. In a transactional database, the UPDATE ... WHERE statement would only update the task state when the task is not currently leased by another client, so I don't see how one or a few tasks would fail to be leased while the others succeed.

The PR description sounds like it intends to tackle task execution failure - is that right? If so, loading the tasks from the database isn't going to solve that problem.

eric-maynard · 2025-05-08T05:00:49Z

The PR description sounds like it intends to tackle task execution failure - is that right? If so, loading the tasks from the database isn't going to solve that problem.

I think it could, just very lazily right @collado-mike? The next time the service restarts, we could retry any orphaned tasks.

danielhumanmod · 2025-05-08T05:10:19Z

Introduce per-task transactional leasing in the metastore layer via loadTasks(...). This enables fine-grained compensation by allowing tasks to be leased and updated one at a time, avoiding the all-or-nothing semantics of bulk operations (which is also mentioned in TODO). This is important for retry scenarios, where we want to isolate failures and ensure that tasks are independently retried without affecting each other.

I don't understand how this PR enables isolation of task failures. This PR only reads the tasks from the metastore one at a time, so the only failure would be in loading the task. In a transactional database, the UPDATE ... WHERE statement would only update the task state when the task is not currently leased by another client, so I don't see how one or a few tasks would fail to be leased while the others succeed.

The PR description sounds like it intends to tackle task execution failure - is that right? If so, loading the tasks from the database isn't going to solve that problem.

Sorry for the confusion — we actually have a second PR for this feature. I try to split two parts to make review easier :)

This is the PR for second phase: #1585

Regarding this PR’s changes in the metastore, the goal is to allow each task entity to be read and leased individually. This ensures that if an exception occurs while reading or leasing one task, it won’t affect others. This improvement was also noted in the TODO comment of the previous implementation. It’s not strictly required, but maybe a “nice-to-have” one for isolating failures.

Update on May 17:
I have updated the PR's title and description to avoid confusion

adnanhemani

A couple of things here: now with pagination being merged, this PR will require further revision to rebase properly imo.

I'm also in agreement with @collado-mike here - but to your response, I don't agree that this should be the way we solve this overall. I'm not sure I see high value in picking only one task at a time to solve this problem we have with tasks and retrying them. Instead, I'd advocate for leaning heavier on the definition of limit. If after we query the relevant amount of tasks using ms.listEntitiesInCurrentTxn and then find out some task has been modified between our querying of this task and attempting to commit our properties to the task, we should just filter it out of the resultant set and allow the user to receive all other tasks that were not impacted. If there are no tasks at the end of this filtering there are no tasks, then that would be the right place to throw the exception. Sure, we will not get the "limit" amount of tasks if the function returns - but I don't see a guarantee of needing that.

I know I've probably not researched this as deeply as you so WDYT?

danielhumanmod · 2025-05-18T01:17:35Z

A couple of things here: now with pagination being merged, this PR will require further revision to rebase properly imo.

Thanks for the reminder — I’ll rebase and update it later!

I'm also in agreement with @collado-mike here - but to your response, I don't agree that this should be the way we solve this overall. I'm not sure I see high value in picking only one task at a time to solve this problem we have with tasks and retrying them. Instead, I'd advocate for leaning heavier on the definition of limit. If after we query the relevant amount of tasks using ms.listEntitiesInCurrentTxn and then find out some task has been modified between our querying of this task and attempting to commit our properties to the task, we should just filter it out of the resultant set and allow the user to receive all other tasks that were not impacted. If there are no tasks at the end of this filtering there are no tasks, then that would be the right place to throw the exception. Sure, we will not get the "limit" amount of tasks if the function returns - but I don't see a guarantee of needing that.

I know I've probably not researched this as deeply as you so WDYT?

That’s a great point — I actually considered that approach initially as well. That said, there were a couple of things that led me to explore the per-task leasing direction instead:

The “skipping failed tasks” logic you described is indeed implemented in the AtomicOperationMetaStoreManager’s loadTasks method (code pointer).My assumption is that, under the atomic semantics of this class, partially leasing a subset of tasks is acceptable.
However, in the case of TransactionWorkspaceMetaStoreManager, which seems to be designed around stricter transactional semantics, I wasn’t sure if silently skipping conflicted tasks would still be consistent with its intended guarantees.
I also came across a TODO in the original implementation that seems to suggest exploring incremental per-task leasing.

// TODO: Consider performing incremental leasing of individual tasks one at a time
// instead of requiring all-or-none semantics for all the tasks we think we listed,
// or else contention could be very bad.

But I am not expert in Polaris's metastore, just sharing the context that led me to this approach. Would really appreciate any feedback or additional insight from the community

adnanhemani · 2025-05-19T23:02:53Z

I agree with the analysis you've stated too. I think it really comes down to point 1) that you made - and if someone has context as to whether they considered this approach before putting the TODO from point 2) down (and if so, why).

I, personally, don't think that the semantics between Transactional and Atomic forces us to make a different implementation here tbh - but would also like any other insight from the community here :)

eric-maynard · 2025-06-03T17:44:08Z

...ava/org/apache/polaris/core/persistence/transactional/TransactionalMetaStoreManagerImpl.java

+      String executorId,
+      int limit) {
+    List<EntitiesResult> entitySuccessResults = new ArrayList<>();
+    final AtomicInteger failedLeaseCount = new AtomicInteger(0);


Do we actually need AtomicInteger here if it's only being used within one thread?

eric-maynard · 2025-06-03T17:44:51Z

...ava/org/apache/polaris/core/persistence/transactional/TransactionalMetaStoreManagerImpl.java

+
+  @Override
+  public @Nonnull EntitiesResult loadTasks(
+      @Nonnull PolarisCallContext callCtx, String executorId, int limit, boolean perTaskTxn) {


txnPerTask seems to have morphed into perTaskTxn here

eric-maynard · 2025-06-03T17:45:15Z

...ava/org/apache/polaris/core/persistence/transactional/TransactionalMetaStoreManagerImpl.java

@@ -1985,11 +1983,60 @@ private PolarisEntityResolver resolveSecurableToRoleGrant(
    return new EntitiesResult(loadedTasks);
  }

+  private @Nonnull EntitiesResult loadTasksWithIsolatedTxn(


Do we actually need a single txn per task, or can we try to transactionally grab multiple in one trip to persistence?

danielhumanmod · 2025-06-09T01:38:44Z

I guess we can consider closing this PR since this change is optional in my plan, and the community also feels it might be unnecessary

eric-maynard · 2025-06-12T17:27:00Z

@danielhumanmod I actually like the idea of making this method transactional, but I'm worried about making too many trips to persistence. We can close this for now if you want though, I'll shift my focus to the other PR

danielhumanmod added 2 commits May 4, 2025 14:00

handling each task in isolated txn when loadTasks

b1f82ce

format

794f48d

github-project-automation bot added this to Basic Kanban Board May 4, 2025

github-project-automation bot moved this to PRs In Progress in Basic Kanban Board May 4, 2025

danielhumanmod marked this pull request as ready for review May 4, 2025 21:21

danielhumanmod requested review from adutra, ashvina, dennishuo, dimas-b, eric-maynard, jackye1995, jbonofre, vvcephei, collado-mike, snazy, RussellSpitzer, takidau, MonkeyCanCode, flyrain, ebyhr and HonahX as code owners May 4, 2025 21:21

danielhumanmod mentioned this pull request May 14, 2025

Support retrying non-finished async tasks on startup and periodically #1585

Closed

adnanhemani reviewed May 15, 2025

View reviewed changes

danielhumanmod changed the title ~~Support more reliable async task retry to guarantee eventual execution (1/2) – Metastore Layer~~ Support per-task transactional leasing in loadTasks May 18, 2025

eric-maynard reviewed Jun 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support per-task transactional leasing in `loadTasks` #1523

Support per-task transactional leasing in `loadTasks` #1523

Uh oh!

danielhumanmod commented May 4, 2025 •

edited

Loading

Uh oh!

collado-mike commented May 8, 2025

Uh oh!

eric-maynard commented May 8, 2025

Uh oh!

danielhumanmod commented May 8, 2025 •

edited

Loading

Uh oh!

adnanhemani left a comment

Uh oh!

danielhumanmod commented May 18, 2025 •

edited

Loading

Uh oh!

adnanhemani commented May 19, 2025

Uh oh!

eric-maynard Jun 3, 2025

Uh oh!

eric-maynard Jun 3, 2025

Uh oh!

eric-maynard Jun 3, 2025

Uh oh!

danielhumanmod commented Jun 9, 2025

Uh oh!

eric-maynard commented Jun 12, 2025

Uh oh!

Uh oh!

Support per-task transactional leasing in loadTasks #1523

Are you sure you want to change the base?

Support per-task transactional leasing in loadTasks #1523

Uh oh!

Conversation

danielhumanmod commented May 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Motivation

Uh oh!

collado-mike commented May 8, 2025

Uh oh!

eric-maynard commented May 8, 2025

Uh oh!

danielhumanmod commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adnanhemani left a comment

Choose a reason for hiding this comment

Uh oh!

danielhumanmod commented May 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adnanhemani commented May 19, 2025

Uh oh!

eric-maynard Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

eric-maynard Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

eric-maynard Jun 3, 2025

Choose a reason for hiding this comment

Uh oh!

danielhumanmod commented Jun 9, 2025

Uh oh!

eric-maynard commented Jun 12, 2025

Uh oh!

Uh oh!

Support per-task transactional leasing in `loadTasks` #1523

Support per-task transactional leasing in `loadTasks` #1523

danielhumanmod commented May 4, 2025 •

edited

Loading

danielhumanmod commented May 8, 2025 •

edited

Loading

danielhumanmod commented May 18, 2025 •

edited

Loading