Skip to content

Conversation

@kvaps
Copy link

@kvaps kvaps commented Dec 11, 2025

This is an alternative proposed solution to the issue described in #9447

Problem

Tasks in Prepared phase were not included in periodic enqueue (only Accepted phase was included). This meant they only received reconcile calls through watch events, which could cause them to get stuck for long periods when waiting for available slots.

Solution

This change adds Prepared phase to the periodic enqueue predicate for both DataUpload and DataDownload controllers, ensuring they get regular reconcile calls (every minute via preparingMonitorFrequency). This guarantees that tasks in Prepared phase will receive reconcile calls regularly, allowing them to check for available slots and proceed when slots become available.

Changes

  • Added Prepared phase check to periodic enqueue predicate in DataUploadReconciler
  • Added Prepared phase check to periodic enqueue predicate in DataDownloadReconciler

Alternative Approach

This is a simpler alternative to the slot reservation mechanism proposed in PR #9447. While PR #9447 focuses on optimizing the reconcile logic itself, this PR ensures tasks get regular reconcile calls in the first place.

Tasks in Prepared phase were not included in periodic enqueue (only Accepted phase was included). This meant they only received reconcile calls through watch events, which could cause them to get stuck for long periods when waiting for available slots.

This change adds Prepared phase to the periodic enqueue predicate for both DataUpload and DataDownload controllers, ensuring they get regular reconcile calls (every minute via preparingMonitorFrequency).

This is an alternative proposed solution to the issue described in vmware-tanzu#9447

Signed-off-by: Andrei Kvapil <[email protected]>
@Lyndon-Li
Copy link
Contributor

Not periodically enqueue Prepared tasks as as the design expectation. The Prepared tasks should be handled appropriately as they are enqueued in below occasions:

  1. When a task is transited to Prepared state
  2. When the Prepare processing of the reconciler is blocked and want to retry

This is to say, the problem you mentioned is not a problem in general case; it should be a bug in edge cases.
Therefore, we need to find the root cause before making any code change.

Please open an issue with the log when the problem happened (all tasks are stuck in Prepared phase), from the log we may be able to find the root cause.

@kvaps
Copy link
Author

kvaps commented Dec 12, 2025

Sure, done #9453

@blackpiglet blackpiglet changed the title Include Prepared phase tasks in periodic enqueue to prevent stalling [AI-generated]Include Prepared phase tasks in periodic enqueue to prevent stalling Dec 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants