Skip to content

Only DU/DD accepted by the restarted node should be cancelled #8534

Closed
@Lyndon-Li

Description

@Lyndon-Li

At present, when a node-agent pod restarts, all DU/DD in Accepted phase are cancelled.
We may be able to enhance this as we have recorded the accepted node in DUCR/DDCR as of #8498. Details:

  1. If the node-agent pod restarts because node-agent itself, only the DU/DD that are accepted by the restarted node-agent pod need to be cancelled

  2. If the node-agent pod restarts because node restart, some other DU/DD in Accepted phase may have created some backupPod/restorePod into the restarted node, then once the node restarts, we need to investigate what happens to those pods:

     - If they fail and never recover, we need to cancel those DU/DD
     - If they fail and then recover, we need to ignore and not cancel the DU/DD
    

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions