-
Notifications
You must be signed in to change notification settings - Fork 361
Description
What happened:
pv is stuck Terminating due to race condition when csi-attacher removes finalizer and csi-provisioner tries to remove another finalizer
-
symptom
pv is stuck Terminating when pvc is deleted withHonorPVReclaimPolicy
feature gate enabled. -
process
csi-attacher would remove the finalizer(e.g. external-attacher/disk-csi-azure-com) when pv is detached, and later on csi-provisioner the would try to remove theexternal-provisioner.volume.kubernetes.io/finalizer
finalizer when pvc is deleted and since pv object is in the cache of provisioner , the finalizer deletion always fail until maximum 6 retries exceeds, and finally leaves the pv in Terminating state forever (the underlying storage is deleted before finalizer remove fails).
csi-attacher-disk E0510 10:18:09.499513 1 csi_handler.go:701] Failed to remove finalizer from PV "pvc-b1c64ae1-6310-4a6c-aa44-12c80c9981a0": PersistentVolume "pvc-b1c64ae1-6310-4a6c-aa44-12c80c9981a0" is invalid: metadata.finalizers: Forbidden: no new finalizers can be added if the object is being deleted, found new finalizers []string{"kubernetes.io/pv-protection"}
csi-attacher-disk I0510 10:18:09.510077 1 csi_handler.go:706] Removed finalizer from PV "pvc-b1c64ae1-6310-4a6c-aa44-12c80c9981a0"
csi-provisioner-disk I0510 10:18:09.466810 1 controller.go:1517] delete "pvc-b1c64ae1-6310-4a6c-aa44-12c80c9981a0": volume deleted
csi-azuredisk-controller I0510 10:18:09.466386 1 azure_managedDiskController.go:325] azureDisk - deleted a managed disk: /subscriptions/xxx/resourceGroups/icto-1019_npi-lcm-cn-lcm-npi-cluster-01-nodes/providers/Microsoft.Compute/disks/pvc-b1c64ae1-6310-4a6c-aa44-12c80c9981a0
csi-provisioner-disk I0510 10:18:09.489676 1 controller.go:1554] delete "pvc-b1c64ae1-6310-4a6c-aa44-12c80c9981a0": failed to remove finalizer for persistentvolume: Operation cannot be fulfilled on persistentvolumes "pvc-b1c64ae1-6310-4a6c-aa44-12c80c9981a0": the object has been modified; please apply your changes to the latest version and try again
csi-provisioner-disk W0510 10:18:09.489714 1 controller.go:989] Retrying syncing volume "pvc-b1c64ae1-6310-4a6c-aa44-12c80c9981a0", failure 6
csi-provisioner-disk E0510 10:18:09.489752 1 controller.go:1007] error syncing volume "pvc-b1c64ae1-6310-4a6c-aa44-12c80c9981a0": Operation cannot be fulfilled on persistentvolumes "pvc-b1c64ae1-6310-4a6c-aa44-12c80c9981a0": the object has been modified; please apply your changes to the latest version and try again
- workaround
remove all finalizers from the pv and then delete pv manually
kubectl patch pv NAME -p '{"metadata":{"finalizers":null}}'
/kind bug
cc @jsafrane
What you expected to happen:
How to reproduce it:
Anything else we need to know?:
Environment:
- Driver version: v4.0.0
- Kubernetes version (use
kubectl version
): 1.27 - OS (e.g. from /etc/os-release):
- Kernel (e.g.
uname -a
): - Install tools:
- Others: