Description
environment Ubuntu 22.04.5 and HPE CSI 2.52.
We are running a patching process that drains/cordons each node, patches it, reboots, then uncordons
On exactly one host in the cluster (which one appears to be random), we see the following error:
MountVolume.MountDevice failed for volume "pvc-fd5b1ac4-aff8-445b-b36b-38a288793c63" : rpc error: code = Internal desc = Failed to stage
volume pvc-fd5b1ac4-aff8-445b-b36b-38a288793c63, err: rpc error: code = Internal desc = Error creating device for volume pvc-fd5b1ac4-aff8-
445b-b36b-38a288793c63, err: device not found with serial 60002ac000000000000002bc000274c7 or target
We observe removing the VLUNs from the previous host run, but VLUN registration for the new host just don't happen.
Moving the broken pods to a different host fixes the issues as the VLUNs are created for that host
Moving the pods back to the first host works correctly (the VLUNs are removed/recreated as expected