Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In OpenShift env (installed in azure), velero is not able to backup PVCs those were created using file.csi.azure.com provisioner while it works for PVCs those were created using disk.csi.azure.com provisioner #8468

Open
adityagu0910 opened this issue Nov 29, 2024 · 17 comments
Assignees
Labels
Area/Cloud/Azure Area/CSI Related to Container Storage Interface support env/openshift

Comments

@adityagu0910
Copy link

adityagu0910 commented Nov 29, 2024

What steps did you take and what happened:

Regular daily backup for whole OpenShift cluster is running and it is not able to backup PVCs those were created using file.csi.azure.com provisioner while it works for PVCs those were created using disk.csi.azure.com provisioner.
We see below warning in backup describe command.

Warnings:
Velero:
Cluster: resource: /persistentvolumes name: /pvc-1111 message: /No volume ID returned by volume snapshotter for persistent volume
resource: /persistentvolumes name: /pvc-2222 message: /No volume ID returned by volume snapshotter for persistent volume
Namespaces:

What did you expect to happen:
It should backup all the PVCs available in cluster including the one created using file.csi.azure.com provisioner

The following information will help us better understand what's going on:

Please see attached velero backup describe without details
daily-backup-schedule-20241129180420.txt

Anything else you would like to add:

Velero was installed using OpenShift OADP operator.
Below is the version:

Client:
Version: v1.14.1-OADP
Git commit: -
Server:
Version: v1.14.1-OADP

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"
@Lyndon-Li
Copy link
Contributor

No volume ID returned by volume snapshotter for persistent volume

Looks like CSI snapshot is not enabled and Velero is using native snapshotter. Follow this doc to enable CSI snapshot https://velero.io/docs/v1.15/csi/.

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Dec 2, 2024

Moreover, Azure file CSI doesn't support snapshot restore, so I am afraid you cannot restore the data even if the backup succeeds.
fs-backup is an alternative, but notice that fs-backup is not a consistent backup, so if you have a high data consistency expectation, it may not be used.

@adityagu0910 adityagu0910 changed the title In OpenShift env (installed in azure), velero is not able to backup PVCs those were created using file.csi.azure.com provisioner while it works for PVCs those were created using blob.csi.azure.com provisioner In OpenShift env (installed in azure), velero is not able to backup PVCs those were created using file.csi.azure.com provisioner while it works for PVCs those were created using disk.csi.azure.com provisioner Dec 2, 2024
@adityagu0910
Copy link
Author

@Lyndon-Li Thanks for your reply.
I see native snapshot works for azure disk and it does not work for azure file. Could it be my azure file share configuration issue ? or velero support native snapshot in azure for azure disks only?

@kaovilai
Copy link
Member

kaovilai commented Dec 2, 2024

I would like to note that as of OpenShift 4.16, azure file CSI driver do not supports snapshotting.

The tech preview snapshot functionality appears on 4.17+

@Lyndon-Li Lyndon-Li assigned Lyndon-Li and kaovilai and unassigned Lyndon-Li Dec 2, 2024
@kaovilai
Copy link
Member

kaovilai commented Dec 2, 2024

@Lyndon-Li I saw on the azure file csi snapshot example that restore is now supported on v1.30.2 fyi

@Lyndon-Li
Copy link
Contributor

velero support native snapshot in azure for azure disks only?

Yes, Velero native snapshot doesn't support azure file.

@adityagu0910
Copy link
Author

No volume ID returned by volume snapshotter for persistent volume

Looks like CSI snapshot is not enabled and Velero is using native snapshotter. Follow this doc to enable CSI snapshot https://velero.io/docs/v1.15/csi/.

For CSI snapshots, snapshots are being created in same azure resource group where we have our OpenShift env running. Is it possible to store snapshots to another azure resource group where we are storing our backups ( in backupStorageLocation container) ?

@kaovilai
Copy link
Member

kaovilai commented Dec 4, 2024

Snapshots belong to disks. Disks belong to a clusters.
It is generally expected that snapshots reside where disks it came from resides. Velero itself I imagine would not allow that configurability when triggering snapshots via CSI interface which are not cloud specific.

@kaovilai
Copy link
Member

kaovilai commented Dec 4, 2024

You should be able to move them if your cloud provider allows. You'll have to move it back to where the CSI driver would expect if you want velero to be able to trigger and create disk where the env lives.

@adityagu0910
Copy link
Author

Snapshots belong to disks. Disks belong to a clusters. It is generally expected that snapshots reside where disks it came from resides. Velero itself I imagine would not allow that configurability when triggering snapshots via CSI interface which are not cloud specific.

@kaovilai I think it could be useful if Velero provides an option to configure snapshot location (similar to how backup locations are configured)) for the CSI interface. Ideally, Velero should also support restoring snapshots directly from this configured location.

In many setups, backups are stored in a dedicated backup resource group shared across multiple applications, including OpenShift. Enabling such functionality would align with the common practice of centralizing backups and recovery for multiple applications.

@kaovilai
Copy link
Member

kaovilai commented Dec 4, 2024

@adityagu0910 I don't think csi spec itself has defined generic way to move regions. I would raise that with csi spec repo

@sseago
Copy link
Collaborator

sseago commented Dec 4, 2024

Also note that if you want to store the PVC data in the backup storage location, you can also do that by using datamover (setting --snapshot-move-data on the backup). When this is used, after creating the CSI snapshot, velero copies the data to the backup storage location using kopia and deletes the local CSI snapshot.

@adityagu0910
Copy link
Author

@sseago Yes I am trying to use OADP Data Mover(Velero built-in data mover) with CSI snapshot and facing issue with azure file share snapshot move.
data mover pod is not able to mount the volume created by data mover for snapshot transfer and this is happening only for azure file share pvcs and not for azure disk pvcs.

Getting below error related to secret not found. I believe this secret should be created automatically as it does when we provision any pvc dynamically using azure file share storage class in any namespace.

Warning FailedMount 111s (x9 over 3m59s) kubelet MountVolume.MountDevice failed for volume "pvc-test1" : rpc error: code = InvalidArgument desc = GetAccountInfo(OCPRESOURCEGROUPNAME#FILESHARESTORAGEACCOUNT#pvc-test1###openshift-adp) failed with error: could not get secret(azure-storage-account-FILESHARESTORAGEACCOUNT-secret): secrets "azure-storage-account-FILESHARESTORAGEACCOUNT-secret" not found

@adityagu0910
Copy link
Author

After creating the secret (azure-storage-account-FILESHARESTORAGEACCOUNT-secret) manually in openshift-adp namespace, backup/restore works fine for both CSI Snapshot type (Azure File and Azure Disk).

It could be a bug with CSI Data Mover that it is not able to leverage the secret from CSI azure file controller.

@kaovilai
Copy link
Member

kaovilai commented Dec 6, 2024

After creating the secret (azure-storage-account-FILESHARESTORAGEACCOUNT-secret) manually

Did you have the doc where you found to do that?

@kaovilai kaovilai added env/openshift Area/CSI Related to Container Storage Interface support labels Dec 6, 2024
@draghuram
Copy link
Contributor

draghuram commented Dec 8, 2024

I would like to add few details regarding Azure Files PVC backups as an additional data point.

Azure Files CSI driver supported CSI snapshots for long time but until recently, creating a PVC from that CSI snapshots was not possible. So CloudCasa implemented custom code to integrate with Azure portal and automate backups. See https://docs.cloudcasa.io/help/reference-pv-types.html#ref-pv-types-azure-file for some info in this regard.

And as @kaovilai pointed out, restore from snapshots functionality has been added in recent versions of Azure Files CSI driver. So we now allow our users to select a backup method (either CSI driver or CloudCasa).

https://docs.cloudcasa.io/help/relnotes-10-2024.html#selection-of-backup-method-for-azure-files-pvcs

However, we did see few issues with CSI driver implementation of restore so we have few users switch back to CloudCasa method after trying CSI driver. Of course, the issues may be fixed in future but I wanted to mention our experience in this regard.

Please feel free to reach out to me if you need more details.

@adityagu0910
Copy link
Author

After creating the secret (azure-storage-account-FILESHARESTORAGEACCOUNT-secret) manually

Did you have the doc where you found to do that?

I copied the secret from kube-system namespace(where we have all our CSI drivers installed) to openshift-adp namespace.

kubectl get secret azure-storage-account-<FILESHARESTORAGEACCOUNTNAME>-secret --namespace kube-system -o yaml | kubectl apply --namespace openshift-adp -f -
I have asked OpenShift Team to check why this is not getting created automatically as it does whenever I provision a PVC dynamically in any namespace from azure file storage class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area/Cloud/Azure Area/CSI Related to Container Storage Interface support env/openshift
Projects
None yet
Development

No branches or pull requests

5 participants