Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Velero backup deletion is not deleting objects in kopia repository #8768

Open
mergwyn opened this issue Mar 8, 2025 · 9 comments
Open

Velero backup deletion is not deleting objects in kopia repository #8768

mergwyn opened this issue Mar 8, 2025 · 9 comments
Assignees

Comments

@mergwyn
Copy link

mergwyn commented Mar 8, 2025

What steps did you take and what happened:

I've been running Velero and Kopia for some time now, and have noticed that there seems to be an increasing number of snapshots in the Kopia repository.

If I look at the Velero backups my old backup is from Feb 9th 2025, however my oldest Kopia snapshot is from April 7th 2024!

It is strange, that whilst there are many old snapshots, recent act

I've looked and it seems like Kopia maintanance is being run.

What did you expect to happen:

The following information will help us better understand what's going on:

If you are using velero v1.7.0+:
Please use velero debug --backup <backupname> --restore <restorename> to generate the support bundle, and attach to this issue, more options please refer to velero debug --help

If you are using earlier versions:
Please provide the output of the following commands (Pasting long output into a GitHub gist or other pastebin is fine.)

  • kubectl logs deployment/velero -n velero
  • velero backup describe <backupname> or kubectl get backup/<backupname> -n velero -o yaml
  • velero backup logs <backupname>
  • velero restore describe <restorename> or kubectl get restore/<restorename> -n velero -o yaml
  • velero restore logs <restorename>

As this issue is not to do with indivudly backups I have not included this information. Hopefully, this is more relevant:

Output of velero backup list:

velero_backup_get.txt

Output of command: kopia repo status

kopia_repo_status.txt

Output of command: kopia maintenance info --json

kopia_maintenance_info_--json.txt

Output of command: kopia snapshot list --all

kopia_snapshot_list_--all.txt

Output of command: kopia content stats

kopia_content_stats.txt

Output of command: kopia blob stats

kopia_blob_stats.txt

Output of command: kopia index list --json

kopia_index_list_--json.txt

Output of command: kopia content list --deleted-only

kopia_content_list_--deleted-only.txt

Output of command: kopia index epoch list

kopia_index_epoch_list.txt

Anything else you would like to add:

Environment:

  • Velero version (use velero version):
Client:
	Version: v1.15.0
	Git commit: 1d4f1475975b5107ec35f4d19ff17f7d1fcb3edf
Server:
	Version: v1.15.2
  • Velero features (use velero client config get features):
    features: <NOT SET>
  • Kubernetes version (use kubectl version):
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.0", GitCommit:"b46a3f887ca979b1a5d14fd39cb1af43e7e5d12d", GitTreeState:"clean", BuildDate:"2022-12-08T19:58:30Z", GoVersion:"go1.19.4", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"32", GitVersion:"v1.32.1+k3s1", GitCommit:"6a322f122729e0e668ca67fd9f0e993541bdce49", GitTreeState:"clean", BuildDate:"2025-01-28T18:27:08Z", GoVersion:"go1.23.4", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.26) and server (1.32) exceeds the supported minor version skew of +/-1
  • Kubernetes installer & version:
k3s version v1.32.1+k3s1 (6a322f12)
go version go1.23.4
  • Cloud provider or hardware configuration:
    I use a combination of local Minio and idrivee2 s3
minio version RELEASE.2024-08-17T01-24-54Z (commit-id=72cff79c8a7cc59bccb591995e3c3ed6aa2f4cd5)
Runtime: go1.22.6 linux/amd64
License: GNU AGPLv3 - https://www.gnu.org/licenses/agpl-3.0.html
Copyright: 2015-2024 MinIO, Inc.

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"
@Lyndon-Li Lyndon-Li self-assigned this Mar 10, 2025
@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Mar 10, 2025

For the old snapshots in 2024, could you confirm below questions:

  • Are they created by the current Velero installation? (upgrade is fine, but uninstall-reinstall is relevant)
  • If not, when you uninstall and reinstall Velero, have you reused an existing backup repository (e.g., you used the same bucket location, didn't delete objects in kopia prefix in the location)?
  • Have you manually deleted any podVolumeBackup CR?

Please also collect the podVolumeBackup CR list existing in the cluster.

@mergwyn
Copy link
Author

mergwyn commented Mar 10, 2025

Hi, Many thanks for the fast response.

  1. I've only upgraded the host command but I have reinstalled velero into the cluster (I changed the namespace)
  2. I don't believe that I cleaned the repository as I wanted to keep the backup history
  3. No, not deleted and podVolumeBackup

Output of command: kubectl -n velero get podVolumeBackup

kubectl_-n_velero_get_podVolumeBackup.txt

@Lyndon-Li
Copy link
Contributor

I've only upgraded the host command

What do you mean by "host command"?

@Lyndon-Li
Copy link
Contributor

I have reinstalled velero into the cluster (I changed the namespace)

When did you reinstall Velero? Do all the leaked repo snapshots belong to PVBs of the previous namespace?
If so, when were the backups for those PVBs deleted? In the current namespace or previous namespace?
And have you ever seen the PVBs from the previous namespace successfully synced to the new namespace?

@mergwyn
Copy link
Author

mergwyn commented Mar 11, 2025

I changed the namespace to ‘backup’ on Oct 2 2024 and changed it back on Jan 14 2025.

There are two backups in velero currently from Jan 1.

velero-monthly-minio-20250101000058    Completed         0        0          2025-01-01 00:26:27 +0000 GMT   21d       default            <none>
velero-monthly-idrive-20250101000058   Completed         0        0          2025-01-01 00:21:55 +0000 GMT   21d       idrive             <none>

So some of the backups for the previous namespace were synced to the new namespace.

All of the ‘orphan’ kopia snapshots belong to the early installs either with the ‘velero’ or ‘backup’ ns.

Unfortunately I can’t remember if the velero backups for those earlier backups were deleted manually or by velero.

By host command, I just meant the velero cli that I run on the host to talk to the k8s instance.

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Mar 11, 2025

Right now, it is hard to back track the PVBs and repo snapshots.
Switching Velero installation namespaces may make the situation complex and result in the problem, since it is possible that some PVBs are out of sync at the time of backup deletion and the repo snapshot deletion relies on the existence of the PVBs.
However, from the current info, I don't see any clue of how the problem happened.

For now, I suggest you monitor the environment for some more time WITHOUT changing the installation namespace, to see if the problem still reproduce.

@Lyndon-Li
Copy link
Contributor

In future, we have an enhancement of not relying on the existence of PVBs on repo snapshot manipulations (see #8763), which may be helpful to avoid this kind of problem.

@mergwyn
Copy link
Author

mergwyn commented Mar 11, 2025

Thanks.

Is it safe to use kopia to delete the orphan snapshots or do I need to clear down the repo? I do have a set of cloud backups in Idrive that does not seem to have this issue so losing the local mini repo would not be the end of the world.

@Lyndon-Li
Copy link
Contributor

Lyndon-Li commented Mar 12, 2025

Is it safe to use kopia to delete the orphan snapshots or do I need to clear down the repo?

Yes, you can delete the orphan repo snapshots through kopia command. Just be reminded that the corresponding backup data will lose once you delete the snapshots.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants