Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the design of cleaning artifacts generated during CSI B/R #8676

Merged
merged 1 commit into from
Feb 14, 2025

Conversation

blackpiglet
Copy link
Contributor

@blackpiglet blackpiglet commented Feb 11, 2025

Thank you for contributing to Velero!

Please add a summary of your change

Does your change fix a particular issue?

Fixes #(issue)

This is the design to resolve issue #7979.

Please indicate you've done the following:

@github-actions github-actions bot added the Area/Design Design Documents label Feb 11, 2025
@blackpiglet blackpiglet added kind/changelog-not-required PR does not require a user changelog. Often for docs, website, or build changes and removed Area/Design Design Documents labels Feb 11, 2025
@github-actions github-actions bot added the Area/Design Design Documents label Feb 11, 2025
Copy link

codecov bot commented Feb 11, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 59.43%. Comparing base (5d9a4e8) to head (a6ae21e).
Report is 16 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8676      +/-   ##
==========================================
+ Coverage   59.39%   59.43%   +0.03%     
==========================================
  Files         370      370              
  Lines       39988    40031      +43     
==========================================
+ Hits        23752    23791      +39     
- Misses      14744    14747       +3     
- Partials     1492     1493       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

design/clean_artifacts_in_csi_flow.md Outdated Show resolved Hide resolved
design/clean_artifacts_in_csi_flow.md Outdated Show resolved Hide resolved
design/clean_artifacts_in_csi_flow.md Outdated Show resolved Hide resolved
design/clean_artifacts_in_csi_flow.md Outdated Show resolved Hide resolved
design/clean_artifacts_in_csi_flow.md Show resolved Hide resolved
@blackpiglet blackpiglet force-pushed the 7979_design branch 3 times, most recently from 0b0a727 to b72f660 Compare February 11, 2025 09:53
design/clean_artifacts_in_csi_flow.md Outdated Show resolved Hide resolved
design/clean_artifacts_in_csi_flow.md Outdated Show resolved Hide resolved
design/clean_artifacts_in_csi_flow.md Outdated Show resolved Hide resolved
design/clean_artifacts_in_csi_flow.md Show resolved Hide resolved
Comment on lines +126 to +127
if boolptr.IsSetToFalse(input.Restore.Spec.RestorePVs) {
p.log.Infof("Restore did not request for PVs to be restored %s/%s",
input.Restore.Namespace, input.Restore.Name)
return &velero.RestoreItemActionExecuteOutput{SkipRestore: true}, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about scenarios where cluster admin backup entire cluster expecting VSC to be in backup and be restored?
Will admin be able to restore VSCs with RestorePVs: false?

Copy link
Contributor Author

@blackpiglet blackpiglet Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, yes. The check will return early in the action, but the VSC will still be created in the cluster.

Second, please check this design's Non Goals first bulletin:
There were some discussions about whether Velero backup should include VSs and VSCs not generated in during the backup. By far, the conclusion is not including them is a better option. Although that is a useful enhancement, that is not included in this design.

IMO, Velero backups don't need to include the VolumeSnapshotContents not created by the running backup.
The reason is the VolumeSnapshots and VolumeSnapshotContents only take effect when they are created.

The results of VS and VSC creation are either snapshots in the storage provider or the volumes created from the snapshot specified in the VSC.

If they are included in the backup, the restore will recreate them. The recreation may not be intended.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you can create VSC without resulting in new snapshot if there's already an existing snapshot id in the cloud provider

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check will return early in the action, but the VSC will still be created in the cluster.

but the return adds SkipRestore here? Just want to confirm intended behavior. I'm ok either way.

		return &velero.RestoreItemActionExecuteOutput{SkipRestore: true}, nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you can create VSC without resulting in new snapshot if there's already an existing snapshot id in the cloud provider

The static VolumeSnapshot will not trigger dynamically creating a snapshot, but the dynamic VolumeSnapshot will create a new snapshot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the return adds SkipRestore here? Just want to confirm intended behavior. I'm ok either way.

Sorry, in this case, the actual result is skipping restore of the VS and the VSC, but that is the existing behavior. It's not changed in this design.

```

### Backup Sync
csi-volumesnapshotclasses.json, csi-volumesnapshotcontents.json, and csi-volumesnapshots.json are CSI-related metadata files in the BSL for each backup.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please ensure these files are not removed altogether from velero, we rely on these files for record keeping of snapshots taken.
Just putting this to make sure the design intent is not to remove these files and only called in doc to explain the need for them in delete/ restore flow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Deleting the existing metadata files is considered a breaking change.
Even if this design proposes to delete them, we should follow the deprecation policy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think deprecation process is needed, b/c it's not APIs. but for v1.16 we'll keep it as is and open an issue to remove it in future release.

Copy link
Contributor Author

@blackpiglet blackpiglet Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anshulahuja98
Apart from whether the deprecation policy is relevant or not, it's OK to not delete the metadata files in v1.16.
Because we also have the Backup's VolumeInfo metadata, I think the downstream consumer, such as Azure, can use that to replace the CSI VSC and VS metadata files.
https://github.com/vmware-tanzu/velero/blob/main/design/Implemented/pv_backup_info.md

What's your opinion on making a change to adopt the backup VolumeInfo in the Azure future releases?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that seems reasonable, we'll work on migrating to use VolumeInfo instead of CSI JSONs.
We can target removal of files in 1.17 / 1.17+

@blackpiglet blackpiglet force-pushed the 7979_design branch 4 times, most recently from 7413507 to d948e76 Compare February 13, 2025 07:04
Lyndon-Li
Lyndon-Li previously approved these changes Feb 13, 2025
@blackpiglet
Copy link
Contributor Author

@kaovilai @anshulahuja98
PTAL.

@anshulahuja98
Copy link
Collaborator

Replied to the pending comment, feel free to resolve after reading.
Overall while we go for impl PR, please make sure back compat for restore / Delete Backup does not break in any way.

@kaovilai kaovilai merged commit 9235fe1 into vmware-tanzu:main Feb 14, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area/Design Design Documents kind/changelog-not-required PR does not require a user changelog. Often for docs, website, or build changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants