From 6c8ba0136ca97ba9ae1e13764b8270f34a448b3a Mon Sep 17 00:00:00 2001 From: sp98 Date: Mon, 29 Jan 2024 15:54:34 +0530 Subject: [PATCH 01/15] doc: add support for using azure kms Users on Microsoft Azure can make use of the Azure key vault service rather than replying on any third party service for KMS. Signed-off-by: sp98 --- .../Advanced/key-management-system.md | 55 +++++++++++++++---- 1 file changed, 45 insertions(+), 10 deletions(-) diff --git a/Documentation/Storage-Configuration/Advanced/key-management-system.md b/Documentation/Storage-Configuration/Advanced/key-management-system.md index 9877c91ce79d..a5640694623c 100644 --- a/Documentation/Storage-Configuration/Advanced/key-management-system.md +++ b/Documentation/Storage-Configuration/Advanced/key-management-system.md @@ -22,16 +22,18 @@ The `security` section contains settings related to encryption of the cluster. Supported KMS providers: -- [Vault](#vault) - - [Authentication methods](#authentication-methods) - - [Token-based authentication](#token-based-authentication) - - [Kubernetes-based authentication](#kubernetes-based-authentication) - - [General Vault configuration](#general-vault-configuration) - - [TLS configuration](#tls-configuration) -- [IBM Key Protect](#ibm-key-protect) - - [Configuration](#configuration) -- [Key Management Interoperability Protocol](#key-management-interoperability-protocol) - - [Configuration](#configuration-1) +* [Vault](#vault) + * [Authentication methods](#authentication-methods) + * [Token-based authentication](#token-based-authentication) + * [Kubernetes-based authentication](#kubernetes-based-authentication) + * [General Vault configuration](#general-vault-configuration) + * [TLS configuration](#tls-configuration) +* [IBM Key Protect](#ibm-key-protect) + * [Configuration](#configuration) +* [Key Management Interoperability Protocol](#key-management-interoperability-protocol) + * [Configuration](#configuration-1) +* [Azure Key Vault](#azure-key-vault) + * [Client Authentication](#client-authentication) ## Vault @@ -334,3 +336,36 @@ security: # name of the k8s secret containing the credentials. tokenSecretName: kmip-credentials ``` + +## Azure Key Vault + +Rook supports storing OSD encryption keys in [Azure Key vault](https://learn.microsoft.com/en-us/azure/key-vault/general/quick-create-portal) + +### Client Authentication + +Different methods are available in Azure to authenticate a client. Rook supports Azure recommended method of authentication with Service Principal and a certificate. Refer the following Azure documentation to set up key vault and authenticate it via service principal and certtificate + +* [Create Azure Key Vault](https://learn.microsoft.com/en-us/azure/key-vault/general/quick-create-portal) + * `AZURE_VAULT_URL` can be retrieved at this step + +* [Create Service Principal](https://learn.microsoft.com/en-us/entra/identity-platform/howto-create-service-principal) + * `AZURE_CLIENT_ID` and `AZURE_TENANT_ID` can be obtained after creating the service principal + * Ensure that the service principal is authenticated with a certificate and not with a client secret. + +* [Set Azure Key Vault RBAC](https://learn.microsoft.com/en-us/azure/key-vault/general/rbac-guide?tabs=azure-cli#enable-azure-rbac-permissions-on-key-vault) + * Ensure that the role assigned to the key vault should be able to create, retrieve and delete secrets in the key vault. + +Provide the following KMS connection details in order to connect with Azure Key Vault. + +```yaml +security: + kms: + connectionDetails: + KMS_PROVIDER: azure-kv + AZURE_VAULT_URL: https://.vault.azure.net + AZURE_CLIENT_ID: Application ID of an Azure service principal + AZURE_TENANT_ID: ID of the application's Microsoft Entra tenant + AZURE_CERT_SECRET_NAME: +``` + +* `AZURE_CERT_SECRET_NAME` should hold the name of the k8s secret. The secret data should be base64 encoded certificate along with private key (without password protection) From f7a9d8ff7b63150dd88f374a196a1764f69c0b70 Mon Sep 17 00:00:00 2001 From: parth-gr Date: Thu, 14 Dec 2023 18:30:11 +0530 Subject: [PATCH 02/15] core: added rook-ceph-default service account When a private docker registry is used and an image pull secret is specified in the chart, the pods with default Service Account fail to pull the image due to authentication issues. Added rook-ceph-default service account and modify the pods specifications by adding the serviceAccountName closes: https://github.com/rook/rook/issues/12786 Closes: https://github.com/rook/rook/issues/6673 Co-authored-by: Tareq Sharafy Signed-off-by: parth-gr (cherry picked from commit 737fb099feafa01489b233cb64f889c61b3b6016) Signed-off-by: parth-gr --- .../Prerequisites/authenticated-registry.md | 13 ++++--------- PendingReleaseNotes.md | 1 + build/csv/csv-gen.sh | 2 +- .../library/templates/_cluster-serviceaccount.tpl | 11 +++++++++++ .../templates/securityContextConstraints.yaml | 1 + deploy/examples/common-second-cluster.yaml | 12 ++++++++++++ deploy/examples/common.yaml | 12 ++++++++++++ pkg/apis/ceph.rook.io/v1/scc.go | 2 +- pkg/operator/ceph/cluster/cleanup.go | 7 ++++--- pkg/operator/ceph/cluster/mon/spec.go | 7 ++++--- pkg/operator/ceph/cluster/mon/spec_test.go | 1 + pkg/operator/ceph/cluster/nodedaemon/crash.go | 11 ++++++----- pkg/operator/ceph/cluster/nodedaemon/exporter.go | 1 + .../ceph/cluster/nodedaemon/exporter_test.go | 1 + pkg/operator/ceph/cluster/nodedaemon/pruner.go | 7 ++++--- pkg/operator/ceph/cluster/rbd/spec.go | 9 +++++---- pkg/operator/ceph/cluster/rbd/spec_test.go | 3 ++- pkg/operator/ceph/file/mds/spec.go | 9 +++++---- pkg/operator/ceph/file/mds/spec_test.go | 2 +- pkg/operator/ceph/file/mirror/spec.go | 9 +++++---- pkg/operator/ceph/file/mirror/spec_test.go | 2 ++ pkg/operator/ceph/nfs/spec.go | 3 ++- pkg/operator/ceph/nfs/spec_test.go | 2 ++ pkg/operator/k8sutil/cmdreporter/cmdreporter.go | 3 ++- pkg/operator/k8sutil/k8sutil.go | 3 ++- tests/framework/installer/ceph_settings.go | 1 + 26 files changed, 93 insertions(+), 42 deletions(-) diff --git a/Documentation/Getting-Started/Prerequisites/authenticated-registry.md b/Documentation/Getting-Started/Prerequisites/authenticated-registry.md index 503f9234d5ae..e9f346dadcdc 100644 --- a/Documentation/Getting-Started/Prerequisites/authenticated-registry.md +++ b/Documentation/Getting-Started/Prerequisites/authenticated-registry.md @@ -3,9 +3,7 @@ title: Authenticated Container Registries --- If you want to use an image from authenticated docker registry (e.g. for image cache/mirror), you'll need to -add an `imagePullSecret` to all relevant service accounts. This way all pods created by the operator (for service account: -`rook-ceph-system`) or all new pods in the namespace (for service account: `default`) will have the `imagePullSecret` added -to their spec. +add an `imagePullSecret` to all relevant service accounts. See the next section for the required service accounts. The whole process is described in the [official kubernetes documentation](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#add-imagepullsecrets-to-a-service-account). @@ -29,25 +27,22 @@ imagePullSecrets: The service accounts are: * `rook-ceph-system` (namespace: `rook-ceph`): Will affect all pods created by the rook operator in the `rook-ceph` namespace. -* `default` (namespace: `rook-ceph`): Will affect most pods in the `rook-ceph` namespace. +* `rook-ceph-default` (namespace: `rook-ceph`): Will affect most pods in the `rook-ceph` namespace. * `rook-ceph-mgr` (namespace: `rook-ceph`): Will affect the MGR pods in the `rook-ceph` namespace. * `rook-ceph-osd` (namespace: `rook-ceph`): Will affect the OSD pods in the `rook-ceph` namespace. * `rook-ceph-rgw` (namespace: `rook-ceph`): Will affect the RGW pods in the `rook-ceph` namespace. -You can do it either via e.g. `kubectl -n edit serviceaccount default` or by modifying the [`operator.yaml`](https://github.com/rook/rook/blob/master/deploy/examples/operator.yaml) -and [`cluster.yaml`](https://github.com/rook/rook/blob/master/deploy/examples/cluster.yaml) before deploying them. - Since it's the same procedure for all service accounts, here is just one example: ```console -kubectl -n rook-ceph edit serviceaccount default +kubectl -n rook-ceph edit serviceaccount rook-ceph-default ``` ```yaml hl_lines="9-10" apiVersion: v1 kind: ServiceAccount metadata: - name: default + name: rook-ceph-default namespace: rook-ceph secrets: - name: default-token-12345 diff --git a/PendingReleaseNotes.md b/PendingReleaseNotes.md index 3c0f7a94bdfe..76426906d391 100644 --- a/PendingReleaseNotes.md +++ b/PendingReleaseNotes.md @@ -9,3 +9,4 @@ read affinity setting in cephCluster CR (CSIDriverOptions section) in [PR](https ## Features - Kubernetes versions **v1.24** through **v1.29** are supported. +- Ceph daemon pods using the `default` service account now use a new `rook-ceph-default` service account. diff --git a/build/csv/csv-gen.sh b/build/csv/csv-gen.sh index e55d3f448563..cab017e01dc4 100755 --- a/build/csv/csv-gen.sh +++ b/build/csv/csv-gen.sh @@ -23,7 +23,7 @@ ASSEMBLE_FILE_OCP="../../deploy/olm/assemble/metadata-ocp.yaml" ############# function generate_csv() { - kubectl kustomize ../../deploy/examples/ | "$operator_sdk" generate bundle --package="rook-ceph" --output-dir="../../build/csv/ceph/$PLATFORM" --extra-service-accounts=rook-ceph-system,rook-csi-rbd-provisioner-sa,rook-csi-rbd-plugin-sa,rook-csi-cephfs-provisioner-sa,rook-csi-nfs-provisioner-sa,rook-csi-nfs-plugin-sa,rook-csi-cephfs-plugin-sa,rook-ceph-system,rook-ceph-rgw,rook-ceph-purge-osd,rook-ceph-osd,rook-ceph-mgr,rook-ceph-cmd-reporter + kubectl kustomize ../../deploy/examples/ | "$operator_sdk" generate bundle --package="rook-ceph" --output-dir="../../build/csv/ceph/$PLATFORM" --extra-service-accounts=rook-ceph-default,rook-csi-rbd-provisioner-sa,rook-csi-rbd-plugin-sa,rook-csi-cephfs-provisioner-sa,rook-csi-nfs-provisioner-sa,rook-csi-nfs-plugin-sa,rook-csi-cephfs-plugin-sa,rook-ceph-system,rook-ceph-rgw,rook-ceph-purge-osd,rook-ceph-osd,rook-ceph-mgr,rook-ceph-cmd-reporter # cleanup to get the expected state before merging the real data from assembles "${YQ_CMD_DELETE[@]}" "$CSV_FILE_NAME" 'spec.icon[*]' diff --git a/deploy/charts/library/templates/_cluster-serviceaccount.tpl b/deploy/charts/library/templates/_cluster-serviceaccount.tpl index fcc9932f3871..c6709f370972 100644 --- a/deploy/charts/library/templates/_cluster-serviceaccount.tpl +++ b/deploy/charts/library/templates/_cluster-serviceaccount.tpl @@ -57,4 +57,15 @@ metadata: storage-backend: ceph {{- include "library.rook-ceph.labels" . | nindent 4 }} {{ include "library.imagePullSecrets" . }} +--- +# Service account for other components +apiVersion: v1 +kind: ServiceAccount +metadata: + name: rook-ceph-default + namespace: {{ .Release.Namespace }} # namespace:cluster + labels: + operator: rook + storage-backend: ceph +{{ include "library.imagePullSecrets" . }} {{ end }} diff --git a/deploy/charts/rook-ceph-cluster/templates/securityContextConstraints.yaml b/deploy/charts/rook-ceph-cluster/templates/securityContextConstraints.yaml index 893350a9b205..f79bcef07f79 100644 --- a/deploy/charts/rook-ceph-cluster/templates/securityContextConstraints.yaml +++ b/deploy/charts/rook-ceph-cluster/templates/securityContextConstraints.yaml @@ -41,4 +41,5 @@ users: - system:serviceaccount:{{ .Release.Namespace }}:rook-ceph-mgr - system:serviceaccount:{{ .Release.Namespace }}:rook-ceph-osd - system:serviceaccount:{{ .Release.Namespace }}:rook-ceph-rgw + - system:serviceaccount:{{ .Release.Namespace }}:rook-ceph-default {{- end }} diff --git a/deploy/examples/common-second-cluster.yaml b/deploy/examples/common-second-cluster.yaml index a7b7ff72d01a..c19a618c2d55 100644 --- a/deploy/examples/common-second-cluster.yaml +++ b/deploy/examples/common-second-cluster.yaml @@ -224,6 +224,18 @@ metadata: name: rook-ceph-mgr namespace: rook-ceph-secondary # namespace:cluster --- +# Service account for other components +apiVersion: v1 +kind: ServiceAccount +metadata: + name: rook-ceph-default + namespace: rook-ceph-secondary # namespace:cluster + labels: + operator: rook + storage-backend: ceph +# imagePullSecrets: +# - name: my-registry-secret +--- apiVersion: v1 kind: ServiceAccount metadata: diff --git a/deploy/examples/common.yaml b/deploy/examples/common.yaml index c344860c1b04..5495bf631ab3 100644 --- a/deploy/examples/common.yaml +++ b/deploy/examples/common.yaml @@ -1154,6 +1154,18 @@ metadata: # imagePullSecrets: # - name: my-registry-secret --- +# Service account for other components +apiVersion: v1 +kind: ServiceAccount +metadata: + name: rook-ceph-default + namespace: rook-ceph # namespace:cluster + labels: + operator: rook + storage-backend: ceph +# imagePullSecrets: +# - name: my-registry-secret +--- # Service account for Ceph mgrs apiVersion: v1 kind: ServiceAccount diff --git a/pkg/apis/ceph.rook.io/v1/scc.go b/pkg/apis/ceph.rook.io/v1/scc.go index 8db6efb7454b..8a76c1566903 100644 --- a/pkg/apis/ceph.rook.io/v1/scc.go +++ b/pkg/apis/ceph.rook.io/v1/scc.go @@ -69,7 +69,7 @@ func NewSecurityContextConstraints(name string, namespaces ...string) *secv1.Sec for _, ns := range namespaces { users = append(users, []string{ fmt.Sprintf("system:serviceaccount:%s:rook-ceph-system", ns), - fmt.Sprintf("system:serviceaccount:%s:default", ns), + fmt.Sprintf("system:serviceaccount:%s:rook-ceph-default", ns), fmt.Sprintf("system:serviceaccount:%s:rook-ceph-mgr", ns), fmt.Sprintf("system:serviceaccount:%s:rook-ceph-osd", ns), fmt.Sprintf("system:serviceaccount:%s:rook-ceph-rgw", ns), diff --git a/pkg/operator/ceph/cluster/cleanup.go b/pkg/operator/ceph/cluster/cleanup.go index 23ece72b1eda..6c87353b09c4 100644 --- a/pkg/operator/ceph/cluster/cleanup.go +++ b/pkg/operator/ceph/cluster/cleanup.go @@ -158,9 +158,10 @@ func (c *ClusterController) cleanUpJobTemplateSpec(cluster *cephv1.CephCluster, Containers: []v1.Container{ c.cleanUpJobContainer(cluster, monSecret, clusterFSID), }, - Volumes: volumes, - RestartPolicy: v1.RestartPolicyOnFailure, - PriorityClassName: cephv1.GetCleanupPriorityClassName(cluster.Spec.PriorityClassNames), + Volumes: volumes, + RestartPolicy: v1.RestartPolicyOnFailure, + PriorityClassName: cephv1.GetCleanupPriorityClassName(cluster.Spec.PriorityClassNames), + ServiceAccountName: k8sutil.DefaultServiceAccount, }, } diff --git a/pkg/operator/ceph/cluster/mon/spec.go b/pkg/operator/ceph/cluster/mon/spec.go index af8db024e8bb..69ea83d8888c 100644 --- a/pkg/operator/ceph/cluster/mon/spec.go +++ b/pkg/operator/ceph/cluster/mon/spec.go @@ -186,9 +186,10 @@ func (c *Cluster) makeMonPod(monConfig *monConfig, canary bool) (*corev1.Pod, er RestartPolicy: corev1.RestartPolicyAlways, // we decide later whether to use a PVC volume or host volumes for mons, so only populate // the base volumes at this point. - Volumes: controller.DaemonVolumesBase(monConfig.DataPathMap, keyringStoreName, c.spec.DataDirHostPath), - HostNetwork: monConfig.UseHostNetwork, - PriorityClassName: cephv1.GetMonPriorityClassName(c.spec.PriorityClassNames), + Volumes: controller.DaemonVolumesBase(monConfig.DataPathMap, keyringStoreName, c.spec.DataDirHostPath), + HostNetwork: monConfig.UseHostNetwork, + PriorityClassName: cephv1.GetMonPriorityClassName(c.spec.PriorityClassNames), + ServiceAccountName: k8sutil.DefaultServiceAccount, } // If the log collector is enabled we add the side-car container diff --git a/pkg/operator/ceph/cluster/mon/spec_test.go b/pkg/operator/ceph/cluster/mon/spec_test.go index 3c5d0b43280f..d336654a313d 100644 --- a/pkg/operator/ceph/cluster/mon/spec_test.go +++ b/pkg/operator/ceph/cluster/mon/spec_test.go @@ -72,6 +72,7 @@ func testPodSpec(t *testing.T, monID string, pvc bool) { d, err := c.makeDeployment(monConfig, false) assert.NoError(t, err) assert.NotNil(t, d) + assert.Equal(t, k8sutil.DefaultServiceAccount, d.Spec.Template.Spec.ServiceAccountName) if pvc { d.Spec.Template.Spec.Volumes = append( diff --git a/pkg/operator/ceph/cluster/nodedaemon/crash.go b/pkg/operator/ceph/cluster/nodedaemon/crash.go index c6b9e7e09412..97abb34a3f9d 100644 --- a/pkg/operator/ceph/cluster/nodedaemon/crash.go +++ b/pkg/operator/ceph/cluster/nodedaemon/crash.go @@ -116,11 +116,12 @@ func (r *ReconcileNode) createOrUpdateCephCrash(node corev1.Node, tolerations [] Containers: []corev1.Container{ getCrashDaemonContainer(cephCluster, *cephVersion), }, - Tolerations: tolerations, - RestartPolicy: corev1.RestartPolicyAlways, - HostNetwork: cephCluster.Spec.Network.IsHost(), - Volumes: volumes, - PriorityClassName: cephv1.GetCrashCollectorPriorityClassName(cephCluster.Spec.PriorityClassNames), + Tolerations: tolerations, + RestartPolicy: corev1.RestartPolicyAlways, + HostNetwork: cephCluster.Spec.Network.IsHost(), + Volumes: volumes, + PriorityClassName: cephv1.GetCrashCollectorPriorityClassName(cephCluster.Spec.PriorityClassNames), + ServiceAccountName: k8sutil.DefaultServiceAccount, }, } diff --git a/pkg/operator/ceph/cluster/nodedaemon/exporter.go b/pkg/operator/ceph/cluster/nodedaemon/exporter.go index f0a196384eb6..ca66fb540ce5 100644 --- a/pkg/operator/ceph/cluster/nodedaemon/exporter.go +++ b/pkg/operator/ceph/cluster/nodedaemon/exporter.go @@ -143,6 +143,7 @@ func (r *ReconcileNode) createOrUpdateCephExporter(node corev1.Node, tolerations Volumes: volumes, PriorityClassName: cephv1.GetCephExporterPriorityClassName(cephCluster.Spec.PriorityClassNames), TerminationGracePeriodSeconds: &terminationGracePeriodSeconds, + ServiceAccountName: k8sutil.DefaultServiceAccount, }, } cephv1.GetCephExporterAnnotations(cephCluster.Spec.Annotations).ApplyToObjectMeta(&deploy.Spec.Template.ObjectMeta) diff --git a/pkg/operator/ceph/cluster/nodedaemon/exporter_test.go b/pkg/operator/ceph/cluster/nodedaemon/exporter_test.go index fa3e635a0d35..6a72b1776bf9 100644 --- a/pkg/operator/ceph/cluster/nodedaemon/exporter_test.go +++ b/pkg/operator/ceph/cluster/nodedaemon/exporter_test.go @@ -103,6 +103,7 @@ func TestCreateOrUpdateCephExporter(t *testing.T) { assert.Equal(t, tolerations, podSpec.Spec.Tolerations) assert.Equal(t, false, podSpec.Spec.HostNetwork) assert.Equal(t, "", podSpec.Spec.PriorityClassName) + assert.Equal(t, k8sutil.DefaultServiceAccount, podSpec.Spec.ServiceAccountName) assertCephExporterArgs(t, podSpec.Spec.Containers[0].Args, cephCluster.Spec.Network.DualStack || cephCluster.Spec.Network.IPFamily == "IPv6") diff --git a/pkg/operator/ceph/cluster/nodedaemon/pruner.go b/pkg/operator/ceph/cluster/nodedaemon/pruner.go index 25e19d22845a..bb8e3966bf92 100644 --- a/pkg/operator/ceph/cluster/nodedaemon/pruner.go +++ b/pkg/operator/ceph/cluster/nodedaemon/pruner.go @@ -107,9 +107,10 @@ func (r *ReconcileNode) createOrUpdateCephCron(cephCluster cephv1.CephCluster, c Containers: []corev1.Container{ getCrashPruneContainer(cephCluster, *cephVersion), }, - RestartPolicy: corev1.RestartPolicyNever, - HostNetwork: cephCluster.Spec.Network.IsHost(), - Volumes: volumes, + RestartPolicy: corev1.RestartPolicyNever, + HostNetwork: cephCluster.Spec.Network.IsHost(), + Volumes: volumes, + ServiceAccountName: k8sutil.DefaultServiceAccount, }, } diff --git a/pkg/operator/ceph/cluster/rbd/spec.go b/pkg/operator/ceph/cluster/rbd/spec.go index 35ba7bae831f..2b846eae826d 100644 --- a/pkg/operator/ceph/cluster/rbd/spec.go +++ b/pkg/operator/ceph/cluster/rbd/spec.go @@ -39,10 +39,11 @@ func (r *ReconcileCephRBDMirror) makeDeployment(daemonConfig *daemonConfig, rbdM Containers: []v1.Container{ r.makeMirroringDaemonContainer(daemonConfig, rbdMirror), }, - RestartPolicy: v1.RestartPolicyAlways, - Volumes: controller.DaemonVolumes(daemonConfig.DataPathMap, daemonConfig.ResourceName, r.cephClusterSpec.DataDirHostPath), - HostNetwork: r.cephClusterSpec.Network.IsHost(), - PriorityClassName: rbdMirror.Spec.PriorityClassName, + RestartPolicy: v1.RestartPolicyAlways, + Volumes: controller.DaemonVolumes(daemonConfig.DataPathMap, daemonConfig.ResourceName, r.cephClusterSpec.DataDirHostPath), + HostNetwork: r.cephClusterSpec.Network.IsHost(), + PriorityClassName: rbdMirror.Spec.PriorityClassName, + ServiceAccountName: k8sutil.DefaultServiceAccount, }, } diff --git a/pkg/operator/ceph/cluster/rbd/spec_test.go b/pkg/operator/ceph/cluster/rbd/spec_test.go index 981f8538a613..d03596645f52 100644 --- a/pkg/operator/ceph/cluster/rbd/spec_test.go +++ b/pkg/operator/ceph/cluster/rbd/spec_test.go @@ -23,9 +23,9 @@ import ( "github.com/rook/rook/pkg/client/clientset/versioned/scheme" cephclient "github.com/rook/rook/pkg/daemon/ceph/client" "github.com/rook/rook/pkg/operator/ceph/config" - "github.com/rook/rook/pkg/operator/ceph/test" cephver "github.com/rook/rook/pkg/operator/ceph/version" + "github.com/rook/rook/pkg/operator/k8sutil" "github.com/stretchr/testify/assert" v1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/api/resource" @@ -91,6 +91,7 @@ func TestPodSpec(t *testing.T) { assert.Equal(t, 5, len(d.Spec.Template.Spec.Volumes)) assert.Equal(t, 1, len(d.Spec.Template.Spec.Volumes[0].Projected.Sources)) assert.Equal(t, 5, len(d.Spec.Template.Spec.Containers[0].VolumeMounts)) + assert.Equal(t, k8sutil.DefaultServiceAccount, d.Spec.Template.Spec.ServiceAccountName) // Deployment should have Ceph labels test.AssertLabelsContainCephRequirements(t, d.ObjectMeta.Labels, diff --git a/pkg/operator/ceph/file/mds/spec.go b/pkg/operator/ceph/file/mds/spec.go index 83d38dab2843..426957a2409c 100644 --- a/pkg/operator/ceph/file/mds/spec.go +++ b/pkg/operator/ceph/file/mds/spec.go @@ -61,10 +61,11 @@ func (c *Cluster) makeDeployment(mdsConfig *mdsConfig, fsNamespacedname types.Na Containers: []v1.Container{ mdsContainer, }, - RestartPolicy: v1.RestartPolicyAlways, - Volumes: controller.DaemonVolumes(mdsConfig.DataPathMap, mdsConfig.ResourceName, c.clusterSpec.DataDirHostPath), - HostNetwork: c.clusterSpec.Network.IsHost(), - PriorityClassName: c.fs.Spec.MetadataServer.PriorityClassName, + RestartPolicy: v1.RestartPolicyAlways, + Volumes: controller.DaemonVolumes(mdsConfig.DataPathMap, mdsConfig.ResourceName, c.clusterSpec.DataDirHostPath), + HostNetwork: c.clusterSpec.Network.IsHost(), + PriorityClassName: c.fs.Spec.MetadataServer.PriorityClassName, + ServiceAccountName: k8sutil.DefaultServiceAccount, }, } diff --git a/pkg/operator/ceph/file/mds/spec_test.go b/pkg/operator/ceph/file/mds/spec_test.go index 18445edc14b7..803c3b6c019e 100644 --- a/pkg/operator/ceph/file/mds/spec_test.go +++ b/pkg/operator/ceph/file/mds/spec_test.go @@ -28,7 +28,6 @@ import ( "github.com/rook/rook/pkg/clusterd" cephclient "github.com/rook/rook/pkg/daemon/ceph/client" cephver "github.com/rook/rook/pkg/operator/ceph/version" - testop "github.com/rook/rook/pkg/operator/test" "github.com/stretchr/testify/assert" apps "k8s.io/api/apps/v1" @@ -104,6 +103,7 @@ func TestPodSpecs(t *testing.T) { assert.NotNil(t, d) assert.Equal(t, v1.RestartPolicyAlways, d.Spec.Template.Spec.RestartPolicy) + assert.Equal(t, k8sutil.DefaultServiceAccount, d.Spec.Template.Spec.ServiceAccountName) // Deployment should have Ceph labels test.AssertLabelsContainCephRequirements(t, d.ObjectMeta.Labels, diff --git a/pkg/operator/ceph/file/mirror/spec.go b/pkg/operator/ceph/file/mirror/spec.go index 4b197e039956..8e9153b5bc28 100644 --- a/pkg/operator/ceph/file/mirror/spec.go +++ b/pkg/operator/ceph/file/mirror/spec.go @@ -42,10 +42,11 @@ func (r *ReconcileFilesystemMirror) makeDeployment(daemonConfig *daemonConfig, f Containers: []v1.Container{ r.makeFsMirroringDaemonContainer(daemonConfig, fsMirror), }, - RestartPolicy: v1.RestartPolicyAlways, - Volumes: controller.DaemonVolumes(daemonConfig.DataPathMap, daemonConfig.ResourceName, r.cephClusterSpec.DataDirHostPath), - HostNetwork: r.cephClusterSpec.Network.IsHost(), - PriorityClassName: fsMirror.Spec.PriorityClassName, + RestartPolicy: v1.RestartPolicyAlways, + Volumes: controller.DaemonVolumes(daemonConfig.DataPathMap, daemonConfig.ResourceName, r.cephClusterSpec.DataDirHostPath), + HostNetwork: r.cephClusterSpec.Network.IsHost(), + PriorityClassName: fsMirror.Spec.PriorityClassName, + ServiceAccountName: k8sutil.DefaultServiceAccount, }, } diff --git a/pkg/operator/ceph/file/mirror/spec_test.go b/pkg/operator/ceph/file/mirror/spec_test.go index 0bf8cc1dd0e1..256705fab3c9 100644 --- a/pkg/operator/ceph/file/mirror/spec_test.go +++ b/pkg/operator/ceph/file/mirror/spec_test.go @@ -25,6 +25,7 @@ import ( "github.com/rook/rook/pkg/operator/ceph/config" "github.com/rook/rook/pkg/operator/ceph/test" cephver "github.com/rook/rook/pkg/operator/ceph/version" + "github.com/rook/rook/pkg/operator/k8sutil" "github.com/stretchr/testify/assert" v1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/api/resource" @@ -88,6 +89,7 @@ func TestPodSpec(t *testing.T) { assert.Equal(t, 5, len(d.Spec.Template.Spec.Volumes)) assert.Equal(t, 1, len(d.Spec.Template.Spec.Volumes[0].Projected.Sources)) assert.Equal(t, 5, len(d.Spec.Template.Spec.Containers[0].VolumeMounts)) + assert.Equal(t, k8sutil.DefaultServiceAccount, d.Spec.Template.Spec.ServiceAccountName) // Deployment should have Ceph labels test.AssertLabelsContainCephRequirements(t, d.ObjectMeta.Labels, diff --git a/pkg/operator/ceph/nfs/spec.go b/pkg/operator/ceph/nfs/spec.go index 4c4bcbf45e8d..10edf9399f6c 100644 --- a/pkg/operator/ceph/nfs/spec.go +++ b/pkg/operator/ceph/nfs/spec.go @@ -148,7 +148,8 @@ func (r *ReconcileCephNFS) makeDeployment(nfs *cephv1.CephNFS, cfg daemonConfig) // for kerberos, nfs-ganesha uses the hostname via getaddrinfo() and uses that when // connecting to the krb server. give all ganesha servers the same hostname so they can all // use the same krb credentials to auth - Hostname: fmt.Sprintf("%s-%s", nfs.Namespace, nfs.Name), + Hostname: fmt.Sprintf("%s-%s", nfs.Namespace, nfs.Name), + ServiceAccountName: k8sutil.DefaultServiceAccount, } // Replace default unreachable node toleration k8sutil.AddUnreachableNodeToleration(&podSpec) diff --git a/pkg/operator/ceph/nfs/spec_test.go b/pkg/operator/ceph/nfs/spec_test.go index 548faeb830fd..870321581a39 100644 --- a/pkg/operator/ceph/nfs/spec_test.go +++ b/pkg/operator/ceph/nfs/spec_test.go @@ -26,6 +26,7 @@ import ( cephclient "github.com/rook/rook/pkg/daemon/ceph/client" "github.com/rook/rook/pkg/operator/ceph/config" cephver "github.com/rook/rook/pkg/operator/ceph/version" + "github.com/rook/rook/pkg/operator/k8sutil" optest "github.com/rook/rook/pkg/operator/test" exectest "github.com/rook/rook/pkg/util/exec/test" "github.com/stretchr/testify/assert" @@ -145,6 +146,7 @@ func TestDeploymentSpec(t *testing.T) { }, ) assert.Equal(t, "my-priority-class", d.Spec.Template.Spec.PriorityClassName) + assert.Equal(t, k8sutil.DefaultServiceAccount, d.Spec.Template.Spec.ServiceAccountName) }) t.Run("with sssd sidecar", func(t *testing.T) { diff --git a/pkg/operator/k8sutil/cmdreporter/cmdreporter.go b/pkg/operator/k8sutil/cmdreporter/cmdreporter.go index 11aa47f11f6f..affc87558f9e 100644 --- a/pkg/operator/k8sutil/cmdreporter/cmdreporter.go +++ b/pkg/operator/k8sutil/cmdreporter/cmdreporter.go @@ -300,7 +300,8 @@ func (cr *cmdReporterCfg) initJobSpec() (*batch.Job, error) { Containers: []v1.Container{ *cmdReporterContainer, }, - RestartPolicy: v1.RestartPolicyOnFailure, + RestartPolicy: v1.RestartPolicyOnFailure, + ServiceAccountName: k8sutil.DefaultServiceAccount, } copyBinsVol, _ := copyBinariesVolAndMount() podSpec.Volumes = []v1.Volume{copyBinsVol} diff --git a/pkg/operator/k8sutil/k8sutil.go b/pkg/operator/k8sutil/k8sutil.go index 32b8fbbbd8e2..e816f97b980e 100644 --- a/pkg/operator/k8sutil/k8sutil.go +++ b/pkg/operator/k8sutil/k8sutil.go @@ -54,10 +54,11 @@ const ( PodNamespaceEnvVar = "POD_NAMESPACE" // NodeNameEnvVar is the env variable for getting the node via downward api NodeNameEnvVar = "NODE_NAME" - // RookVersionLabelKey is the key used for reporting the Rook version which last created or // modified a resource. RookVersionLabelKey = "rook-version" + // DefaultServiceAccount is a service-account used for components that do not specify a dedicated service-account. + DefaultServiceAccount = "rook-ceph-default" ) // GetK8SVersion gets the version of the running K8S cluster diff --git a/tests/framework/installer/ceph_settings.go b/tests/framework/installer/ceph_settings.go index 41d1d01dcb76..3fb0e7cb1501 100644 --- a/tests/framework/installer/ceph_settings.go +++ b/tests/framework/installer/ceph_settings.go @@ -99,6 +99,7 @@ func replaceNamespaces(name, manifest, operatorNamespace, clusterNamespace strin // SCC namespaces for operator and Ceph daemons manifest = strings.ReplaceAll(manifest, "rook-ceph:rook-ceph-system # serviceaccount:namespace:operator", operatorNamespace+":rook-ceph-system") + manifest = strings.ReplaceAll(manifest, "rook-ceph:rook-ceph-default # serviceaccount:namespace:cluster", clusterNamespace+":rook-ceph-default") manifest = strings.ReplaceAll(manifest, "rook-ceph:rook-ceph-mgr # serviceaccount:namespace:cluster", clusterNamespace+":rook-ceph-mgr") manifest = strings.ReplaceAll(manifest, "rook-ceph:rook-ceph-osd # serviceaccount:namespace:cluster", clusterNamespace+":rook-ceph-osd") manifest = strings.ReplaceAll(manifest, "rook-ceph:rook-ceph-rgw # serviceaccount:namespace:cluster", clusterNamespace+":rook-ceph-rgw") From 10dea459ec32f21b190dec0e3bcc6a9bce3db30b Mon Sep 17 00:00:00 2001 From: Praveen M Date: Thu, 22 Feb 2024 14:58:12 +0530 Subject: [PATCH 03/15] doc: pending release notes for update netNamespaceFilePath PR Signed-off-by: Praveen M --- PendingReleaseNotes.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/PendingReleaseNotes.md b/PendingReleaseNotes.md index 3c0f7a94bdfe..6fdb6afdb181 100644 --- a/PendingReleaseNotes.md +++ b/PendingReleaseNotes.md @@ -5,7 +5,9 @@ - The removal of `CSI_ENABLE_READ_AFFINITY` option and its replacement with per-cluster read affinity setting in cephCluster CR (CSIDriverOptions section) in [PR](https://github.com/rook/rook/pull/13665) - Allow setting the Ceph `application` on a pool - +- updating `netNamespaceFilePath` for all clusterIDs in rook-ceph-csi-config configMap in [PR](https://github.com/rook/rook/pull/13613) + - Issue: The netNamespaceFilePath isn't updated in the CSI config map for all the clusterIDs when `CSI_ENABLE_HOST_NETWORK` is set to false in `operator.yaml` + - Impact: This results in the unintended network configurations, with pods using the host networking instead of pod networking. ## Features - Kubernetes versions **v1.24** through **v1.29** are supported. From ea700bcf9a5449784f15adb66c7e118438185c22 Mon Sep 17 00:00:00 2001 From: karthik-us Date: Thu, 29 Feb 2024 11:22:12 +0530 Subject: [PATCH 04/15] doc: fix broken links Fixing the broken links in the docs. Signed-off-by: karthik-us --- Documentation/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/README.md b/Documentation/README.md index ede5c852315c..55f9e3b288ed 100644 --- a/Documentation/README.md +++ b/Documentation/README.md @@ -18,11 +18,11 @@ Rook is hosted by the [Cloud Native Computing Foundation](https://cncf.io) (CNCF ## Quick Start Guide Starting Ceph in your cluster is as simple as a few `kubectl` commands. -See our [Quickstart](quickstart.md) guide to get started with the Ceph operator! +See our [Quickstart](https://github.com/rook/rook/tree/master/Documentation/Getting-Started/quickstart.md) guide to get started with the Ceph operator! ## Designs -[Ceph](https://docs.ceph.com/en/latest/) is a highly scalable distributed storage solution for block storage, object storage, and shared filesystems with years of production deployments. See the [Ceph overview](storage-architecture.md). +[Ceph](https://docs.ceph.com/en/latest/) is a highly scalable distributed storage solution for block storage, object storage, and shared filesystems with years of production deployments. See the [Ceph overview](https://github.com/rook/rook/tree/master/Documentation/Getting-Started/storage-architecture.md). For detailed design documentation, see also the [design docs](https://github.com/rook/rook/tree/master/design). From 532e865ba2a4be55fab0c8c3feef2e71b5f38aee Mon Sep 17 00:00:00 2001 From: karthik-us Date: Fri, 1 Mar 2024 00:10:39 +0530 Subject: [PATCH 05/15] Revert "doc: fix broken links" This reverts commit ea700bcf9a5449784f15adb66c7e118438185c22. Signed-off-by: karthik-us --- Documentation/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/README.md b/Documentation/README.md index 55f9e3b288ed..ede5c852315c 100644 --- a/Documentation/README.md +++ b/Documentation/README.md @@ -18,11 +18,11 @@ Rook is hosted by the [Cloud Native Computing Foundation](https://cncf.io) (CNCF ## Quick Start Guide Starting Ceph in your cluster is as simple as a few `kubectl` commands. -See our [Quickstart](https://github.com/rook/rook/tree/master/Documentation/Getting-Started/quickstart.md) guide to get started with the Ceph operator! +See our [Quickstart](quickstart.md) guide to get started with the Ceph operator! ## Designs -[Ceph](https://docs.ceph.com/en/latest/) is a highly scalable distributed storage solution for block storage, object storage, and shared filesystems with years of production deployments. See the [Ceph overview](https://github.com/rook/rook/tree/master/Documentation/Getting-Started/storage-architecture.md). +[Ceph](https://docs.ceph.com/en/latest/) is a highly scalable distributed storage solution for block storage, object storage, and shared filesystems with years of production deployments. See the [Ceph overview](storage-architecture.md). For detailed design documentation, see also the [design docs](https://github.com/rook/rook/tree/master/design). From d12478beb353c586cccdb9a00c799457e562fe51 Mon Sep 17 00:00:00 2001 From: Scott Miller Date: Mon, 29 Jan 2024 14:18:49 -0500 Subject: [PATCH 06/15] build: add ability to stash docker build context This commit gives builders the necessary tooling to save off a docker build context for use with other tools that dont follow the same command format as $DOCKERCMD. Signed-off-by: Scott Miller --- images/ceph/Makefile | 35 ++++++++++++++++++++--------------- 1 file changed, 20 insertions(+), 15 deletions(-) diff --git a/images/ceph/Makefile b/images/ceph/Makefile index 495fc037f87e..e93dab7d9fc8 100755 --- a/images/ceph/Makefile +++ b/images/ceph/Makefile @@ -31,7 +31,9 @@ YQv3_VERSION = 3.4.1 GOHOST := GOOS=$(GOHOSTOS) GOARCH=$(GOHOSTARCH) go MANIFESTS_DIR=../../deploy/examples -TEMP := $(shell mktemp -d) +ifeq ($(BUILD_CONTEXT_DIR),) +BUILD_CONTEXT_DIR := $(shell mktemp -d) +endif # Note: as of version 1.3 of operator-sdk, the url format changed to: # ${OPERATOR_SDK_DL_URL}/operator-sdk_${OS}_${ARCH} @@ -68,33 +70,36 @@ export OPERATOR_SDK YQv3 do.build: @echo === container build $(CEPH_IMAGE) - @cp Dockerfile $(TEMP) - @cp toolbox.sh $(TEMP) - @cp set-ceph-debug-level $(TEMP) - @cp $(OUTPUT_DIR)/bin/linux_$(GOARCH)/rook $(TEMP) - @cp -r $(MANIFESTS_DIR)/monitoring $(TEMP)/ceph-monitoring - @mkdir -p $(TEMP)/rook-external/test-data - @cp $(MANIFESTS_DIR)/create-external-cluster-resources.* $(TEMP)/rook-external/ - @cp ../../tests/ceph-status-out $(TEMP)/rook-external/test-data/ + @mkdir -p $(BUILD_CONTEXT_DIR) + @cp Dockerfile $(BUILD_CONTEXT_DIR) + @cp toolbox.sh $(BUILD_CONTEXT_DIR) + @cp set-ceph-debug-level $(BUILD_CONTEXT_DIR) + @cp $(OUTPUT_DIR)/bin/linux_$(GOARCH)/rook $(BUILD_CONTEXT_DIR) + @cp -r $(MANIFESTS_DIR)/monitoring $(BUILD_CONTEXT_DIR)/ceph-monitoring + @mkdir -p $(BUILD_CONTEXT_DIR)/rook-external/test-data + @cp $(MANIFESTS_DIR)/create-external-cluster-resources.* $(BUILD_CONTEXT_DIR)/rook-external/ + @cp ../../tests/ceph-status-out $(BUILD_CONTEXT_DIR)/rook-external/test-data/ ifeq ($(INCLUDE_CSV_TEMPLATES),true) @$(MAKE) csv - @cp -r ../../build/csv $(TEMP)/ceph-csv-templates - @rm $(TEMP)/ceph-csv-templates/csv-gen.sh + @cp -r ../../build/csv $(BUILD_CONTEXT_DIR)/ceph-csv-templates + @rm $(BUILD_CONTEXT_DIR)/ceph-csv-templates/csv-gen.sh @$(MAKE) csv-clean else - mkdir $(TEMP)/ceph-csv-templates + mkdir $(BUILD_CONTEXT_DIR)/ceph-csv-templates endif - @cd $(TEMP) && $(SED_IN_PLACE) 's|BASEIMAGE|$(BASEIMAGE)|g' Dockerfile + @cd $(BUILD_CONTEXT_DIR) && $(SED_IN_PLACE) 's|BASEIMAGE|$(BASEIMAGE)|g' Dockerfile @if [ -z "$(BUILD_CONTAINER_IMAGE)" ]; then\ $(DOCKERCMD) build $(BUILD_ARGS) \ --build-arg S5CMD_VERSION=$(S5CMD_VERSION) \ --build-arg S5CMD_ARCH=$(S5CMD_ARCH) \ -t $(CEPH_IMAGE) \ - $(TEMP);\ + $(BUILD_CONTEXT_DIR);\ + fi + @if [ -z "$(SAVE_BUILD_CONTEXT_DIR)" ]; then\ + rm -fr $(BUILD_CONTEXT_DIR);\ fi - @rm -fr $(TEMP) # call this before building multiple arches in parallel to prevent parallel build processes from # conflicting From 77bfcd46e77e9b4f5be0a3edf7cec0e568d243cd Mon Sep 17 00:00:00 2001 From: Madhu Rajanna Date: Wed, 28 Feb 2024 14:04:13 +0100 Subject: [PATCH 07/15] csi: add rbac required for vgs Added required rbac's for required rbac for volumegroupsnapshot feature. Signed-off-by: Madhu Rajanna --- .../rook-ceph/templates/clusterrole.yaml | 26 ++++++++++++++++--- deploy/examples/common.yaml | 26 ++++++++++++++++--- 2 files changed, 44 insertions(+), 8 deletions(-) diff --git a/deploy/charts/rook-ceph/templates/clusterrole.yaml b/deploy/charts/rook-ceph/templates/clusterrole.yaml index 12c2ad02e105..e99f0c0c10f7 100644 --- a/deploy/charts/rook-ceph/templates/clusterrole.yaml +++ b/deploy/charts/rook-ceph/templates/clusterrole.yaml @@ -500,16 +500,25 @@ rules: verbs: ["patch"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshots"] - verbs: ["get", "list"] + verbs: ["get", "list", "watch", "update", "patch", "create"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotclasses"] verbs: ["get", "list", "watch"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotcontents"] - verbs: ["get", "list", "watch", "patch", "update"] + verbs: ["get", "list", "watch", "patch", "update", "create"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotcontents/status"] verbs: ["update", "patch"] + - apiGroups: ["groupsnapshot.storage.k8s.io"] + resources: ["volumegroupsnapshotclasses"] + verbs: ["get", "list", "watch"] + - apiGroups: ["groupsnapshot.storage.k8s.io"] + resources: ["volumegroupsnapshotcontents"] + verbs: ["get", "list", "watch", "update", "patch"] + - apiGroups: ["groupsnapshot.storage.k8s.io"] + resources: ["volumegroupsnapshotcontents/status"] + verbs: ["update", "patch"] --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 @@ -579,16 +588,25 @@ rules: verbs: ["patch"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshots"] - verbs: ["get", "list", "watch"] + verbs: ["get", "list", "watch", "update", "patch", "create"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotclasses"] verbs: ["get", "list", "watch"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotcontents"] - verbs: ["get", "list", "watch", "patch", "update"] + verbs: ["get", "list", "watch", "patch", "update", "create"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotcontents/status"] verbs: ["update", "patch"] + - apiGroups: ["groupsnapshot.storage.k8s.io"] + resources: ["volumegroupsnapshotclasses"] + verbs: ["get", "list", "watch"] + - apiGroups: ["groupsnapshot.storage.k8s.io"] + resources: ["volumegroupsnapshotcontents"] + verbs: ["get", "list", "watch", "update", "patch"] + - apiGroups: ["groupsnapshot.storage.k8s.io"] + resources: ["volumegroupsnapshotcontents/status"] + verbs: ["update", "patch"] - apiGroups: [""] resources: ["configmaps"] verbs: ["get"] diff --git a/deploy/examples/common.yaml b/deploy/examples/common.yaml index 5495bf631ab3..ed523e8cb051 100644 --- a/deploy/examples/common.yaml +++ b/deploy/examples/common.yaml @@ -54,16 +54,25 @@ rules: verbs: ["patch"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshots"] - verbs: ["get", "list"] + verbs: ["get", "list", "watch", "update", "patch", "create"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotclasses"] verbs: ["get", "list", "watch"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotcontents"] - verbs: ["get", "list", "watch", "patch", "update"] + verbs: ["get", "list", "watch", "patch", "update", "create"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotcontents/status"] verbs: ["update", "patch"] + - apiGroups: ["groupsnapshot.storage.k8s.io"] + resources: ["volumegroupsnapshotclasses"] + verbs: ["get", "list", "watch"] + - apiGroups: ["groupsnapshot.storage.k8s.io"] + resources: ["volumegroupsnapshotcontents"] + verbs: ["get", "list", "watch", "update", "patch"] + - apiGroups: ["groupsnapshot.storage.k8s.io"] + resources: ["volumegroupsnapshotcontents/status"] + verbs: ["update", "patch"] --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 @@ -152,16 +161,25 @@ rules: verbs: ["patch"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshots"] - verbs: ["get", "list", "watch"] + verbs: ["get", "list", "watch", "update", "patch", "create"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotclasses"] verbs: ["get", "list", "watch"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotcontents"] - verbs: ["get", "list", "watch", "patch", "update"] + verbs: ["get", "list", "watch", "patch", "update", "create"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotcontents/status"] verbs: ["update", "patch"] + - apiGroups: ["groupsnapshot.storage.k8s.io"] + resources: ["volumegroupsnapshotclasses"] + verbs: ["get", "list", "watch"] + - apiGroups: ["groupsnapshot.storage.k8s.io"] + resources: ["volumegroupsnapshotcontents"] + verbs: ["get", "list", "watch", "update", "patch"] + - apiGroups: ["groupsnapshot.storage.k8s.io"] + resources: ["volumegroupsnapshotcontents/status"] + verbs: ["update", "patch"] - apiGroups: [""] resources: ["configmaps"] verbs: ["get"] From 0d5bd70194bd9edd0d0c7c11718fac1ec9673e7c Mon Sep 17 00:00:00 2001 From: Madhu Rajanna Date: Wed, 28 Feb 2024 14:12:00 +0100 Subject: [PATCH 08/15] csi: install vgs CRD in tests update the snapshot controller to 7.0.1 and install new Volumegroup CRD's Signed-off-by: Madhu Rajanna --- tests/framework/utils/snapshot.go | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/tests/framework/utils/snapshot.go b/tests/framework/utils/snapshot.go index 1a20d6144876..254c1b6c77c0 100644 --- a/tests/framework/utils/snapshot.go +++ b/tests/framework/utils/snapshot.go @@ -27,14 +27,18 @@ import ( const ( // snapshotterVersion from which the snapshotcontroller and CRD will be // installed - snapshotterVersion = "v5.0.1" + snapshotterVersion = "v7.0.1" repoURL = "https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter" rbacPath = "deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml" controllerPath = "deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml" - // snapshot CRD path + // snapshot CRD path snapshotClassCRDPath = "client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml" volumeSnapshotContentsCRDPath = "client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml" volumeSnapshotCRDPath = "client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml" + // volumegroupsnapshot CRD path + volumeGroupSnapshotClassCRDPath = "client/config/crd/groupsnapshot.storage.k8s.io_volumegroupsnapshotclasses.yaml" + volumeGroupSnapshotContentsCRDPath = "client/config/crd/groupsnapshot.storage.k8s.io_volumegroupsnapshotcontents.yaml" + volumeGroupSnapshotCRDPath = "client/config/crd/groupsnapshot.storage.k8s.io_volumegroupsnapshots.yaml" ) // CheckSnapshotISReadyToUse checks snapshot is ready to use @@ -143,6 +147,24 @@ func (k8sh *K8sHelper) snapshotCRD(action string) error { if err != nil { return err } + vgsClassCRD := fmt.Sprintf("%s/%s/%s", repoURL, snapshotterVersion, volumeGroupSnapshotClassCRDPath) + _, err = k8sh.Kubectl(args(vgsClassCRD)...) + if err != nil { + return err + } + + vgsContentsCRD := fmt.Sprintf("%s/%s/%s", repoURL, snapshotterVersion, volumeGroupSnapshotContentsCRDPath) + _, err = k8sh.Kubectl(args(vgsContentsCRD)...) + if err != nil { + return err + } + + vgsCRD := fmt.Sprintf("%s/%s/%s", repoURL, snapshotterVersion, volumeGroupSnapshotCRDPath) + _, err = k8sh.Kubectl(args(vgsCRD)...) + if err != nil { + return err + } + return nil } From 2611e924a2ad474fe9a54761729eefb30c199f6a Mon Sep 17 00:00:00 2001 From: Madhu Rajanna Date: Wed, 28 Feb 2024 14:43:47 +0100 Subject: [PATCH 09/15] csi: provide option to configure VGS volumegroupsnapshot feature will be enabled by default if the required CRD's are present if not its disabled and user will have an option to disable it if they dont require this feature. Signed-off-by: Madhu Rajanna --- Documentation/Helm-Charts/operator-chart.md | 1 + PendingReleaseNotes.md | 1 + deploy/charts/rook-ceph/templates/configmap.yaml | 1 + deploy/charts/rook-ceph/values.yaml | 3 +++ deploy/examples/operator-openshift.yaml | 4 ++++ deploy/examples/operator.yaml | 3 +++ pkg/operator/ceph/csi/csi.go | 13 +++++++++++++ pkg/operator/ceph/csi/spec.go | 2 ++ .../cephfs/csi-cephfsplugin-provisioner-dep.yaml | 3 +++ .../template/rbd/csi-rbdplugin-provisioner-dep.yaml | 3 +++ 10 files changed, 34 insertions(+) diff --git a/Documentation/Helm-Charts/operator-chart.md b/Documentation/Helm-Charts/operator-chart.md index d0b7d43af8f5..88b3aff03e9b 100644 --- a/Documentation/Helm-Charts/operator-chart.md +++ b/Documentation/Helm-Charts/operator-chart.md @@ -91,6 +91,7 @@ The following table lists the configurable parameters of the rook-operator chart | `csi.enablePluginSelinuxHostMount` | Enable Host mount for `/etc/selinux` directory for Ceph CSI nodeplugins | `false` | | `csi.enableRBDSnapshotter` | Enable Snapshotter in RBD provisioner pod | `true` | | `csi.enableRbdDriver` | Enable Ceph CSI RBD driver | `true` | +| `csi.enableVolumeGroupSnapshot` | Enable volume group snapshot feature. This feature is enabled by default as long as the necessary CRDs are available in the cluster. | `true` | | `csi.forceCephFSKernelClient` | Enable Ceph Kernel clients on kernel < 4.17. If your kernel does not support quotas for CephFS you may want to disable this setting. However, this will cause an issue during upgrades with the FUSE client. See the [upgrade guide](https://rook.io/docs/rook/v1.2/ceph-upgrade.html) | `true` | | `csi.grpcTimeoutInSeconds` | Set GRPC timeout for csi containers (in seconds). It should be >= 120. If this value is not set or is invalid, it defaults to 150 | `150` | | `csi.imagePullPolicy` | Image pull policy | `"IfNotPresent"` | diff --git a/PendingReleaseNotes.md b/PendingReleaseNotes.md index b04354c0beab..bf11c3fb0e90 100644 --- a/PendingReleaseNotes.md +++ b/PendingReleaseNotes.md @@ -12,3 +12,4 @@ read affinity setting in cephCluster CR (CSIDriverOptions section) in [PR](https - Kubernetes versions **v1.24** through **v1.29** are supported. - Ceph daemon pods using the `default` service account now use a new `rook-ceph-default` service account. +- The feature support for VolumeSnapshotGroup has been added to the RBD and CephFS CSI driver. diff --git a/deploy/charts/rook-ceph/templates/configmap.yaml b/deploy/charts/rook-ceph/templates/configmap.yaml index d6af5cc798ad..4ce7b75dc278 100644 --- a/deploy/charts/rook-ceph/templates/configmap.yaml +++ b/deploy/charts/rook-ceph/templates/configmap.yaml @@ -25,6 +25,7 @@ data: CSI_ENABLE_OMAP_GENERATOR: {{ .Values.csi.enableOMAPGenerator | quote }} CSI_ENABLE_HOST_NETWORK: {{ .Values.csi.enableCSIHostNetwork | quote }} CSI_ENABLE_METADATA: {{ .Values.csi.enableMetadata | quote }} + CSI_ENABLE_VOLUME_GROUP_SNAPSHOT: {{ .Values.csi.enableVolumeGroupSnapshot | quote }} {{- if .Values.csi.csiDriverNamePrefix }} CSI_DRIVER_NAME_PREFIX: {{ .Values.csi.csiDriverNamePrefix | quote }} {{- end }} diff --git a/deploy/charts/rook-ceph/values.yaml b/deploy/charts/rook-ceph/values.yaml index 0781cf3b6fe1..de890690f8b0 100644 --- a/deploy/charts/rook-ceph/values.yaml +++ b/deploy/charts/rook-ceph/values.yaml @@ -96,6 +96,9 @@ csi: # -- Enable Ceph CSI PVC encryption support enableCSIEncryption: false + # -- Enable volume group snapshot feature. This feature is + # enabled by default as long as the necessary CRDs are available in the cluster. + enableVolumeGroupSnapshot: true # -- PriorityClassName to be set on csi driver plugin pods pluginPriorityClassName: system-node-critical diff --git a/deploy/examples/operator-openshift.yaml b/deploy/examples/operator-openshift.yaml index 3d6f48049b4a..306ce45a2c3b 100644 --- a/deploy/examples/operator-openshift.yaml +++ b/deploy/examples/operator-openshift.yaml @@ -552,6 +552,10 @@ data: # The GCSI RPC timeout value (in seconds). It should be >= 120. If this variable is not set or is an invalid value, it's default to 150. CSI_GRPC_TIMEOUT_SECONDS: "150" + # set to false to disable volume group snapshot feature. This feature is + # enabled by default as long as the necessary CRDs are available in the cluster. + CSI_ENABLE_VOLUME_GROUP_SNAPSHOT: "true" + # Enable topology based provisioning. CSI_ENABLE_TOPOLOGY: "false" # Domain labels define which node labels to use as domains diff --git a/deploy/examples/operator.yaml b/deploy/examples/operator.yaml index b5f16fee87aa..97d550328a9c 100644 --- a/deploy/examples/operator.yaml +++ b/deploy/examples/operator.yaml @@ -85,6 +85,9 @@ data: # set to false to disable deployment of snapshotter container in RBD provisioner pod. CSI_ENABLE_RBD_SNAPSHOTTER: "true" + # set to false to disable volume group snapshot feature. This feature is + # enabled by default as long as the necessary CRDs are available in the cluster. + CSI_ENABLE_VOLUME_GROUP_SNAPSHOT: "true" # Enable cephfs kernel driver instead of ceph-fuse. # If you disable the kernel client, your application may be disrupted during upgrade. # See the upgrade guide: https://rook.io/docs/rook/latest/ceph-upgrade.html diff --git a/pkg/operator/ceph/csi/csi.go b/pkg/operator/ceph/csi/csi.go index 26d34d4a052e..65520405d9f3 100644 --- a/pkg/operator/ceph/csi/csi.go +++ b/pkg/operator/ceph/csi/csi.go @@ -17,6 +17,7 @@ limitations under the License. package csi import ( + "context" "strconv" "strings" "time" @@ -24,6 +25,7 @@ import ( "github.com/rook/rook/pkg/operator/k8sutil" "github.com/pkg/errors" + kerrors "k8s.io/apimachinery/pkg/api/errors" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" "k8s.io/apimachinery/pkg/version" ) @@ -317,5 +319,16 @@ func (r *ReconcileCSI) setParams(ver *version.Info) error { CSIParam.DriverNamePrefix = k8sutil.GetValue(r.opConfig.Parameters, "CSI_DRIVER_NAME_PREFIX", r.opConfig.OperatorNamespace) + _, err = r.context.ApiExtensionsClient.ApiextensionsV1().CustomResourceDefinitions().Get(context.TODO(), "volumegroupsnapshotclasses.groupsnapshot.storage.k8s.io", metav1.GetOptions{}) + if err != nil && !kerrors.IsNotFound(err) { + return errors.Wrapf(err, "failed to get volumegroupsnapshotclasses.groupsnapshot.storage.k8s.io CRD") + } + CSIParam.VolumeGroupSnapshotSupported = (err == nil) + + CSIParam.EnableVolumeGroupSnapshot = true + if strings.EqualFold(k8sutil.GetValue(r.opConfig.Parameters, "CSI_ENABLE_VOLUME_GROUP_SNAPSHOT", "true"), "false") { + CSIParam.EnableVolumeGroupSnapshot = false + } + return nil } diff --git a/pkg/operator/ceph/csi/spec.go b/pkg/operator/ceph/csi/spec.go index 24fb4224432c..a150c81f7794 100644 --- a/pkg/operator/ceph/csi/spec.go +++ b/pkg/operator/ceph/csi/spec.go @@ -82,6 +82,8 @@ type Param struct { CephFSAttachRequired bool RBDAttachRequired bool NFSAttachRequired bool + VolumeGroupSnapshotSupported bool + EnableVolumeGroupSnapshot bool LogLevel uint8 SidecarLogLevel uint8 CephFSLivenessMetricsPort uint16 diff --git a/pkg/operator/ceph/csi/template/cephfs/csi-cephfsplugin-provisioner-dep.yaml b/pkg/operator/ceph/csi/template/cephfs/csi-cephfsplugin-provisioner-dep.yaml index d280f11f9f08..65780bca6fad 100644 --- a/pkg/operator/ceph/csi/template/cephfs/csi-cephfsplugin-provisioner-dep.yaml +++ b/pkg/operator/ceph/csi/template/cephfs/csi-cephfsplugin-provisioner-dep.yaml @@ -55,6 +55,9 @@ spec: - "--leader-election-renew-deadline={{ .LeaderElectionRenewDeadline }}" - "--leader-election-retry-period={{ .LeaderElectionRetryPeriod }}" - "--extra-create-metadata=true" + {{ if .VolumeGroupSnapshotSupported }} + - "--enable-volume-group-snapshots={{ .EnableVolumeGroupSnapshot }}" + {{ end }} env: - name: ADDRESS value: unix:///csi/csi-provisioner.sock diff --git a/pkg/operator/ceph/csi/template/rbd/csi-rbdplugin-provisioner-dep.yaml b/pkg/operator/ceph/csi/template/rbd/csi-rbdplugin-provisioner-dep.yaml index a564063f139e..05dc8bfc443c 100644 --- a/pkg/operator/ceph/csi/template/rbd/csi-rbdplugin-provisioner-dep.yaml +++ b/pkg/operator/ceph/csi/template/rbd/csi-rbdplugin-provisioner-dep.yaml @@ -102,6 +102,9 @@ spec: - "--leader-election-renew-deadline={{ .LeaderElectionRenewDeadline }}" - "--leader-election-retry-period={{ .LeaderElectionRetryPeriod }}" - "--extra-create-metadata=true" + {{ if .VolumeGroupSnapshotSupported }} + - "--enable-volume-group-snapshots={{ .EnableVolumeGroupSnapshot }}" + {{ end }} env: - name: ADDRESS value: unix:///csi/csi-provisioner.sock From b47bd35ddc7afc16288ce39cc2a947dff345f51c Mon Sep 17 00:00:00 2001 From: Madhu Rajanna Date: Wed, 28 Feb 2024 15:17:09 +0100 Subject: [PATCH 10/15] csi: use different variable name for replicas r is the variable name for the CSI reconciler and same is used for a local variable as well chaning it to avoid confusion and variable shadowing. Signed-off-by: Madhu Rajanna --- pkg/operator/ceph/csi/csi.go | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/pkg/operator/ceph/csi/csi.go b/pkg/operator/ceph/csi/csi.go index 65520405d9f3..36690285e0bf 100644 --- a/pkg/operator/ceph/csi/csi.go +++ b/pkg/operator/ceph/csi/csi.go @@ -273,12 +273,12 @@ func (r *ReconcileCSI) setParams(ver *version.Info) error { if len(nodes.Items) == 1 { CSIParam.ProvisionerReplicas = 1 } else { - replicas := k8sutil.GetValue(r.opConfig.Parameters, "CSI_PROVISIONER_REPLICAS", "2") - r, err := strconv.ParseInt(replicas, 10, 32) + replicaStr := k8sutil.GetValue(r.opConfig.Parameters, "CSI_PROVISIONER_REPLICAS", "2") + replicas, err := strconv.ParseInt(replicaStr, 10, 32) if err != nil { logger.Errorf("failed to parse CSI_PROVISIONER_REPLICAS. Defaulting to %d. %v", defaultProvisionerReplicas, err) } else { - CSIParam.ProvisionerReplicas = int32(r) + CSIParam.ProvisionerReplicas = int32(replicas) } } } else { From 43fa57fa57d4e8cc2dbbbf26b0bba56e3e9095c7 Mon Sep 17 00:00:00 2001 From: Madhu Rajanna Date: Wed, 28 Feb 2024 15:09:33 +0100 Subject: [PATCH 11/15] csi: update sidecars to latest release updating all the csi sidecars to the latest release. Signed-off-by: Madhu Rajanna --- Documentation/Helm-Charts/operator-chart.md | 10 +++++----- .../Storage-Configuration/Ceph-CSI/custom-images.md | 10 +++++----- deploy/charts/rook-ceph/values.yaml | 10 +++++----- deploy/examples/images.txt | 10 +++++----- deploy/examples/operator-openshift.yaml | 10 +++++----- deploy/examples/operator.yaml | 10 +++++----- pkg/operator/ceph/csi/spec.go | 10 +++++----- 7 files changed, 35 insertions(+), 35 deletions(-) diff --git a/Documentation/Helm-Charts/operator-chart.md b/Documentation/Helm-Charts/operator-chart.md index d0b7d43af8f5..09a0781bf4d4 100644 --- a/Documentation/Helm-Charts/operator-chart.md +++ b/Documentation/Helm-Charts/operator-chart.md @@ -53,7 +53,7 @@ The following table lists the configurable parameters of the rook-operator chart | `containerSecurityContext` | Set the container security context for the operator | `{"capabilities":{"drop":["ALL"]},"runAsGroup":2016,"runAsNonRoot":true,"runAsUser":2016}` | | `crds.enabled` | Whether the helm chart should create and update the CRDs. If false, the CRDs must be managed independently with deploy/examples/crds.yaml. **WARNING** Only set during first deployment. If later disabled the cluster may be DESTROYED. If the CRDs are deleted in this case, see [the disaster recovery guide](https://rook.io/docs/rook/latest/Troubleshooting/disaster-recovery/#restoring-crds-after-deletion) to restore them. | `true` | | `csi.allowUnsupportedVersion` | Allow starting an unsupported ceph-csi image | `false` | -| `csi.attacher.image` | Kubernetes CSI Attacher image | `registry.k8s.io/sig-storage/csi-attacher:v4.4.2` | +| `csi.attacher.image` | Kubernetes CSI Attacher image | `registry.k8s.io/sig-storage/csi-attacher:v4.5.0` | | `csi.cephFSAttachRequired` | Whether to skip any attach operation altogether for CephFS PVCs. See more details [here](https://kubernetes-csi.github.io/docs/skip-attach.html#skip-attach-with-csi-driver-object). If cephFSAttachRequired is set to false it skips the volume attachments and makes the creation of pods using the CephFS PVC fast. **WARNING** It's highly discouraged to use this for CephFS RWO volumes. Refer to this [issue](https://github.com/kubernetes/kubernetes/issues/103305) for more details. | `true` | | `csi.cephFSFSGroupPolicy` | Policy for modifying a volume's ownership or permissions when the CephFS PVC is being mounted. supported values are documented at https://kubernetes-csi.github.io/docs/support-fsgroup.html | `"File"` | | `csi.cephFSKernelMountOptions` | Set CephFS Kernel mount options to use https://docs.ceph.com/en/latest/man/8/mount.ceph/#options. Set to "ms_mode=secure" when connections.encrypted is enabled in CephCluster CR | `nil` | @@ -104,7 +104,7 @@ The following table lists the configurable parameters of the rook-operator chart | `csi.pluginNodeAffinity` | The node labels for affinity of the CephCSI RBD plugin DaemonSet [^1] | `nil` | | `csi.pluginPriorityClassName` | PriorityClassName to be set on csi driver plugin pods | `"system-node-critical"` | | `csi.pluginTolerations` | Array of tolerations in YAML format which will be added to CephCSI plugin DaemonSet | `nil` | -| `csi.provisioner.image` | Kubernetes CSI provisioner image | `registry.k8s.io/sig-storage/csi-provisioner:v3.6.3` | +| `csi.provisioner.image` | Kubernetes CSI provisioner image | `registry.k8s.io/sig-storage/csi-provisioner:v4.0.0` | | `csi.provisionerNodeAffinity` | The node labels for affinity of the CSI provisioner deployment [^1] | `nil` | | `csi.provisionerPriorityClassName` | PriorityClassName to be set on csi driver provisioner pods | `"system-cluster-critical"` | | `csi.provisionerReplicas` | Set replicas for csi provisioner deployment | `2` | @@ -115,14 +115,14 @@ The following table lists the configurable parameters of the rook-operator chart | `csi.rbdPluginUpdateStrategy` | CSI RBD plugin daemonset update strategy, supported values are OnDelete and RollingUpdate | `RollingUpdate` | | `csi.rbdPluginUpdateStrategyMaxUnavailable` | A maxUnavailable parameter of CSI RBD plugin daemonset update strategy. | `1` | | `csi.rbdPodLabels` | Labels to add to the CSI RBD Deployments and DaemonSets Pods | `nil` | -| `csi.registrar.image` | Kubernetes CSI registrar image | `registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.9.1` | -| `csi.resizer.image` | Kubernetes CSI resizer image | `registry.k8s.io/sig-storage/csi-resizer:v1.9.2` | +| `csi.registrar.image` | Kubernetes CSI registrar image | `registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.10.0` | +| `csi.resizer.image` | Kubernetes CSI resizer image | `registry.k8s.io/sig-storage/csi-resizer:v1.10.0` | | `csi.serviceMonitor.enabled` | Enable ServiceMonitor for Ceph CSI drivers | `false` | | `csi.serviceMonitor.interval` | Service monitor scrape interval | `"5s"` | | `csi.serviceMonitor.labels` | ServiceMonitor additional labels | `{}` | | `csi.serviceMonitor.namespace` | Use a different namespace for the ServiceMonitor | `nil` | | `csi.sidecarLogLevel` | Set logging level for Kubernetes-csi sidecar containers. Supported values from 0 to 5. 0 for general useful logs (the default), 5 for trace level verbosity. | `0` | -| `csi.snapshotter.image` | Kubernetes CSI snapshotter image | `registry.k8s.io/sig-storage/csi-snapshotter:v6.3.2` | +| `csi.snapshotter.image` | Kubernetes CSI snapshotter image | `registry.k8s.io/sig-storage/csi-snapshotter:v7.0.1` | | `csi.topology.domainLabels` | domainLabels define which node labels to use as domains for CSI nodeplugins to advertise their domains | `nil` | | `csi.topology.enabled` | Enable topology based provisioning | `false` | | `currentNamespaceOnly` | Whether the operator should watch cluster CRD in its own namespace or not | `false` | diff --git a/Documentation/Storage-Configuration/Ceph-CSI/custom-images.md b/Documentation/Storage-Configuration/Ceph-CSI/custom-images.md index 703f85b453dd..b63ddb0732bd 100644 --- a/Documentation/Storage-Configuration/Ceph-CSI/custom-images.md +++ b/Documentation/Storage-Configuration/Ceph-CSI/custom-images.md @@ -19,11 +19,11 @@ The default upstream images are included below, which you can change to your des ```yaml ROOK_CSI_CEPH_IMAGE: "quay.io/cephcsi/cephcsi:v3.10.2" -ROOK_CSI_REGISTRAR_IMAGE: "registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.9.1" -ROOK_CSI_PROVISIONER_IMAGE: "registry.k8s.io/sig-storage/csi-provisioner:v3.6.3" -ROOK_CSI_ATTACHER_IMAGE: "registry.k8s.io/sig-storage/csi-attacher:v4.4.2" -ROOK_CSI_RESIZER_IMAGE: "registry.k8s.io/sig-storage/csi-resizer:v1.9.2" -ROOK_CSI_SNAPSHOTTER_IMAGE: "registry.k8s.io/sig-storage/csi-snapshotter:v6.3.2" +ROOK_CSI_REGISTRAR_IMAGE: "registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.10.0" +ROOK_CSI_PROVISIONER_IMAGE: "registry.k8s.io/sig-storage/csi-provisioner:v4.0.0" +ROOK_CSI_ATTACHER_IMAGE: "registry.k8s.io/sig-storage/csi-attacher:v4.5.0" +ROOK_CSI_RESIZER_IMAGE: "registry.k8s.io/sig-storage/csi-resizer:v1.10.0" +ROOK_CSI_SNAPSHOTTER_IMAGE: "registry.k8s.io/sig-storage/csi-snapshotter:v7.0.1" ROOK_CSIADDONS_IMAGE: "quay.io/csiaddons/k8s-sidecar:v0.8.0" ``` diff --git a/deploy/charts/rook-ceph/values.yaml b/deploy/charts/rook-ceph/values.yaml index 0781cf3b6fe1..98da16c1ad54 100644 --- a/deploy/charts/rook-ceph/values.yaml +++ b/deploy/charts/rook-ceph/values.yaml @@ -474,27 +474,27 @@ csi: registrar: # -- Kubernetes CSI registrar image - # @default -- `registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.9.1` + # @default -- `registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.10.0` image: provisioner: # -- Kubernetes CSI provisioner image - # @default -- `registry.k8s.io/sig-storage/csi-provisioner:v3.6.3` + # @default -- `registry.k8s.io/sig-storage/csi-provisioner:v4.0.0` image: snapshotter: # -- Kubernetes CSI snapshotter image - # @default -- `registry.k8s.io/sig-storage/csi-snapshotter:v6.3.2` + # @default -- `registry.k8s.io/sig-storage/csi-snapshotter:v7.0.1` image: attacher: # -- Kubernetes CSI Attacher image - # @default -- `registry.k8s.io/sig-storage/csi-attacher:v4.4.2` + # @default -- `registry.k8s.io/sig-storage/csi-attacher:v4.5.0` image: resizer: # -- Kubernetes CSI resizer image - # @default -- `registry.k8s.io/sig-storage/csi-resizer:v1.9.2` + # @default -- `registry.k8s.io/sig-storage/csi-resizer:v1.10.0` image: # -- Image pull policy diff --git a/deploy/examples/images.txt b/deploy/examples/images.txt index 741f75b738c1..03353b01dcb8 100644 --- a/deploy/examples/images.txt +++ b/deploy/examples/images.txt @@ -3,9 +3,9 @@ quay.io/ceph/cosi:v0.1.1 quay.io/cephcsi/cephcsi:v3.10.2 quay.io/csiaddons/k8s-sidecar:v0.8.0 - registry.k8s.io/sig-storage/csi-attacher:v4.4.2 - registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.9.1 - registry.k8s.io/sig-storage/csi-provisioner:v3.6.3 - registry.k8s.io/sig-storage/csi-resizer:v1.9.2 - registry.k8s.io/sig-storage/csi-snapshotter:v6.3.2 + registry.k8s.io/sig-storage/csi-attacher:v4.5.0 + registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.10.0 + registry.k8s.io/sig-storage/csi-provisioner:v4.0.0 + registry.k8s.io/sig-storage/csi-resizer:v1.10.0 + registry.k8s.io/sig-storage/csi-snapshotter:v7.0.1 rook/ceph:master diff --git a/deploy/examples/operator-openshift.yaml b/deploy/examples/operator-openshift.yaml index 3d6f48049b4a..37f0c8f3a1c3 100644 --- a/deploy/examples/operator-openshift.yaml +++ b/deploy/examples/operator-openshift.yaml @@ -191,11 +191,11 @@ data: # of the CSI driver to something other than what is officially supported, change # these images to the desired release of the CSI driver. # ROOK_CSI_CEPH_IMAGE: "quay.io/cephcsi/cephcsi:v3.10.2" - # ROOK_CSI_REGISTRAR_IMAGE: "registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.9.1" - # ROOK_CSI_RESIZER_IMAGE: "registry.k8s.io/sig-storage/csi-resizer:v1.9.2" - # ROOK_CSI_PROVISIONER_IMAGE: "registry.k8s.io/sig-storage/csi-provisioner:v3.6.3" - # ROOK_CSI_SNAPSHOTTER_IMAGE: "registry.k8s.io/sig-storage/csi-snapshotter:v6.3.2" - # ROOK_CSI_ATTACHER_IMAGE: "registry.k8s.io/sig-storage/csi-attacher:v4.4.2" + # ROOK_CSI_REGISTRAR_IMAGE: "registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.10.0" + # ROOK_CSI_RESIZER_IMAGE: "registry.k8s.io/sig-storage/csi-resizer:v1.10.0" + # ROOK_CSI_PROVISIONER_IMAGE: "registry.k8s.io/sig-storage/csi-provisioner:v4.0.0" + # ROOK_CSI_SNAPSHOTTER_IMAGE: "registry.k8s.io/sig-storage/csi-snapshotter:v7.0.1" + # ROOK_CSI_ATTACHER_IMAGE: "registry.k8s.io/sig-storage/csi-attacher:v4.5.0" # (Optional) set user created priorityclassName for csi plugin pods. CSI_PLUGIN_PRIORITY_CLASSNAME: "system-node-critical" diff --git a/deploy/examples/operator.yaml b/deploy/examples/operator.yaml index b5f16fee87aa..e50bbda866df 100644 --- a/deploy/examples/operator.yaml +++ b/deploy/examples/operator.yaml @@ -113,11 +113,11 @@ data: # of the CSI driver to something other than what is officially supported, change # these images to the desired release of the CSI driver. # ROOK_CSI_CEPH_IMAGE: "quay.io/cephcsi/cephcsi:v3.10.2" - # ROOK_CSI_REGISTRAR_IMAGE: "registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.9.1" - # ROOK_CSI_RESIZER_IMAGE: "registry.k8s.io/sig-storage/csi-resizer:v1.9.2" - # ROOK_CSI_PROVISIONER_IMAGE: "registry.k8s.io/sig-storage/csi-provisioner:v3.6.3" - # ROOK_CSI_SNAPSHOTTER_IMAGE: "registry.k8s.io/sig-storage/csi-snapshotter:v6.3.2" - # ROOK_CSI_ATTACHER_IMAGE: "registry.k8s.io/sig-storage/csi-attacher:v4.4.2" + # ROOK_CSI_REGISTRAR_IMAGE: "registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.10.0" + # ROOK_CSI_RESIZER_IMAGE: "registry.k8s.io/sig-storage/csi-resizer:v1.10.0" + # ROOK_CSI_PROVISIONER_IMAGE: "registry.k8s.io/sig-storage/csi-provisioner:v4.0.0" + # ROOK_CSI_SNAPSHOTTER_IMAGE: "registry.k8s.io/sig-storage/csi-snapshotter:v7.0.1" + # ROOK_CSI_ATTACHER_IMAGE: "registry.k8s.io/sig-storage/csi-attacher:v4.5.0" # To indicate the image pull policy to be applied to all the containers in the csi driver pods. # ROOK_CSI_IMAGE_PULL_POLICY: "IfNotPresent" diff --git a/pkg/operator/ceph/csi/spec.go b/pkg/operator/ceph/csi/spec.go index 24fb4224432c..b72530601b65 100644 --- a/pkg/operator/ceph/csi/spec.go +++ b/pkg/operator/ceph/csi/spec.go @@ -137,11 +137,11 @@ var ( var ( // image names DefaultCSIPluginImage = "quay.io/cephcsi/cephcsi:v3.10.2" - DefaultRegistrarImage = "registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.9.1" - DefaultProvisionerImage = "registry.k8s.io/sig-storage/csi-provisioner:v3.6.3" - DefaultAttacherImage = "registry.k8s.io/sig-storage/csi-attacher:v4.4.2" - DefaultSnapshotterImage = "registry.k8s.io/sig-storage/csi-snapshotter:v6.3.2" - DefaultResizerImage = "registry.k8s.io/sig-storage/csi-resizer:v1.9.2" + DefaultRegistrarImage = "registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.10.0" + DefaultProvisionerImage = "registry.k8s.io/sig-storage/csi-provisioner:v4.0.0" + DefaultAttacherImage = "registry.k8s.io/sig-storage/csi-attacher:v4.5.0" + DefaultSnapshotterImage = "registry.k8s.io/sig-storage/csi-snapshotter:v7.0.1" + DefaultResizerImage = "registry.k8s.io/sig-storage/csi-resizer:v1.10.0" DefaultCSIAddonsImage = "quay.io/csiaddons/k8s-sidecar:v0.8.0" // image pull policy From 330f604cca1384b763bce629626b75fce6f73bd4 Mon Sep 17 00:00:00 2001 From: Rakshith R Date: Thu, 29 Feb 2024 17:45:58 +0530 Subject: [PATCH 12/15] csi: update CSIDriverOption params during saving cluster config During cluster creation, csi config map was first filled with mon ips and without CSIDriverOptions. This commit makes sure CSIDriverOptions are added at the begining when the entry is first created. Signed-off-by: Rakshith R --- pkg/operator/ceph/csi/cluster_config.go | 32 ++++++++++++------ pkg/operator/ceph/csi/cluster_config_test.go | 34 ++++++++++++++++++++ 2 files changed, 56 insertions(+), 10 deletions(-) diff --git a/pkg/operator/ceph/csi/cluster_config.go b/pkg/operator/ceph/csi/cluster_config.go index 7775b0e9c8e7..6b9a9ee59d2d 100644 --- a/pkg/operator/ceph/csi/cluster_config.go +++ b/pkg/operator/ceph/csi/cluster_config.go @@ -170,6 +170,9 @@ func updateCsiClusterConfig(curr, clusterKey string, newCsiClusterConfigEntry *C // update default clusterID's entry if clusterKey == centry.Namespace { centry.Monitors = newCsiClusterConfigEntry.Monitors + centry.ReadAffinity = newCsiClusterConfigEntry.ReadAffinity + centry.CephFS.KernelMountOptions = newCsiClusterConfigEntry.CephFS.KernelMountOptions + centry.CephFS.FuseMountOptions = newCsiClusterConfigEntry.CephFS.FuseMountOptions cc[i] = centry } } @@ -183,12 +186,19 @@ func updateCsiClusterConfig(curr, clusterKey string, newCsiClusterConfigEntry *C break } centry.Monitors = newCsiClusterConfigEntry.Monitors + // update subvolumegroup and cephfs netNamespaceFilePath only when either is specified + // while always updating kernel and fuse mount options. if newCsiClusterConfigEntry.CephFS.SubvolumeGroup != "" || newCsiClusterConfigEntry.CephFS.NetNamespaceFilePath != "" { centry.CephFS = newCsiClusterConfigEntry.CephFS + } else { + centry.CephFS.KernelMountOptions = newCsiClusterConfigEntry.CephFS.KernelMountOptions + centry.CephFS.FuseMountOptions = newCsiClusterConfigEntry.CephFS.FuseMountOptions } + // update nfs netNamespaceFilePath only when specified. if newCsiClusterConfigEntry.NFS.NetNamespaceFilePath != "" { centry.NFS = newCsiClusterConfigEntry.NFS } + // update radosNamespace and rbd netNamespaceFilePath only when either is specified. if newCsiClusterConfigEntry.RBD.RadosNamespace != "" || newCsiClusterConfigEntry.RBD.NetNamespaceFilePath != "" { centry.RBD = newCsiClusterConfigEntry.RBD } @@ -207,16 +217,9 @@ func updateCsiClusterConfig(curr, clusterKey string, newCsiClusterConfigEntry *C centry.ClusterID = clusterKey centry.Namespace = newCsiClusterConfigEntry.Namespace centry.Monitors = newCsiClusterConfigEntry.Monitors - if newCsiClusterConfigEntry.RBD.RadosNamespace != "" || newCsiClusterConfigEntry.RBD.NetNamespaceFilePath != "" { - centry.RBD = newCsiClusterConfigEntry.RBD - } - // Add a condition not to fill with empty values - if newCsiClusterConfigEntry.CephFS.SubvolumeGroup != "" || newCsiClusterConfigEntry.CephFS.NetNamespaceFilePath != "" { - centry.CephFS = newCsiClusterConfigEntry.CephFS - } - if newCsiClusterConfigEntry.NFS.NetNamespaceFilePath != "" { - centry.NFS = newCsiClusterConfigEntry.NFS - } + centry.RBD = newCsiClusterConfigEntry.RBD + centry.CephFS = newCsiClusterConfigEntry.CephFS + centry.NFS = newCsiClusterConfigEntry.NFS if len(newCsiClusterConfigEntry.ReadAffinity.CrushLocationLabels) != 0 { centry.ReadAffinity = newCsiClusterConfigEntry.ReadAffinity } @@ -273,6 +276,15 @@ func SaveClusterConfig(clientset kubernetes.Interface, clusterNamespace string, } logger.Debugf("using %q for csi configmap namespace", csiNamespace) + if newCsiClusterConfigEntry != nil { + // set CSIDriverOptions + newCsiClusterConfigEntry.ReadAffinity.Enabled = clusterInfo.CSIDriverSpec.ReadAffinity.Enabled + newCsiClusterConfigEntry.ReadAffinity.CrushLocationLabels = clusterInfo.CSIDriverSpec.ReadAffinity.CrushLocationLabels + + newCsiClusterConfigEntry.CephFS.KernelMountOptions = clusterInfo.CSIDriverSpec.CephFS.KernelMountOptions + newCsiClusterConfigEntry.CephFS.FuseMountOptions = clusterInfo.CSIDriverSpec.CephFS.FuseMountOptions + } + configMutex.Lock() defer configMutex.Unlock() diff --git a/pkg/operator/ceph/csi/cluster_config_test.go b/pkg/operator/ceph/csi/cluster_config_test.go index 9a87c39fc91d..3698e0fb1b0d 100644 --- a/pkg/operator/ceph/csi/cluster_config_test.go +++ b/pkg/operator/ceph/csi/cluster_config_test.go @@ -74,6 +74,16 @@ func TestUpdateCsiClusterConfig(t *testing.T) { }, }, } + csiClusterConfigEntryMountOptions := CSIClusterConfigEntry{ + Namespace: "rook-ceph-1", + ClusterInfo: cephcsi.ClusterInfo{ + Monitors: []string{"1.2.3.4:5000"}, + CephFS: cephcsi.CephFS{ + KernelMountOptions: "ms_mode=crc", + FuseMountOptions: "debug", + }, + }, + } csiClusterConfigEntry2 := CSIClusterConfigEntry{ Namespace: "rook-ceph-2", ClusterInfo: cephcsi.ClusterInfo{ @@ -123,6 +133,17 @@ func TestUpdateCsiClusterConfig(t *testing.T) { assert.Equal(t, 2, len(cc[0].Monitors)) }) + t.Run("add mount options to the current cluster", func(t *testing.T) { + configWithMountOptions, err := updateCsiClusterConfig(s, "rook-ceph-1", &csiClusterConfigEntryMountOptions) + assert.NoError(t, err) + cc, err := parseCsiClusterConfig(configWithMountOptions) + assert.NoError(t, err) + assert.Equal(t, 1, len(cc)) + assert.Equal(t, "rook-ceph-1", cc[0].ClusterID) + assert.Equal(t, csiClusterConfigEntryMountOptions.CephFS.KernelMountOptions, cc[0].CephFS.KernelMountOptions) + assert.Equal(t, csiClusterConfigEntryMountOptions.CephFS.FuseMountOptions, cc[0].CephFS.FuseMountOptions) + }) + t.Run("add a 2nd cluster with 3 mons", func(t *testing.T) { s, err = updateCsiClusterConfig(s, "beta", &csiClusterConfigEntry2) assert.NoError(t, err) @@ -178,6 +199,19 @@ func TestUpdateCsiClusterConfig(t *testing.T) { assert.Equal(t, "my-group", cc[2].CephFS.SubvolumeGroup) }) + t.Run("update mount options in presence of subvolumegroup", func(t *testing.T) { + sMntOptionUpdate, err := updateCsiClusterConfig(s, "baba", &csiClusterConfigEntryMountOptions) + assert.NoError(t, err) + cc, err := parseCsiClusterConfig(sMntOptionUpdate) + assert.NoError(t, err) + assert.Equal(t, 3, len(cc)) + assert.Equal(t, "baba", cc[2].ClusterID) + assert.Equal(t, "my-group", cc[2].CephFS.SubvolumeGroup) + assert.Equal(t, csiClusterConfigEntryMountOptions.CephFS.KernelMountOptions, cc[2].CephFS.KernelMountOptions) + assert.Equal(t, csiClusterConfigEntryMountOptions.CephFS.FuseMountOptions, cc[2].CephFS.FuseMountOptions) + + }) + t.Run("add a 4th mon to the 3rd cluster and subvolumegroup is preserved", func(t *testing.T) { csiClusterConfigEntry3.Monitors = append(csiClusterConfigEntry3.Monitors, "10.11.12.13:5000") s, err = updateCsiClusterConfig(s, "baba", &csiClusterConfigEntry3) From 117bc76f20c6e2a7610bf57572bca367a81639b6 Mon Sep 17 00:00:00 2001 From: parth-gr Date: Mon, 4 Mar 2024 17:08:41 +0530 Subject: [PATCH 13/15] external: enable the use of only v2 mon port currently the script requires to have both v2 and v1 port to enable v2 port, but that is not the necessary condition, so removing the chek, and enabling it only v2 is present to successfully configure with v2 only part-of: https://github.com/rook/rook/issues/13827 Signed-off-by: parth-gr --- deploy/examples/create-external-cluster-resources.py | 8 -------- 1 file changed, 8 deletions(-) diff --git a/deploy/examples/create-external-cluster-resources.py b/deploy/examples/create-external-cluster-resources.py index 5ffef28d3183..61039c9eb1bd 100644 --- a/deploy/examples/create-external-cluster-resources.py +++ b/deploy/examples/create-external-cluster-resources.py @@ -699,14 +699,6 @@ def get_ceph_external_mon_data(self): q_leader_details = q_leader_matching_list[0] # get the address vector of the quorum-leader q_leader_addrvec = q_leader_details.get("public_addrs", {}).get("addrvec", []) - # if the quorum-leader has only one address in the address-vector - # and it is of type 'v2' (ie; with :3300), - # raise an exception to make user aware that - # they have to enable 'v1' (ie; with :6789) type as well - if len(q_leader_addrvec) == 1 and q_leader_addrvec[0]["type"] == "v2": - raise ExecutionFailureException( - "Only 'v2' address type is enabled, user should also enable 'v1' type as well" - ) ip_addr = str(q_leader_details["public_addr"].split("/")[0]) if self._arg_parser.v2_port_enable: From 3cdb79a73da48838ec6391f39069b3c866a9d056 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon, 4 Mar 2024 12:22:14 +0000 Subject: [PATCH 14/15] build(deps): bump azure/setup-helm from 3 to 4 Bumps [azure/setup-helm](https://github.com/azure/setup-helm) from 3 to 4. - [Release notes](https://github.com/azure/setup-helm/releases) - [Changelog](https://github.com/Azure/setup-helm/blob/main/CHANGELOG.md) - [Commits](https://github.com/azure/setup-helm/compare/v3...v4) --- updated-dependencies: - dependency-name: azure/setup-helm dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] --- .github/workflows/build.yml | 2 +- .github/workflows/helm-lint.yaml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml index fe4301cef870..f10d02689f7b 100644 --- a/.github/workflows/build.yml +++ b/.github/workflows/build.yml @@ -30,7 +30,7 @@ jobs: go-version: "1.21" - name: Set up Helm - uses: azure/setup-helm@v3 + uses: azure/setup-helm@v4 with: version: v3.6.2 diff --git a/.github/workflows/helm-lint.yaml b/.github/workflows/helm-lint.yaml index b0dc6b644485..b07c06b219bb 100644 --- a/.github/workflows/helm-lint.yaml +++ b/.github/workflows/helm-lint.yaml @@ -31,7 +31,7 @@ jobs: fetch-depth: 0 - name: Set up Helm - uses: azure/setup-helm@v3 + uses: azure/setup-helm@v4 with: version: v3.6.2 From 072884f4512e5be28c6c929e46faa4a8e1f4c7ae Mon Sep 17 00:00:00 2001 From: Blaine Gardner Date: Mon, 4 Mar 2024 11:08:48 -0700 Subject: [PATCH 15/15] build: use 'baseos' as repo for iproute install The rook/ceph Dockerfile uses dnf to ensure iproute (containing the 'ip' CLI tool) is installed in the Rook image for Multus usage. This comes from the 'baseos' repo, but if any other repos are unavailable temporarily, it can cause the container build to fail. Use the '--repo baseos' flag to help prevent these kinds of failures. Additionally, this will speed up the build slightly since it does not attempt to load any non-necessary repos. This change may make the container build slightly fragile in the future if CentOS changes the name of its baseos repo, or if the Ceph image switches to a non-CentOS base image. Signed-off-by: Blaine Gardner --- images/ceph/Dockerfile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/images/ceph/Dockerfile b/images/ceph/Dockerfile index 26f6dbb052be..268926856e95 100644 --- a/images/ceph/Dockerfile +++ b/images/ceph/Dockerfile @@ -20,7 +20,7 @@ ARG S5CMD_VERSION ARG S5CMD_ARCH # install 'ip' tool for Multus -RUN dnf install -y --setopt=install_weak_deps=False iproute && dnf clean all +RUN dnf install -y --repo baseos --setopt=install_weak_deps=False iproute && dnf clean all # Install the s5cmd package to interact with s3 gateway RUN curl --fail -sSL -o /s5cmd.tar.gz https://github.com/peak/s5cmd/releases/download/v${S5CMD_VERSION}/s5cmd_${S5CMD_VERSION}_${S5CMD_ARCH}.tar.gz && \