From a1f4eb1cdbad46a2cd28b6a628e972c1fd9e8d84 Mon Sep 17 00:00:00 2001 From: carlory Date: Thu, 15 Jun 2023 03:05:35 +0800 Subject: [PATCH] update spec doc for groupcontroller service Co-authored-by: jdef <2348332+jdef@users.noreply.github.com> --- spec.md | 161 ++++++++++++++++++++++++++++++++++---------------------- 1 file changed, 97 insertions(+), 64 deletions(-) diff --git a/spec.md b/spec.md index f16e8a47..27bed630 100644 --- a/spec.md +++ b/spec.md @@ -80,85 +80,97 @@ A CO SHOULD be equipped to handle both centralized and headless plugins, as well Several of these possibilities are illustrated in the following figures. ``` - CO "Master" Host -+-------------------------------------------+ -| | -| +------------+ +------------+ | -| | CO | gRPC | Controller | | -| | +-----------> Plugin | | -| +------------+ +------------+ | -| | -+-------------------------------------------+ - - CO "Node" Host(s) -+-------------------------------------------+ -| | -| +------------+ +------------+ | -| | CO | gRPC | Node | | -| | +-----------> Plugin | | -| +------------+ +------------+ | -| | -+-------------------------------------------+ + CO "Master" Host ++-------------------------------------------------+ +| | +| +------------+ +------------------+ | +| | CO | gRPC | Controller | | +| | +--+--------> Plugin | | +| +------------+ | +------------------+ | +| | | +| | +------------------+ | +| | | Group Controller | | +| +--------> Plugin | | +| +------------------+ | +| | ++-------------------------------------------------+ + + CO "Node" Host(s) ++-------------------------------------------------+ +| | +| +------------+ +------------------+ | +| | CO | gRPC | Node | | +| | +-----------> Plugin | | +| +------------+ +------------------+ | +| | ++-------------------------------------------------+ Figure 1: The Plugin runs on all nodes in the cluster: a centralized -Controller Plugin is available on the CO master host and the Node -Plugin is available on all of the CO Nodes. +Controller Plugin and Group Controller Plugin (optional) are available +on the CO master host and the Node Plugin is available on all of the +CO Nodes. ``` ``` - CO "Node" Host(s) -+-------------------------------------------+ -| | -| +------------+ +------------+ | -| | CO | gRPC | Controller | | -| | +--+--------> Plugin | | -| +------------+ | +------------+ | -| | | -| | | -| | +------------+ | -| | | Node | | -| +--------> Plugin | | -| +------------+ | -| | -+-------------------------------------------+ + CO "Node" Host(s) ++-------------------------------------------------+ +| | +| +------------------+ | +| | Node | | +| +--------> Plugin | | +| | +------------------+ | +| | | +| +------------+ | +------------------+ | +| | CO | |gRPC | Controller | | +| | +-----------> Plugin | | +| +------------+ | +------------------+ | +| | | +| | +------------------+ | +| | | Group Controller | | +| +--------> Plugin | | +| +------------------+ | +| | ++-------------------------------------------------+ Figure 2: Headless Plugin deployment, only the CO Node hosts run -Plugins. Separate, split-component Plugins supply the Controller -Service and the Node Service respectively. +Plugins. Separate, split-component Plugins supply the Node Service, +the Controller Service and the Group Controller Service (optional) +respectively. ``` ``` - CO "Node" Host(s) -+-------------------------------------------+ -| | -| +------------+ +------------+ | -| | CO | gRPC | Controller | | -| | +-----------> Node | | -| +------------+ | Plugin | | -| +------------+ | -| | -+-------------------------------------------+ + CO "Node" Host(s) ++-------------------------------------------------+ +| | +| +------------------+ | +| +------------+ | Node | | +| | CO | gRPC | Controller | | +| | +-----------> Group Controller | | +| +------------+ | Plugin | | +| +------------------+ | +| | ++-------------------------------------------------+ Figure 3: Headless Plugin deployment, only the CO Node hosts run -Plugins. A unified Plugin component supplies both the Controller -Service and Node Service. +Plugins. A unified Plugin component supplies both the Node Service, +the Controller Service and the Group Controller Service (optional). ``` ``` - CO "Node" Host(s) -+-------------------------------------------+ -| | -| +------------+ +------------+ | -| | CO | gRPC | Node | | -| | +-----------> Plugin | | -| +------------+ +------------+ | -| | -+-------------------------------------------+ + CO "Node" Host(s) ++-------------------------------------------------+ +| | +| +------------+ +------------------+ | +| | CO | gRPC | Node | | +| | +-----------> Plugin | | +| +------------+ +------------------+ | +| | ++-------------------------------------------------+ Figure 4: Headless Plugin deployment, only the CO Node hosts run Plugins. A Node-only Plugin component supplies only the Node Service. -Its GetPluginCapabilities RPC does not report the CONTROLLER_SERVICE -capability. +Its GetPluginCapabilities RPC does not report either the CONTROLLER_SERVICE +capability or the GROUP_CONTROLLER_SERVICE capability. ``` ### Volume Lifecycle @@ -268,7 +280,12 @@ Each SP MUST provide: * **Node Plugin**: A gRPC endpoint serving CSI RPCs that MUST be run on the Node whereupon an SP-provisioned volume will be published. * **Controller Plugin**: A gRPC endpoint serving CSI RPCs that MAY be run anywhere. -* In some circumstances a single gRPC endpoint MAY serve all CSI RPCs (see Figure 3 in [Architecture](#architecture)). + +Each SP MAY provide: + +- **Group Controller Plugin**: A gRPC endpoint serving CSI RPCs that MAY be run anywhere. + +In some circumstances a single gRPC endpoint MAY serve all CSI RPCs (see Figure 3 in [Architecture](#architecture)). ```protobuf syntax = "proto3"; @@ -322,10 +339,11 @@ extend google.protobuf.ServiceOptions { } ``` -There are three sets of RPCs: +There are four sets of RPCs: -* **Identity Service**: Both the Node Plugin and the Controller Plugin MUST implement this sets of RPCs. +* **Identity Service**: Every Controller Plugin, Group Controller Plugin and Node Plugin MUST implement this sets of RPCs. * **Controller Service**: The Controller Plugin MUST implement this sets of RPCs. +* **Group Controller Service**: The GroupController Plugin MUST implement this sets of RPCs. * **Node Service**: The Node Plugin MUST implement this sets of RPCs. ```protobuf @@ -3066,6 +3084,21 @@ The CO MUST implement the specified error recovery behavior when it encounters t | Snapshot list mismatch | 3 INVALID_ARGUMENT | Besides the general cases, this code SHOULD also be used to indicate when plugin supporting CREATE_DELETE_GET_VOLUME_GROUP_SNAPSHOT detects a mismatch in the `snapshot_ids`. | If a mismatch is detected in the `snapshot_ids`, caller SHOULD use different `snapshot_ids`. | | Volume group snapshot does not exist | 5 NOT_FOUND | Indicates that a volume group snapshot corresponding to the specified `group_snapshot_id` does not exist. | Caller MUST verify that the `group_snapshot_id` is correct and that the volume group snapshot is accessible and has not been deleted before retrying with exponential back off. | +#### RPC Interactions + +##### `CreateVolumeGroupSnapshot`, `DeleteVolumeGroupSnapshot`, `GetVolumeGroupSnapshot` + +The plugin-generated `group_snapshot_id` is a REQUIRED field for both the `DeleteVolumeGroupSnapshot` RPC and the `GetVolumeGroupSnapshot` PRC, as opposed to the CO-generated snapshot `name` that is REQUIRED for the `CreateVolumeGroupSnapshot` RPC. + +A `CreateVolumeGroupSnapshot` operation SHOULD return with a `group_snapshot_id` when the group snapshot is cut successfully. If a `CreateVolumeGroupSnapshot` operation times out before the group snapshot is cut, leaving the CO without an ID with which to reference a group snapshot, and the CO also decides that it no longer needs/wants the group snapshot in question then the CO MAY choose one of the following paths: + +1. Retry the `CreateVolumeGroupSnapshot` RPC to possibly obtain a group snapshot ID that may be used to execute a `DeleteVolumeGroupSnapshot` RPC; upon success execute `DeleteVolumeGroupSnapshot`. If the `CreateVolumeGroupSnapshot` +RPC returns a server-side gRPC error, it means that SP do clean up and make sure no snapshots are leaked. + +2. The CO takes no further action regarding the timed out RPC, a group snapshot is possibly leaked and the operator/user is expected to clean up. But this way isn't considered as a good practice. + +For plugins that support snapshot post processing such as uploading, a `GetVolumeGroupSnapshot` operation SHALL return current information of the group snapshot with the given `group_snapshot_id`. When processing is complete, the `ready_to_use` parameter of the group snapshot from `GetVolumeGroupSnapshot` SHALL become `true`. + ## Protocol ### Connectivity