You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This way the cluster would work consistently because there is a single ClusterVersion across the whole cluster.
167
168
169
+
It may not be necessary for some changes if the changes do not affect user facing apis or data consistency.
170
+
We would only make version based switching optional and best effort, at the discretion of the developers and reviewers.
171
+
172
+
However, we should make sure the feature is well tested for mixed version scenarios in robustness test.
173
+
168
174
## Design Details
169
175
170
176
On high level, a cluster feature gate would need:
@@ -175,6 +181,38 @@ On high level, a cluster feature gate would need:
175
181
1.[client APIs](#client-apis-changes) to query if a feature is enabled for the whole cluster.
176
182
1. a way to [remove a feature gate][#feature-removal] when it is no longer useful or have graduated.
177
183
184
+
### Set the Basis
185
+
186
+
Before we proceed to the design details, we need to think through several questions.
187
+
188
+
1. Do the cluster features need to tied to the cluster version?
189
+
190
+
Imaging the following scenario:
191
+
* every server in the cluster is on 3.7, and have a new Alpha feature enabled, which would write a new field to wal.
192
+
* downgrade is enabled, so cluster version is downgraded to 3.6.
193
+
When the cluster version is downgrade to 3.6, the flags of each member are not changed. But we should still disable the Alpha feature, so that any new data written would be compatible after the downgrade.
194
+
195
+
So similar to how cluster version determines the capability, cluster features should also be tied to the cluster version.
196
+
197
+
In addition, a server can run with 3.N or 3.N-1 binary version with 3.N-1 cluster version with the same values of cluster feature gates.
198
+
The feature implementation for Alpha and Beta features might change between different binary versions.
199
+
See [Feature Implementation Change Risks](#feature-implementation-change-risks) about how we we can mitigate the risks in this scenario.
200
+
201
+
2. Do we need a leader to set the final values of features?
202
+
203
+
Instead of relying on the leader to send a raft request to set the cluster feature values, individual members decide the feature values by reconciling the proposed values locally after receiving them from each member. This approach has the benefit of skipping a raft step, but has the risk of split brains if there is a need to change the reconciliation logic in a patch release to fix a bug.
204
+
205
+
3. What happens before the cluster feature value setting raft request is sent?
206
+
207
+
When a new cluster starts, the cluster starts to accept requests once a leader is elected. At this point, the cluster version might be `nil` or set to the `MinClusterVersion = "3.0.0"`. The leader would not have decided the values of cluster feature gates yet. What should be the values for the cluster features during that time?
208
+
* if we set the cluster features to `nil` or tie it to the `MinClusterVersion`, every cluster feature would be off at that moment. This is fine because a cluster feature should be considered off unless all the active members agree on it.
209
+
* if we set the cluster feature values according to the local settings (the values the server would propose), we run the risk of a feature might be enabled in some members and disabled in others. Members with a feature disabled might not be able to process data written when the feature is enabled.
210
+
For data consistency reasons, it is probably better to disable all features at startup. And we need to have consistent default value handling for non-bool cluster params.
211
+
212
+
For simplicity of the design, we would only consider the following cluster configuration cases:
213
+
* when a new cluster starts, all cluster members have the same major.minor version, and have the same feature configurations.
214
+
* a cluster can be upgraded, downgraded, and updated in rolling sequence. For a limited time, the cluster can have mixed versions and mixed configurations. Eventually all cluster members will have the same major.minor version, and have the same feature configurations.
215
+
178
216
### Register New Feature Gates
179
217
180
218
A feature can be registered as server level feature or cluster level feature, but not both.
@@ -226,13 +264,6 @@ To guarantee consistent value of if a feature is enabled in the whole cluster, t
226
264
227
265
1. Each member applies the updates to their `ClusterParams`, and saves the results in the `cluster` bucket in the backend.
228
266
229
-
A few other alternatives we have evaluated:
230
-
1. Is it better to initialize the `ClusterParams` with nil or cluster version defaults when we have not received the updates of `proposed_cluster_params` from all members?
231
-
In either case, there would be a state change from the initial state to the final state. If we choose to use the version defaults, even though different member might have different default values of the cluster parameters, compared with if the initial state is nil, the change would still be smaller because defaults rarely change between patch versions, and most users would run with parameters close the default values.
232
-
233
-
1. Should individual members decide the `ClusterParams` by reconciling `proposed_cluster_params` locally instead of relying on a leader to determine the final values and send a raft request to set the final `ClusterParams`?
234
-
If we allow individual members decide the final `ClusterParams`, the logic to reconcile the `proposed_cluster_params` of all members to a common cluster setting - `UpdateClusterParamsIfNeeded` has to be the same across all patch versions. On the other hand, if we use a single leader to make the final decision, we would have the flexibility to change the implementation of `UpdateClusterParamsIfNeeded` in patch versions without risking split brains.
0 commit comments