-
Notifications
You must be signed in to change notification settings - Fork 38
feat: Add support for vertical scaling (storage scaling) [KO-448] [KO-459] #409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add support for vertical scaling (storage scaling) [KO-448] [KO-459] #409
Conversation
a7d5344 to
276e920
Compare
ab1947c to
c6552d9
Compare
| oldSts = nil | ||
| } | ||
|
|
||
| newSts, err := r.getSTS(newRack) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any issue if we get new sts before old sts above 1958?
If we do so, then if oldSts == nil logic will be in continuation with oldsts code.
api/v1/aerospikecluster_types.go
Outdated
| // Revision is a version identifier for this rack's specification, used to trigger controlled migrations | ||
| // when rack configuration changes require new StatefulSets. Change this field when making changes | ||
| // that cannot be applied in-place, such as storage updates that require pod recreation. | ||
| // The revision is appended to the rack ID for Kubernetes resource naming (e.g., <cluster>-<rackID>-<revision>). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // The revision is appended to the rack ID for Kubernetes resource naming (e.g., <cluster>-<rackID>-<revision>). | |
| // The revision is appended to the rack ID for Kubernetes resource naming (e.g., <cluster-name>-<rackID>-<revision>). |
|
|
||
| func (r *SingleClusterReconciler) waitForAllSTSToBeReady(ignorablePodNames sets.Set[string]) error { | ||
| r.Log.Info("Waiting for cluster to be ready") | ||
| r.Log.Info("Waiting for cluster STS to be ready") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| r.Log.Info("Waiting for cluster STS to be ready") | |
| r.Log.Info("Waiting for all cluster STSs to be ready") |
| r.Log.Info("Waiting for cluster STS to be ready") | ||
|
|
||
| allRackIDs := sets.NewInt() | ||
| allRackIdentifier := sets.NewString() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we call this allRackIdentifier var as allRackIdentifiers (plural form)
pkg/utils/pod.go
Outdated
| prefix := clusterName + "-" | ||
|
|
||
| rackAndPodIndexPart := strings.TrimPrefix(podName, prefix) | ||
| // parts contain only the rack-id, rack-revision (optional) pod-index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // parts contain only the rack-id, rack-revision (optional) pod-index. | |
| // parts contain only the rack-id, rack-revision (optional), and pod-index. |
test/cluster/rack_revision_test.go
Outdated
| ) | ||
|
|
||
| It( | ||
| "Should successfully migrate rack to new revision with storage update", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have a test to update the storage with no revision initially to some revision. This will test vertical scaling on existing clusters.
This PR introduces the Rack Revision feature that enables safe, cost-effective storage updates for Aerospike clusters by implementing a gradual pod-by-pod migration strategy instead of the old add new rack approach. Old approach required rack-aware clients to change their configuration whenever there is storage update change by adding a new rack.
Before: Storage updates required bringing up an entirely new rack before deleting the old one, leading to:
🔴 Double infrastructure cost during migration
🔴 Rack-id changes
🔴 Client configuration changes for rack-aware clients
After: Gradual pod-by-pod migration with:
✅ Gradual rollout of new changes without changing the rack-id
✅ No client side changes
RackRevision Field