Skip to content

Conversation

@abhishekdwivedi3060
Copy link
Collaborator

@abhishekdwivedi3060 abhishekdwivedi3060 commented Sep 17, 2025

This PR introduces the Rack Revision feature that enables safe, cost-effective storage updates for Aerospike clusters by implementing a gradual pod-by-pod migration strategy instead of the old add new rack approach. Old approach required rack-aware clients to change their configuration whenever there is storage update change by adding a new rack.

Before: Storage updates required bringing up an entirely new rack before deleting the old one, leading to:
🔴 Double infrastructure cost during migration
🔴 Rack-id changes
🔴 Client configuration changes for rack-aware clients

After: Gradual pod-by-pod migration with:
✅ Gradual rollout of new changes without changing the rack-id
✅ No client side changes

RackRevision Field

spec:
  rackConfig:
    racks:
    - id: 1
      revision: "v2"  # New field for versioned storage updates
      storage:
        # Updated storage configuration

@abhishekdwivedi3060 abhishekdwivedi3060 force-pushed the feature/KO-448-support-vertical-scaling branch from a7d5344 to 276e920 Compare September 17, 2025 19:01
@abhishekdwivedi3060 abhishekdwivedi3060 changed the title [WIP] feat: Add support for vertical scaling (storage scaling) [KO-448] feat: Add support for vertical scaling (storage scaling) [KO-448] Sep 22, 2025
@abhishekdwivedi3060 abhishekdwivedi3060 force-pushed the feature/KO-448-support-vertical-scaling branch from ab1947c to c6552d9 Compare September 24, 2025 05:59
oldSts = nil
}

newSts, err := r.getSTS(newRack)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any issue if we get new sts before old sts above 1958?
If we do so, then if oldSts == nil logic will be in continuation with oldsts code.

// Revision is a version identifier for this rack's specification, used to trigger controlled migrations
// when rack configuration changes require new StatefulSets. Change this field when making changes
// that cannot be applied in-place, such as storage updates that require pod recreation.
// The revision is appended to the rack ID for Kubernetes resource naming (e.g., <cluster>-<rackID>-<revision>).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The revision is appended to the rack ID for Kubernetes resource naming (e.g., <cluster>-<rackID>-<revision>).
// The revision is appended to the rack ID for Kubernetes resource naming (e.g., <cluster-name>-<rackID>-<revision>).


func (r *SingleClusterReconciler) waitForAllSTSToBeReady(ignorablePodNames sets.Set[string]) error {
r.Log.Info("Waiting for cluster to be ready")
r.Log.Info("Waiting for cluster STS to be ready")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
r.Log.Info("Waiting for cluster STS to be ready")
r.Log.Info("Waiting for all cluster STSs to be ready")

r.Log.Info("Waiting for cluster STS to be ready")

allRackIDs := sets.NewInt()
allRackIdentifier := sets.NewString()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this allRackIdentifier var as allRackIdentifiers (plural form)

pkg/utils/pod.go Outdated
prefix := clusterName + "-"

rackAndPodIndexPart := strings.TrimPrefix(podName, prefix)
// parts contain only the rack-id, rack-revision (optional) pod-index.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// parts contain only the rack-id, rack-revision (optional) pod-index.
// parts contain only the rack-id, rack-revision (optional), and pod-index.

@abhishekdwivedi3060 abhishekdwivedi3060 changed the title feat: Add support for vertical scaling (storage scaling) [KO-448] feat: Add support for vertical scaling (storage scaling) [KO-448] [KO-459] Oct 17, 2025
jwalantmodi05
jwalantmodi05 previously approved these changes Nov 11, 2025
)

It(
"Should successfully migrate rack to new revision with storage update", func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a test to update the storage with no revision initially to some revision. This will test vertical scaling on existing clusters.

tanmayja
tanmayja previously approved these changes Nov 12, 2025
@abhishekdwivedi3060 abhishekdwivedi3060 merged commit 21e0ab1 into master Nov 13, 2025
10 checks passed
@abhishekdwivedi3060 abhishekdwivedi3060 deleted the feature/KO-448-support-vertical-scaling branch November 13, 2025 05:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants