[DRAFT] Cluster scalling improvements: allow safe scale down & even replicas #844

elderapo · 2025-10-16T15:18:19Z

For the sake of simplicity, I will be referring to spec.replicas as instances. Renaming replicas to instances won't be a part of this PR (so there won't be breaking changes, requiring a possibly complex operator upgrade for the users).

Scope:

Allow scaling instances down.
- When scaling down StatefulSet's it's not possible to customize which pods get deleted first; instead highest ordinals get deleted first (for example, with 5 instances, scaling down to 3 will delete 4 and 3, keeping 0, 1, and 2). To ensure the primary is not accidentally deleted during scale down, it's required not to be in the "pods to delete pool". If it is, the cluster instances update is not accepted with the following error message:
```
The MySQLCluster "mysql-database-cluster" is invalid: spec.replicas: Forbidden: scale-down 2->1 would delete the current primary pod moco-mysql-database-cluster-1. Perform a switchover so the primary's ordinal is < 1, then retry.
```
- When scaling down from multiple instances to a single one - primary, semi-sync replication gets disabled on it (because there are no longer replicas to ACK events from the primary).
~~Sequentially add instances to the cluster on scale-up~~
- Initially, I thought it would be nice to create new instances one by one, let them clone, sync up, and join the cluster. However, it appears that moco heavily relies on all pod instances being deployed right after scale-up.
Prevent staling writes during scale-up.
- Freshly added replicas do not get taken into account for primary semi-sync ack calculation until they successfully bootstrap for the first time.
Allow even instance counts
- Allowing even instances does not increase fault tolerance vs the previous odd size, but gives the cluster operator more control.
- Safety is maintained by a true majority for semi-sync replication: ceil((instances - 1) / 2) replicas are required to ACK before primary commits.
Prevent staling writes during scale-down.
- On scale-down, there is a brief write stale (~5 seconds, from what I've observed). Ideally, this can be prevented, and the primary ack configuration should be updated immediately.
- This might not be easily achievable because reconciliation seems to only take place when the pod count is the same as the actual live pods. Once the terminating pods get removed, the primary gets updated.
Update docs to document these changes.
Update CHANGELOG.md

…be deleted

…cas count

elderapo added 6 commits October 16, 2025 10:41

Sequentially scale up replicas

2b6c14b

Allow scaling down but only if primary is not on the list of pods to …

4bb7fce

…be deleted

Configure primary to use/not use semi-sync replication based on repli…

82002de

…cas count

Remove code responsible for blocking even replicas count

a0e388c

Move RequiredSemiSyncAcks to clustering and add tests

d567a94

Prevent staling writes on scale-up

737804d

elderapo mentioned this pull request Oct 16, 2025

Improve scaling UX: even replicas, safe scale-down, sequential non write blocking scale-up #843

Open

elderapo added 2 commits October 16, 2025 23:52

Fix neverReadyReplicas calculation for clone processes

b1717c2

Improve init container startup time from ~60s to ~5s

fe56821

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DRAFT] Cluster scalling improvements: allow safe scale down & even replicas #844

[DRAFT] Cluster scalling improvements: allow safe scale down & even replicas #844

Uh oh!

elderapo commented Oct 16, 2025 •

edited by shunki-fujita

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[DRAFT] Cluster scalling improvements: allow safe scale down & even replicas #844

Are you sure you want to change the base?

[DRAFT] Cluster scalling improvements: allow safe scale down & even replicas #844

Uh oh!

Conversation

elderapo commented Oct 16, 2025 • edited by shunki-fujita Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

elderapo commented Oct 16, 2025 •

edited by shunki-fujita

Loading