Ensure that cluster upgrade in HA mode is not disruptive and document the shortcomings 

Currently, we do not test availability impact of our upgrade process on the cluster. This means that **ideally**, when user is doing the upgrade process, all production traffic should be migrated to some other cluster, to make sure there is no disturbance to the provided services.

As we support in-place upgrades, we should ensure (test and document), that when cluster is running in HA setup (3 or 5 controller nodes), production traffic is not affected. This should include testing things like:
- Kubernetes API reading and writing
- End components availability (Dex, Gangway, etc.)
- End application availability (httpbin can be used for testing)
- MetalLB
- Contour

We should make sure, that all components are configured in HA mode, so when nodes gets drained etc, services remains operational at all times. If this is not possible for some reason (e.g. because of single read-write storage without application-level replication) like Prometheus, we should document that.

See also #485

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ensure that cluster upgrade in HA mode is not disruptive and document the shortcomings #1213

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Ensure that cluster upgrade in HA mode is not disruptive and document the shortcomings #1213

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions