Skip to content
This repository was archived by the owner on Jun 29, 2022. It is now read-only.
This repository was archived by the owner on Jun 29, 2022. It is now read-only.

Ensure that cluster upgrade in HA mode is not disruptive and document the shortcomings  #1213

Open
@invidian

Description

@invidian

Currently, we do not test availability impact of our upgrade process on the cluster. This means that ideally, when user is doing the upgrade process, all production traffic should be migrated to some other cluster, to make sure there is no disturbance to the provided services.

As we support in-place upgrades, we should ensure (test and document), that when cluster is running in HA setup (3 or 5 controller nodes), production traffic is not affected. This should include testing things like:

  • Kubernetes API reading and writing
  • End components availability (Dex, Gangway, etc.)
  • End application availability (httpbin can be used for testing)
  • MetalLB
  • Contour

We should make sure, that all components are configured in HA mode, so when nodes gets drained etc, services remains operational at all times. If this is not possible for some reason (e.g. because of single read-write storage without application-level replication) like Prometheus, we should document that.

See also #485

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions