Skip to content

User Request: Support for pod state disruptions #352

@nikos912000

Description

@nikos912000

Is your feature request related to a problem? Please describe.
Pod state failures (e.g. graceful/non-graceful deletion) are common disruptions in the Chaos Engineering community.

The reasoning behind pod failures is that Kubernetes pods are ephemeral resources; they get destroyed, restarted, recreated.

This happens in many cases:

  • When deploying a new version of an application
  • In case the liveness probe of any container running inside the pod fails
  • As a consequence of draining a node
  • When the autoscaler updates the number of replicas of a deployment

Pod state disruptions can expose a number of reliability concerns including:

  • Long-living pods and all the issues that may arise from them
  • Cold start issues
  • Scalability issues (e.g. autoscaling misconfigurations)
  • Inconsistent/unknown startup times
  • Uneven traffic distribution across pods
  • Non-graceful shutdown
  • Issues related to Java's DNS cache TTL leading to terminated pods still receiving requests
  • Cascading failures
  • We also wrote a blogpost on issues we found when using Kube Monkey

Describe the solution you'd like
Pod deletions can be executed in many different ways. The easiest is through the Kubernetes client which supports graceful and non-graceful deletions through its gracePeriodSeconds parameter. This is how tools like Kube Monkey and our internal controller execute that disruption.

The other option would be to do this at container level which provides more granularity. This is how Pumba executes these disruptions.

A few more implementation details/ideas:

  • The level will always be pod for these disruptions.
  • In the CRD this might get a bit confusing but one option is to introduce a podFailure, similar to nodeFailure, with options (e.g. graceful/non-graceful deletion).
  • There is already a containers field which would allow targeting containers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions