Skip to content

[CHAOSPLT-1364] Fix cloud disruption hosts only injected in first chaos pod#1012

Merged
aymericDD merged 1 commit intomainfrom
aymeric.daurelle/CHAOSPLT-1364/fix
Jan 7, 2026
Merged

[CHAOSPLT-1364] Fix cloud disruption hosts only injected in first chaos pod#1012
aymericDD merged 1 commit intomainfrom
aymeric.daurelle/CHAOSPLT-1364/fix

Conversation

@aymericDD
Copy link
Contributor

@aymericDD aymericDD commented Dec 22, 2025

What does this PR do?

  • Adds new functionality
  • Alters existing functionality
  • Fixes a bug
  • Improves documentation or testing

Please briefly describe your changes as well as the motivation behind them:
Bug Fix: Cloud network disruptions (e.g., AWS S3) were only injecting --hosts arguments into the first chaos pod. When multiple targets were selected, subsequent
chaos pods were created without any hosts, causing the disruption to fail partially.

Root Cause: The r.Client.Status().Update(ctx, instance) call in createChaosPods (line 709) overwrites the local instance object with the API server's response.
Since UpdateHostsOnCloudDisruption modifies instance.Spec.Network.Hosts in memory only (not persisted to etcd), the hosts were cleared after the first pod creation.

Solution: Use DeepCopy() before calling Status().Update() to preserve in-memory spec changes. This follows Kubernetes controller-runtime best practices
(cluster-api #1259, controller-runtime
#2850
).

BTW: It also fix the flaky test.

image.png

Code Quality Checklist

  • The documentation is up to date.
  • My code is sufficiently commented and passes continuous integration checks.
  • I have signed my commit (see Contributing Docs).

Testing

  • I leveraged continuous integration testing
    • by depending on existing unit tests or end-to-end tests.
    • by adding new unit tests or end-to-end tests.
  • I manually tested the following steps:
    • Reproduction: Created a cloud disruption targeting AWS S3 with 2 target replicas. Before the fix, first chaos pod had 361 --hosts arguments, second pod had 0.
    • Verification: After the fix, both chaos pods receive all 361 cloud service IP ranges.
    • E2E Test: Ran ginkgo --focus "should create a cloud disruption but apply a host disruption with the list of cloud managed service ip ranges" - passes
      successfully.
    • locally.
    • as a canary deployment to a cluster.

@aymericDD aymericDD changed the title fix(network): preserve hosts in cloud disruptions [CHAOSPLT-1364] Fix cloud disruption hosts only injected in first chaos pod Dec 22, 2025
@aymericDD aymericDD force-pushed the aymeric.daurelle/CHAOSPLT-1364/fix branch from 623468b to 68d4ef6 Compare December 22, 2025 20:06
@aymericDD aymericDD force-pushed the aymeric.daurelle/CHAOSPLT-259/feat branch 2 times, most recently from 766b11f to 28c876d Compare December 22, 2025 20:27
@aymericDD aymericDD force-pushed the aymeric.daurelle/CHAOSPLT-1364/fix branch from 68d4ef6 to 205f48c Compare December 22, 2025 20:27
@aymericDD aymericDD force-pushed the aymeric.daurelle/CHAOSPLT-259/feat branch from 28c876d to 6f36370 Compare December 22, 2025 20:42
@aymericDD aymericDD force-pushed the aymeric.daurelle/CHAOSPLT-1364/fix branch from 205f48c to dea3090 Compare December 22, 2025 20:42
@aymericDD aymericDD marked this pull request as ready for review December 22, 2025 20:43
@aymericDD aymericDD requested a review from a team December 22, 2025 20:43
@aymericDD aymericDD force-pushed the aymeric.daurelle/CHAOSPLT-1364/fix branch from dea3090 to d3de07c Compare December 24, 2025 09:56
@aymericDD aymericDD force-pushed the aymeric.daurelle/CHAOSPLT-259/feat branch from 6f36370 to cb4f887 Compare December 24, 2025 09:56
@aymericDD aymericDD force-pushed the aymeric.daurelle/CHAOSPLT-1364/fix branch from d3de07c to bedf48a Compare December 24, 2025 10:01
@aymericDD aymericDD force-pushed the aymeric.daurelle/CHAOSPLT-259/feat branch 2 times, most recently from cdae105 to 742fcec Compare December 29, 2025 11:02
@aymericDD aymericDD force-pushed the aymeric.daurelle/CHAOSPLT-1364/fix branch from bedf48a to bf4a502 Compare December 29, 2025 11:02
@aymericDD aymericDD requested a review from a team December 29, 2025 14:37
@aymericDD aymericDD force-pushed the aymeric.daurelle/CHAOSPLT-259/feat branch from 742fcec to f6beea5 Compare December 31, 2025 09:55
@aymericDD aymericDD force-pushed the aymeric.daurelle/CHAOSPLT-1364/fix branch from bf4a502 to 5736463 Compare December 31, 2025 09:55
Base automatically changed from aymeric.daurelle/CHAOSPLT-259/feat to main January 7, 2026 10:01
Use DeepCopy before Status().Update() to prevent
in-memory spec changes from being lost. Without this,
cloud disruption hosts were cleared after the first
chaos pod creation, causing subsequent pods to have
no hosts injected.

Jira: CHAOSPLT-1364
@aymericDD aymericDD force-pushed the aymeric.daurelle/CHAOSPLT-1364/fix branch from 5736463 to 06dcf17 Compare January 7, 2026 10:02
@aymericDD aymericDD merged commit cfcf279 into main Jan 7, 2026
20 checks passed
@aymericDD aymericDD deleted the aymeric.daurelle/CHAOSPLT-1364/fix branch January 7, 2026 12:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments