Skip to content

User Request: Release Dynamic Targeting behind a feature flag in controller #575

Open
@baluishere

Description

@baluishere

Note: While chaos-controller is open to the public and we consider all suggestions for improvement, we prioritize feature development that is immediately applicable to chaos engineering initiatives within Datadog. We encourage users to contribute ideas to the repository directly in the form of pull requests!

Is your feature request related to a problem? Please describe.

After deploying controller version 7.2.0 which has dynamic targeting enabled by default, we observed that the controller enters into a restart loop (Back-off restart) while running experiments. (Error attached at the bottom). We were in controller version 6.1.0 and upgraded to 7.2.0. However when turning off dynamic targeting by adding statingTargeting: true in the defintion, the controller works as expected.
We are in the process of evaluating the dynamic targeting feature however as this feature is enabled by default in new versions there is a risk of teams running this without knowing the complete impact.
It would be great if dynamic targeting can be released under a config in controller so that we can block this until teams are confident and/or iron out issues we see in our cluster due to dynamic targeting. This also enables us to use the newer versions of controller while we sort dynamic targeting.

Describe the solution you'd like
Release dynamic targeting behind a feature flag. A configuration to enable/disable dynamic targeting in the configmap.yaml so we can disable/enable this feature from the controller side.

Describe alternatives you've considered
Open to any other ideas that would enable/disable dynamic targeting from controller

** Errors seeing when running experiments with dynamic targeting**

{"level":"info","ts":1660818040268.9556,"caller":"chaos-controller/main.go:267","message":"loading configuration file","config":"/etc/chaos-controller/config.yaml"}
I0818 10:20:41.321192       1 request.go:665] Waited for 1.031103301s due to client-side throttling, not priority and fairness, request: GET:https://172.20.0.1:443/apis/pkg.crossplane.io/v1?timeout=32s
{"level":"info","ts":1660818045975.2166,"caller":"eventbroadcaster/notifiersink.go:40","message":"notifier noop enabled"}
{"level":"info","ts":1660818045978.2427,"caller":"chaos-controller/main.go:424","message":"restarting chaos-controller"}
I0818 10:20:45.978442       1 leaderelection.go:248] attempting to acquire leader lease chaos-engineering-framework/75ec2fa4.datadoghq.com...
I0818 10:21:02.864515       1 leaderelection.go:258] successfully acquired lease chaos-engineering-framework/75ec2fa4.datadoghq.com
I0818 10:21:04.017267       1 request.go:665] Waited for 1.046628643s due to client-side throttling, not priority and fairness, request: GET:https://172.20.0.1:443/apis/athena.aws.crossplane.io/v1alpha1?timeout=32s
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1444dba]

goroutine 691 [running]:
github.com/DataDog/chaos-controller/controllers.(*DisruptionReconciler).manageInstanceSelectorCache(0xc000824000, 0xc0004e0240)
	/go/src/github.com/gsQ9JVMR/0/DataDog/chaos-controller/controllers/cache_handler.go:514 +0x63a
github.com/DataDog/chaos-controller/controllers.(*DisruptionReconciler).Reconcile(0xc000824000, {0x1b730d8?, 0xc0010c1a70?}, {{{0xc000a4de60?, 0x1b?}, {0xc000a4de20?, 0x20?}}})
	/go/src/github.com/gsQ9JVMR/0/DataDog/chaos-controller/controllers/disruption_controller.go:124 +0x4c5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0xc00082d040, {0x1b730d8, 0xc0010c19b0}, {{{0xc000a4de60?, 0x17c79c0?}, {0xc000a4de20?, 0xc000716d40?}}})
	/go/src/github.com/gsQ9JVMR/0/DataDog/chaos-controller/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:114 +0x222
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc00082d040, {0x1b73030, 0xc000698580}, {0x175aa00?, 0xc00064d940?})
	/go/src/github.com/gsQ9JVMR/0/DataDog/chaos-controller/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:311 +0x2e9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc00082d040, {0x1b73030, 0xc000698580})
	/go/src/github.com/gsQ9JVMR/0/DataDog/chaos-controller/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	/go/src/github.com/gsQ9JVMR/0/DataDog/chaos-controller/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
	/go/src/github.com/gsQ9JVMR/0/DataDog/chaos-controller/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:223 +0x309

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions