-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User Request: allow targeting subset of destination service #843
Comments
Hey, just to clarify: you would like to drop all packets but only for a subset of hosts behind the hostname you provide to the disruption right? In other words and with an example, your use case would be: I want to drop 100% of packets going to 50% of the hosts behind the provided hostname. And it is a different use case than: I want to drop 50% of packets going to the provided hostname. |
Yup; in my concrete case I probably want to |
Ok, I think there's a simplistic way to implement such a feature by resolving the given hostname and picking x% of returned IPs in the injector component. @ptnapoleon wdyt? |
Is it literally just x% of returned IPs, or is there any other filtering you want to do on those hosts? Do you need the same x% of IPs to be picked across all injectors, or is it fine if they're all just picking a random x%? Do you need this to work for the |
It would probably be better if it's the same IPs across all the selected pods, but that's not a hard requirement. But I'm thinking it would be relatively easy to do with a consistent hash? Doesn't have to be the same across runs, so maybe throw some Disruption specific value in there. By the way, this would also work well for testing resilience to e.g. a single Aurora database reader being unavailable; there's a single hostname for the reader endpoint, with the configured number of reader instances behind it, so you can't target disruptions on just the hostname. It's another case where it's valuable to test that application retry connections, for example. |
Sounds good and easy to integrate within the host filters doing the resolution of hostnames: https://github.com/DataDog/chaos-controller/blob/main/injector/network_disruption.go#L1177 Passing a percentage of resolved IPs to keep would probably be enough and it is a valuable feature. |
I'll open a ticket for this internally for us to track |
Are there any updates on the status of this? |
Is your feature request related to a problem? Please describe.
A typical scenario I want to test is how service A response to its direct dependency service B being partially unavailable; I basically want to verify that A has proper timeouts and retries in place to be able to gracefully handle e.g. a single B pod being overloaded or in a bad state.
Describe the solution you'd like
I see that it's possible to scope a network disruption to just a list of specific IP addresses with the
network.hosts
field. However, I do not know the IP addresses of the B pods at the time of writing the Disruption. I would like to instead be able to provide acount
of the destination service's pods that should be in scope for the disruption, with a percentage allowed. This would be dynamically translated to a list of IPs.Describe alternatives you've considered
I can create a disruption on B instead of A, and set the
count
as I wish. However, that causes a disruption to all clients of B, whereas I want to limit the scope to A, which is the subject under test. We do not have dedicated environments for this, so limiting the impact of disruptions is key to staying popular with my colleagues :DThe text was updated successfully, but these errors were encountered: