Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: A/B Testing flow modification to accommodate incompatible versions #1752

Open
SuphalAthlur opened this issue Jan 8, 2025 · 0 comments

Comments

@SuphalAthlur
Copy link

SuphalAthlur commented Jan 8, 2025

Describe the feature

Currently, this is the flow in A/B testing strategy after the canary analysis phase succeeds:
Route all traffic to primary -> Copy new template spec to primary pods -> Scale down canary deployment

Issue: This would cause primary pods to serve two versions during the brief period when the update is being rolled out- some applications may not be tolerant to this. There should be a way to avoid this.

During canary analysis, this can be mitigated by using header based matching (spec.analysis.match) for routing traffic to primary/canary. By obtaining the user's identity/session cookies from the header fields, we may ensure that a particular session (or a user or a set of users) is always served a particular version. But we do not have any such mechanism once the canary succeeds and the rolling update is underway on the primary deployment - which is serving all the live traffic.

Proposed solution

Proposed flow:
Route all traffic to canary (successfully tested right before) -> Copy template spec to primary pods -> Route all traffic to primary -> Scale down canary deployment
Similar to blue-green strategy.

How it solves the issue: This would make sure that, at no point, is a deployment serving live traffic - hosting pods serving different application versions. This alternative flow could be provided as a configurable option, thereby not denying anything existing to Flagger users.

Any alternatives you've considered?

If Istio is being used as the service mesh provider, then maybe using it's sticky sessions support could help: https://istio.io/latest/docs/reference/config/networking/destination-rule/#LoadBalancerSettings-ConsistentHashLB
But this would bring in many caveats:

  • If a host that a session is coupled with is down, istio would pick another host at random without any regard for the application version being served by the old host
  • This is because Istio is unaware of the application version information - it routes purely based on the destination host identity
  • Such strict constraints may not be required - the usecase described here would be satisfied as long as the requests are routed to one of the hosts serving that version - not necessarily the same host each time
  • This could make load balancing trickier and impose unnecessary constraints, Flagger has the potential to provide a much more elegant solution
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant