Description
Background information:
- The original problems with the old algorithm;
- The fix to the claim/join algorithm;
- The PR to fix the leave algorithm;
- The original rack awareness PR.
When joining a node, the following algorithms are attempted:
1 - A basic attempt to satisfy wants (vnodes required by the joining node) by asking node-by-node which vnodes can be passed on without breaking target_n_val
(the claim_v2
algortihm).
2 - If Step 1 is unsuccessful, then attempt to stripe the all vnodes across all nodes (the sequential_claim
algorithm).
3 - If Step 2 creates tail violations (i.e. if 0 < RingSize rem NodeCount < TargetNVal), resolve through the solve_tail_violations
algorithm.
When leaving a node, the following algorithms are attempted:
1 - A basic attempt to perform a simple_transfer
(vnodes are passed in turn to nodes that would not break target_n_val
).
2 - Use sequential_claim
as in join.
3 - Use solve_tail_violations
extension to sequential_claim
as in join
Ideally, in both cases Step 1 should succeed - as Step 2 will inevitable lead to a full cluster reorganisation (and hence a large volume of transfers).
As part of #967 location awareness was added to the sequential_claim
algorithm (Step 2).
This issue is to document an ongoing investigation to these three problems:
- Under what conditions does the
sequential_claim
algorithm (both with and without the need for thesolve_tail_volationa
algorithm provide a location safe cluster; - Can the
claim_v2
(Step 1 for joins) andsimple_transfer
(Step 1 for leave) algorithms be extended to be location aware; - Can the
claim_v2
andsimple_transfer
algorithms be extended to reduce the scenarios in which cluster changes fallback tosequential_claim
.