Skip to content

Conversation

@fulmicoton
Copy link
Collaborator

No description provided.

node_delta
.max_version
.map(|max_version| DeltaOpRef::SetMaxVersion { max_version })
if node_delta.key_values.is_empty() && node_delta.max_version > 0 {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not do much... I just removed the max_version option as it is error prone.

version: crate::Version,
deleted: bool,
) {
assert_ne!(version, 0, "0 version for a kv is forbidden");
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just a util for tests. we don't panic here.

}

fn process_delta(&mut self, delta: Delta) {
self.maybe_trigger_catchup_callback(&delta);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's one important part. We don't have adhoc logic to detect if there is a reset anymore.

We return whether a reset was applied.
The adhoc logic in maybe_trigger_catchup_callback was too loose before

/// forcing the garbage collected node's recreation to the regular chitchat
/// heartbeat protocol.
pub fn reset_node_state(
pub fn reset_node_state_if_update(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the second fix. Now we do not early exit if the state received by grpc is not a progress.

Generally a chitchat nodestate must have an ever increasing:
(last_gc.max(max_version), max_version)

//
// Now we are trying to reset our state via grpc gossip, but the new state is already
// be out of date.
warn!(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the new condition.

Before, when we gossiped with lagging nodes, we would update our state, and trigger gossip in the same round.

// Assess whether the delta can be applied or not.
#[must_use]
fn prepare_apply_delta(&mut self, node_delta: &NodeDelta) -> bool {
fn check_delta_status(&self, node_delta: &NodeDelta) -> DeltaStatus {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just isolated the immutable logic that considers whether we should accept a delta or not.

I also tried to make it simpler.

@fulmicoton-dd fulmicoton-dd force-pushed the paul.masurel/only-call-catchup-on-reset branch 2 times, most recently from 1d74f1e to 1e62244 Compare December 19, 2025 13:04
},
}

impl std::fmt::Display for NodeStatePredicate {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just nicer display for the simulator unit test.

@fulmicoton fulmicoton requested a review from guilload December 19, 2025 13:43
It also prevents applying a grpc gossip when the new state
is detected as not being an updated.

It also adds a bunch of defensive asserts.

This changes the condition for the catchup callback to be called.

Now it only happens if the delta provided really triggered a catchup.

blop

Added monotonic invariant. Simplify delta acceptation logic
@fulmicoton-dd fulmicoton-dd force-pushed the paul.masurel/only-call-catchup-on-reset branch from 1e62244 to adca52c Compare December 19, 2025 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants