-
Notifications
You must be signed in to change notification settings - Fork 104
feature(new_dc): Add new node with zero node dc #9230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feature(new_dc): Add new node with zero node dc #9230
Conversation
bd79c21
to
a8d2eae
Compare
a8d2eae
to
d2f283f
Compare
@aleksbykov can you verify unit tests error and resolve conflicts? |
9f9b1e3
to
36db4ec
Compare
00e459e
to
fdf05b8
Compare
user_prefix: 'add-new-dc' | ||
|
||
n_db_zero_token_nodes: '0 0 0' | ||
zero_token_instance_type_db: 'i4i.large' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember we wanted to verify that zero-token nodes can be run on weak instances. I don't know if someone did it. If not, I think we could do an experiment here: try to run this new test with the weakest possible instance for the zero-token node and see what happens. If the test passes, we can change the instance to the weakest one in this test. If it fails, then we should investigate it.
Zero-token nodes should work on basically any machine if their only job is participating in the Raft group 0, which is the case in this test. Zero-token nodes could also be used for something else like being proxy nodes for client requests or Alternator load balancers (these are the probable future use cases). In these use cases, the low storage still shouldn't be an issue, but weak CPU could become an issue depending on the traffic.
Note that I know almost nothing about stuff like instances, so take this comment with a pinch of salt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lower instance was check in another tests. It could be also configured here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it was checked in another tests, then we don't have to do it here, but we can. Your choice.
cdff125
to
2048394
Compare
1e80fdd
to
dcd4e27
Compare
dcd4e27
to
5689b6f
Compare
171360e
to
9023ac4
Compare
c885a0d
to
0ed1446
Compare
bcb26ca
to
4f1f836
Compare
Add new feature tests with Arbiter DC and zero nodes Test checks that in multidc symetric configuration: DC1: 3, DC2: 3, we lost raft quorum if all nodes in any dc will die (stopped, killed, etc) and after adding new arbiter DC with zero node only then it saves raft quroum after one dc is die and allow to restore failed dc with topology operations With limited voters features raft voters spread across cluster and has assymentric configuration. for example DC1 could have 2 voters DC2 will have 3 voters. in this case if dc1 will be down then raft quorum will still be presented. In this case DC wil the largest number of voters will killed to get lost raft quourum Test kill scylla on the node by sigstop signal. it is required because nodes should be stopped very fast, so raft couldn't reassign the voters to alive nodes.
81a4bec
to
8c5a2ba
Compare
@patjed41 , can you take a look again. I fixed according to your comments
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces support for adding a new DC with zero nodes to preserve raft quorum in multidc clusters and adjusts related configurations, tests, and Jenkins pipelines. Key changes include:
- Adding a new configuration parameter (ignore_dead_nodes_for_replace) in YAML files.
- Implementing and raising a new ReadBarrierErrorException in raft utilities.
- Extending the test suite and associated Jenkins pipelines to verify quorum preservation with arbiter and zero-node DC scenarios.
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
unit_tests/test_scylla_yaml.py | Added new YAML field for dead node handling. |
test-cases/features/restore-quorum-with-zero-nodes.yaml | Introduced test configuration for restoring quorum with zero-node DC. |
sdcm/utils/raft/init.py | Updated to raise a new exception on read barrier failure. |
sdcm/provision/scylla_yaml/scylla_yaml.py | Added configuration option for ignoring dead nodes. |
sdcm/exceptions.py | Introduced new ReadBarrierErrorException class. |
sdcm/cluster.py | Enhanced cluster node functions and introduced host-id mapping. |
raft_quorum_with_arbiterdc_test.py | Added tests to validate quorum loss/restoration with arbiter and zero-node scenarios. |
jenkins-pipelines/oss/features/*.jenkinsfile | New pipelines for quorum preservation tests. |
configurations/zerotoken_nodes/multidc_config_with_empty_arbiter_dc.yaml | New configuration file for multidc with an arbiter DC. |
Comments suppressed due to low confidence (1)
raft_quorum_with_arbiterdc_test.py:88
- [nitpick] Consider renaming 'arbitor_dc_node' to 'arbiter_dc_node' for consistent and clear variable naming.
arbitor_dc_node = self.add_zero_node_to_dc(dc_idx=arbiter_dcx)
LOGGER.error("Trigger read-barrier via rest api failed %s", exc) | ||
raise ReadBarrierErrorException("Read Barrier call failed %s", exc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider formatting the error message properly (e.g., using an f-string) to ensure the exception details are correctly included in the message.
LOGGER.error("Trigger read-barrier via rest api failed %s", exc) | |
raise ReadBarrierErrorException("Read Barrier call failed %s", exc) | |
LOGGER.error(f"Trigger read-barrier via rest api failed: {exc}") | |
raise ReadBarrierErrorException(f"Read Barrier call failed: {exc}") |
Copilot uses AI. Check for mistakes.
for node in data_nodes_per_region[dead_region]: | ||
node.run_nodetool(sub_cmd=f"rebuild -- {region_dc_mapping[alive_region]}", publish_event=True) | ||
|
||
InfoEvent("Reepair data on new nodes").publish() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct the spelling of 'Reepair' to 'Repair' in the log message.
InfoEvent("Reepair data on new nodes").publish() | |
InfoEvent("Repair data on new nodes").publish() |
Copilot uses AI. Check for mistakes.
|
||
InfoEvent("Assert that quorum is lost").publish() | ||
verification_node = random.choice(alive_nodes) | ||
InfoEvent("Validate raft quorom is lost").publish() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct the spelling of 'quorom' to 'quorum' in the log message.
InfoEvent("Validate raft quorom is lost").publish() | |
InfoEvent("Validate raft quorum is lost").publish() |
Copilot uses AI. Check for mistakes.
|
||
class TestClusterQuorum(LongevityTest): | ||
"""Test for procedure: | ||
TODO: link to 'Preventing Quorum Loss in Symmetrical Multi-DC Clusters' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dead_nodes, alive_nodes = self.db_cluster.data_nodes[: | ||
lost_quorum_num], self.db_cluster.data_nodes[lost_quorum_num:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation doesn't look right
# Stop half of nodes in dc and check that raft quorum is lost | ||
num_of_data_nodes = self.params.get("n_db_nodes") | ||
lost_quorum_num = num_of_data_nodes // 2 if num_of_data_nodes % 2 == 0 else (num_of_data_nodes // 2) + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With limiter voters, stopping half on the nodes is not always enough.
The number of voters is min(5, number of nodes)
and we plan to ensure that the number of voters will always be odd in the future (scylladb/scylladb#23266). In particular, a cluster of 4 nodes will have 3 voters. So, if you kill only 2 nodes, you won't lose majority if 1 of the killed nodes is a non-voter.
I'm not sure if this test makes much sense after adding the limited voters feature. We can keep it, but we should adjust it in the following way:
- find voters,
- stop half of the voters (it's important to kill all of the voters at once, so no new nodes will become voters in the meantime),
- check that we lost majority,
- restart stopped nodes,
- add a zero-token node,
- kill the same number of voters,
- check that we didn't lose majority.
Also note that this will work only if the initial number of data nodes is 2 or 4. If the number of voters is greater than 5 or odd, adding a zero-token node won't increase the number of voters (once we fix scylladb/scylladb#23266). It's also possible to make this test work if the number of data nodes is 1 or 3, but then we need 2 zero-token nodes. So, we need to specify which cases we support here and add proper assertions.
# Add new dc with zero node only | ||
InfoEvent("Add new zero node").publish() | ||
zero_token_node = self.add_zero_node_to_dc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says "Add new dc", but it looks like the code adds a node to the same DC (which is fine).
@@ -3380,12 +3381,22 @@ def get_ip_to_node_map(self) -> dict[str, BaseNode]: | |||
"""returns {ip: node} map for all nodes in cluster to get node by ip""" | |||
return {ip: node for node in self.nodes for ip in node.get_all_ip_addresses()} | |||
|
|||
def get_hostid_to_node_map(self) -> dict[str, BaseNode]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function seems unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since test_raft_quorum_saved_after_add_arbiter_dc and test_raft_quorum_saved_after_add_zeronode_to_single_dc complete unique scenarios/tests in SCT I would like to see more doc strings like:
- scenario overview, with description for each step/expected result(validation)
- docstring for all features in raft_quorum_with_arbiterdc_test.py file
- step number /expected result(validation) in InfoEvent that match the docstring from 1
Test checks that in multidc symetric configuration:
DC1: 3, DC2: 3, we lost raft quorum if all nodes in any dc
will die (stopped, killed, etc) and with adding new DC with zero
node only allow to save raft quroum and restore failed dc with
topology operations
Testing
PR pre-checks (self review)
backport
labelsReminders
sdcm/sct_config.py
)unit-test/
folder)