Skip to content

Joining a node after committing a plan - transfers freeze & cluster state is stuck #996

@martinsumner

Description

@martinsumner

To replicate:

  • Start six nodes
  • Join three nodes to the first, but not nodes 5 and 6
  • Plan/Commit cluster chnages
  • attempt to join nodes 5 and 6 (i.e. without planning - just riak admin cluster join )

The transfers stop at the point the additional nodes join - and the cluster ends up stuck in that state:

dev/dev4/riak/bin/riak admin cluster plan
=============================== Staged Changes ================================
Action         Details(s)
-------------------------------------------------------------------------------
join           '[email protected]'
join           '[email protected]'
join           '[email protected]'
-------------------------------------------------------------------------------


NOTE: Applying these changes will result in 1 cluster transition

###############################################################################
                         After cluster transition 1/1
###############################################################################

================================= Membership ==================================
Status     Ring    Pending    Node
-------------------------------------------------------------------------------
valid     100.0%     25.0%    [email protected]
valid       0.0%     25.0%    [email protected]
valid       0.0%     25.0%    [email protected]
valid       0.0%     25.0%    [email protected]
-------------------------------------------------------------------------------
Valid:4 / Leaving:0 / Exiting:0 / Joining:0 / Down:0

Transfers resulting from cluster changes: 48
  16 transfers from '[email protected]' to '[email protected]'
  16 transfers from '[email protected]' to '[email protected]'
  16 transfers from '[email protected]' to '[email protected]'

$ dev/dev4/riak/bin/riak admin cluster commit
Cluster changes committed
$ dev/dev4/riak/bin/riak admin cluster status
---- Cluster Status ----
Ring ready: true

+--------------------+------+-------+-----+-------+
|        node        |status| avail |ring |pending|
+--------------------+------+-------+-----+-------+
| (C) [email protected] |valid |  up   |100.0|  25.0 |
|     [email protected] |valid |  up   |  0.0|  25.0 |
|     [email protected] |valid |  up   |  0.0|  25.0 |
|     [email protected] |valid |  up   |  0.0|  25.0 |
+--------------------+------+-------+-----+-------+

Key: (C) = Claimant; availability marked with '!' is unexpected
$ dev/dev5/riak/bin/riak admin cluster join [email protected]
Success: staged join request for '[email protected]' to '[email protected]'
$ dev/dev6/riak/bin/riak admin cluster join [email protected]
Success: staged join request for '[email protected]' to '[email protected]'
$ dev/dev4/riak/bin/riak admin cluster status
---- Cluster Status ----
Ring ready: true

+--------------------+-------+-------+-----+-------+
|        node        |status | avail |ring |pending|
+--------------------+-------+-------+-----+-------+
|     [email protected] |joining|  up   |  0.0|   0.0 |
|     [email protected] |joining|  up   |  0.0|   0.0 |
| (C) [email protected] | valid |  up   | 71.9|  25.0 |
|     [email protected] | valid |  up   |  9.4|  25.0 |
|     [email protected] | valid |  up   | 10.9|  25.0 |
|     [email protected] | valid |  up   |  7.8|  25.0 |
+--------------------+-------+-------+-----+-------+

Key: (C) = Claimant; availability marked with '!' is unexpected
$ dev/dev4/riak/bin/riak admin cluster status
---- Cluster Status ----
Ring ready: false

+--------------------+-------+-------+-----+-------+
|        node        |status | avail |ring |pending|
+--------------------+-------+-------+-----+-------+
|     [email protected] |joining|  up   |  0.0|   0.0 |
|     [email protected] |joining|  up   |  0.0|   0.0 |
| (C) [email protected] | valid |  up   | 71.9|  25.0 |
|     [email protected] | valid |  up   |  9.4|  25.0 |
|     [email protected] | valid |  up   | 10.9|  25.0 |
|     [email protected] | valid |  up   |  7.8|  25.0 |
+--------------------+-------+-------+-----+-------+

Key: (C) = Claimant; availability marked with '!' is unexpected


$ dev/dev4/riak/bin/riak admin cluster status
---- Cluster Status ----
Ring ready: true

+--------------------+-------+-------+-----+-------+
|        node        |status | avail |ring |pending|
+--------------------+-------+-------+-----+-------+
|     [email protected] |joining|  up   |  0.0|   0.0 |
|     [email protected] |joining|  up   |  0.0|   0.0 |
| (C) [email protected] | valid |  up   | 71.9|  25.0 |
|     [email protected] | valid |  up   |  9.4|  25.0 |
|     [email protected] | valid |  up   | 10.9|  25.0 |
|     [email protected] | valid |  up   |  7.8|  25.0 |
+--------------------+-------+-------+-----+-------+

Key: (C) = Claimant; availability marked with '!' is unexpected
$ dev/dev4/riak/bin/riak admin cluster status
---- Cluster Status ----
Ring ready: true

+--------------------+-------+-------+-----+-------+
|        node        |status | avail |ring |pending|
+--------------------+-------+-------+-----+-------+
|     [email protected] |joining|  up   |  0.0|   0.0 |
|     [email protected] |joining|  up   |  0.0|   0.0 |
| (C) [email protected] | valid |  up   | 71.9|  25.0 |
|     [email protected] | valid |  up   |  9.4|  25.0 |
|     [email protected] | valid |  up   | 10.9|  25.0 |
|     [email protected] | valid |  up   |  7.8|  25.0 |
+--------------------+-------+-------+-----+-------+

Key: (C) = Claimant; availability marked with '!' is unexpected
$ dev/dev4/riak/bin/riak admin transfers
'[email protected]' waiting to handoff 30 partitions
'[email protected]' waiting to handoff 30 partitions
'[email protected]' waiting to handoff 27 partitions
'[email protected]' waiting to handoff 30 partitions
'[email protected]' waiting to handoff 11 partitions

Active Transfers:


$ dev/dev4/riak/bin/riak admin transfers
'[email protected]' waiting to handoff 30 partitions
'[email protected]' waiting to handoff 30 partitions
'[email protected]' waiting to handoff 27 partitions
'[email protected]' waiting to handoff 30 partitions
'[email protected]' waiting to handoff 11 partitions

Active Transfers:


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions