Skip to content

Removing and then (re-)adding a member may crash #528

@mkuratczyk

Description

@mkuratczyk

Describe the bug

I had a bunch of quorum queues under load and was removing and then re-adding followers (using shrink/grow). The shrink/grow was always on rabbit-3, so the leader was not disturbed and another follower was working at all times. When re-adding a member, occasionally it would crash with:

     {ra_mt,commit,1,[{file,"src/ra_mt.erl"},{line,158}]},
     {ra_log,wal_write_batch,2,[{file,"src/ra_log.erl"},{line,1220}]},
     {ra_server,handle_follower,2,[{file,"src/ra_server.erl"},{line,1258}]},
     {ra_server_proc,handle_follower,2,
                     [{file,"src/ra_server_proc.erl"},{line,1188}]},
     {ra_server_proc,follower,3,[{file,"src/ra_server_proc.erl"},{line,845}]},
     {gen_statem,loop_state_callback,11,[{file,"gen_statem.erl"},{line,3735}]},
     {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,329}]}]
** Time-outs: {1,[{{timeout,tick},tick_timeout}]}

Full logs (grep for <0.3686.0> to get the most interesting bit):
logs.tgz

Reproduction steps

I had a quorum queue under load with large messages and a loop like this:

rabbitmq-queues -n rabbit-3 shrink rabbit-3@localhost
sleep 10
rabbitmq-queues -n rabbit-3 grow rabbit-3@localhost all

Expected behavior

No crash.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions