[Questions] RabbitMQ broker/node connection draining instantly closes all connections without grace period. #14574
-
Community Support Policy
RabbitMQ version used4.0.9 Erlang version used27.3.x Operating system (distribution) useddebian-12-r1 How is RabbitMQ deployed?Community Docker image rabbitmq-diagnostics status outputSee https://www.rabbitmq.com/docs/cli to learn how to use rabbitmq-diagnostics
Logs from node 1 (with sensitive values edited out)See https://www.rabbitmq.com/docs/logging to learn how to collect logs
Logs from node 2 (if applicable, with sensitive values edited out)See https://www.rabbitmq.com/docs/logging to learn how to collect logs
Logs from node 3 (if applicable, with sensitive values edited out)See https://www.rabbitmq.com/docs/logging to learn how to collect logs
rabbitmq.confSee https://www.rabbitmq.com/docs/configure#config-location to learn how to find rabbitmq.conf file location
Steps to deploy RabbitMQ clusternot applicable Steps to reproduce the behavior in question
advanced.configSee https://www.rabbitmq.com/docs/configure#config-location to learn how to find advanced.config file location
Application code
(I'm using the java client 5.25.0)
These are the logs from the custom java application/client side.
Kubernetes deployment file# Relevant parts of K8S deployment that demonstrate how RabbitMQ is deployed
# PASTE YAML HERE, BETWEEN BACKTICKS What problem are you trying to solve?My assumption when it comes to connection draining and in general when upgrading server nodes in a clustered scenario is the following:
By looking at the logs from the server and from what I can see in our application Step 3 (waiting for the clients to reconnect) basically does not exist. Once the consumer has been cancelled the connection is forcefully closed by sending TCP RST almost instantly. My questions:
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
@oskar-wicht "connection draining" is not a term used anywhere in our docs. I assume you mean Maintenance mode. None of the protocols RabbitMQ support have a provision for such "shutdown notifications". A connection to a node can fail at any moment, which is why explicit confirmations for both consumers and (less commonly) publishers exist in every protocol. All outstanding deliveries are automatically requeued. If a node is stopped, the assumption is that a client reconnects to another node. So what would such a "shutdown advisory" really add is less than obvious to me. In any case, you are welcome to go ahead and try to contribute a solution to
What you will find out, as our team has many years ago, is that your client will need to have a local on-disk storage, which is not always an option in today's day and age (many applications get like 50 MiB of disk space if not less). So such an "accumulating publisher" will easily run out of memory or disk space, and may not even have a reasonable amount of disk space to begin with, so you can develop a specialized library with certain assumptions about the deployment environment (it will be a fair amount of effort) but not a general solution in a client library for everyone to use. That said, if someone wants to investigate what a solution even in just one client (out of dozens for AMQP 0-9-1 and AMQP 1.0 alone) might look like, they are welcome to do it. Asking "should there be…" is not how change happens in open source software. |
Beta Was this translation helpful? Give feedback.
@oskar-wicht "connection draining" is not a term used anywhere in our docs. I assume you mean Maintenance mode.
None of the protocols RabbitMQ support have a provision for such "shutdown notifications". A connection to a node can fail at any moment, which is why explicit confirmations for both consumers and (less commonly) publishers exist in every protocol. All outstanding deliveries are automatically requeued.
If a node is stopped, the assumption is that a client reconnects to another node. So what would such a "shutdown advisory" really add is less than obvious to me.
In any case, you are welcome to go ahead and try to contribute a solution to