-
Notifications
You must be signed in to change notification settings - Fork 128
Introduce broadcast_pool_size option to allow safe pool size migration #197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce broadcast_pool_size option to allow safe pool size migration #197
Conversation
Beautiful work. I am happy with everything here, we only need docs. Perhaps we can even convert those diagrams into mermaid diagrams? (and perhaps AI can automate that). |
Thanks for taking a look! |
@josevalim Here's my attempt to add AI-aided docs with graphs :D |
💚 💙 💜 💛 ❤️ |
👋
We need to increase the
Phoenix.PubSub
'spool_size
; however, we don't see a way to do this safely (i.e., without losing messages during deployment).For example, if we change the
pool_size: 1
option topool_size: 2
, we will encounter a situation where we'll have nodes with both settings running in the cluster. Then, if a message is broadcast frompool_size: 2
, a message can be sent to the shard number 2 viapg
. If a node runningpool_size: 1
will receive it, it won't be delivered to its subscribers:This draft PR attempts to address this issue by introducing a new option,
broadcast_pool_size
(that defaults topool_size
if unset). When set, the pool size of shards used for broadcasting messages will be smaller than that used for receiving messages and forwarding them to the local clients.The pool size change can then be deployed safely in the following two-step process (assuming we're already running our application with
pool_size: 1
):We deploy new version with

pool_size: 2
andbroadcast_pool_size: 1
. The new version has two shards participating inpg
, but still broadcasts messages using only one shard:This way, no messages broadcast from node two will be lost.
We deploy a new version with

pool_size: 2
. The new version has two shards that can receive and broadcast messages. The version deployed in step 1 can receive messages broadcast by the new version:When the deployment from step 2 is complete, all nodes are running pools with the new size:

If we need to decrease the pool size, we follow the same process but in the reverse order.
I'm opening the PR to discuss this mechanism; we can work on the exact naming of the parameters and documenting the above process in the documentation when the approach is validated.