Skip to content

Separate Registry pool size configuration #198

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

studzien
Copy link
Contributor

Under our workload (a few thousand topics, single digit to a few thousand subscribers each, a few thousand broadcasts per second from a single node in the cluster), we have started to experience instability of the PG2Worker processes (arrival rate > processing rate => message queue growing indefinitely).

We initially approached it by increasing the pool size, but this didn't help much.
After some investigation, we found that introducing more parallelism in processing incoming messages didn't help much because the PG2 worker processes did more work than before.

That's because Registry for :duplicate keys is sharded by pids, not by key (topic). So, to get all the subscribers, we need to go through all the partitions and get the topic subscribers from each of them.

Moreover, since the dispatch code is executed in the context of the worker process, the serialization cache is valid for just a single dispatch call.
Even if we fastlane serialization, we might execute it up to the number of partitions.

This PR allows configuring the Registry pool size to a value different from the PubSub pool size.
Under our workload, I got the best results with removing the Registry pool entirely, i.e., setting registry_pool_size: 1.

I'm happy to describe the new option in the documentation, but I'm wondering if there was a particular reason why the :duplicate registries are sharded by pids, not by keys. Perhaps that was done to remove contention on the ETS table for topics with many subscribers? Depending on the use case, I can see that picking a sharding method could be a Registry configuration option.

@SteffenDE SteffenDE requested a review from josevalim June 20, 2025 16:38
@josevalim
Copy link
Member

Perhaps that was done to remove contention on the ETS table for topics with many subscribers? Depending on the use case, I can see that picking a sharding method could be a Registry configuration option.

I believe it was to deal with the cases of very uneven distributions (also known as the Justin Bieber effect), where one topic could have million of entries more than others.

I think we can go ahead with this, perhaps call the option registry_size? Also, do you believe the previous PR is still relevant/necessary?

@studzien
Copy link
Contributor Author

I believe it was to deal with the cases of very uneven distributions (also known as the Justin Bieber effect), where one topic could have million of entries more than others.

Got it, I think it makes perfect sense when the Registry dispatch is executed in parallel (so we should get a shorter delivery time when we have a lot of entries).

I think we can go ahead with this, perhaps call the option registry_size? Also, do you believe the previous PR is still relevant/necessary?

Cool, will rename.
Yeah, the previous PR is orthogonal to this. We still want to increase the pool size without losing messages in the cluster. And the Registry pool size doesn't really matter when delivering messages in the cluster since it's local (and there's no mapping between the PubSub shard and the Registry shard).

@studzien
Copy link
Contributor Author

Hi again @josevalim.
I just renamed the option, added another test, and described it in the docs. Please feel free to adjust to what you prefer.
I'm not happy with how the new tests look (as they rely on implementation details), but I could not figure out how to make them better.

Do you think it would be reasonable to add an option to Registry for :duplicate keys, such as :partition_by = :keys | :pids (defaulting to :pids in order not to break existing assumptions) so that this can be tuned on the registry side as well?

@josevalim
Copy link
Member

Do you think it would be reasonable to add an option to Registry for :duplicate keys, such as :partition_by = :keys | :pids (defaulting to :pids in order not to break existing assumptions) so that this can be tuned on the registry side as well?

At first I don't see an issue with that, we can give it a try. :)

@josevalim josevalim merged commit fff23f8 into phoenixframework:main Jun 21, 2025
2 checks passed
@josevalim
Copy link
Member

💚 💙 💜 💛 ❤️

@studzien
Copy link
Contributor Author

At first I don't see an issue with that, we can give it a try. :)

Cool, I can prepare something the week after next.
I should also be able to see if there are any benefits, on topics with up to a few dozen thousand subscribers, though, not millions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants