Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PersistClient worker_map allows for hotspots and possibly out of order data #33

Open
mitchell-es opened this issue Oct 29, 2014 · 0 comments

Comments

@mitchell-es
Copy link
Contributor

In persist.py:MultiWorkerQueue:get_worker() only the oidset_name and device_name are used to determine which persist worker should get a task. This results in hotspots if an oidset get an especially large number of instances.

In this specific case a circuit testing configuration on a router is utilizing a large number of epipes causing a hotspot for one particular persist worker which cannot keep up. Since the work unit is essentially oidset:device there isn't an obvious way to break up an especially large work unit to better spread the load.

There may be a potential bug lying in wait as well since the current code will basically divide the work amongst queues in a relatively non-determinisitc way. If one was to restart espolld without draining all the queues in memcache entirely (or restarting memcache) then individual work units will almost certainly end up in different queues after the restart resulting in possible out-of-order writes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant