PersistClient worker_map allows for hotspots and possibly out of order data #33

mitchell-es · 2014-10-29T19:10:17Z

In persist.py:MultiWorkerQueue:get_worker() only the oidset_name and device_name are used to determine which persist worker should get a task. This results in hotspots if an oidset get an especially large number of instances.

In this specific case a circuit testing configuration on a router is utilizing a large number of epipes causing a hotspot for one particular persist worker which cannot keep up. Since the work unit is essentially oidset:device there isn't an obvious way to break up an especially large work unit to better spread the load.

There may be a potential bug lying in wait as well since the current code will basically divide the work amongst queues in a relatively non-determinisitc way. If one was to restart espolld without draining all the queues in memcache entirely (or restarting memcache) then individual work units will almost certainly end up in different queues after the restart resulting in possible out-of-order writes.

mitchell-es added bug enhancement labels Oct 29, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PersistClient worker_map allows for hotspots and possibly out of order data #33

PersistClient worker_map allows for hotspots and possibly out of order data #33

mitchell-es commented Oct 29, 2014

PersistClient worker_map allows for hotspots and possibly out of order data #33

PersistClient worker_map allows for hotspots and possibly out of order data #33

Comments

mitchell-es commented Oct 29, 2014