-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse domains rather than spawning them for each parallel test #457
base: main
Are you sure you want to change the base?
Conversation
I removed the dependency to Domainslib to use Stdlib mutexes. Since @jmid mentioned in a discussion that he wasn’t fond of the user having to wrap the test runner in a call to
@jmid also argued that spawning a pair of Domains for each pair of parallel command sequences allows to test them for a clean state. Although it’s true that it means an initially empty minor heap, I can’t think of tests where an initial empty minor heap is a better test context. |
f3b69f2
to
30e65c2
Compare
(With some advice and help from @fabbing to switch to mutexes!) |
Thanks Olivier! |
Strange, it should indeed spawn strictly less domains.
Normally, no, the |
Thanks again for this! 🙏
|
Well spotted, thanks! There seems to be a lot of failures still, I’ll have a look at it on my spare time at some point… |
I think I figured this one out too: the hint was The opam install workflows will still fail, due to |
There are a few, e.g., tests of I think we misunderstood eachother however: For end-users, and assuming the runtime is bug-free, Domain reuse may be less of a worry. |
I see, it makes sense to me too, then, to make domain re-use an option. |
Given that domains are costly to spawn and join, and parallel tests in multicoretests do it many thousands of times, I wanted to look at it more closely. A quick perf profile on
src/array/lin_tests.ml
shows that more than 50 % of the time is spent inDomain.spawn
. I wanted to try spawning domains only at the start of the test sequence and reusing them. Then I realised that this is exactly what Domainslib provides with domain pools!This proof of concept PR simply replaces
Domain.spawn
withDomainslib.Task.async
andDomain.join
withDomainslib.Task.await
. The approach implies to setup a domain pool and pass it to the tests. I have editedsrc/array/{lin,stm}_tests.ml
to reflect this.The results are rather encouraging:
src/array/lin_tests.ml
is cut down from about 4.5 s to 0.3 s! (With extremely far off statistical outliers due to bad luck in interleaving search.) However, this is a favourable case as the test’s commands are very cheap to run. The same comparison on my extensive Dynarray STM parallel test gives 24 s vs 12 s for the version with domain reuse, so merely a 2x speedup.If depending on Domainslib is an issue, the mechanism can be reimplemented using atomics, mutexes and condition variables easily enough. (A reimplementation would likely squeeze some more performance out by not requiring a work-stealing queue.)
It looks like about 20 % of the time is now taken by
Sys.cpu_relax
, so further improvement may be possible still.