-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Since Bluefin is adding concurrency primitives #34, it might be worth considering adding parallel primitives also.
I can implement this, but I'd rather wait for concurrency API to stabilize. In the meantime, my prototype is here.
The idea is to create primitives for programming with threads, that guarantee by construction, that programs written with them are deterministic (or pure, or from now on: parallel). This API would be to concurrent API, what runPureEff is to runEff. That is, sacrificing power for safety.
Main source of inspiration is monad-par which has many examples and very simple API.
The main motivation for this is that Bluefin is an effect system, which programs are supposed to be structured around. Dropping to monad-par for some computations and losing access to every effect is not desirable. Deterministic parallelism by construction, with access to runPureEff goodness, seems attractive. That being said, monad-par doesn't look very popular those days. I'll ask on Haskell Discourse about reasons why (link).
How does it to work?
We will rely on an ability to shield our parallel code from other effects in scope. This can be done as follows (I hope there is no workaround for this).
runNewEff :: (forall freshEs. Eff freshEs r) -> Eff es r
runNewEff _ = undefined
f = runPureEff do
evalState "" \s -> do
put s "you shall not pass"
runNewEff do
get s -- compilation errorThis is important, as we can't distinguish effects and can't trust them to be pure. The very same idea extends to shielding threads from sharing effects.
With our shields up, it is very easy to recreate spawn from monad-par.
spawn
:: (e :> es)
=> Scope sE e
-> Eff forkEs r
-> Eff es (Thread r sE es)(I'm still unsure about get put and IVar abstractions. put has a very nasty restriction that you can use it on given IVar only once, or it throws. I don't know what power do they give over spawn.)
This is sound, our main thread is pure, and our threads are pure, and they share no state and cannot communicate other than by their results. In some sense, our threads are just lazy values!
With just this one simple trick, we can have monad-par in Bluefin!
But I believe we can do more.
Streaming/Channels/Generators?. (idea from here)
If our threads are just pure lazy values of type a, then they can also be pure lazy values of type [a]. So instead of waiting for them to compute the whole list, we can allow them to publish results as they go. Then other threads can access those values as they are created, instead of waiting for the whole thing. This is sound as long as every access to such result is a separate stream, from the very beginning.
This is doable, but more complicated. Roughly, we can accumulate results in a list, and coordinate with the consumer thread's index of the last inserted value.
While streaming is cool, it usually involves IO. This is also fake streaming, as we have to keep the whole list in memory, in case some other thread wishes to access it too. But there might be problems that are easier to express this way. This has to be investigated. Plus, I didn't try to integrate it with Coroutine effect, would be great if it can be done.
Checked exceptions/early return/jump.
Now, this is a banger. The idea is mine (but I'm sure somebody did it already, if it is doable), so I hope I won't feel embarrassed about it, if it doesn't pan out. The problem with exceptions is that they easily create race conditions.
Let's look at this code, it is a pure function.
validateList :: [Int] -> Either String [Int]
validateList xs = runPureEff do
try \e -> do
when (1 `elem` xs) do throw e "1"
when (2 `elem` xs) do throw e "2"
when (3 `elem` xs) do throw e "3"
return xsBut what about this?
validateListPar :: [Int] -> Either String [Int]
validateListPar xs = runPureEff do
try \(e :: Exception String e) -> do
runParallelEx e \par ex -> do
t2 <- pureForkEx par ex \p1 ex1 -> do
when (2 `elem` xs) do pureThrow p1 ex1 "2"
t3 <- pureForkEx par ex \p1 ex1 -> do
when (3 `elem` xs) do pureThrow p1 ex1 "3"
when (1 `elem` xs) do pureThrow par ex "1"
pureGet par t2
pureGet par t3
return xsWith Bluefin exceptions as they are, this is bad. We don't know which thread will throw first.
The idea is as follows. There is a special thread in our parallel computation, let's call it main. Main is special in two ways, every thread originates from it, and it is the thread that returns the final value. All we have to do is to track when a value from a thread that has thrown an exception is requested by main! validateListPar [3,2] will always return Left "2". It is a bit more complicated when threads create other threads, but a simple state machine can do it. I'll have to give it a go in TLA+, but it looks dauntingly doable.
Exceptions in pure code.
monad-par docs don't mention exceptions. It might be a bit of a taboo in pure parallel community :) Those exceptions will not be deterministic, unless we apply checked exceptions trick. But that might be not worth the effort.
Schedulers.
monad-par offers a few schedulers. As parallelism is all about performance, it might be important. I know nothing about schedulers, nor I'm planing to implement any. My plan for this is to cross fingers that GHC runtime will do (and benchmark).
API considerations.
I'll leave those for now. Other that guaranteing that only special effects can cross threads boundaries and slapping NFData here and there, I'm not that concerned.