Adding parallel primitives

Since Bluefin is adding concurrency primitives #34, it might be worth considering adding parallel primitives also.

I can implement this, but I'd rather wait for concurrency API to stabilize. In the meantime, my prototype is [here](https://github.com/jeukshi/bluefin-ki).

The idea is to create primitives for programming with threads, that guarantee by construction, that programs written with them are deterministic (or pure, or from now on: parallel). This API would be to concurrent API, what `runPureEff` is to `runEff`. That is, sacrificing power for safety.

Main source of inspiration is [monad-par](https://hackage.haskell.org/package/monad-par) which has many examples and very simple API. 

The main motivation for this is that Bluefin is an effect system, which programs are supposed to be structured around. Dropping to `monad-par` for some computations and losing access to every effect is not desirable. Deterministic parallelism by construction, with access to `runPureEff` goodness, seems attractive. That being said, `monad-par` doesn't look very popular those days. I'll ask on Haskell Discourse about reasons why ([link](https://discourse.haskell.org/t/thoughts-on-monad-par/11233)).

How does it to work?

We will rely on an ability to shield our parallel code from other effects in scope. This can be done as follows (I hope there is no workaround for this).

```haskell
runNewEff :: (forall freshEs. Eff freshEs r) -> Eff es r
runNewEff _ = undefined

f = runPureEff do
    evalState "" \s -> do
        put s "you shall not pass"
        runNewEff do
            get s -- compilation error
```

This is important, as we can't distinguish effects and can't trust them to be pure. The very same idea extends to shielding threads from sharing effects.

With our shields up, it is very easy to recreate `spawn` from `monad-par`.

```haskell
spawn
    :: (e :> es)
    => Scope sE e
    -> Eff forkEs r
    -> Eff es (Thread r sE es)
```

(I'm still unsure about `get` `put` and `IVar` abstractions. `put` has a very nasty restriction that you can use it on given `IVar` only once, or it throws. I don't know what power do they give over `spawn`.)

This is sound, our main thread is pure, and our threads are pure, and they share no state and cannot communicate other than by their results. In some sense, our threads are just lazy values!

With just this one simple trick, we can have `monad-par` in Bluefin!

But I believe we can do more.

Streaming/Channels/Generators?. ([idea from here](https://github.com/simonmar/monad-par/blob/c712b4f09bdf9ed7e08959962beb55e8f1877e32/abstract-par/Control/Monad/Par/Class.hs#L141))

If our threads are just pure lazy values of type `a`, then they can also be pure lazy values of type `[a]`. So instead of waiting for them to compute the whole list, we can allow them to publish results as they go. Then other threads can access those values as they are created, instead of waiting for the whole thing. This is sound as long as every access to such result is a separate stream, from the very beginning.

This is doable, but more complicated. Roughly, we can accumulate results in a list, and coordinate with the consumer thread's index of the last inserted value.

While streaming is cool, it usually involves IO. This is also fake streaming, as we have to keep the whole list in memory, in case some other thread wishes to access it too. But there might be problems that are easier to express this way. This has to be investigated. Plus, I didn't try to integrate it with Coroutine effect, would be great if it can be done.

Checked exceptions/early return/jump.

Now, this is a banger. The idea is mine (but I'm sure somebody did it already, if it is doable), so I hope I won't feel embarrassed about it, if it doesn't pan out. The problem with exceptions is that they easily create race conditions.

Let's look at this code, it is a pure function.

```haskell
validateList :: [Int] -> Either String [Int]
validateList xs = runPureEff do
    try \e -> do
        when (1 `elem` xs) do throw e "1"
        when (2 `elem` xs) do throw e "2"
        when (3 `elem` xs) do throw e "3"
        return xs
```

But what about this?

```haskell
validateListPar :: [Int] -> Either String [Int]
validateListPar xs = runPureEff do
    try \(e :: Exception String e) -> do
        runParallelEx e \par ex -> do
            t2 <- pureForkEx par ex \p1 ex1 -> do
                when (2 `elem` xs) do pureThrow p1 ex1 "2"
            t3 <- pureForkEx par ex \p1 ex1 -> do
                when (3 `elem` xs) do pureThrow p1 ex1 "3"
            when (1 `elem` xs) do pureThrow par ex "1"
            pureGet par t2
            pureGet par t3
            return xs
```

With Bluefin exceptions as they are, this is bad. We don't know which thread will throw first.

The idea is as follows. There is a special thread in our parallel computation, let's call it main. Main is special in two ways, every thread originates from it, and it is the thread that returns the final value. All we have to do is to track when a value from a thread that has thrown an exception is requested by main! `validateListPar` [3,2] will _always_ return `Left "2"`. It is a bit more complicated when threads create other threads, but a simple state machine can do it. I'll have to give it a go in TLA+, but it looks dauntingly doable.

Exceptions in pure code.

`monad-par` docs don't mention exceptions. It might be a bit of a taboo in pure parallel community :) Those exceptions will not be deterministic, unless we apply checked exceptions trick. But that might be not worth the effort.

Schedulers.

`monad-par` offers a few schedulers. As parallelism is all about performance, it might be important. I know nothing about schedulers, nor I'm planing to implement any. My plan for this is to cross fingers that GHC runtime will do (and benchmark).

API considerations.

I'll leave those for now. Other that guaranteing that only special effects can cross threads boundaries and slapping `NFData` here and there, I'm not that concerned.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding parallel primitives #37

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Adding parallel primitives #37

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions