Skip to content

Releases: louthy/language-ext

Refining the Maybe.MonadIO concept

07 May 18:09
Compare
Choose a tag to compare
Pre-release

A previous idea to split the MonadIO trait into two traits: Traits.MonadIO and Maybe.MonadIO - has allowed monad-transformers to pass IO functionality down the transformer-chain, even if the outer layers of the transformer-chain aren't 'IO capable'.

This works as long as the inner monad in the transformer-chain is the IO<A> monad.

There are two distinct types of functionality in the MonadIO trait:

  • IO lifting functionality (via MonadIO.LiftIO)
  • IO unlifting functionality (via MonadIO.ToIO and MonadIO.MapIO)

Problem no.1

It is almost always possible to implement LiftIO, but it is often impossible to implement ToIO (the minimum required unlifting implementation) without breaking composition laws.

Much of the 'IO functionality for free' of MonadIO comes from leveraging ToIO (for example, Repeat, Fork, Local, Await, Bracket, etc.) -- and so if ToIO isn't available and has a default implementation that throws an exception, then Repeat, Fork, Local, Await, Bracket, etc. will also all throw.

This feels wrong to me.

Problem no.2

Because of the implementation hierarchy:

Maybe.MonadIO<M>
      ↓
   Monad<M>
      ↓
  MonadIO<M>

Methods like LiftIO and ToIO, which have default-implementations (that throw) in Maybe.MonadIO<M>, don't have their overridden implementations enforced when someone implements MonadIO<M>. We can just leave LiftIO and ToIO on their defaults, which means inheriting from MonadIO<M> has no implementation guarantees.

Solution

  1. Split MonadIO (and Maybe.MonadIO) into distinct traits:
    • MonadIO and Maybe.MonadIO for lifting functionality (LiftIO)
    • MonadUnliftIO and Maybe.MonadUnliftIO for unlifting functionality (ToIO and MapIO)
    • The thinking here is that when unlifting can't be supported (in types like StateT and OptionT) then we only implement MonadIO
    • but in types where unlifting can be supported we implement both MonadIO and MonadUnliftIO.
  2. In MonadIO and MonadUnliftIO (the non-Maybe versions) we make abstract the methods that previously had default virtual (exception throwing) implementations.
    • That means anyone stating their type supports IO must implement it!
  3. Make all methods in Maybe.MonadIO and Maybe.MonadUnliftIO have the *Maybe suffix (so LiftIOMaybe, ToIOMaybe, etc.)
    • The thinking here is that for monad-transformer 'IO passing' we can still call the Maybe variants, but in the code it's declarative, we can see it might not work.
    • Then in MonadIO and MonadUnliftIO (the non-Maybe versions) we can override LiftIOMaybe, ToIOMaybe, and MapIOMaybe and get them to invoke the bespoke LiftIO, ToIO, and MapIO from MonadIO and MonadUnliftIO.
    • That means all default functionality Repeat, Fork, Local, Await, Bracket, gets routed to the bespoke IO functionality for the type.

The implementation hierarchy now looks like this:

   Maybe.MonadIO<M>
         ↓
Maybe.MonadUnliftIO<M>
         ↓
      Monad<M>
         ↓
     MonadIO<M>
         ↓
  MonadUnliftIO<M>

This should (if I've got it right) lead to more type-safe implementations, fewer exceptional errors for IO functionality not implemented, and a slightly clearer implementation path. It's more elegant because we override implementations in MonadIO and MonadUnliftIO, not the Maybe versions. So, it feels more 'intentional'.

For example, this will work, because ReaderT supports lifting and unlifting because it implements MonadUnliftIO

    ReaderT<E, IO, A> mx;

    var my = mx.ForkIO();    // compiles

Whereas this won't compile, because StateT can only support lifting (by implementing MonadIO):

    StateT<S, IO, A> mx;

    var my = mx.ForkIO();    // type-constraint error

If you tried to implementing MonadUnliftIO for StateT you quickly run into the fact that StateT (when run) yields a tuple, which isn't compatible with the singleton value needed for ToIO. The only way to make it work is to drop the yielded state, which breaks composition rules.

Previously, this wasn't visible to the user because it was hidden in default implementations that threw exceptions.

@micmarsh @hermanda19 if you are able to cast a critical eye on this and let me know what you think, that would be super helpful?

I ended up trying a number of different approaches and my eyes have glazed over somewhat, so treat this release with some caution. I think it's good, but critique and secondary eyes would be helpful! That goes for anyone else interested too.

Thanks in advance 👍

IObservable support in Source and SourceT

30 Apr 21:20
Compare
Choose a tag to compare

IObservable can now be lifted into Source and SourceT types (via Source.lift, SourceT.lift, and SourceT.liftM).

Source or SourceT is now supports lifting of the following types:

  • IObservable
  • IEnumerable
  • IAsyncEnumerable
  • System.Threading.Channels.Channel

And, because both Source and SourceT can be converted to Producer and ProducerT (via ToProducer and ToProducerT), all of the above types can therefore also be used in Pipes.

More general support for foldables coming soon

LanguageExt.Streaming + MonadIO + Deriving

23 Apr 20:07
Compare
Choose a tag to compare

Features:

  • New streaming library
    • Transducers are back
    • Closed streams
      • Pipes
    • Open streams
      • Source
      • SourceT
      • Sink
      • SinkT
      • Conduit
      • ConduitT
    • Open to closed streams
  • Deprecated Pipes library
  • MonadIO
  • Deriving
  • Bug fixes

New streaming library

A seemingly innocuous bug in the StreamT type opened up a rabbit hole of problems that needed a fundamental rewrite to fix. In the process more and more thoughts came to my mind about bringing the streaming functionality under one roof. So, now, there's a new language-ext library LanguageExt.Streaming and the LanguageExt.Pipes library has been deprecated.

This is the structure of the Streaming library:

image

Transducers are back

Transducers were going to be the big feature of v5 before I worked out the new trait-system. They were going to be too much effort to bring in + all of the traits, but now with the new streaming functionality they are hella useful again. So, I've re-added Transducer and a new TransducerM (which can work with lifted types). Right now the functionality is relatively limited, but you can extend the set of transducers as much as you like by deriving new types from Transducer and TransducerM.

Documentation

The API documentation has some introductory information on the streaming functionality. It's a little light at the moment because I wanted to get the release done, but it's still useful to look at:


The Streaming library of language-ext is all about compositional streams. There are two key types of streaming
functionality: closed-streams and open-streams...

Closed streams

Closed streams are facilitated by the Pipes system. The types in the Pipes system are compositional
monad-transformers
that 'fuse' together to produce an EffectT<M, A>. This effect is a closed system,
meaning that there is no way (from the API) to directly interact with the effect from the outside: it can be executed
and will return a result if it terminates.

The pipeline components are:

  • ProducerT<OUT, M, A>
  • PipeT<IN, OUT, M, A>
  • ConsumerT<IN, M, A>

These are the components that fuse together (using the | operator) to make an EffectT<M, A>. The
types are monad-transformers that support lifting monads with the MonadIO trait only (which constrains M). This
makes sense, otherwise the closed-system would have no effect other than heating up the CPU.

There are also more specialised versions of the above that only support the lifting of the Eff<RT, A> effect-monad:

  • Producer<RT, OUT, A>
  • Pipe<RT, IN, OUT, A>
  • Consumer<RT, IN, A>

They all fuse together into an Effect<RT, A>

Pipes are especially useful if you want to build reusable streaming components that you can glue together ad infinitum.
Pipes are, arguably, less useful for day-to-day stream processing, like handling events, but your mileage may vary.

More details on the Pipes page.

Open streams

Open streams are closer to what most C# devs have used classically. They are like events or IObservable streams.
They yield values and (under certain circumstances) accept inputs.

  • Source and SourceT yield values synchronously or asynchronously depending on their construction. Can support multiple readers.
  • Sink and SinkT receives values and propagates them through the channel they're attached to. Can support multiple writers.
  • Conduit and ConduitT provides and input transducer (acts like a Sink), an internal buffer, and an output transducer (acts like a Source). Supports multiple writers and one reader. But can yield a Source`SourceT` that allows for multiple readers.

I'm calling these 'open streams' because we can Post values to a Sink/SinkT and we can Reduce values yielded by
Source/SourceT. So, they are 'open' for public manipulation, unlike Pipes which fuse the public access away.

Source

Source<A> is the 'classic stream': you can lift any of the following types into it: System.Threading.Channels.Channel<A>,
IEnumerable<A>, IAsyncEnumerable<A>, or singleton values. To process a stream, you need to use one of the Reduce
or ReduceAsync variants. These take Reducer delegates as arguments. They are essentially a fold over the stream of
values, which results in an aggregated state once the stream has completed. These reducers can be seen to play a similar
role to Subscribe in IObservable streams, but are more principled because they return a value (which we can leverage
to carry state for the duration of the stream).

Source also supports some built-in reducers:

  • Last - aggregates no state, simply returns the last item yielded
  • Iter - this forces evaluation of the stream, aggregating no state, and ignoring all yielded values.
  • Collect - adds all yielded values to a Seq<A>, which is then returned upon stream completion.

SourceT

SourceT<M, A> is the classic-stream embellished - it turns the stream into a monad-transformer that can
lift any MonadIO-enabled monad (M), allowing side effects to be embedded into the stream in a principled way.

So, for example, to use the IO<A> monad with SourceT, simply use: SourceT<IO, A>. Then you can use one of the
following static methods on the SourceT type to lift IO<A> effects into a stream:

  • SourceT.liftM(IO<A> effect) creates a singleton-stream
  • SourceT.foreverM(IO<A> effect) creates an infinite stream, repeating the same effect over and over
  • SourceT.liftM(Channel<IO<A>> channel) lifts a System.Threading.Channels.Channel of effects
  • SourceT.liftM(IEnumerable<IO<A>> effects) lifts an IEnumerable of effects
  • SourceT.liftM(IAsyncEnumerable<IO<A>> effects) lifts an IAsyncEnumerable of effects

Obviously, when lifting non-IO monads, the types above change.

SourceT also supports the same built-in convenience reducers as Source (Last, Iter, Collect).

Sink

Sink<A> provides a way to accept many input values. The values are buffered until consumed. The sink can be
thought of as a System.Threading.Channels.Channel (which is the buffer that collects the values) that happens to
manipulate the values being posted to the buffer just before they are stored.

This manipulation is possible because the Sink is a CoFunctor (contravariant functor). This is the dual of Functor:
we can think of Functor.Map as converting a value from A -> B. Whereas CoFunctor.Comap converts from B -> A.

So, to manipulate values coming into the Sink, use Comap. It will give you a new Sink with the manipulation 'built-in'.

SinkT

SinkT<M, A> provides a way to accept many input values. The values are buffered until consumed. The sink can
be thought of as a System.Threading.Channels.Channel (which is the buffer that collects the values) that happens to
manipulate the values being posted to the buffer just before they are stored.

This manipulation is possible because the SinkT is a CoFunctor (contravariant functor). This is the dual of Functor:
we can think of Functor.Map as converting a value from A -> B. Whereas CoFunctor.Comap converts from B -> A.

So, to manipulate values coming into the SinkT, use Comap. It will give you a new SinkT with the manipulation 'built-in'.

SinkT is also a transformer that lifts types of K<M, A>.

Conduit

Conduit<A, B> can be pictured as so:

+----------------------------------------------------------------+
|                                                                |
|  A --> Transducer --> X --> Buffer --> X --> Transducer --> B  |
|                                                                |
+----------------------------------------------------------------+
  • A value of A is posted to the Conduit (via Post)
  • It flows through an input Transducer, mapping the A value to X (an internal type you can't see)
  • The X value is then stored in the conduit's internal buffer (a System.Threading.Channels.Channel)
  • Any invocation of Reduce will force the consumption of the values in the buffer
  • Flowing each value X through the output Transducer

So the input and output transducers allow for pre and post-processing of values as they flow through the conduit.
Conduit is a CoFunctor, call Comap to manipulate the pre-processing transducer. Conduit is also a Functor, call
Map to manipulate the post-processing transducer. There are other non-trait, but common behaviours, like FoldWhile,
Filter, Skip, Take, etc.

Conduit supports access to a Sink and a Source for more advanced processing.

ConduitT

ConduitT<M, A, B> can be pictured as so:

+------------------------------------------------------------------------------------------+
|                                                                                          |
|  K<M, A> --> TransducerM --> K<M, X> --> Buffer --> K<M, X> --> TransducerM --> K<M, B>  |
|                                                                                          |
+------------------------------------------------------------------------------------------+
  • A value of K<M, A> is posted to the Conduit (via Post)
  • It flows through an input TransducerM, mapping the K<M, A> value to K<M, X> (an internal type you can't see)
  • The K<M, X> value is then stored in the conduit's internal buffer (a `System.Th...
Read more

IO 'acquired resource tidy up' bug-fix

07 Apr 17:57
Compare
Choose a tag to compare
Pre-release

This issue highlighted an acquired resource tidy-up issue that needed tracking down...

The IO monad has an internal state-machine. It tries to run that synchronously until it finds an asynchronous operation. If it encounters an asynchronous operation then it switches to a state-machine that uses the async/await machinery. The benefit of this is that we have no async/await overhead if there's no asynchronicity and only use it when we need it.

But... the initial synchronous state-machine used a try/finally block that was used to tidy up the internally allocated EnvIO (and therefore any acquired resources). This is problematic when switching from sync -> async as the try/finally isn't then sequenced correctly.

It could have been worked-around by manually providing an EnvIO to Run or RunAsync.

That was a slightly awkward one to track down. Should be fixed now!

Pipes refactor and Cofunctors

14 Feb 20:40
Compare
Choose a tag to compare
Pre-release

LanguageExt Pipes Background

Part of the v5 refresh was to migrate the Pipes functionality to be a proper monad-transformer (in v4 it's a transformer too, but it can only lift Eff<RT, A>, rather than the more general K<M, A> where M : Monad<M>). I completed the generalisation work a while back, but it had some problems:

For any users of pipes it was going to be a big upheaval

Obviously, v5 is a big change, but where possible I want the migrations to be quite mechanical - it wasn't going to be. That doesn't mean I shouldn't 'go for it', but I'm trying to make sure that every bit of pain a user has to go through to move from v4 to v5 is strongly justified and will lead to a better experience once migrated.

It was inconsistently named

The core type Proxy, and the derived types: Producer, Consumer, Pipe, etc. don't follow the monad-transformer naming convention of having a T suffix. Really, if they're going to be generalised for any monad then they should be called ProducerT, ConsumerT, PipeT, ...

Pipes is hard to use

This is not a new problem with v5. I made Pipes into a 1-for-1 clone of the Haskell Pipes library. Even in Haskell they can be quite hard to use as you chase alignment of generics. The desire for pipes to support: producers, pipes, clients, servers, and more seems (in hindsight) to be too greedy.

Hard to retrofit

The generalisation process wasn't working well in some areas. The Producer.merge was blocking and fixing it with the original code was challenging to say the least.

LanguageExt Pipes Refresh

So, I decided to take a step back. Instead of trying to make an exact clone of the Haskell version, I thought I'd build it from scratch in a way that's more 'csharpy', consistent, and simpler. In particular I looked at the techniques I used to refactor the IO monad (to support recursion, asynchrony, etc.) and brought them into a new Pipes implementation.

I also decided to drop support for Client, Server, Request, Response, and all of the other stuff that I suspect nobody used because they were too hard.

That means:

  • There's no need for an underlying Proxy<A1, A, B1, B, M, R> interface. This was only needed to support all flavours of client, server, producer, consumer, etc.
  • The base-type of all pipes related types is: PipeT<IN, OUT, M, R>
    • This is clearly easier to understand
    • A ProducerT<OUT, M, R> is simply a pipe with the input set to Unit:
      • PipeT<Unit, OUT, M, R>
    • A ConsumerT<IN, M, R> is simply a pipe with the output set to Void:
      • PipeT<IN, Void, M, R>
    • A EffectT<M, R> is simply a pipe with the input set to Unit and the output set to Void. This enclosed effect is the result of fusing producer, pipe, and consumers together:
      • PipeT<Unit, Void, M, R>

Those four types: ProducerT, PipeT, ConsumerT, and EffectT are the new simplified and, fully generalised, version of pipes.

Now that the generalised implementation follows the naming convention of having a T suffix for transformers, we can use the original names Producer, Pipe, Consumer, and Effect to provide a more specialised version that only works with Eff<RT, A> (like the original pipes).

So,

The good thing about this refactor is that there really is only one implementation of the pipes functionality and it all sits in the PipesT.DSL.cs . This focused DSL is much easier to manage than before - it was implemented in a similar way before, but it's now just much easier for a C# dev to consume. I have put a real effort into making the interfaces, modules, preludes, etc. consistent for all types.

Pipes concurrency

Concurrency wasn't front-and-centre in the original implementation. In some senses it was 'bolted on'. You got concurrency from the lifted Eff type and from the Producer.merge function, but that was it.

Now pipes has first-class support for concurrency:

  • Support for IEnumerable and IAsyncEnumerable with ProducerT.yieldAll, Producer.yieldAll, PipeT.yieldAll, and Pipe.yieldAll.
  • Unlike the original, the core DSL supports the lifting of tasks
    • Which means direct support from: PipeT.liftT, PipeT.liftM, Pipe.liftT, Pipe.liftM, ProducerT.liftT, ProducerT.liftM, Producer.liftT, Producer.liftM, ConsumerT.liftT, ConsumerT.liftM, Consumer.liftT, Consumer.liftM, EffectT.liftT, EffectT.liftM, Effect.liftT, and Effect.liftM!

Mailbox, Inbox, and Outbox

Inspired by the original Pipes.Concurrency library, I implemented Mailbox, Inbox, and Outbox. It's not a clone of the original, just inspired by. A Mailbox consists of an Inbox and an Outbox. The inbox receives values posted to it. The outbox yields values posted to the inbox upon request.

Backing the Mailbox is a System.Threading.Channels.Channel. You can create a Mailbox like so:

var mailbox = Mailbox.spawn<string>();

A mailbox is simply a record with an Inbox and Outbox:

public record Mailbox<A, B>(Inbox<A> Inbox, Outbox<B> Outbox)

You can Post to the Mailbox and you can Read from the Mailbox. But, even more critically, you can call:

  • mailbox.ToConsumer<M>() - to get a consumer of values being posted into the Inbox
  • mailbox.ToProducer<M>() - to get a producer of values being yielded into the Outbox

A good example of why this is useful is the new Producer.merge function:

public static ProducerT<OUT, M, Unit> merge<OUT, M>(Seq<ProducerT<OUT, M, Unit>> producers) where M : Monad<M> =>
    from mailbox in Pure(Mailbox.spawn<OUT>())
    from forks   in forkEffects(producers, mailbox)
    from _       in mailbox.ToProducerT<M>()
    from x       in forks.Traverse(f => f.Cancel).As()
    select unit;

static K<M, Seq<ForkIO<Unit>>> forkEffects<M, OUT>(
    Seq<ProducerT<OUT, M, Unit>> producers,
    Mailbox<OUT, OUT> mailbox)
    where M : Monad<M> =>
    producers.Map(p => (p | mailbox.ToConsumerT<M>()).Run())
             .Traverse(ma => ma.ForkIO());

The merge function gets a collection of producers. What we want is for those to run concurrently so we can receive the values as they happen. Then we want to produce a single merged stream of values.

This creates the merged stream Mailbox:

    from mailbox in Pure(Mailbox.spawn<OUT>())

In forkEffects we process each producer p and pipe its values to mailbox.ToConsumerT:

p | mailbox.ToConsumerT<M>()

So, we get a ConsumerT for the merged-stream's Mailbox. It consumes every value from p, fusing into an EffectT. We then Run() that EffectT which gives us the underlying M monad:

(p | mailbox.ToConsumerT<M>()).Run()

We do this for every ProducerT, which means the merged-values Mailbox gets every value yielded from upstream.

producers.Map(p => (p | mailbox.ToConsumerT<M>()).Run())

Finally, we ForkIO each EffectT so that it can run in parallel.

.Traverse(ma => ma.ForkIO())

Back to the merge function, we then access the other side of the mailbox by asking for the Outbox producer, using ToProducerT:

from _ in mailbox.ToProducerT<M>()

This will then yield all of the merged values downstream (whilst there are values to yield). Once complete, we tidy up the forks:

forks.Traverse(f => f.Cancel).As()

Cofunctor

Mailbox is pretty powerful in its own right and doesn't need pipes to function. This is a quick example of a loop that reads every value posted to a Mailbox and writes it to the console:

static IO<Unit> consumeAll(Mailbox<string, string> mailbox) =>
        from x in mailbox.Read()
        from _ in IO.lift(() => Console.WriteLine(x))
        from r in consumeAll(mailbox)
        select r;

Mailbox<A, B> has two type parameters: A represents the values coming in and B represents the values being yielded.

    A -> B

Values of type A are posted to Mailbox.Inbox and values of type B are yielded from Mailbox.Outbox.

If you call mailbox.Map<C>((B b) => ...) on Mailbox then you could imagine Mailbox being represented like this:

    A -> B -> C

The result is a Mailbox<A, C>, but internally there's a mapping of the values as they flow through.

Subsequent calls to Map<D>, and the like, would continue to transform the value being yielded from the Mailbox.Outbox:

    A -> B -> C -> D

But what if we wa...

Read more

Minor updates and fixes

01 Feb 14:52
Compare
Choose a tag to compare
Pre-release

This is a small release:

New:

  • Add awaitAll and awaitAny that work with ForkIO - #1440

Bug fix:

  • Arr.Apply returns an almost empty array with the first element equal to the last expected element - #1442

Iterator: a safe IEnumerator

25 Dec 20:08
Compare
Choose a tag to compare
Pre-release

Language-ext gained Iterable a few months back, which is a functional wrapper for IEnumerable. We now have Iterator, which is a more functional wrapper for IEnumerator.

IEnumerator is particularly problematic due to its mutable nature. It makes it impossible to share or leverage safely within other immutable types.

For any type where you could previously call GetEnumerator(), it is now possible to call GetIterator().

Iterator is pattern-matchable, so you can use the standard FP sequence processing technique:

public static A Sum<A>(this Iterator<A> self) where A : INumber<A> =>
    self switch
    {
        Iterator<A>.Nil                 => A.Zero,
        Iterator<A>.Cons(var x, var xs) => x + xs.Sum()
    };

Or, bog standard imperative processing:

for(var iter = Naturals.GetIterator(); !iter.IsEmpty; iter = iter.Tail)
{
    Console.WriteLine(iter.Head);
}

You need to be a little careful when processing large lists or infinite streams.. Iterator<A> uses Iterator<A>.Cons and Iterator<A>.Nil types to describe a linked-list of values. That linked-list requires an allocated object per item. That is not really a problem for most of us that want correctness over outright performance, it is a small overhead. But, the other side-effect of this is that if you hold a reference to the head item of a sequence and you're processing an infinite sequence, then those temporary objects won't be freed by the GC. Causing a space leak.

This will cause a space-leak:

var first = Naturals.GetIterator();
for(var iter = first; !iter.IsEmpty; iter = iter.Tail)
{
    Console.WriteLine(iter.Head);
}

first references the first Iterator<A>.Cons and every subsequent item via the Tail.

This (below) is OK because the iter reference keeps being overwritten, which means nothing is holding on the Head item in the sequence:

for(var iter = Naturals.GetIterator(); !iter.IsEmpty; iter = iter.Tail)
{
    Console.WriteLine(iter.Head);
}

This type is probably more useful for me when implementing the various core types of language-ext, but I can't be the only person who's struggled with IEnumerator and its horrendous design.

A good example of where I am personally already seeing the benefits is IO<A>.RetryUntil.

This is the original version:

public IO<A> RepeatUntil(
    Schedule schedule,
    Func<A, bool> predicate) =>
    LiftAsync(async env =>
              {
                  if (env.Token.IsCancellationRequested) throw new TaskCanceledException();
                  var token = env.Token;
                  var lenv  = env.LocalResources;
                  try
                  {
                      var result = await RunAsync(lenv);

                      // free any resources acquired during a repeat
                      await lenv.Resources.ReleaseAll().RunAsync(env);

                      if (predicate(result)) return result;

                      foreach (var delay in schedule.Run())
                      {
                          await Task.Delay((TimeSpan)delay, token);
                          result = await RunAsync(lenv);

                          // free any resources acquired during a repeat
                          await lenv.Resources.ReleaseAll().RunAsync(env);

                          if (predicate(result)) return result;
                      }

                      return result;
                  }
                  finally
                  {
                      // free any resources acquired during a repeat
                      await lenv.Resources.ReleaseAll().RunAsync(env);
                  }
              });      
    

Notice the foreach in there and the manual running of the item to retry with RunAsync. This has to go all imperative because there previously was no way to safely get the IEnumerator of Schedule.Run() and pass it around.

This is what RetryUntil looks like now:

public IO<A> RetryUntil(Schedule schedule, Func<Error, bool> predicate)
{
    return go(schedule.PrependZero.Run().GetIterator(), Errors.None);

    IO<A> go(Iterator<Duration> iter, Error error) =>
        iter switch
        {
            Iterator<Duration>.Nil =>
                IO.fail<A>(error),

            Iterator<Duration>.Cons(var head, var tail) =>
                IO.yieldFor(head)
                  .Bind(_ => BracketFail()
                               .Catch(e => predicate(e)
                                               ? IO.fail<A>(e)
                                               : go(tail, e)))
        };
}

Entirely functional, no imperative anything, and even (potentially) infinitely recursive depending on the Schedule. There's also no manual running of the IO monad with RunAsync, which means we benefit from all of the DSL work on optimising away the async/await machinery.

Future:

  • Potentially use Iterator in StreamT
  • Potentially use Iterator in Pipes
  • Potentially create IteratorT (although this would likely just be StreamT, so maybe a renaming)

IO refactor continued

23 Dec 21:41
Compare
Choose a tag to compare
IO refactor continued Pre-release
Pre-release

IO<A>

The work has continued following on from the last IO refactor release. The previous release was less about optimisation and more about correctness, this release is all about making the async/await state-machines disappear for synchronous operations.

  • The core Run and RunAsync methods have been updated to never await. If at any point an asynchronous DSL entry is encountered then processing is deferred to RunAsyncInternal (which does use await). Because, RunAsync uses ValueTask it's possible to run synchronous processes with next to zero overhead and still resolve to a fully asynchronous expression when one is encountered.
  • The DSL types have all been updated too, to try to run synchronously, if possible, and if not defer to asynchronous versions.
  • DSL state-machine support for resource tracking. It automatically disposes resources on exception-throw
  • DSL support for three folding types: IOFold, IOFoldWhile, IOFoldUntil (see 'Folding')
  • DSL Support for Final<F> (see Final<F>)

TODO

  • DSL support for Repeat* and Retry* - then all core capabilities can run synchronously if they are composed entirely of synchronous components.

Folding

The standard FoldWhile and FoldUntil behaviour has changed for IO (and will change for all FoldWhile and FoldUntil eventually: it dawned on me that it was a bit of a waste that FoldWhile was equivalent to FoldUntil but with a not on the predicate.

So, the change in behaviour is:

  • The FoldUntil predicate test (and potential return) is run after the fold-delegate has been run (so it gets the current-value + the fold-delegate updated-state).
    • This was the previous behaviour
  • The FoldWhile predicate test (and potential return) is run before the fold-delegate is run (so it gets the current-value + the current-state).

The benefit of this approach is that you can stop a fold-operation running if the state is already 'bad' with FoldWhile, whereas with FoldUntil you can exit once the fold-operation makes the state 'bad'. The difference is subtle, but it does give additional options.

Final<F> trait

Final<F> is a new trait-type to support try / finally behaviour. This has been implemented for IO for now. This will expand out to other types later. You can see from the updated implementation of Bracket how this works:

public IO<C> Bracket<B, C>(Func<A, IO<C>> Use, Func<Error, IO<C>> Catch, Func<A, IO<B>> Fin) =>
    Bind(x => Use(x).Catch(Catch).Finally(Fin(x)));

It's still early for this type, but I expect to provide @finally methods that work a bit like the @catch methods.

Unit tests for IO

One of the things that's holding up the full-release of v5 (other than the outstanding bugs in Pipes) is the lack of unit-tests for all of the new functionality. So, I've experimented using Rider's AI assistant to help me write the unit-tests. It's fair to say that it's not too smart, but at least it wrote a lot of the boilerplate. So, once I'd fixed up the errors, it was quite useful.

It's debatable whether it was much quicker or not. But, I haven't really spent any time with AI assistants, so I guess it might just be my inexperience of prompting them. I think it's worth pursuing to see if it can help me get through the unit-tests that are needed for v5.

IO refactor

19 Dec 17:39
Compare
Choose a tag to compare
IO refactor Pre-release
Pre-release

IO

I have refactored the IO<A> monad, which is used to underpin all side-effects in language-ext. The API surface is unchanged, but the inner workings have been substantially refactored. Instead of the four concrete implementations of the abstract IO<A> type (IOPure<A>, IOFail<A>, IOSync<A>, and IOAsync<A>), there is now a 'DSL' of operations deriving from IO<A> which are interpreted in the Run and RunAsync methods (it now works like a Free monad).

The DSL is also extensible, so if you have some behaviours you'd like to embed into the IO interpreter than you can derive from one of these four types.

The benefits of the refactored approach are:

  • Fixes this issue
  • Should have more performance (although not tested fully)
  • Extensible DSL
  • Can support infinite recursion

The last one is a big win. Previously, infinite recursion only worked in certain scenarios. It should work for all scenarios now.

Take a look at this infinite-loop sample:

static IO<Unit> infinite(int value) =>
    from _ in writeLine($"{value}")
    from r in infinite(value + 1)
    select r;

The second from expression recursively calls infinite which would usually blow the stack. However, now the stack will not blow and this example would run forever*.

*All LINQ expressions require a trailing select ..., that is technically something that needs to be invoked after each recursive call to infinite. Therefore, with the above implementation, we get a space-leak (memory is consumed for each loop).

To avoid that when using recursion in LINQ, you can use the tail function:

static IO<Unit> infinite(int value) =>
    from _ in writeLine($"{value}")
    from r in tail(infinite(value + 1))
    select r;

For other monads and monad-transformers that lift the IO monad into their stacks, you can use the tailIO function. This should bring infinite recursion to all types that use the IO monad (like Eff for example).

What tail does is says "We're not going run the select at all, we'll just return the result of infinite". That means we don't have to keep track of any continuations in memory. It also means you should never do extra processing in the select, just return the r as-is and everything will work: infinite recursion without space leaks.

tail is needed because the SelectMany used by LINQ has the final Func<A, B, C> argument to invoke after the Func<A, IO<B>> monad-bind function (which is the recursive one). The Func<A, B, C> is the trailing select and is always needed. It would be good if C# supported a SelectMany that is more like a regular monadic-bind and recognised the pattern of no additional processing in select, but we have to put up with the hand we're dealt.

Not doing work after a tail-call is a limitation of tail-recursion in every language that supports it. So, I'm OK being explicit about it with LINQ. Just be careful to not do any additional processing or changing of types in the select.

Note, if you don't use LINQ and instead use a regular monad-bind operation, then we don't need the tail call at all:

static IO<Unit> infinite(int value) =>
    writeLine($"{value}")
       .Bind(_ => infinite(value + 1));

That will run without blowing the stack and without space-leaks. Below is a chart of memory-usage after 670 million iterations:

image

What's nice about looking a the memory-graph is that, not only is it flat in terms of total-usage (around 26mb), it only ever uses the Gen 0 heap. This is something I've always said about functional-programming. We may we generate a lot of temporary objects (lambdas and the like), but they rarely live long enough to cause memory pressures in higher generations of the heap. Even though this is a very simple sample and you wouldn't expect that much pressure, the benefits of most of your memory usage being in Gen 0 is that you're likely using memory addresses already cached by the CPU -- so the churn of objects is less problematic that is often posited.

StreamT made experimental

I'm not sure how I'm going to fix this issue, so until I have a good idea, StreamT will be marked as [Experimental]. To use it, add this to the top of a file:

#pragma warning disable LX_StreamT

Conclusion

This was a pretty large change, so if you're using the beta in production code, please be wary of this release. And, if you spot any issues, please let me know.

XML documentation updates

17 Dec 20:39
Compare
Choose a tag to compare
Pre-release

One of the most idiotic things Microsoft ever did was to use XML as a 'comment' documentation-format when the language it is documenting is full of <, >, and & characters. I'd love to know who it was that thought this was a good idea.

They should be shunned forever!

Anyway, more seriously, I have ignored the 'well formed' XML documentation warnings, forever. I did so because I consider the readability of comments in code to be the most important factor. So, if I needed a < or a > and it looked OK in the source, then that was good enough for me.

I even built my own documentation generator that understood these characters and knew how to ignore them if the weren't known tags (something Microsoft should have done by now!)

I refused to make something like Either<L, R> turn into Either&lt;L, R&gt;

Anyway, there are places where this is problematic:

  • Inline documentation tooltips in IDEs
  • The new 'formatted' documentation of Rider (maybe VS too, but I haven't used it in ages).

The thing is, I still think the source-code is the most important place for readability, so requests like this I ignored until I could think of a better solution. Now I have a better solution which is to use alternative unicode characters that are close enough to <, >, and &, that they read naturally in source and in documentation, but are also not XML delimiters:

Original Replacement Unicode Example (before) Example (after)
< U+3008 Either<Error, A> Either〈Error, A〉
> U+3009 Either<Error, A> Either〈Error, A〉
& U+FF06 x && y x && y

The spacing isn't perfect and in certain situations they look a touch funky, but still legible, so I'm happy to go with this as it makes the documentation work everywhere without compromise. It's slightly more effort for me, but I'd rather do this than compromise the inline comments.

Over 120 source files have been updated with the new characters. There are now no XML documentation errors (and I've removed the NoWarn that suppressed XML documentation errors).

There may well be other unicode characters that are better, but at least now it's a simple search & replace if I ever decided to change them.