Releases: louthy/language-ext
Refining the Maybe.MonadIO concept
A previous idea to split the MonadIO
trait into two traits: Traits.MonadIO
and Maybe.MonadIO
- has allowed monad-transformers to pass IO functionality down the transformer-chain, even if the outer layers of the transformer-chain aren't 'IO capable'.
This works as long as the inner monad in the transformer-chain is the IO<A>
monad.
There are two distinct types of functionality in the MonadIO
trait:
- IO lifting functionality (via
MonadIO.LiftIO
) - IO unlifting functionality (via
MonadIO.ToIO
andMonadIO.MapIO
)
Problem no.1
It is almost always possible to implement LiftIO
, but it is often impossible to implement ToIO
(the minimum required unlifting implementation) without breaking composition laws.
Much of the 'IO functionality for free' of MonadIO
comes from leveraging ToIO
(for example, Repeat
, Fork
, Local
, Await
, Bracket
, etc.) -- and so if ToIO
isn't available and has a default implementation that throws an exception, then Repeat
, Fork
, Local
, Await
, Bracket
, etc. will also all throw.
This feels wrong to me.
Problem no.2
Because of the implementation hierarchy:
Maybe.MonadIO<M>
↓
Monad<M>
↓
MonadIO<M>
Methods like LiftIO
and ToIO
, which have default-implementations (that throw) in Maybe.MonadIO<M>
, don't have their overridden implementations enforced when someone implements MonadIO<M>
. We can just leave LiftIO
and ToIO
on their defaults, which means inheriting from MonadIO<M>
has no implementation guarantees.
Solution
- Split
MonadIO
(andMaybe.MonadIO
) into distinct traits:MonadIO
andMaybe.MonadIO
for lifting functionality (LiftIO
)MonadUnliftIO
andMaybe.MonadUnliftIO
for unlifting functionality (ToIO
andMapIO
)- The thinking here is that when unlifting can't be supported (in types like
StateT
andOptionT
) then we only implementMonadIO
- but in types where unlifting can be supported we implement both
MonadIO
andMonadUnliftIO
.
- In
MonadIO
andMonadUnliftIO
(the non-Maybe versions) we makeabstract
the methods that previously had defaultvirtual
(exception throwing) implementations.- That means anyone stating their type supports IO must implement it!
- Make all methods in
Maybe.MonadIO
andMaybe.MonadUnliftIO
have the*Maybe
suffix (soLiftIOMaybe
,ToIOMaybe
, etc.)- The thinking here is that for monad-transformer 'IO passing' we can still call the
Maybe
variants, but in the code it's declarative, we can see it might not work. - Then in
MonadIO
andMonadUnliftIO
(the non-Maybe versions) we can overrideLiftIOMaybe
,ToIOMaybe
, andMapIOMaybe
and get them to invoke the bespokeLiftIO
,ToIO
, andMapIO
fromMonadIO
andMonadUnliftIO
. - That means all default functionality
Repeat
,Fork
,Local
,Await
,Bracket
, gets routed to the bespoke IO functionality for the type.
- The thinking here is that for monad-transformer 'IO passing' we can still call the
The implementation hierarchy now looks like this:
Maybe.MonadIO<M>
↓
Maybe.MonadUnliftIO<M>
↓
Monad<M>
↓
MonadIO<M>
↓
MonadUnliftIO<M>
This should (if I've got it right) lead to more type-safe implementations, fewer exceptional errors for IO functionality not implemented, and a slightly clearer implementation path. It's more elegant because we override implementations in MonadIO
and MonadUnliftIO
, not the Maybe
versions. So, it feels more 'intentional'.
For example, this will work, because ReaderT
supports lifting and unlifting because it implements MonadUnliftIO
ReaderT<E, IO, A> mx;
var my = mx.ForkIO(); // compiles
Whereas this won't compile, because StateT
can only support lifting (by implementing MonadIO
):
StateT<S, IO, A> mx;
var my = mx.ForkIO(); // type-constraint error
If you tried to implementing MonadUnliftIO
for StateT
you quickly run into the fact that StateT
(when run) yields a tuple, which isn't compatible with the singleton value needed for ToIO
. The only way to make it work is to drop the yielded state, which breaks composition rules.
Previously, this wasn't visible to the user because it was hidden in default implementations that threw exceptions.
@micmarsh @hermanda19 if you are able to cast a critical eye on this and let me know what you think, that would be super helpful?
I ended up trying a number of different approaches and my eyes have glazed over somewhat, so treat this release with some caution. I think it's good, but critique and secondary eyes would be helpful! That goes for anyone else interested too.
Thanks in advance 👍
IObservable support in Source and SourceT
IObservable
can now be lifted into Source
and SourceT
types (via Source.lift
, SourceT.lift
, and SourceT.liftM
).
Source
or SourceT
is now supports lifting of the following types:
IObservable
IEnumerable
IAsyncEnumerable
System.Threading.Channels.Channel
And, because both Source
and SourceT
can be converted to Producer
and ProducerT
(via ToProducer
and ToProducerT
), all of the above types can therefore also be used in Pipes.
More general support for foldables coming soon
LanguageExt.Streaming + MonadIO + Deriving
Features:
- New streaming library
- Transducers are back
- Closed streams
- Pipes
- Open streams
Source
SourceT
Sink
SinkT
Conduit
ConduitT
- Open to closed streams
- Deprecated Pipes library
MonadIO
Deriving
- Bug fixes
New streaming library
A seemingly innocuous bug in the StreamT
type opened up a rabbit hole of problems that needed a fundamental rewrite to fix. In the process more and more thoughts came to my mind about bringing the streaming functionality under one roof. So, now, there's a new language-ext library LanguageExt.Streaming
and the LanguageExt.Pipes
library has been deprecated.
This is the structure of the Streaming
library:
Transducers are back
Transducers were going to be the big feature of v5
before I worked out the new trait-system. They were going to be too much effort to bring in + all of the traits, but now with the new streaming functionality they are hella useful again. So, I've re-added Transducer
and a new TransducerM
(which can work with lifted types). Right now the functionality is relatively limited, but you can extend the set of transducers as much as you like by deriving new types from Transducer
and TransducerM
.
Documentation
The API documentation has some introductory information on the streaming functionality. It's a little light at the moment because I wanted to get the release done, but it's still useful to look at:
The Streaming
library of language-ext is all about compositional streams. There are two key types of streaming
functionality: closed-streams and open-streams...
Closed streams
Closed streams are facilitated by the Pipes
system. The types in the Pipes
system are compositional
monad-transformers that 'fuse' together to produce an EffectT<M, A>
. This effect is a closed system,
meaning that there is no way (from the API) to directly interact with the effect from the outside: it can be executed
and will return a result if it terminates.
The pipeline components are:
ProducerT<OUT, M, A>
PipeT<IN, OUT, M, A>
ConsumerT<IN, M, A>
These are the components that fuse together (using the |
operator) to make an EffectT<M, A>
. The
types are monad-transformers that support lifting monads with the MonadIO
trait only (which constrains M
). This
makes sense, otherwise the closed-system would have no effect other than heating up the CPU.
There are also more specialised versions of the above that only support the lifting of the Eff<RT, A>
effect-monad:
Producer<RT, OUT, A>
Pipe<RT, IN, OUT, A>
Consumer<RT, IN, A>
They all fuse together into an Effect<RT, A>
Pipes are especially useful if you want to build reusable streaming components that you can glue together ad infinitum.
Pipes are, arguably, less useful for day-to-day stream processing, like handling events, but your mileage may vary.
More details on the Pipes page
.
Open streams
Open streams are closer to what most C# devs have used classically. They are like events or IObservable
streams.
They yield values and (under certain circumstances) accept inputs.
Source
andSourceT
yield values synchronously or asynchronously depending on their construction. Can support multiple readers.Sink
andSinkT
receives values and propagates them through the channel they're attached to. Can support multiple writers.Conduit
andConduitT
provides and input transducer (acts like aSink
), an internal buffer, and an output transducer (acts like aSource
). Supports multiple writers and one reader. But can yield aSource
`SourceT` that allows for multiple readers.
I'm calling these 'open streams' because we can
Post
values to aSink
/SinkT
and we canReduce
values yielded by
Source
/SourceT
. So, they are 'open' for public manipulation, unlikePipes
which fuse the public access away.
Source
Source<A>
is the 'classic stream': you can lift any of the following types into it: System.Threading.Channels.Channel<A>
,
IEnumerable<A>
, IAsyncEnumerable<A>
, or singleton values. To process a stream, you need to use one of the Reduce
or ReduceAsync
variants. These take Reducer
delegates as arguments. They are essentially a fold over the stream of
values, which results in an aggregated state once the stream has completed. These reducers can be seen to play a similar
role to Subscribe
in IObservable
streams, but are more principled because they return a value (which we can leverage
to carry state for the duration of the stream).
Source
also supports some built-in reducers:
Last
- aggregates no state, simply returns the last item yieldedIter
- this forces evaluation of the stream, aggregating no state, and ignoring all yielded values.Collect
- adds all yielded values to aSeq<A>
, which is then returned upon stream completion.
SourceT
SourceT<M, A>
is the classic-stream embellished - it turns the stream into a monad-transformer that can
lift any MonadIO
-enabled monad (M
), allowing side effects to be embedded into the stream in a principled way.
So, for example, to use the IO<A>
monad with SourceT
, simply use: SourceT<IO, A>
. Then you can use one of the
following static
methods on the SourceT
type to lift IO<A>
effects into a stream:
SourceT.liftM(IO<A> effect)
creates a singleton-streamSourceT.foreverM(IO<A> effect)
creates an infinite stream, repeating the same effect over and overSourceT.liftM(Channel<IO<A>> channel)
lifts aSystem.Threading.Channels.Channel
of effectsSourceT.liftM(IEnumerable<IO<A>> effects)
lifts anIEnumerable
of effectsSourceT.liftM(IAsyncEnumerable<IO<A>> effects)
lifts anIAsyncEnumerable
of effects
Obviously, when lifting non-
IO
monads, the types above change.
SourceT
also supports the same built-in convenience reducers as Source
(Last
, Iter
, Collect
).
Sink
Sink<A>
provides a way to accept many input values. The values are buffered until consumed. The sink can be
thought of as a System.Threading.Channels.Channel
(which is the buffer that collects the values) that happens to
manipulate the values being posted to the buffer just before they are stored.
This manipulation is possible because the
Sink
is aCoFunctor
(contravariant functor). This is the dual ofFunctor
:
we can think ofFunctor.Map
as converting a value fromA -> B
. WhereasCoFunctor.Comap
converts fromB -> A
.
So, to manipulate values coming into the Sink
, use Comap
. It will give you a new Sink
with the manipulation 'built-in'.
SinkT
SinkT<M, A>
provides a way to accept many input values. The values are buffered until consumed. The sink can
be thought of as a System.Threading.Channels.Channel
(which is the buffer that collects the values) that happens to
manipulate the values being posted to the buffer just before they are stored.
This manipulation is possible because the
SinkT
is aCoFunctor
(contravariant functor). This is the dual ofFunctor
:
we can think ofFunctor.Map
as converting a value fromA -> B
. WhereasCoFunctor.Comap
converts fromB -> A
.
So, to manipulate values coming into the SinkT
, use Comap
. It will give you a new SinkT
with the manipulation 'built-in'.
SinkT
is also a transformer that lifts types of K<M, A>
.
Conduit
Conduit<A, B>
can be pictured as so:
+----------------------------------------------------------------+
| |
| A --> Transducer --> X --> Buffer --> X --> Transducer --> B |
| |
+----------------------------------------------------------------+
- A value of
A
is posted to theConduit
(viaPost
) - It flows through an input
Transducer
, mapping theA
value toX
(an internal type you can't see) - The
X
value is then stored in the conduit's internal buffer (aSystem.Threading.Channels.Channel
) - Any invocation of
Reduce
will force the consumption of the values in the buffer - Flowing each value
X
through the outputTransducer
So the input and output transducers allow for pre and post-processing of values as they flow through the conduit.
Conduit
is a CoFunctor
, call Comap
to manipulate the pre-processing transducer. Conduit
is also a Functor
, call
Map
to manipulate the post-processing transducer. There are other non-trait, but common behaviours, like FoldWhile
,
Filter
, Skip
, Take
, etc.
Conduit
supports access to aSink
and aSource
for more advanced processing.
ConduitT
ConduitT<M, A, B>
can be pictured as so:
+------------------------------------------------------------------------------------------+
| |
| K<M, A> --> TransducerM --> K<M, X> --> Buffer --> K<M, X> --> TransducerM --> K<M, B> |
| |
+------------------------------------------------------------------------------------------+
- A value of
K<M, A>
is posted to theConduit
(viaPost
) - It flows through an input
TransducerM
, mapping theK<M, A>
value toK<M, X>
(an internal type you can't see) - The
K<M, X>
value is then stored in the conduit's internal buffer (a `System.Th...
IO 'acquired resource tidy up' bug-fix
This issue highlighted an acquired resource tidy-up issue that needed tracking down...
The IO
monad has an internal state-machine. It tries to run that synchronously until it finds an asynchronous operation. If it encounters an asynchronous operation then it switches to a state-machine that uses the async
/await
machinery. The benefit of this is that we have no async
/await
overhead if there's no asynchronicity and only use it when we need it.
But... the initial synchronous state-machine used a try
/finally
block that was used to tidy up the internally allocated EnvIO
(and therefore any acquired resources). This is problematic when switching from sync -> async
as the try
/finally
isn't then sequenced correctly.
It could have been worked-around by manually providing an
EnvIO
toRun
orRunAsync
.
That was a slightly awkward one to track down. Should be fixed now!
Pipes refactor and Cofunctors
LanguageExt Pipes Background
Part of the v5
refresh was to migrate the Pipes
functionality to be a proper monad-transformer (in v4
it's a transformer too, but it can only lift Eff<RT, A>
, rather than the more general K<M, A> where M : Monad<M>
). I completed the generalisation work a while back, but it had some problems:
For any users of pipes it was going to be a big upheaval
Obviously, v5
is a big change, but where possible I want the migrations to be quite mechanical - it wasn't going to be. That doesn't mean I shouldn't 'go for it', but I'm trying to make sure that every bit of pain a user has to go through to move from v4
to v5
is strongly justified and will lead to a better experience once migrated.
It was inconsistently named
The core type Proxy
, and the derived types: Producer
, Consumer
, Pipe
, etc. don't follow the monad-transformer naming convention of having a T
suffix. Really, if they're going to be generalised for any monad then they should be called ProducerT
, ConsumerT
, PipeT
, ...
Pipes is hard to use
This is not a new problem with v5
. I made Pipes into a 1-for-1 clone of the Haskell Pipes library. Even in Haskell they can be quite hard to use as you chase alignment of generics. The desire for pipes to support: producers, pipes, clients, servers, and more seems (in hindsight) to be too greedy.
Hard to retrofit
The generalisation process wasn't working well in some areas. The Producer.merge
was blocking and fixing it with the original code was challenging to say the least.
LanguageExt Pipes Refresh
So, I decided to take a step back. Instead of trying to make an exact clone of the Haskell version, I thought I'd build it from scratch in a way that's more 'csharpy', consistent, and simpler. In particular I looked at the techniques I used to refactor the IO
monad (to support recursion, asynchrony, etc.) and brought them into a new Pipes implementation.
I also decided to drop support for Client
, Server
, Request
, Response
, and all of the other stuff that I suspect nobody used because they were too hard.
That means:
- There's no need for an underlying
Proxy<A1, A, B1, B, M, R>
interface. This was only needed to support all flavours of client, server, producer, consumer, etc. - The base-type of all pipes related types is:
PipeT<IN, OUT, M, R>
- This is clearly easier to understand
- A
ProducerT<OUT, M, R>
is simply a pipe with the input set toUnit
:PipeT<Unit, OUT, M, R>
- A
ConsumerT<IN, M, R>
is simply a pipe with the output set toVoid
:PipeT<IN, Void, M, R>
- A
EffectT<M, R>
is simply a pipe with the input set toUnit
and the output set toVoid
. This enclosed effect is the result of fusing producer, pipe, and consumers together:PipeT<Unit, Void, M, R>
Those four types: ProducerT
, PipeT
, ConsumerT
, and EffectT
are the new simplified and, fully generalised, version of pipes.
Now that the generalised implementation follows the naming convention of having a T
suffix for transformers, we can use the original names Producer
, Pipe
, Consumer
, and Effect
to provide a more specialised version that only works with Eff<RT, A>
(like the original pipes).
So,
Producer<RT, OUT, R>
is (internally) aProducerT<OUT, Eff<RT>, R>
Pipe<RT, IN, OUT, R>
is (internally) aPipeT<IN, OUT, Eff<RT>, R>
Consumer<RT, IN, R>
is (internally) aConsumerT<IN, Eff<RT>, R>
Effect<RT, IN, R>
is (internally) aEffectT<IN, Eff<RT>, R>
The good thing about this refactor is that there really is only one implementation of the pipes functionality and it all sits in the PipesT.DSL.cs
. This focused DSL is much easier to manage than before - it was implemented in a similar way before, but it's now just much easier for a C# dev to consume. I have put a real effort into making the interfaces, modules, preludes, etc. consistent for all types.
Pipes concurrency
Concurrency wasn't front-and-centre in the original implementation. In some senses it was 'bolted on'. You got concurrency from the lifted Eff
type and from the Producer.merge
function, but that was it.
Now pipes has first-class support for concurrency:
- Support for
IEnumerable
andIAsyncEnumerable
withProducerT.yieldAll
,Producer.yieldAll
,PipeT.yieldAll
, andPipe.yieldAll
. - Unlike the original, the core DSL supports the lifting of tasks
- Which means direct support from:
PipeT.liftT
,PipeT.liftM
,Pipe.liftT
,Pipe.liftM
,ProducerT.liftT
,ProducerT.liftM
,Producer.liftT
,Producer.liftM
,ConsumerT.liftT
,ConsumerT.liftM
,Consumer.liftT
,Consumer.liftM
,EffectT.liftT
,EffectT.liftM
,Effect.liftT
, andEffect.liftM
!
- Which means direct support from:
Mailbox
, Inbox
, and Outbox
Inspired by the original Pipes.Concurrency
library, I implemented Mailbox
, Inbox
, and Outbox
. It's not a clone of the original, just inspired by. A Mailbox
consists of an Inbox
and an Outbox
. The inbox receives values posted to it. The outbox yields values posted to the inbox upon request.
Backing the Mailbox
is a System.Threading.Channels.Channel
. You can create a Mailbox
like so:
var mailbox = Mailbox.spawn<string>();
A mailbox is simply a record
with an Inbox
and Outbox
:
public record Mailbox<A, B>(Inbox<A> Inbox, Outbox<B> Outbox)
You can Post
to the Mailbox
and you can Read
from the Mailbox
. But, even more critically, you can call:
mailbox.ToConsumer<M>()
- to get a consumer of values being posted into theInbox
mailbox.ToProducer<M>()
- to get a producer of values being yielded into theOutbox
A good example of why this is useful is the new Producer.merge
function:
public static ProducerT<OUT, M, Unit> merge<OUT, M>(Seq<ProducerT<OUT, M, Unit>> producers) where M : Monad<M> =>
from mailbox in Pure(Mailbox.spawn<OUT>())
from forks in forkEffects(producers, mailbox)
from _ in mailbox.ToProducerT<M>()
from x in forks.Traverse(f => f.Cancel).As()
select unit;
static K<M, Seq<ForkIO<Unit>>> forkEffects<M, OUT>(
Seq<ProducerT<OUT, M, Unit>> producers,
Mailbox<OUT, OUT> mailbox)
where M : Monad<M> =>
producers.Map(p => (p | mailbox.ToConsumerT<M>()).Run())
.Traverse(ma => ma.ForkIO());
The merge
function gets a collection of producers. What we want is for those to run concurrently so we can receive the values as they happen. Then we want to produce a single merged stream of values.
This creates the merged stream Mailbox
:
from mailbox in Pure(Mailbox.spawn<OUT>())
In forkEffects
we process each producer p
and pipe its values to mailbox.ToConsumerT
:
p | mailbox.ToConsumerT<M>()
So, we get a ConsumerT
for the merged-stream's Mailbox
. It consumes every value from p
, fusing into an EffectT
. We then Run()
that EffectT
which gives us the underlying M
monad:
(p | mailbox.ToConsumerT<M>()).Run()
We do this for every ProducerT
, which means the merged-values Mailbox
gets every value yielded from upstream.
producers.Map(p => (p | mailbox.ToConsumerT<M>()).Run())
Finally, we ForkIO
each EffectT
so that it can run in parallel.
.Traverse(ma => ma.ForkIO())
Back to the merge
function, we then access the other side of the mailbox by asking for the Outbox
producer, using ToProducerT
:
from _ in mailbox.ToProducerT<M>()
This will then yield all of the merged values downstream (whilst there are values to yield). Once complete, we tidy up the forks:
forks.Traverse(f => f.Cancel).As()
Cofunctor
Mailbox
is pretty powerful in its own right and doesn't need pipes to function. This is a quick example of a loop that reads every value posted to a Mailbox
and writes it to the console:
static IO<Unit> consumeAll(Mailbox<string, string> mailbox) =>
from x in mailbox.Read()
from _ in IO.lift(() => Console.WriteLine(x))
from r in consumeAll(mailbox)
select r;
Mailbox<A, B>
has two type parameters: A
represents the values coming in and B
represents the values being yielded.
A -> B
Values of type A
are posted to Mailbox.Inbox
and values of type B
are yielded from Mailbox.Outbox
.
If you call mailbox.Map<C>((B b) => ...)
on Mailbox
then you could imagine Mailbox
being represented like this:
A -> B -> C
The result is a
Mailbox<A, C>
, but internally there's a mapping of the values as they flow through.
Subsequent calls to Map<D>
, and the like, would continue to transform the value being yielded from the Mailbox.Outbox
:
A -> B -> C -> D
But what if we wa...
Minor updates and fixes
Iterator: a safe IEnumerator
Language-ext gained Iterable
a few months back, which is a functional wrapper for IEnumerable
. We now have Iterator
, which is a more functional wrapper for IEnumerator
.
IEnumerator
is particularly problematic due to its mutable nature. It makes it impossible to share or leverage safely within other immutable types.
For any type where you could previously call GetEnumerator()
, it is now possible to call GetIterator()
.
Iterator
is pattern-matchable, so you can use the standard FP sequence processing technique:
public static A Sum<A>(this Iterator<A> self) where A : INumber<A> =>
self switch
{
Iterator<A>.Nil => A.Zero,
Iterator<A>.Cons(var x, var xs) => x + xs.Sum()
};
Or, bog standard imperative processing:
for(var iter = Naturals.GetIterator(); !iter.IsEmpty; iter = iter.Tail)
{
Console.WriteLine(iter.Head);
}
You need to be a little careful when processing large lists or infinite streams.. Iterator<A>
uses Iterator<A>.Cons
and Iterator<A>.Nil
types to describe a linked-list of values. That linked-list requires an allocated object per item. That is not really a problem for most of us that want correctness over outright performance, it is a small overhead. But, the other side-effect of this is that if you hold a reference to the head item of a sequence and you're processing an infinite sequence, then those temporary objects won't be freed by the GC. Causing a space leak.
This will cause a space-leak:
var first = Naturals.GetIterator();
for(var iter = first; !iter.IsEmpty; iter = iter.Tail)
{
Console.WriteLine(iter.Head);
}
first
references the first Iterator<A>.Cons
and every subsequent item via the Tail
.
This (below) is OK because the iter
reference keeps being overwritten, which means nothing is holding on the Head
item in the sequence:
for(var iter = Naturals.GetIterator(); !iter.IsEmpty; iter = iter.Tail)
{
Console.WriteLine(iter.Head);
}
This type is probably more useful for me when implementing the various core types of language-ext, but I can't be the only person who's struggled with IEnumerator
and its horrendous design.
A good example of where I am personally already seeing the benefits is IO<A>.RetryUntil
.
This is the original version:
public IO<A> RepeatUntil(
Schedule schedule,
Func<A, bool> predicate) =>
LiftAsync(async env =>
{
if (env.Token.IsCancellationRequested) throw new TaskCanceledException();
var token = env.Token;
var lenv = env.LocalResources;
try
{
var result = await RunAsync(lenv);
// free any resources acquired during a repeat
await lenv.Resources.ReleaseAll().RunAsync(env);
if (predicate(result)) return result;
foreach (var delay in schedule.Run())
{
await Task.Delay((TimeSpan)delay, token);
result = await RunAsync(lenv);
// free any resources acquired during a repeat
await lenv.Resources.ReleaseAll().RunAsync(env);
if (predicate(result)) return result;
}
return result;
}
finally
{
// free any resources acquired during a repeat
await lenv.Resources.ReleaseAll().RunAsync(env);
}
});
Notice the foreach
in there and the manual running of the item to retry with RunAsync
. This has to go all imperative because there previously was no way to safely get the IEnumerator
of Schedule.Run()
and pass it around.
This is what RetryUntil
looks like now:
public IO<A> RetryUntil(Schedule schedule, Func<Error, bool> predicate)
{
return go(schedule.PrependZero.Run().GetIterator(), Errors.None);
IO<A> go(Iterator<Duration> iter, Error error) =>
iter switch
{
Iterator<Duration>.Nil =>
IO.fail<A>(error),
Iterator<Duration>.Cons(var head, var tail) =>
IO.yieldFor(head)
.Bind(_ => BracketFail()
.Catch(e => predicate(e)
? IO.fail<A>(e)
: go(tail, e)))
};
}
Entirely functional, no imperative anything, and even (potentially) infinitely recursive depending on the Schedule
. There's also no manual running of the IO
monad with RunAsync
, which means we benefit from all of the DSL work on optimising away the async/await
machinery.
Future:
- Potentially use
Iterator
inStreamT
- Potentially use
Iterator
in Pipes - Potentially create
IteratorT
(although this would likely just beStreamT
, so maybe a renaming)
IO refactor continued
IO<A>
The work has continued following on from the last IO
refactor release. The previous release was less about optimisation and more about correctness, this release is all about making the async/await
state-machines disappear for synchronous operations.
- The core
Run
andRunAsync
methods have been updated to neverawait
. If at any point an asynchronous DSL entry is encountered then processing is deferred toRunAsyncInternal
(which does useawait
). Because,RunAsync
usesValueTask
it's possible to run synchronous processes with next to zero overhead and still resolve to a fully asynchronous expression when one is encountered. - The DSL types have all been updated too, to try to run synchronously, if possible, and if not defer to asynchronous versions.
- DSL state-machine support for resource tracking. It automatically disposes resources on exception-throw
- DSL support for three folding types:
IOFold
,IOFoldWhile
,IOFoldUntil
(see 'Folding') - DSL Support for
Final<F>
(seeFinal<F>
)
TODO
- DSL support for
Repeat*
andRetry*
- then all core capabilities can run synchronously if they are composed entirely of synchronous components.
Folding
The standard FoldWhile
and FoldUntil
behaviour has changed for IO
(and will change for all FoldWhile
and FoldUntil
eventually: it dawned on me that it was a bit of a waste that FoldWhile
was equivalent to FoldUntil
but with a not
on the predicate.
So, the change in behaviour is:
- The
FoldUntil
predicate test (and potential return) is run after the fold-delegate has been run (so it gets the current-value + the fold-delegate updated-state).- This was the previous behaviour
- The
FoldWhile
predicate test (and potential return) is run before the fold-delegate is run (so it gets the current-value + the current-state).
The benefit of this approach is that you can stop a fold-operation running if the state is already 'bad' with FoldWhile
, whereas with FoldUntil
you can exit once the fold-operation makes the state 'bad'. The difference is subtle, but it does give additional options.
Final<F>
trait
Final<F>
is a new trait-type to support try / finally
behaviour. This has been implemented for IO
for now. This will expand out to other types later. You can see from the updated implementation of Bracket
how this works:
public IO<C> Bracket<B, C>(Func<A, IO<C>> Use, Func<Error, IO<C>> Catch, Func<A, IO<B>> Fin) =>
Bind(x => Use(x).Catch(Catch).Finally(Fin(x)));
It's still early for this type, but I expect to provide @finally
methods that work a bit like the @catch
methods.
Unit tests for IO
One of the things that's holding up the full-release of v5
(other than the outstanding bugs in Pipes) is the lack of unit-tests for all of the new functionality. So, I've experimented using Rider's AI assistant to help me write the unit-tests. It's fair to say that it's not too smart, but at least it wrote a lot of the boilerplate. So, once I'd fixed up the errors, it was quite useful.
It's debatable whether it was much quicker or not. But, I haven't really spent any time with AI assistants, so I guess it might just be my inexperience of prompting them. I think it's worth pursuing to see if it can help me get through the unit-tests that are needed for v5
.
IO refactor
IO
I have refactored the IO<A>
monad, which is used to underpin all side-effects in language-ext. The API surface is unchanged, but the inner workings have been substantially refactored. Instead of the four concrete implementations of the abstract IO<A>
type (IOPure<A>
, IOFail<A>
, IOSync<A>
, and IOAsync<A>
), there is now a 'DSL' of operations deriving from IO<A>
which are interpreted in the Run
and RunAsync
methods (it now works like a Free monad).
The DSL is also extensible, so if you have some behaviours you'd like to embed into the IO interpreter than you can derive from one of these four types.
The benefits of the refactored approach are:
- Fixes this issue
- Should have more performance (although not tested fully)
- Extensible DSL
- Can support infinite recursion
The last one is a big win. Previously, infinite recursion only worked in certain scenarios. It should work for all scenarios now.
Take a look at this infinite-loop sample:
static IO<Unit> infinite(int value) =>
from _ in writeLine($"{value}")
from r in infinite(value + 1)
select r;
The second from
expression recursively calls infinite
which would usually blow the stack. However, now the stack will not blow and this example would run forever*.
*All LINQ expressions require a trailing select ...
, that is technically something that needs to be invoked after each recursive call to infinite
. Therefore, with the above implementation, we get a space-leak (memory is consumed for each loop).
To avoid that when using recursion in LINQ, you can use the tail
function:
static IO<Unit> infinite(int value) =>
from _ in writeLine($"{value}")
from r in tail(infinite(value + 1))
select r;
For other monads and monad-transformers that lift the
IO
monad into their stacks, you can use thetailIO
function. This should bring infinite recursion to all types that use theIO
monad (likeEff
for example).
What tail
does is says "We're not going run the select
at all, we'll just return the result of infinite
". That means we don't have to keep track of any continuations in memory. It also means you should never do extra processing in the select
, just return the r
as-is and everything will work: infinite recursion without space leaks.
tail
is needed because theSelectMany
used by LINQ has the finalFunc<A, B, C>
argument to invoke after theFunc<A, IO<B>>
monad-bind function (which is the recursive one). TheFunc<A, B, C>
is the trailingselect
and is always needed. It would be good if C# supported aSelectMany
that is more like a regular monadic-bind and recognised the pattern of no additional processing inselect
, but we have to put up with the hand we're dealt.
Not doing work after a tail-call is a limitation of tail-recursion in every language that supports it. So, I'm OK being explicit about it with LINQ. Just be careful to not do any additional processing or changing of types in the select
.
Note, if you don't use LINQ and instead use a regular monad-bind operation, then we don't need the tail
call at all:
static IO<Unit> infinite(int value) =>
writeLine($"{value}")
.Bind(_ => infinite(value + 1));
That will run without blowing the stack and without space-leaks. Below is a chart of memory-usage after 670 million iterations:
What's nice about looking a the memory-graph is that, not only is it flat in terms of total-usage (around 26mb
), it only ever uses the Gen 0
heap. This is something I've always said about functional-programming. We may we generate a lot of temporary objects (lambdas and the like), but they rarely live long enough to cause memory pressures in higher generations of the heap. Even though this is a very simple sample and you wouldn't expect that much pressure, the benefits of most of your memory usage being in Gen 0
is that you're likely using memory addresses already cached by the CPU -- so the churn of objects is less problematic that is often posited.
StreamT
made experimental
I'm not sure how I'm going to fix this issue, so until I have a good idea, StreamT
will be marked as [Experimental]
. To use it, add this to the top of a file:
#pragma warning disable LX_StreamT
Conclusion
This was a pretty large change, so if you're using the beta
in production code, please be wary of this release. And, if you spot any issues, please let me know.
XML documentation updates
One of the most idiotic things Microsoft ever did was to use XML as a 'comment' documentation-format when the language it is documenting is full of <
, >
, and &
characters. I'd love to know who it was that thought this was a good idea.
They should be shunned forever!
Anyway, more seriously, I have ignored the 'well formed' XML documentation warnings, forever. I did so because I consider the readability of comments in code to be the most important factor. So, if I needed a <
or a >
and it looked OK in the source, then that was good enough for me.
I even built my own documentation generator that understood these characters and knew how to ignore them if the weren't known tags (something Microsoft should have done by now!)
I refused to make something like Either<L, R>
turn into Either<L, R>
Anyway, there are places where this is problematic:
- Inline documentation tooltips in IDEs
- The new 'formatted' documentation of Rider (maybe VS too, but I haven't used it in ages).
The thing is, I still think the source-code is the most important place for readability, so requests like this I ignored until I could think of a better solution. Now I have a better solution which is to use alternative unicode characters that are close enough to <
, >
, and &
, that they read naturally in source and in documentation, but are also not XML delimiters:
Original | Replacement | Unicode | Example (before) | Example (after) |
---|---|---|---|---|
< | 〈 | U+3008 | Either<Error, A> | Either〈Error, A〉 |
> | 〉 | U+3009 | Either<Error, A> | Either〈Error, A〉 |
& | & | U+FF06 | x && y | x && y |
The spacing isn't perfect and in certain situations they look a touch funky, but still legible, so I'm happy to go with this as it makes the documentation work everywhere without compromise. It's slightly more effort for me, but I'd rather do this than compromise the inline comments.
Over 120 source files have been updated with the new characters. There are now no XML documentation errors (and I've removed the NoWarn
that suppressed XML documentation errors).
There may well be other unicode characters that are better, but at least now it's a simple search & replace if I ever decided to change them.