-
Notifications
You must be signed in to change notification settings - Fork 7
log :: 2025‐03
Tip
Simulation Testing
, Deterministic Runtime
, Stake Distribution
, Ledger Validation
, DReps
These logs detail new simulation testing progress (finalizing demos, forging a deterministic runtime, bridging with Amaru), improved ledger validation and data handling (DReps, multi-relations, reduced duplication), and refinements in telemetry/tracing.
They also describe efforts to remove hardcoded epoch/slot logic, integrate partial chain data for faster tests, and unify concurrency patterns.
Overall, this work aligns the system more closely with Haskell node implementations and sets the stage for more robust end-to-end testing.
This week has mostly been wrapping things up for the demo with AB tomorrow. Fixing small things, improving the UI, trying to find a small bug to inject to show how the test output looks like in case of a failure, and working on slides.
I also started planning a bit for what to do after the demo. I feel like if we want to take simulation testing seriously we need to start taking determinism seriously, and right now we don't have a plan for how to achieve determinism. We are building upon a foundation that simply isn't good enough for what we are saying we want to achieve. The longer we keep doing this the more expensive it will be to fix down the road.
Therefore I'm working on documenting the Maelstrom distributed system DSL and how to implement a deterministic runtime for it. Here's what I believe to be true:
-
The DSL is expressive enough for the job, based on the fact that the Maelstrom DSL can be used to implement mini-Datalog and Raft (these are examples from the Maelstrom repo) and the fact that Kyle Kingsbury uses Maelstrom to teach distributed systems classes;
-
The approach works, based on my prior experience in writing a simulator.
What I don't know (yet):
-
Is the DSL too flexible for what we need? Or with other words: can we get away with implementing a subset of the DSL? I don't understand the Amaru project well enough to tell at this point;
-
I know how to do it if everything is single-threaded, but for performance we'll want to introduce parallelism. I have some ideas about how to introduce parallelism without breaking determinism for some subsets of the DSL, but I can't tell if this will be flexible enough or performant enough.
Next week, after the demo, when I've hopefully managed to write something coherent about the topic, I'll share it and try to get a conversation going.
Simulation tests do not work with real network data as we want to be able to produce "interesting" synthetic chains and blocks to test the behaviour of the node. But we still want to validate headers and block correctly, and of course detect invalid headers and blocks. The slot-arithmetic
crate existed for that purpose but was so far unused, so what I did was:
- Add a
Testnet
item toNetworkName
, so that one can run amaru (and simulator) without historical data. TheTestnet
is parameterised by a network magic number- Also added the capability to load an Era from JSON, this could be useful for testing or when we want to run amaru on other networks. For example, we could run amaru in a tesnet along with cardano-nodes and bootstrap it with their era history
- Convert JSON era history for preprod into a static
EraHistory
object that can be used everywhere - Ensure that one can convert from a
NetworkName
to anEraHistory
.- This is only implemented for
Preprod
andTestnet
s with the latter being hardcoded to a single era with 1000 epochs.
- This is only implemented for
- Moved various hardcoded functions from
amaru-kernel
toslot-arithmetic
, in theEraHistory
where they really belong. - Threaded an
EraHistory
across the various layers, creating it from the givenNetworkName
passed to themain
as an argument and then passing it to various layers that need it, most notably:-
ledger/state.rs
, which can then pass it on to other substructures -
rocksdb
and consensus store which need it to compute nonces correctly
-
- I hit a small snag with the examples that were not compiled locally and failed on CI. It's nice there's a
make all
command that ensures everything compiles correctly - I found the (ab)use of
u64
values to represent slots, epochs, or time, a bit confusing so tried to implement a newtype pattern in Rust but quickly gave up as it gave rise to way too much hand-written code - I tested running moskstraumen against the new
simulator
binary and it worked fine: No need to hardcode some values for slots conversions!
I spent some time working on clarifying tracing implementation and making sure the overall span traces make sense and are useful.
First step was to leverage smaller functions usage and leveraging the instrument
. Manual span
creation is too error prone and significantly obfuscates the code.
Another step was to remove parent
from spans as they should be inherited. This removes the need to pass spans around, cleaning the API.
The one thing to enable this is to make sure parent spans are kept when crossing threads. gasket handles this internally so without modifying gasket this has to be done manually on amaru side when crossing stages.
Ultimately, the TLDR when adding new traces is:
- only use
instrument
withskip_all
- explicitely add
fields
- make sure
fields
are JSON compliant strings - no explicit
target
- new fields can be added dynamically to the current span (created by the closest instrument) but they have to be listed in instrument
- events can be added via
trace!
(and family) macros - do not abuse instrument as it clogs the system and make it harder to understand the flow
A few important things for you to note:
-
There's now a preparation step, ahead of the validation. This step is used to figure out what data are going to be needed by the validation context. At this point, since the rules only requires information about inputs (for key witness validations), then we only construct a context for those.
-
The preparation of the validation context is entirely decoupled from the actual pre-fetching of the underlying data. Right now, we define two types Preparation/Validation context:
- A fake one, that is suitable for testing and requires the fixture data to be provided upfront. Used for example here or here.
- A simple one, that is currently being used during normal operation. Calls to
require
will mark data as needed, and I've introduced a step for pre-fetching the underlying data and constructing a valid simple validation context.
For now, the
simple
is extremely basic, and implement just what's needed to get the integration tests pass. But now, a next step (from my end) will be to fully port the current state management to this new approach, very likely just re-using what we have now. -
Then, the validation code now successfully uses the provided context to (a) resolve inputs from it and (b) produce outputs which may be needed by further validations.
At this point, we do not consume inputs, so double-spending is effectively possible :). We shall do this at the end of every transaction, of course.
-
With this last changes, we're able to run the integration tests and actually validate signatures on PreProd as part of the ledger validations 🎉! Note that, this wasn't the case before since we would simply ignore any such validation of any unknown input...
Generally speaking, the code in the rules crucially misses tests. At the very least now, the integration tests will catch a few things, but they aren't sufficient -- they should remain only a last rampart.
I've added an example of a validation yielding a
MissingVKeyWitness
and moving forward, I'd expect more of those for each possible validation failure -- and this, before we introduce any new rules (and that'll give us some time in the meantime to finish merging the state & validations!). -
I've also reworked some errors, to make the rules a bit more composable. The idea is to structure errors in the same way we structure rules, and instead of having "low-level rules" yields plain
TransactionRuleViolation
which will rapidly grow quite large; they can yield an error that's more restricted to the thing they're validating.So far, I've done it in two (three if we count the unification of bootstrap/vkey witnesses verification) places:
-
The
InvalidVKeyWitnesses
now takes an array ofInvalidVKeyWitness
errors which are produced by verify_ed25519_signature. -
The
InvalidOutputs
now takes an array ofInvalidOutput
errors, which are produced by single outputs validations. We only have one for now regarding the min ada value
Remark that I am no longer cloning the underlying data as part of the error, but instead prefer reporting errors using an indice/position of the corresponding element within the transaction. To avoid repeating this over and over, I've introduced a little helper structure that makes the construction of such errors straighforward.
-
Supported AB with various small fixes to get the simulation testing demo working.
Introduced myself to RK and tried to get a DST discussion going by sending him some stuff I've written.
Had a kick-off meeting regarding Antithesis, where AB reminded me of the fact there is an example in their docs (I had forgotten about this).
Otherwise I've started documenting two things: workloads (and faults) and the Maelstrom DSL and how to implement a deterministic runtime for it. While trying to explain how to implement the runtime, I realised that my current implementation is too complicated to explain easily. So started thinking about how it can be simplified. I think this would will be important when discussing with RK about how to make the consensus node runtime deterministic.
- Made some progress towards being able to test consensus through moskstraumen walking a block tree and simulating interactions with upstream peers (and checking the SUT selects the right chain along the way)
- Ditched gasket which does not add value at this stage and make the code more complicated than it should be: The
run_simulator
is just anasync
function that loops over lines read fromstdin
, pass the relevant messages to theConsensus
for forwarding or rolling back, and then output the resultingBlockValidatedEvent
tostdout
using the same format as the input. - I wasted some time struggling with I/O and reading lines: I had created two
BufReader
structures readingstdin()
in different functions, one to read the init message which gives the list of upstream nodes "addresses", one for looping over other input messages. Turns out this is not a good idea as it seems the first one swallows all the input there, or lock it, so the second one does not get a chance to read anything. - Ultimately created a
MessageReader
trait with 2 implementations in order to test the reading logic: one implementation dependent onstdin
and the other one simply reading a list ofString
. This is a bit of a mess and I suspect there already exists plenty of tools to do that already I am not aware of... - Once the input part sorted out, I hit issues validating the header with computing nonces: Nonce computation assumes things about the current slot and epoch (eg. that it's past byron and shelley hard fork which is hardcoded to
preprod
value).- TODO: we should have network-specific values for all those things, or at least the possibility to swap the "real" network with a test one.
- Note I could have just mocked the
ChainStore
trait which provides nonces computation, I might do it later down the road as it removes the dependency onRocksDB
underlying storage while adding another layer of mocks - I ended up manually hardcoding the epoch value and
import-nonces
with all zeros, and default to all-zeros if the header has no parent thus denoting first header after genesis.
- Next steps to complete the whole journey are:
- implement remaining methods from
FakeStakeDistribution
, similar to existing mock - mock
block_fetch
. Ideally, this should be part of the protocol interacting with the tester but for now, let's keep it simple - handle the
ValidateBlockEvent
emitted by the consensus once it has a chain candidate and has downloaded the block. This last one is a bit tricky and annoying because the events do not hold the block header, only the body, and we need the header when forwarding. I was pondering whether I should read the header from theChainStore
with the point, or if I should more simply add the header to theValidate
event.
- implement remaining methods from
We took a first bite at implementing governance; starting with the tracking of delegate representatives (a.k.a DReps). We initially thought this would be relatively easy, although it had its share of problems.
Like pool, a DRep can have many delegators; and delegators can have at most one DRep. Fundamentally, we have something like:
erDiagram
Account }o--|| DRep : "delegate vote"
Account }o--|| Stake-Pool : "delegate consensus"
Hence, the first intuition is to store the DRep relationship next to the Pool relationship within the Account entity. So far so good. Although, our storage model is deferred: updates to apply to the persistent store appear first as deltas in a volatile queue. For one-to-many entity relationship like this, we use:
- For new relations: key:value map, where the key represents an Account's stake credential and the value is an optional entity.
- For account removals: a key set, containing Account's stake credential being removed.
This works well when tracking a single relation, but we ran into issues as soon we added a second relation. The main problem came from our internal representation of those relations:
pub struct DiffRelation<K: Ord, L, R> {
pub registered: BTreeMap<K, Relation<L, R>>,
pub unregistered: BTreeSet<K>,
}
pub struct Relation<L, R> {
pub left: Option<L>,
pub right: Option<R>,
}
By holding Option
for the left
and right
relations, we cannot distinguish between an absence of change, and the need to remove a relation. For example, if we register or re-register a new entity, and then introduce a left-relation; we end up with a relation holding: Some(L)
& None
. And in this case, I need to set or reset the right
relation to None
since the entry is either new or re-registered (thus invalidating any previous relation).
However, I cannot distinguish it from the case where I simply introduce a left-relation on an existing entity. Which ends up also yielding a pair Some(L)
and None
. Although here, None
does not indicate a reset of the right relation, but rather an absence of action. To solve this ambiguity, we introduced the following:
pub enum Resettable<A> {
Set(A),
Reset,
Unchanged,
}
pub struct Relation<L, R> {
pub left: Resettable<L>,
pub right: Resettable<R>,
}
This correctly allows to represent the first example as Set(L)
& Reset
, while the second example becomes Set(L)
& Unchanged
now removing the ambiguity. We could also have introduced separate maps for the left and right relations, so that the absence of entry in the map would indicate the absence of changes, but this Resettable
structure somewhat simplifies the implementation and testing underneath.
A great part of testing the ledger state revolves around comparing epoch snapshots with similar snapshots produced from a Haskell node. The Haskell implementation acts here as an oracle for us. Although, since we use different data-formats and different serialisation strategies, we need to produce conformance snapshots that are somewhat in-between both implementations. So far, we have three kind of snapshots:
- Stake distribution snapshots: for account balances, stake pool balances & parameters as well as account <-> stake pool relations.
- Rewards summary snapshots: for intermediate rewards calculations (e.g. pools efficiency, total rewards, available rewards, ...)
- DReps snapshots: for relations accounts <-> dreps, and (eventually) drep states.
The last snapshot, we introduced recently to ensure conformance with the Haskell node. To produce it, we initially used information coming from the local-state-query protocol's query: GetDRepState
. This query returns, for each DRep, a set of delegators. However, after some headache, we realized that the delegators returned by this query where incoherent. The same delegator would sometimes be present in several DRep entries. After some digging, we found IntersectMBO/cardano-ledger#4772.
The issue and its fix make the problem clear: the DRep delegators weren't properly cleaned up in the early days of the Haskell implementation; effectively causing a memory leak. The Haskell node however also maintains a mapping in the other direction, from account to drep, which it uses internally as a source of truth. So the overall state of the system wasn't compromised. Thus, we resorted to using another local-state query: GetFilteredVoteDelegatees
which yields the correct mapping from account to DReps.
I was able to do a first run of the simulator
driven by Moskstraumen 🚀
However it seems the executable is crashing and the output is not very clear:
% RUST_BACKTRACE=1 cabal run blackbox-test -- /Users/arnaud/projects/amaru/amaru/./target/debug/simulator amaru 1 --peer-address=127.0.0.1:3000 --stake-distribution-file data/stake.json
thread 'thread 'mainmain' panicked at ' panicked at simulation/amaru-sim/src/bin/amaru-sim/simulator.rssimulation/amaru-sim/src/bin/amaru-sim/simulator.rs::99:29:
unable to open chain store at ./chain.dbthread '
main' panicked at simulation/amaru-sim/src/bin/amaru-sim/simulator.rs:thread 'main99' panicked at :29:
unable to open chain store at ./chain.db
99simulation/amaru-sim/src/bin/amaru-sim/simulator.rs::29:
unable to open chain store at ./chain.db
stack backtrace:
99:29:
unable to open chain store at ./chain.db
stack backtrace:
stack backtrace:
stack backtrace:
0: _rust_begin_unwind
1: core::panicking::panic_fmt
2: simulator::simulator::bootstrap
3: tokio::runtime::park::CachedParkThread::block_on
4: tokio::runtime::context::runtime::enter_runtime
5: tokio::runtime::runtime::Runtime::block_on
6: simulator::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
0: _rust_begin_unwind
1: core::panicking::panic_fmt
2: simulator::simulator::bootstrap
3: tokio::runtime::park::CachedParkThread::block_on
4: tokio::runtime::context::runtime::enter_runtime
5: tokio::runtime::runtime::Runtime::block_on
6: simulator::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
0: _rust_begin_unwind
1: core::panicking::panic_fmt
2: simulator::simulator::bootstrap
3: tokio::runtime::park::CachedParkThread::block_on
4: tokio::runtime::context::runtime::enter_runtime
5: tokio::runtime::runtime::Runtime::block_on
6: simulator::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
0: _rust_begin_unwind
1: core::panicking::panic_fmt
2: simulator::simulator::bootstrap
3: tokio::runtime::park::CachedParkThread::block_on
4: tokio::runtime::context::runtime::enter_runtime
5: tokio::runtime::runtime::Runtime::block_on
6: simulator::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
blackbox-test: fd:5: Data.ByteString.hGetLine: end of file
{"timestamp":"2025-03-11T17:43:07.629613Z","level":"INFO","fields":{"message":"stage bootstrap ok"},"target":"gasket::runtime","span":{"stage":"pull","name":"stage"},"spans":[{"stage":"pull","name":"stage"}]}
{"timestamp":"2025-03-11T17:43:07.629641Z","level":"INFO","fields":{"message":"switching stage phase","prev_phase":"Bootstrap","next_phase":"Working"},"target":"gasket::runtime","span":{"stage":"pull","name":"stage"},"spans":[{"stage":"pull","name":"stage"}]}
...
It does not crash when running standalone and is able to open the DB, so unsure what's going on here. Perhaps something with process control, or the way tokio is initialized? But that seems weird... Next step: Try to just wrap the executable from Haskell side and sned it canned messages. There's probably a need for some integration test on the simulator side.
I also bit the bullet and implemented a change in chain_selector
to allow having Genesis
or Origin
as the tip
of the chain selection which will mostly be the case, at least initially, for simulations. This rippled all over the chain selection process but did entail drastic refactorings, and lead me to some nice refactoring of the find_best_chain
function which was the only place where handling Genesis
was slightly annoying.
Interestingly, Haskell node has a dedicated type WithOrigin
isomorphic to Maybe
that's used every to wrap types that don't exist at Genesis
. Perhaps something we should consider?
Fake stake distribution [#140]
In order to be able to test the consensus pipeline, whether inside a simulated environment, with Moskstraumen/Maelstrom driver connected to stdin/stdout, or jepsen, with synthetic workload representing various possible scenarios, we need to be able to provide a test double for:
- the
HasStakeDistribution
interface which is used to validate headers - the ledger stage which receives blocks to validate
- the block fetch stage which calls upstream server(s) to retrieve the header's body once it's been validated.
Today I implemented the stake distribution part: The simulator takes a path to a stake distribution file containing data about the known stake pools, generated from Haskell code that's also used to generate the block tree we'd be testing the consensus with. Took me some time to get up to speed again and reconcile the work-in-progress branch I had, but finally nailed it down. I got hit by a mistake I made in the Haskell side: The PoolId
was not serialised correctly, I was actually serializing the VRF Cold key hash. This became obvious when I tried to lookup the pool on the Rust side from the PoolId
: The PoolId
is a 28 bytes long hash, whereas the VRF Key hash is 32 bytes! I know I should have written a proper roundtrip property: Always implement FromJSON/ToJSON together to ensure consistency of your represenation.
Next steps:
- fix the chain selector creation process, in order to be able to start from
Genesis
, which is not currently possible as we expect a concreteHeader
resolved from aPoint
- move the block fetching part out of the way, into its own stage, in order to be able to fake it too.
- connect to SA's tester 🚀
We've spent some time this week to setup an integration environment for end-to-end tests. The main motivation was to ensure we could automate the snapshots checks that are the primary validation mechanism for the ledger state. The challenge being that those snapshot tests demand to run Amaru against the PreProd network for a few epochs, and then run the tests using the snapshots produced while syncing with the network.
There's no out-of-the-box way to run Amaru for a fixed time or until a specific event, but thanks to JSON traces being readily available, we can instrument Amaru to do just that with a couple of lines of bash.
AMARU_TRACE="amaru=debug" cargo run -- --with-json-traces daemon --peer-address=$AMARU_PEER_ADDRESS --network=preprod | while read line; do
EVENT=$(echo $line | jq -r '.fields.message' 2>/dev/null)
SPAN=$(echo $line | jq -r '.spans[0].name' 2>/dev/null)
if [ "$EVENT" == "exit" ] && [ "$SPAN" == "snapshot" ]; then
EPOCH=$(echo $line | jq -r '.spans[0].epoch' 2>/dev/null)
if [ "$EPOCH" == "$TARGET_EPOCH" ]; then
echo "Target epoch reached, stopping the process."
pkill -INT -P $$
break
fi
fi
done
The second challenge we faced was to get a reliable and fast-enough connection to a (Haskell) cardano-node to synchronize from. While there are public peers that are available, they are heavily rate-limited and synchronizing the dozen of epochs required by the test scenario would make the test too slow.
So instead, we've decided to synchronize from a local node, running in a container. To work, this however requires the node to be somewhat synchronized. This being a recurring problem of any client application wanting to perform e2e tests, we've re-used an existing solution to synchronize a node in a nightly workflow:
uses: CardanoSolutions/[email protected]
with:
db-dir: ${{ runner.temp }}/db-${{ matrix.network }}
network: ${{ matrix.network }}
version: ${{ matrix.ogmios_version }}_${{ matrix.cardano_node_version }}
synchronization-level: ${{ inputs.synchronization-level || 1 }}
To make all of this work, we've used a few tricks:
-
We've used Github caches to avoid having to re-synchronize the node on each test run. Instead, we have cron job running twice a day that refresh a cache. The cache is then used by tests in a read-only fashion:
id: cache uses: actions/cache@v4 with: path: ${{ runner.temp }}/db-${{ matrix.network }} key: cardano-node-ogmios-${{ matrix.network }} restore-keys: | cardano-node-ogmios-${{ matrix.network }}
-
Github has some restrictions regarding cache accesses, but they can be re-used across workflows. The cache's restore-key must obviously match on the consumer workflow, but also the path into which the cache is being downloaded. Somehow, the output path is part of the cache invalidation strategy.
-
Running a containerized (Haskell) cardano-node is relatively easy once one knows what the right options are and where to find configuration files.
docker pull ghcr.io/intersectmbo/cardano-node:${{ matrix.cardano_node_version }} make HASKELL_NODE_CONFIG_DIR=cardano-node-config NETWORK=${{ matrix.network }} download-haskell-config docker run -d --name cardano-node \ -v ${{ runner.temp }}/db-${{ matrix.network }}:/db \ -v ${{ runner.temp }}/ipc:/ipc \ -v ./cardano-node-config:/config \ -v ./cardano-node-config:/genesis \ -p 3001:3001 \ ghcr.io/intersectmbo/cardano-node:${{ matrix.cardano_node_version }} run \ --config /config/config.json \ --database-path /db \ --socket-path /ipc/node.socket \ --topology /config/topology.json
-
There are some limits to both the maximum cache storage available per repository (10GB) and the Github runner hardware requirements, in particular with the available filesystem space (14GB). On PreProd, the node database weights already around 11GB, leaving little space for (1) growth and (2) the rest of the build (e.g. cargo dependencies). Plus, it creates needlessly large caches that are also slow(er) to download and uncompress. Since Amaru doesn't do full chain-sync, we can actually operate from a partial immutable database by removing immutable chunks that are past a certain point (roughly everything before chunk 3150). This saves us about 8GB:
name: prune node db shell: bash working-directory: ${{ runner.temp }}/db-${{ matrix.network }} run: | rm -f immutable/00*.chunk immutable/01*.chunk immutable/02*.chunk immutable/030*.chunk
As the README shows, bootstrapping Amaru is becoming more and more complicated. To ease the setup, we've computed each of the bootstrapping steps in a Makefile, which comes in handy to define the Github workflow. A nice "help" is shown by default on make
:
❯ make
Targets:
bootstrap: Bootstrap the node
dev: Compile and run for development with default options
download-haskell-config: Download Cardano Haskell configuration for $NETWORK
import-headers: Import headers from $AMARU_PEER_ADDRESS for demo
import-nonces: Import PreProd nonces for demo
import-snapshots: Import PreProd snapshots for demo
snapshots: Download snapshots
Configuration:
AMARU_PEER_ADDRESS ?= 127.0.0.1:3000
HASKELL_NODE_CONFIG_DIR ?= cardano-node-config
HASKELL_NODE_CONFIG_SOURCE := https://book.world.dev.cardano.org/environments
NETWORK ?= preprod
The last piece to the puzzle was to get the snapshot tests runnable in parallel since this is what cargo test
typically does. The issue is how each test depends on 2 on-disk snapshots, and thus may open concurrent conflicting connections to the same store, causing the tests to fail arbitrarily. So instead, we know maintain a "pool" of re-usable read-only connections in a thread-safe manner, which ensures that tests can be invoked in any order, with non-conflicting ones even running in parallel:
pub static CONNECTIONS: LazyLock<Mutex<BTreeMap<Epoch, Arc<RocksDB>>>> =
LazyLock::new(|| Mutex::new(BTreeMap::new()));
fn db(epoch: Epoch) -> Arc<impl Snapshot + Send + Sync> {
let mut connections = CONNECTIONS.lock().unwrap();
connections
.entry(epoch)
.or_insert_with(|| {
Arc::new(
RocksDB::for_epoch_with(&LEDGER_DB, epoch).unwrap_or_else(|_| {
panic!("Failed to open ledger snapshot for epoch {}", epoch)
}),
)
})
.clone()
}
I started working on connecting the simulation testing work with Amaru.
So far I got: given AB's prefabricated chain.json block tree, I can produce a sequence of sync protocol calls (Fwd and Bwd requests, where Bwd requests are issued when we reach deadends in the tree and there's a possibility to backtrack). These calls are not actually made against the Amaru node yet, since that functionality isn't working yet.
Initially my plan was to do this in Clojure using Maelstrom, but given that the outcome of the discussion last week is that there's no "normal" workload to speak of at the moment, I opted for doing it in Haskell and use the simulation testing prototype (Moskstraumen) instead, since this gives us more flexibility to do "special" workloads if necessary.
There's some glue code still needed to connect the tests and the Amaru node, but all the pieces are there. My plan is to start with the Amaru echo example, since that passes the Maelstrom tests and therefore it should also be able to pass the Moskstraumen tests. That will prove that all the glue works, and then whenever the Amaru sync protocol works we can trivially add that to the test suite.
I also discussed with AB about future steps, which include generalsing to multiple clients walking the block tree, partitions that can be simulated by clients stopping to send messages, and correctness criteria in terms of slots. But this is all still a bit abstract.