log :: 2025‐06

2025-06-25

Fix "messages lost" issue between nodes

It took a couple of minutes discussing with RK to find what was the source of the "messages lost" issue when connecting nodes: The Actor library we use, acto, has bounded size mailboxes for each spawned actor. When a mailbox is full, new incoming messages are dropped: they become dead letters are never processed. While this might be consistent with usual Actors' semantics, it's quite inconvenient for us.

There are a variety of options to deal with the issue, which amounts to handling back-pressure:

keep track of sent messages and require an Ack from the actor. This complicates the machinery, but is in accord with actors' look and feel
provide a try_send method and retry sending the message if mailbox is full
increase the size of the mailbox!

We went for the latter solution as a stop-gap approach, looking forward to implementing new network approach with p2p as discussed the previous days.

Consensus metrics

Collecting interesting metrics for consensus was relatively straightforward:
- define a ConsensusMetrics structure containing a few Gauges and Counters of interest
- initialise the object at startup when Telemetry is active
- Pass an Option<ConsensusMetrics> to all stages so that they can report their own stuff. For stages which are simple wrappers around domain services, the metrics is passed to the latter so that it's the responsibility of the "business logic" to set and update metrics

2025-06-24

Troubleshooting issues connecting 2 Amaru nodes

I spent time since the beginning of the week to polish the "Amaru as a (primitive) relay" Demo for next Friday, which lead me to implement a few minor features and investigate a couple issues.

Simulation tests were failing on my machine (#278) and only on my machine. Spending some time comparing the pure-stage traces as discussed previously it became clear the simulation engine was sometimes not sending messages in the right order which lead to chain validation failure. SA independently investigated the issue himself and this effort lead to PRs: #293 and #294
- Generated input messages are sorted by (planned) arrival time which was an Instant::now(), which on fast machine could lead to 2 messages having the same timestamp and therefore sorting to be non-deterministic.
- The issue was sorted by introducing a delay for each message diffusion which follows an exponential distribution
In practice, ChainSync messages arrival follows 2 modes: when catching-up, they arrive as a constant stream of messages and therefore at the speed of the network, whereas when at tip, they arrive at a random interval dependent on block creation time and block diffusion. The former follows exponential distribution with parameter λ = 1/20 corresponding to block creation frequency expectation, whereas the latter depends on propagation model which could be ΔQ-based.
I initially thought the validation failure observed in simulation testing and 2-nodes setup could be caused by a race condition between child validation and parent storage, but this wasn't the case. It nevertheless lead me to reorganise the pipeline to store the received headers before validating them (see #295) . This is currently done inefficiently and somewhat unsafely as we probably want to avoid storing headers multiple times and detect knonw invalid headers early.
I started trying amaru-doctor to investigate what was the state of each of the nodes. Wweirdly the upstream node doesn't appear to have stored the header it has sent, nor any other. Or rather I could not find it using amaru-doctor, and I am unsure if this is a problem with the soft or the DB

I then resorted to running the 2 nodes dumping their logs in JSON:

AMARU_SERVICE_NAME=amaru-node1 AMARU_OTLP_SPAN_URL=http://clermont:4317 AMARU_OTLP_METRIC_URL=http://clermont:4318/v1/metrics cargo run -- --with-json-traces daemon --network preprod --ledger-dir node1/ledger.db --chain-dir node1/chain.db --peer-address clermont:3001 --listen-address localhost:3000 > node1.log

Note that sending the logs to a file is annoying right now because the output is mangled with ANSI escape characters. ripgrep colorizes the output and add line numbers only in the terminal, which is quite useful
Noticed that the stage.ledger does not trace the header hash and slot so it's hard to correlate those traces, I added some fields to the instrumentation there
Ultimately, we could observe a gap in forward_chain between the faulty child's operation handling and the previous operations:

{"timestamp":"2025-06-24T08:28:24.030110Z","level":"DEBUG","fields":{"message":"got op Forward { header: (2674874, (70264698, c3baea7ec628ae2da91bbae6e11c724b2103b2fe9ae741c063c0e78285c7e5cc)), tip: (2674874, (70
264698, c3baea7ec628ae2da91bbae6e11c724b2103b2fe9ae741c063c0e78285c7e5cc)) }"},"target":"amaru::stages::consensus::forward_chain::client_protocol"}
{"timestamp":"2025-06-24T08:28:24.030134Z","level":"DEBUG","fields":{"message":"got op Forward { header: (2677966, (70349466, ce251a72e20cd6202f217f45e0542ac5c92446901dd040ccb019f126441ea12f)), tip: (2677966, (70349466, ce251a72e20cd6202f217f45e0542ac5c92446901dd040ccb019f126441ea12f)) }"},"target":"amaru::stages::consensus::forward_chain::client_protocol"}

between 70264698 and 70349466 that's 84768 slots, and more than 3100 headers that do not appear in the sending op listing a few hashes between those 2 bounds:

57d7445b2f2b18dea80664111d4ddb56c78bcb3d7486b1a1a43ca8ae160c8445
a9a9043106a384b2dc49f89b3eaed75097d446a1fffd98876c36edebb46224d1
9bb187043f99e501c72fa99b37d64522da33d9313f16e4288c3294958b195c17

and grepping in the logs show that they never get handled in the client_protocol actor.

I commented on the issue and asked for RK's help

Adding metrics for Consensus

This experience lead me to the realisation our metrics story for consensus and stages pipeline wasn't great, so I started implementing a ConsensusMetrics structure and threading it in the pipeline to report various metrics, following the guidelines from EDR#7.

2025-06-20

Amaru P2P networking integration

Discussing integration of @scarmuega work on peer-to-peer networking into amaru:

Pallas primitives could be used in another context -> generic interfaces to the network
scope of p2p pallas work?
Network v2 provides strict boundary b/w actual IO in the network and state
- "business logic" communicates with state through commands and events
- there's one piece of state for each pair of miniprotocol and peer
- note that address is part of the state for each peer
The central concept is the Behaviour trait which exposes a limited set of functions invoked by the manager. Network2 strives to provide opinionated standard behaviours in standard
- for example the InitiatorCommand groups several mini-protocols in a single interface
- behaviours are composable
- standard behaviour takes care of p2p peer selection logic
- how much distance should there be between standard behaviour vs. amaru-specific behaviour? How much of standard behaviour can we reuse?
- schedule_io / apply_io -> are called by the manager to signal incoming events or requests. Interestingly ther sent message is also passed back to the schedule_io
To deal with the initiator/responder distinction, there could be 2 different behaviours: 1 initiator and 1 responder
- But note that the haskell node will try to upgrade a single stream connection to full duplex depending on how the cnx is initiated (initiatoronly vs. initiatorandresponder)
- Hence it makes sense to have a single beahviour for responder/initiator but with a composable approach
what about backpressure?
- could be handled by effects or by bounding the queue
What about disconnects?
- when getting a PeerDisconnected and a message is in flight, we have to tear down the state (and resend the message)
next steps:
- show around Amaru code to @scarmuega
- implement initiator-side of Amaru networking (eg. fetch_header stage) using the new library

2025-06-19

Debugging simulation tests failure

Spent the morning with RK, SA and AB troubleshooting simulation tests failure surfacing only on some specific machine (now reported as issue #278).

First tried to reliably reproduced the observed failure (on Mac OS)
This lead us to add the ability to pass seed value to test generator. We need to be able to set that on the command-line but it's currently only setable in code
The failure is not deterministic even when setting the seed, so this should come from the underlying scheduling. We then wanted to compare the trace of execution of a successful and a failed test.
[!WARNING] trace is heavily overloaded in that context and we should find a way to distinguish between the various usage of the word:
- The trace of messages the simulation tester uses for simulation testing SUT and which is dumped upon failure
- the "trace" of log messages that's printed during test execution and comes from tracing system
- the trace from the TraceBuffer that collects all inputs and outputs in stage-graph runner
We then wanted to record and compare the stage-graph execution traces so we added to simulation the code to capture the trace for all instances and dump those to a file
- the TraceBuffer is shared through an Arc<Mutex<TraceBuffer>> reference and cloned for each node instance we launch
- output is recorded in a timestamped file in raw CBOR format
Once we got our hands on both a success and a failure dump, we realised our tooling was inadequate to easily compare those traces:
- raw CBOR is useless so we need to transform it. We first used rq to generate JSON out of CBOR:
```
rq -c -J < simulation/amaru-sim/success-1750320738.trace | jq -c
```
- this works fine but rq is not maintained anymore and some of the representations are not great (eg. bytestrings -> array of ints)
- moreover the content of the messages is very noisy, e.g with full bytes of headers, decoded bodies, etc.
- we ended up using cbor-bin which provides some filtering capabilities and lead to much simpler traces (see issue linked above)
while a bit frustrating, the session proved fruitful to pinpoint issues on tests execution and how to leverage the simulator's and stage-graph capabilities, but also to highlight how useful it can be to explore various execution paths and identify bugs

2025-06-16

Towards a higher-level API for pure-stage for simulation testing

This is a summary of what SA and RK concluded so far about what a good API should be.

Context: since the pure-stage library allows for modelling several connected stages that might share memory, one cannot simply assume that the processing of one message is a discrete event that produces output messages instantly (which is a simplification that the current Haskell simulator does). So the simulator needs to not only orchestrate the network, but also the thread scheduler so to say.

Here's a sketch of the algorithm for the simulator:

Ask all nodes when the next interesting point in time is;
Pop world heap to figure out which message to deliever next;
If there are no interesting times from step 1 or they all happen after the next message is supposed to be delivered, then advance the time to the arrival time of the message on all nodes and deliver the popped message.

Otherwise put all the interesting times into the world heap and recurse. (We need to make it so that if we ask for the next interesting points in time in step 1, the nodes returns an empty list?)

So the API should be something like:

  handle : ArrivalTime -> Message -> NextArrivalTime -> Vec<Message>
  nextInteresting : Time
  advanceTime : Time -> Vec<Message>

Where NextArrivalTime is when the next message (after the one we just popped) arrives, the idea being that the node we are currently stepping is free to run all its yield points untilt hat time (since nothing interesting is happining in the world until then anyway).

2025-06-11

Connecting 2 nodes

At last we were able to have 2 Amaru nodes connected, with on node using the other node as its (only) peer:

graph LR;
    amaru1 --> amaru2;
    amaru2 --> cardano-node;

This mostly involved implementing the block_fetch server in the forward_chain stage for the consensus.

There is a Draft PR in progress which requires a bit of polishing before merge:

add some tests for the block_fetch logic
document use case in the README

2025-06-10

Single command to `bootstrap` a node

We can make bootstrap in order to query all the data needed for an Amaru node to start syncing on preprod, which means:

download gzip cbor encoded ledger state, aka. snapshots, from some external resource
update nonces for snapshots' tip
download enough headers (at least tip) to get started syncing from another node

I have packaged all those steps into a single command that takes a network identifier and various optional parameters to generate a node's database into a single directory. The goal is to provide a unified interface without exposing the details of how the boostrapping process happens to the user as ultimately we would like to:

support bootstrapping an Amaru node from its own generated snapshots
support bootstrapping from Mithril snapshots
easy creation of testnets

The process was relatively straightforward, I only had to refactor a bit the import_xxx.run functions to avoid using Args structures, which in the end might not be that smart a move. I only had issues implementing the decompression for import_headers function because of identical names from different modules or crates being available in the environment:

the async-compression crate is used to wrap compression libraries into async IO functions, which originally were directly supported into those libraries (eg. flate2) but whose supported has been removed a while ago
this library supports 2 different async IO, futures-io or tokio, both of which expose a AsyncBufRead trait which are enabled through feature flags
I of course initially activated the wrong feature flag which lead me to spend a couple hours scratching my head over type errors. I was able to sort the issue thanks to RK's help!

The introduction of reqwest lead to a dependency on another library with unsupported license:

% cargo deny check -- license
error[rejected]: failed to satisfy license requirements
  ┌─ registry+https://github.com/rust-lang/crates.io-index#[email protected]:4:12
  │
4 │ license = "MPL-2.0"
  │            ━━━━━━━
  │            │
  │            license expression retrieved via Cargo.toml `license`
  │            rejected: license is not explicitly allowed
  │
  ├ MPL-2.0 - Mozilla Public License 2.0:
  ├   - OSI approved
  ├   - FSF Free/Libre
  ├   - Copyleft
  ├ webpki-roots v0.25.4
    └── reqwest v0.11.27
        └── amaru v0.1.0
            └── amaru-sim v0.1.0

licenses FAILED

This was solved simply by adding the license to .cargo/deny.toml file.

2025-06-06

Observability and metrics that's also useful for testing (by SA)

Been having some discussions about observability and metrics with AB and RK, thought I'd jot down some thoughts about observability that's also useful for testing.

If we store the arrival time of when an item got enqueued to the inbox of a stage, we can calculate the latency (the time of being latent) by taking the current time when the stage dequeues it from the inbox and subtracting the arrival time.

We can then process the item and take the current time again after than, the difference is the service time.

The response time is latency + service time.

These three times can be stored in a histogram that isn't sampling, for example this. It's important that the histogram isn't sampling, because then we might lose the tails of the distributions. Prometheus samples and requires you to run a separate process to scrape the metrics, which is a pain when testing locally.

Other stuff to store per stage: queue length, total processed items, idle time.

We can then use Little's law from queuing theory to calculate, for example, the throughput or utilisation.

Having all these statistics inside data structures per stage basically gives us a built-in profiler. We can then expose the stats in different ways, e.g. let Prometheus sample them. Or build a visualiser similar to what they recently built for Clojure's async flow which pings each stage and asks for the stats and visualises them in real time.

(Also if we add traces that can be replayed to put the system back to a state when an error happened, it will probably be useful to have OS-level metrics from that time. Because the error might depend on something that happened in the environment that the trace cannot reproduce.)

Weekly simulation testing update (by SA)

Finished chain sync message generator that uses a pre-generated block tree in Rust. Also paired with AB and RK to port the consensus pipeline, I think once that's done we can start doing basic simulation testing fully in Rust.

In parallel I've been trying to simplify and debug a Haskell prototype of a simulator that can do network-level fault injection. My hope is to try out some ideas there, so that by the time the Rust simulator catches up we can move over the ideas that work with minimal effort.

2025-06-02

Discussion on stages PR

The PR improve the previous way to define and link stages through various ergonomics tweaks, most notably in the way initial state is handled
- We can now define breakpoints which are just predicates over messages sent to a stage defining when it should stop
- We can also overrides external effects, also using a function that can intercept specific messages and handles them
ExternalEffect came out of the want to be able to use async/await code outside of stages, eg. to call into other effectful code, while still keeping track of side effects
- In practice, when we want to ensure determinism in tests we can either resort to swapping a specific effect or to inject an alternative implementation
We decided to merge it and continue refactoring the amaru-sim code to use it in order to fully validate that's the way to go for the "full monty". The amaru-sim code only implements 3 stages but has to deal with storage
We also discussed the main next step in this line of work, namely recording effects
- Recording effects will be very useful to troubleshoot production issues in a simulated environement: we could just replay the recorded trace in order to examine the state of the system
- In simulations, it's enough to record only inputs as everything is deterministic so outputs naturally derive from inputs (Eg. command sourcing)
- Messages need to be serialised to be recorded, and obviously this has to be very efficient. Current best idea is to store them as CBOR bytes in a ring buffer, and persist them only on demand or in testing
- We discussed format and how to best handle serialisation for trace recording and human consumption. serde being the de facto standard in rust world and providing a unified approach to serialisation allowing for different consumers, we agreed it would be great to leverage it as much as possible.
- We could reuse the same serde Serialize/Deserialize traits to generate both CBOR And JSON, the latter being more amenable to human consumption, or possibly even other formats.
One of the shortcomings of minicbor is the need to write our own Encoder/Decoder implementations which is quite annoying. It seems there exists options to reuse serde deriving macros which would remove quite a lot of boilerplate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

log :: 2025‐06

2025-06-25

Fix "messages lost" issue between nodes

Consensus metrics

2025-06-24

Troubleshooting issues connecting 2 Amaru nodes

Adding metrics for Consensus

2025-06-20

Amaru P2P networking integration

2025-06-19

Debugging simulation tests failure

2025-06-16

Towards a higher-level API for pure-stage for simulation testing

2025-06-11

Connecting 2 nodes

2025-06-10

Single command to `bootstrap` a node

2025-06-06

Observability and metrics that's also useful for testing (by SA)

Weekly simulation testing update (by SA)

2025-06-02

Discussion on stages PR

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

log :: 2025‐06

2025-06-25

Fix "messages lost" issue between nodes

Consensus metrics

2025-06-24

Troubleshooting issues connecting 2 Amaru nodes

Adding metrics for Consensus

2025-06-20

Amaru P2P networking integration

2025-06-19

Debugging simulation tests failure

2025-06-16

Towards a higher-level API for pure-stage for simulation testing

2025-06-11

Connecting 2 nodes

2025-06-10

Single command to bootstrap a node

2025-06-06

Observability and metrics that's also useful for testing (by SA)

Weekly simulation testing update (by SA)

2025-06-02

Discussion on stages PR

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Single command to `bootstrap` a node