Skip to content

Conversation

@rkuhn
Copy link
Contributor

@rkuhn rkuhn commented Dec 18, 2025

no individual test yet, will test when new stages are complete

Summary by CodeRabbit

  • New Features

    • In-memory mempool, full transaction-submission and block-fetch protocols, handshake and keep‑alive protocols, peer manager, and a new Tokio-based network connector.
  • Improvements

    • Unified tip/point/slot/height handling across components; centralized protocol deserializers; richer scheduling, supervision and tracing; mempool/store effects exposed for integration.
  • Bug Fixes

    • Clearer header-decode errors, improved fetch/timeouts, and updated tests/mocks to align with protocol and mempool changes.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 18, 2025

Important

Review skipped

Review was skipped as selected files did not have any reviewable changes.

💤 Files selected but had no reviewable changes (1)
  • crates/pure-stage/tests/functional.rs

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds mempool access to consensus, replaces HeaderTip/Point/slot/hash shapes with Tip/BlockHeight/Slot/HeaderHash, introduces amaru-protocols (chainsync/blockfetch/tx-submission/handshake/keepalive/connection/manager), rewires networking to TokioConnections, and overhauls pure-stage scheduling/simulation (ScheduleId/ScheduleIds, schedule/cancel, contramap, wire-stage).

Changes

Cohort / File(s) Summary
Consensus effects & mempool
crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
Add ConsensusOps::mempool() and ConsensusEffects::mempool() returning Arc<dyn TxSubmissionMempool<Tx>>; wire MemoryPool adapter; update MockConsensusOps with mock_mempool; adjust tests to use Tip.
Network effects & ForwardEvent
crates/amaru-consensus/src/consensus/effects/network_effects.rs, crates/amaru-consensus/src/consensus/effects/mod.rs
Remove fetch_block/disconnect from NetworkOps; ForwardEvent::Backward now holds Tip; drop ChainSyncEffect/FetchBlockEffect/DisconnectEffect exports; update call sites.
Consensus stages & wiring
crates/amaru-consensus/src/consensus/stages/*
Replace HeaderTipTip; thread ManagerMessage into fetch paths; remove track_peers; many stage signatures/state shapes changed (receive_header, fetch_block, forward_chain, pull, select_chain).
Kernel types: Point / Tip / BlockHeight / Slot
crates/amaru-kernel/src/protocol_messages/*, crates/amaru-kernel/src/is_header/mod.rs, crates/amaru-kernel/src/protocol_messages/point.rs, .../tip.rs, block_height.rs
Point::Specific becomes (Slot, HeaderHash) and derives Copy; add BlockHeight and Tip(Point, BlockHeight) with CBOR encode/decode; IsHeader uses Slot/BlockHeight; BlockHeader::tip() added.
amaru-protocols (new crate)
crates/amaru-protocols/src/**
New protocols crate: chainsync, blockfetch, tx_submission, handshake, keepalive, connection, manager, mempool_effects, mux, store_effects; adds miniprotocol specs, messages, deserializers, register helpers and many public types/functions.
Mempool crate & strategies
crates/amaru-mempool/src/*
Remove legacy top-level Mempool trait; export strategies::*; add InMemoryMempool and updated DummyMempool implementing TxSubmissionMempool/Mempool; add tests and helpers (SizedMempool).
MemoryPool adapter (protocol effects)
crates/amaru-protocols/src/mempool_effects.rs
New MemoryPool<T> adapter implementing TxSubmissionMempool and CanValidateTransactions, delegating operations to pure-stage external effects.
Network refactor & connections
crates/amaru-network/src/connection.rs, crates/amaru-network/src/lib.rs, crates/amaru-ouroboros-traits/src/connection.rs
Remove acto-based actor & NetworkResource; add TokioConnections and ConnectionProvider/ConnectionId/ToSocketAddrs; implement connect/send/recv/close with timeouts and per-connection management.
pure-stage runtime & scheduling
crates/pure-stage/src/**
Major API/interpreter changes: introduce ScheduleId/ScheduleIds, DropGuard, DeserializerGuard; extend StageEffect/Effect with Schedule/Cancel/Contramap/WireStage; StageRef gains extra; remove CallId/CallRef in favour of schedule-based flow; add resume/simulation rewrites and scheduled runnables.
Simulation, examples & app wiring
simulation/amaru-sim/**, examples/shared/src/lib.rs, crates/amaru/src/**
spawn_node/replay signatures updated; ChainSyncEvent now includes Tip; build_stage_graph accepts Tip + ManagerMessage; network wiring moves to TokioConnections; mempool wired into consensus; NetworkMagic::for_testing added.
Store adapter
crates/amaru-protocols/src/store_effects.rs
New Store<T> exposing ReadOnlyChainStore/ChainStore via pure-stage external effects and external_sync.
Tx-submission subsystem
crates/amaru-protocols/src/tx_submission/**
Full tx_submission subsystem added: messages, initiator/responder, outcome/errors, responder params, tests, assertions, sized_mempool and system-test scaffolding; many new public APIs.
Stagegraph / types / simulation API shifts
crates/pure-stage/src/stagegraph.rs, crates/pure-stage/src/types.rs, crates/pure-stage/src/simulation/**
Remove CallId/CallRef; StageGraph::stage now returns StageBuildRef<..., Box<dyn Any + Send>>; add scheduling APIs and ScheduledRunnables; update TryInStage signature and many resume handlers; add tombstone/supervision fields.
Manifests & workspace
many Cargo.toml
Move many dependencies to workspace mode; add amaru-protocols and amaru-mempool workspace entries; bump pallas crates to 0.34.0; adjust dev-deps.
Misc utilities & errors
assorted files
Add NetworkMagic::for_testing & FromStr; custom NonEmptyBytes Deserialize; VersionTable::query; simplify ConsensusError::CannotDecodeHeader; small Debug/derive adjustments.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Consensus as Consensus
    participant MemoryPool as MemoryPool
    participant Store as Store
    Note left of Client: tx submission flow
    Client->>Consensus: submit_tx(tx)
    Consensus->>MemoryPool: mempool().insert(tx, origin)
    MemoryPool-->>Consensus: (TxId, SeqNo) or Err
    Consensus->>Store: external_sync(ValidateTransaction(tx))
    Store-->>Consensus: validation result (ok / err)
    Consensus-->>Client: ack / reject
Loading
sequenceDiagram
    participant Stage as Stage
    participant Effects as Effects
    participant Scheduler as Scheduler
    participant Target as Target
    Note left of Stage: scheduled call/send flow (pure-stage)
    Stage->>Effects: call(target, timeout, msg_fn)
    Effects->>Scheduler: register schedule -> ScheduleId
    Note right of Scheduler: time passes
    Scheduler->>Effects: trigger ScheduleId
    Effects->>Stage: resume_call_internal(ScheduleId) -> response or timeout
    Stage->>Target: deliver response or handle timeout
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Poem

A little mempool jig beneath the Tip-lit sky,
Tokio keeps tempo while stages spin high.
Schedulers hum, effects march in time,
Protocols wink — chainsync, txs, and a cuppa rhyme.
Ship it, mate — CI lights the sign.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 64.16% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'add chainsync impl based on pure-stage' accurately describes the main contribution—a chain synchronization implementation using the pure-stage framework. It's concise and specific enough for developers scanning history to understand the core change.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (13)
crates/pure-stage/src/tokio.rs (1)

442-458: Solid implementation – follows the established pattern like a speedrunner hitting their marks.

The logic mirrors TokioBuilder::contramap nicely, which is exactly what you'd want for runtime parity. Just one wee observation:

The debug log at line 447 shows {original} -> {new_name}, but new_name is the user-provided prefix, while the actual registered name is name (generated at line 448 with the counter suffix). If you're ever tracing through logs trying to find your adapter, the names won't quite match up – like expecting to find "Master Sword" but it's catalogued as "Master Sword-42".

🔎 Optional: Log the generated name for clarity
             StageEffect::Contramap {
                 original,
                 new_name,
                 transform,
             } => {
-                tracing::debug!("contramap {original} -> {new_name}");
                 let name = stage_name(&mut inner.stage_counter.lock(), new_name.as_str());
+                tracing::debug!("contramap {original} -> {name}");
                 inner.senders.lock().insert(
                     name.clone(),
                     StageOrAdapter::Adapter(Adapter {
crates/amaru-ouroboros/src/praos/nonce.rs (1)

55-55: This binding's a bit like ordering avocado toast when you've already got smashed avo at home, yeah?

Since tip is now just an alias for slot (and Slot implements Copy), this binding doesn't add any new information. You could eliminate it and use slot directly throughout—would make the code tighter.

That said, if the parameter name in the API is literally "tip" (line 56, 58), then keeping this binding does improve readability by matching the domain language. Your call, legend!

🔎 Optional: Eliminate the redundant binding
 let slot = header.slot();
-let tip = slot;
-let epoch = era_history.slot_to_epoch(tip, tip)?;
+let epoch = era_history.slot_to_epoch(slot, slot)?;

-let next_epoch_first_slot = era_history.next_epoch_first_slot(epoch, &tip)?;
+let next_epoch_first_slot = era_history.next_epoch_first_slot(epoch, &slot)?;
crates/pure-stage/src/trace_buffer.rs (1)

142-143: Dead set, mate – the contramap support lands flawlessly.

The new ContramapResponse variant slots into the Display match with the same pattern as its siblings – no rough edges. And the serialization story's already sorted: when responses hit the trace buffer via push_resume(), they go through TraceEntryRef which directly serializes the full StageResponse enum. The to_json() impl and serde derives already handle all variants including ContramapResponse, so there's no breadcrumbs left behind when replaying traces.

crates/amaru-network/src/protocol.rs (1)

25-53: Outcome struct with builder pattern—lovely design!

The shift from an enum to a struct with optional fields gives you more flexibility—you can have a send, a result, both, or neither. The builder methods send() and result() consuming self and returning Self is textbook idiomatic Rust, enabling nice chaining like outcome().send(msg).result(done).

One small thought: since Outcome with all None fields is a valid state, you might consider implementing Default for it:

🔎 Optional Default implementation
+impl<S, D> Default for Outcome<S, D> {
+    fn default() -> Self {
+        Self {
+            send: None,
+            result: None,
+        }
+    }
+}

This would let callers use Outcome::default() as an alternative to outcome(), which some folks find more discoverable. But hey, the standalone outcome() function works just as well—it's a matter of taste, like choosing between controller and keyboard.

crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (1)

140-146: Consistent Tip construction pattern.

Using Tip(Point::Origin, 0.into()) here aligns with the tuple-style construction, though I notice line 79 uses Tip::new(Point::Origin, 0.into()). Both work, but mixing styles could trip up future readers like a plot twist they didn't see coming.

Consider sticking to one construction pattern throughout for consistency:

🔎 Suggested change
         let best_tip = best_intersection
             .and_then(|(_, h)| store.load_header(&h))
             .map(|h| h.as_tip())
-            .unwrap_or(Tip(Point::Origin, 0.into()));
+            .unwrap_or(Tip::new(Point::Origin, 0.into()));
crates/amaru-network/src/handshake/tests.rs (1)

197-198: Tidy refactor using the named function!

Using handshake::handler_transform instead of an inline closure is cleaner – like extracting a reusable function in your game code instead of copy-pasting the same logic everywhere.

One thing though, mate – the other test test_against_node (lines 98-103) still uses an inline closure for the same transformation. Might be worth updating that for consistency, unless you're saving it for a future PR?

🔎 Optional: Align both tests to use the named function
     let handshake_bytes =
-        network.contramap(
-            handshake,
-            "handshake_bytes",
-            |msg: HandlerMessage| match msg {
-                HandlerMessage::FromNetwork(bytes) => {
-                    handshake::HandshakeMessage::FromNetwork(bytes)
-                }
-                HandlerMessage::Registered(_) => handshake::HandshakeMessage::Registered,
-            },
-        );
+        network.contramap(handshake, "handshake_bytes", handshake::handler_transform);
crates/amaru/src/stages/consensus/forward_chain/mod.rs (1)

23-31: Solid bridge function between the internal and pallas Tip types!

This to_pallas_tip helper is like a universal adapter in your inventory – converts the kernel's Tip to the pallas network format cleanly. The implementation is straightforward and correct.

One small nitpick for readability, mate – if Tip has named accessor methods like tip.point() and tip.block_height(), using those might be clearer than tip.0 and tip.1. But tuple field access works fine here too!

🔎 Optional: Use named accessors if available

Based on the relevant snippets showing tip.point() exists (in crates/amaru-kernel/src/protocol_messages/tip.rs:29-31):

 pub fn to_pallas_tip(
     tip: amaru_kernel::protocol_messages::tip::Tip,
 ) -> pallas_network::miniprotocols::chainsync::Tip {
     pallas_network::miniprotocols::chainsync::Tip(
-        amaru_network::point::to_network_point(tip.0),
-        tip.1.as_u64(),
+        amaru_network::point::to_network_point(tip.point()),
+        tip.block_height().as_u64(),
     )
 }
crates/amaru-network/src/connection.rs (2)

82-90: Consider handling unexpected state/message combos more gracefully, legend.

The unimplemented!("{x:?}") will panic if an unexpected state/message combo rolls in – bit like hitting an invisible wall in Dark Souls. While this might be fine during development (fail fast and all that), in production you might want to either log and terminate gracefully or at least document that this is intentional for catching programming errors.

If this is meant to be a fail-fast mechanism for internal bugs (similar to the pattern used in mux.rs based on learnings), a quick comment explaining the intent would help future devs understand why a panic is acceptable here.


92-141: Nice wiring of the handshake flow, cobber!

The do_initialize function cleanly sets up the muxer and handshake stages using the effects system. The contramap usage for transforming handshake results is particularly elegant – like a well-choreographed combo in a fighting game.

The magic number 5760 for handshake buffer appears in two places (lines 104 and 135). Consider extracting this to a named constant for clarity.

🔎 Optional: Extract magic number to a constant
+const HANDSHAKE_MAX_BUFFER: usize = 5760;
+
 async fn do_initialize(
     Params {
         conn_id,
         role,
         magic,
     }: Params,
     eff: Effects<ConnectionMessage>,
 ) -> State {
     let muxer = eff.stage("mux", mux::stage).await;
     let muxer = eff
         .wire_up(
             muxer,
-            mux::State::new(conn_id, &[(PROTO_HANDSHAKE.erase(), 5760)]),
+            mux::State::new(conn_id, &[(PROTO_HANDSHAKE.erase(), HANDSHAKE_MAX_BUFFER)]),
         )
         .await;
     // ... later ...
     eff.send(
         &muxer,
         MuxMessage::Register {
             protocol: PROTO_HANDSHAKE.erase(),
             frame: mux::Frame::OneCborItem,
             handler,
-            max_buffer: 5760,
+            max_buffer: HANDSHAKE_MAX_BUFFER,
         },
     )
     .await;
crates/amaru-consensus/src/consensus/effects/network_effects.rs (1)

44-48: Trait signature updated for the new Tip type!

Fair dinkum, the send_backward_event now speaks the new Tip dialect. Might consider renaming header_tip to just tip for clarity since the type changed, but it's a minor nitpick – like arguing whether it's "tomato" or "tomahto".

crates/amaru-network/src/keepalive/mod.rs (1)

42-83: Initiator keepalive flow is solid, mate!

The ping-pong logic is like a good rhythm game – receive response, validate cookie, queue up next listen, advance cookie, wait a tick, send the next ping. Cookie validation (lines 59-62) is crucial for security – if the cookie doesn't match, we bail. Fair dinkum!

One small observation: the tracing::debug! on line 81 happens after the await on line 80. This means the log shows "sending" after the send completes. Might want to flip the order if you want the log to appear before the actual send, but it's a minor thing.

crates/amaru/src/stages/consensus/forward_chain/tcp_forward_chain_server.rs (1)

85-85: Minor: Unnecessary .clone() on a Copy type.

Since pallas_network::miniprotocols::chainsync::Tip is Copy, the .clone() call here is redundant — like bringing a water bottle to a pool party. Not a big deal, just a wee bit of housekeeping.

🔎 Suggested fix:
-        let our_tip = Arc::new(Mutex::new(our_tip.clone()));
+        let our_tip = Arc::new(Mutex::new(our_tip));
crates/amaru-network/src/chainsync/mod.rs (1)

85-88: FIXME needs addressing before this goes live.

The empty vec![] for intersect points means the initiator won't actually find a valid intersection with the server. It's like trying to play co-op without sending your friend the invite code. This'll need proper implementation with the client's known chain points.

Would you like me to open an issue to track implementing the proper intersect points population, mate?

@codecov
Copy link

codecov bot commented Dec 18, 2025

Codecov Report

❌ Patch coverage is 62.32888% with 2339 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/amaru-protocols/src/store_effects.rs 0.00% 225 Missing ⚠️
crates/amaru-protocols/src/mempool_effects.rs 0.00% 185 Missing ⚠️
crates/amaru-protocols/src/connection.rs 0.00% 154 Missing ⚠️
crates/amaru-protocols/src/chainsync/responder.rs 46.69% 129 Missing ⚠️
crates/amaru-protocols/src/manager.rs 0.00% 108 Missing ⚠️
crates/amaru-protocols/src/chainsync/initiator.rs 57.01% 95 Missing ⚠️
...rates/amaru-protocols/src/protocol/miniprotocol.rs 30.30% 92 Missing ⚠️
crates/pure-stage/src/simulation/running/mod.rs 79.51% 85 Missing ⚠️
crates/amaru-protocols/src/blockfetch/initiator.rs 43.24% 63 Missing ⚠️
...tes/amaru-protocols/src/tx_submission/initiator.rs 87.64% 63 Missing ⚠️
... and 65 more
Files with missing lines Coverage Δ
...ru-consensus/src/consensus/effects/base_effects.rs 66.66% <100.00%> (+14.03%) ⬆️
...-consensus/src/consensus/effects/ledger_effects.rs 100.00% <100.00%> (ø)
...consensus/src/consensus/effects/network_effects.rs 90.56% <100.00%> (+34.79%) ⬆️
...u-consensus/src/consensus/effects/store_effects.rs 64.31% <100.00%> (ø)
crates/amaru-consensus/src/consensus/errors.rs 54.90% <ø> (ø)
.../consensus/headers_tree/data_generation/actions.rs 83.01% <100.00%> (ø)
...us/headers_tree/data_generation/data_generation.rs 97.64% <100.00%> (+0.04%) ⬆️
...ru-consensus/src/consensus/stages/forward_chain.rs 84.78% <100.00%> (ø)
...aru-consensus/src/consensus/stages/select_chain.rs 79.27% <ø> (ø)
...-consensus/src/consensus/stages/validate_header.rs 54.90% <100.00%> (ø)
... and 104 more

... and 20 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
crates/pure-stage/src/simulation/resume.rs (1)

283-288: Crikey, the parameter names could be a tad clearer, mate!

The function signature looks solid as a rock, but orig and name are giving me some cryptic vibes. Since they map to original and new_name in the StageEffect::Contramap variant (line 294), why not go full clarity mode and call them original_name and new_name instead? Makes it easier for the next dev who stumbles through here to grok what's what without playing detective.

🔎 Proposed parameter naming improvement
 pub fn resume_contramap_internal(
     data: &mut StageData,
     run: &mut dyn FnMut(Name, StageResponse),
-    orig: Name,
-    name: Name,
+    original_name: Name,
+    new_name: Name,
 ) -> anyhow::Result<Box<dyn Fn(Box<dyn SendData>) -> Box<dyn SendData> + Send + 'static>> {

And update the validation accordingly:

-    if !matches!(waiting_for, StageEffect::Contramap { original, new_name, .. } if original == &orig && name.as_str().starts_with(new_name.as_str()))
+    if !matches!(waiting_for, StageEffect::Contramap { original, new_name: effect_new_name, .. } if original == &original_name && new_name.as_str().starts_with(effect_new_name.as_str()))
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f410beb and 1362b1b.

⛔ Files ignored due to path filters (1)
  • examples/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • crates/amaru-network/src/chainsync/initiator.rs (1 hunks)
  • crates/amaru-network/src/chainsync/messages.rs (1 hunks)
  • crates/amaru-network/src/chainsync/mod.rs (1 hunks)
  • crates/amaru-network/src/chainsync/responder.rs (1 hunks)
  • crates/amaru-network/src/handshake/tests.rs (3 hunks)
  • crates/amaru-network/src/keepalive/mod.rs (1 hunks)
  • crates/pure-stage/src/simulation/resume.rs (1 hunks)
  • examples/shared/src/lib.rs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/amaru-network/src/chainsync/mod.rs
🧰 Additional context used
🧠 Learnings (14)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.
📚 Learning: 2025-05-12T14:21:27.470Z
Learnt from: stevana
Repo: pragma-org/amaru PR: 210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.

Applied to files:

  • crates/amaru-network/src/handshake/tests.rs
📚 Learning: 2025-09-29T16:39:24.001Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:317-325
Timestamp: 2025-09-29T16:39:24.001Z
Learning: In crates/amaru-network/src/mux.rs, the outgoing() method intentionally uses unwrap() after get_mut(&proto_id) as a fail-fast mechanism. This panic is designed to catch programming errors where an actor tries to send on an unregistered protocol, and should not be changed to return a Result since it represents internal code bugs that should terminate the process, not external input that should be handled gracefully.

Applied to files:

  • crates/amaru-network/src/handshake/tests.rs
📚 Learning: 2025-09-29T16:38:59.323Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:216-233
Timestamp: 2025-09-29T16:38:59.323Z
Learning: In crates/amaru-network/src/mux.rs, the segment length field uses u16 type which naturally limits values to 65535, matching MAX_SEGMENT_SIZE constant exactly. This provides type-level safety against oversized allocations without needing runtime bounds checking.

Applied to files:

  • crates/amaru-network/src/handshake/tests.rs
📚 Learning: 2025-12-16T21:32:37.668Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/amaru-network/src/handshake/tests.rs:40-47
Timestamp: 2025-12-16T21:32:37.668Z
Learning: In Rust, shadowing a binding with a new let does not drop the previous binding until the end of the scope. All shadowed bindings in a scope are dropped in reverse-declaration order when the scope ends. Therefore, multiple let _guard = register_*() calls will keep all guards alive until the end of the function (or the surrounding scope). When reviewing code, be mindful that resources tied to shadowed bindings persist longer than the most recent binding; to release early, constrain the lifetime in an inner block or explicitly drop guards when appropriate.

Applied to files:

  • crates/amaru-network/src/handshake/tests.rs
  • crates/amaru-network/src/chainsync/initiator.rs
  • crates/amaru-network/src/chainsync/responder.rs
  • crates/amaru-network/src/chainsync/messages.rs
  • crates/amaru-network/src/keepalive/mod.rs
  • examples/shared/src/lib.rs
  • crates/pure-stage/src/simulation/resume.rs
📚 Learning: 2025-08-08T14:35:35.562Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/transaction_pointer.rs:36-44
Timestamp: 2025-08-08T14:35:35.562Z
Learning: In the amaru project, when decoding CBOR arrays, prefer using minicbor_extra::heterogenous_array with the expected length to validate definite-length arrays and correctly handle indefinite-length arrays. Example: crates/amaru-kernel/src/transaction_pointer.rs Decode should use heterogenous_array(d, 2, …) instead of ignoring the length from d.array().

Applied to files:

  • crates/amaru-network/src/chainsync/messages.rs
  • examples/shared/src/lib.rs
📚 Learning: 2025-09-06T09:16:25.025Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 435
File: crates/amaru/src/bin/amaru/cmd/convert_ledger_state.rs:113-116
Timestamp: 2025-09-06T09:16:25.025Z
Learning: In cardano-node serialized ledger state CBOR encoding, indefinite-length structures may be terminated with 0xFF "break" markers. The current code in convert_ledger_state.rs unconditionally strips the last byte (bytes[p..bytes.len() - 1]), which could corrupt data if the trailing byte is not actually a CBOR break marker.

Applied to files:

  • crates/amaru-network/src/chainsync/messages.rs
  • examples/shared/src/lib.rs
📚 Learning: 2025-08-08T14:34:06.105Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/minicbor-extra/src/lib.rs:50-55
Timestamp: 2025-08-08T14:34:06.105Z
Learning: Project uses Rust 1.88 stable; using Result::is_err_and is acceptable and should not be flagged as unstable. In particular, crates/minicbor-extra/src/lib.rs relies on is_err_and when checking end_of_input; future reviews should not suggest replacing it for stability reasons.

Applied to files:

  • crates/amaru-network/src/chainsync/messages.rs
📚 Learning: 2025-02-03T11:15:22.640Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Applied to files:

  • crates/amaru-network/src/chainsync/messages.rs
📚 Learning: 2025-08-08T14:46:53.013Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/pool_params.rs:107-116
Timestamp: 2025-08-08T14:46:53.013Z
Learning: In crates/amaru-kernel/src/pool_params.rs, when serializing Relay::SingleHostAddr IPv6 to text, the project intentionally reverses each 4-byte chunk before constructing std::net::Ipv6Addr. This matches cardano-ledger’s IPv6 representation (four little-endian Word32 chunks). Do not “simplify” by passing the raw 16 bytes directly to Ipv6Addr::from; that would break ledger compatibility.

Applied to files:

  • crates/amaru-network/src/chainsync/messages.rs
  • examples/shared/src/lib.rs
📚 Learning: 2025-09-29T16:44:14.807Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/protocol.rs:94-106
Timestamp: 2025-09-29T16:44:14.807Z
Learning: In the amaru-network crate protocol.rs file, the correct Cardano mini-protocol ID assignments are: PROTO_N2N_KEEP_ALIVE = 8 and PROTO_N2N_PEER_SHARE = 10, as verified against the network specification by the maintainer.

Applied to files:

  • crates/amaru-network/src/keepalive/mod.rs
📚 Learning: 2025-08-20T13:02:25.763Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.

Applied to files:

  • examples/shared/src/lib.rs
📚 Learning: 2025-06-03T06:31:57.736Z
Learnt from: stevana
Repo: pragma-org/amaru PR: 236
File: simulation/amaru-sim/src/simulator/generate.rs:141-145
Timestamp: 2025-06-03T06:31:57.736Z
Learning: In the amaru project, the team prefers to use as_bytes() instead of hex::decode() for converting hash and header strings to bytes in simulation/amaru-sim/src/simulator/generate.rs, even though they appear to be hex-encoded strings.

Applied to files:

  • examples/shared/src/lib.rs
📚 Learning: 2025-08-08T14:43:20.218Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/point.rs:45-52
Timestamp: 2025-08-08T14:43:20.218Z
Learning: In crates/amaru-kernel/src/point.rs, Point::Specific stores a header hash that is always exactly 32 bytes by project invariant. Therefore, converting it with `impl From<&Point> for Hash<32>` without a runtime length check is acceptable; future reviews should not request an error-returning check inside this `From` impl. If validation is ever desired, it should be done via `TryFrom` or upstream constructors.

Applied to files:

  • examples/shared/src/lib.rs
🧬 Code graph analysis (6)
crates/amaru-network/src/handshake/tests.rs (1)
crates/amaru-network/src/handshake/mod.rs (1)
  • handler_transform (189-194)
crates/amaru-network/src/chainsync/initiator.rs (3)
crates/amaru-network/src/protocol.rs (1)
  • outcome (48-53)
crates/amaru-network/src/chainsync/responder.rs (2)
  • new (41-57)
  • step (251-302)
crates/amaru-network/src/chainsync/messages.rs (4)
  • decode (92-127)
  • decode (131-165)
  • encode (38-88)
  • encode (169-199)
crates/amaru-network/src/chainsync/responder.rs (2)
crates/amaru-network/src/protocol.rs (1)
  • outcome (48-53)
crates/amaru-network/src/chainsync/messages.rs (4)
  • decode (92-127)
  • decode (131-165)
  • encode (38-88)
  • encode (169-199)
crates/amaru-network/src/chainsync/messages.rs (2)
crates/amaru-kernel/src/protocol_messages/tip.rs (3)
  • decode (68-74)
  • encode (54-64)
  • point (30-32)
crates/amaru-kernel/src/protocol_messages/point.rs (3)
  • decode (119-139)
  • encode (104-115)
  • from (66-68)
crates/amaru-network/src/keepalive/mod.rs (1)
crates/amaru-network/src/keepalive/messages.rs (5)
  • new (27-29)
  • decode (60-62)
  • decode (97-120)
  • encode (49-56)
  • encode (73-93)
examples/shared/src/lib.rs (2)
crates/pure-stage/src/serde.rs (2)
  • from_cbor (654-656)
  • to_cbor (640-651)
crates/amaru-kernel/src/memoized/plutus_data.rs (1)
  • hash (44-46)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (rust)
🔇 Additional comments (23)
crates/amaru-network/src/handshake/tests.rs (2)

18-18: Clean import refactor, mate!

The import change is spot on - you've dropped HandlerMessage since the inline closure's been replaced with handshake::handler_transform, but kept MuxMessage for the Register messages down below. Nice and tidy! 🎯


94-95: Refactor's looking fresh as! 🔥

Replacing that inline closure with handshake::handler_transform is a ripper move - you've eliminated code duplication across both tests and made the intent crystal clear. The function signature matches what contramap needs, and it's properly exported from the handshake module.

This kind of "extract and reuse" is exactly the sort of level-up refactor that makes the codebase easier to maintain. Well played! 🎮

Also applies to: 188-189

crates/pure-stage/src/simulation/resume.rs (1)

303-310: Nailed the pattern like a seasoned speedrunner!

The state mutation and response flow here is absolutely mint - you've followed the same validate-then-mutate pattern that's running through all the other resume functions in this file. The comment at line 303 is like a save point reminder, the panic at line 305 is safe 'cause of the validation above, and the into_inner() call is consistent with how resume_wire_stage_internal does it at line 280.

Clean execution, mate. No worries here.

crates/amaru-network/src/chainsync/responder.rs (6)

30-57: LGTM! Solid struct design for tracking responder state.

The Responder struct cleanly encapsulates all the necessary state - upstream tip, peer info, connection details, and the muxer reference. Starting with Point::Origin for the pointer is a sensible default, like starting a new game from the first checkpoint. Based on learnings, handling Origin vs Specific points is important for fresh nodes.


66-154: Clean async loop implementation with proper state machine stepping.

The message transformation via the internal Msg enum and the From impl is a nice pattern - keeps the external API clean while allowing internal flexibility. The loop structure for processing actions until completion is like a proper game loop, mate.

One small thing to note: the or_terminate pattern is used consistently for error handling, which is grand for ensuring the responder doesn't continue in an inconsistent state.


174-184: Verify ancestor traversal terminates if pointer hash doesn't exist.

The error at line 178 handles the case where the pointer's hash isn't in the store, but if the client sends a completely bogus point that happens to not be on any known chain, this will bail out appropriately. That said, the ancestors iterator at line 179 needs to eventually find common ground or exhaust - if it doesn't, line 184 will correctly error. Looks solid, like a well-scripted heist movie where every exit is covered.


202-224: Intersection search logic is correct with early termination optimization.

The sorting in descending order (Reverse) and the early break at line 219-221 when the current point is smaller than the smallest candidate is a nice optimization - no point checking further ancestors if we've already passed all candidate points. Clever stuff, like knowing when to fold in poker.


250-302: State machine transitions look complete and well-structured.

The step function handles all the expected transitions for the responder role:

  • FindIntersectIntersect state
  • Intersection results → back to Idle with appropriate send_rollback flag
  • RequestNextCanAwaitMustReply flow
  • RollForward/RollBackward from awaiting states

The catch-all at line 300 that bails on invalid state/input combos is good defensive programming. It's like having a bouncer that knows exactly who's on the guest list.


190-199: Hardcoding variant: 6 assumes Conway-era headers, but BlockHeader doesn't expose era information to extract dynamically.

The concern is valid — variant 6 assumes all headers are Conway-era. However, your suggestion to extract the era from BlockHeader itself isn't currently feasible: the BlockHeader type doesn't expose any era or variant field. The underlying Header (from pallas) comes into amaru as a Babbage-compatible structure, but BlockHeader doesn't wrap or expose the era information.

The codebase comment explicitly states: "There's no difference in headers' structure between Babbage and Conway era. The idea is that we only keep concrete the header from the latest era." This is by design—variant 6 is correct for the current single-era approach.

If multi-era support becomes needed, you'd need to either track the era separately (outside BlockHeader) or refactor BlockHeader to store era metadata alongside the header data, rather than trying to derive it from the header itself.

crates/amaru-network/src/keepalive/mod.rs (3)

42-84: Consider whether the 1-second wait should apply after registration.

Hey mate! The eff.wait(Duration::from_secs(1)).await at line 75 runs after the match block, meaning it fires for both Registered and FromNetwork cases. This means even on initial registration, there's a 1-second delay before sending the first keepalive.

If that's intentional (to avoid hammering the network right after connection), it's all good - just wanted to flag it in case it was meant to only apply after receiving a response. Like the difference between a dramatic pause and accidentally forgetting your lines!


86-116: Responder implementation is clean and straightforward.

The responder does exactly what it should - receives a KeepAlive, echoes back a ResponseKeepAlive with the same cookie, and requests the next message. Simple as a Sunday arvo.

Using .responder() on the protocol ID for sending (line 105) and WantNext (line 110) is correct for the responder role.


30-40: LGTM! Clean struct design.

Nice and simple - just the muxer reference and the cookie. The #[derive] chain includes Clone which makes sense for passing the state around. No unnecessary baggage here.

crates/amaru-network/src/chainsync/initiator.rs (4)

35-61: LGTM! Initiator struct properly encapsulates connection state.

Starting with upstream: None and state: InitiatorState::Idle is the right call - we don't know the upstream tip until we get a response. The pipeline reference for forwarding results upstream completes the picture nicely.


129-145: Exponential spacing for intersection points is well-implemented.

Ah, the classic binary search-esque approach for finding the intersection! Pushing points at indices 1, 2, 4, 8, 16, etc. gives you O(log n) points to check while covering the entire chain. And line 143 ensures we always include the oldest point we know about - can't miss the genesis, like never skipping the tutorial level.

One tiny nit: the #[expect(clippy::expect_used)] at line 133 suppresses the panic lint. The comment implies it's a valid assumption, but if somehow the best chain hash doesn't exist (corrupted store?), this would panic. Given this is initialization code, that's probably acceptable.


110-124: Tip update logic is consistent across all result types.

Every result variant updates initiator.upstream with the received tip before forwarding the message upstream. This ensures we always have the latest view of the peer's chain. Consistent handling across all four cases - tidy as a well-organized inventory screen.


170-208: State machine looks spot on for the happy path and early exits.

The transitions nail the standard flow: IdleFindIntersectIntersect, with proper handling for both success and failure cases through IntersectFound/IntersectNotFound. The roll forward/backward cycling and MustReply mechanics are textbook stuff.

Lines 203-205 allowing Done from Intersect is legit—it's your escape hatch when you've found what you need or decided to bail. Per the Cardano spec, Done can terminate from any state, so this matches the protocol dialect nicely.

crates/amaru-network/src/chainsync/messages.rs (5)

18-28: Clean Message enum design covering all ChainSync message types.

All the expected ChainSync mini-protocol messages are present and accounted for. The tuple variants with their payloads (HeaderContent, Point, Tip, Vec<Point>) match the protocol requirements. Like a complete character roster in a fighting game - everyone's here.


37-88: Message encoding is consistent and well-structured.

The encoding pattern is clear: array(N) followed by the label and payload items. The label assignments (0-7) are consistent with the decode implementation. Each variant correctly counts its elements in the array:

  • 1-element: RequestNext, AwaitReply, Done (just label)
  • 2-element: FindIntersect (label + points array), IntersectNotFound (label + tip)
  • 3-element: RollForward, RollBackward, IntersectFound (label + 2 payload items)

Nice work addressing the previous review comment about IntersectNotFound - line 79 now correctly uses array(2).


91-128: Message decoding correctly mirrors the encoding.

The decode implementation properly handles all label values and constructs the appropriate variants. The unknown variant error at line 123-125 is good defensive coding - reject what you don't understand rather than silently corrupting.


130-166: HeaderContent decoding handles Byron vs Shelley+ variants correctly.

The Byron-specific handling (variant 0) with its prefix tuple is a necessary quirk of the Cardano protocol - Byron headers have that extra metadata before the CBOR payload. The comment at lines 140-141 acknowledging the unclear structure is honest and helpful for future maintainers. Sometimes you just gotta work with what the docs give you, like deciphering ancient game manuals.


168-200: HeaderContent encoding properly validates Byron variant requires prefix.

The guard at line 181-189 that returns an error if we have variant 0 but no byron_prefix is good defensive programming. Can't let inconsistent state slip through to the wire. The nested array structure for Byron (line 179, 182) matches the decode logic perfectly.

examples/shared/src/lib.rs (2)

16-20: Sweet as! All them imports are properly wired up.

No drama here, mate. All the symbols you've imported from amaru_kernel are exactly where they should be—Bytes, Hash, Hasher, the lot. The protocol_messages restructure didn't bork anything; Point's still hanging out in protocol_messages::point and getting re-exported cleanly. Your to_cbor and from_cbor buddies are coming through amaru_minicbor_extra without a hitch, and NetworkName and GlobalParameters are accessible via their respective modules. The code compiles like a charm—you're right as rain!


65-68: Spot on with the Point upgrade, mate! The changes are bang on – slot.into() correctly converts the u64 slot to the strong-typed Slot for Point::Specific, thanks to the From implementation in the slot-arithmetic crate, and Hasher::<256>::hash() returns the right Hash<32> type that matches HeaderHash. The pattern aligns perfectly with the PR's introduction of proper type safety for Point construction.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
crates/pure-stage/src/simulation/resume.rs (1)

294-294: The starts_with check is playing a bit too fast and loose here, legend.

This is the same issue flagged in a previous review – the prefix matching can cause false positives. A name like "foobar" would match new_name = "foo", when you actually want to match only names with the -{counter} suffix pattern that stage_name() generates (like "foo-1", "foo-2").

You've got a name_match() helper in blocked.rs (around lines 129-143) that does this properly – might be worth extracting that logic or using exact equality like the original check does.

It's a bit like in Dark Souls – a small hitbox miscalculation and suddenly you're taking damage from attacks that looked like they missed by a mile!

🔎 Suggested approaches

Option 1: Use exact equality like the original check:

-    if !matches!(waiting_for, StageEffect::Contramap { original, new_name, .. } if original == &orig && name.as_str().starts_with(new_name.as_str()))
+    if !matches!(waiting_for, StageEffect::Contramap { original, new_name, .. } if original == &orig && &name == new_name)

Option 2: Extract and reuse the name_match() pattern from blocked.rs that validates the suffix is -{digits}.

🧹 Nitpick comments (26)
crates/amaru-network/src/protocol.rs (1)

26-53: Struct-based Outcome with builder pattern — flexible design!

The optional fields approach gives ya more flexibility than the previous enum would've. You can now have send-only, result-only, or both.

Tiny nitpick though: the result() method at line 40 takes a parameter called done, but it sets the field called result. Might be worth aligning the names for clarity, yeah? Like when a game's tooltip says one thing but the actual effect does another — can trip ya up.

🔎 Optional naming alignment
-    pub fn result(self, done: D) -> Self {
+    pub fn result(self, result: D) -> Self {
         Self {
             send: self.send,
-            result: Some(done),
+            result: Some(result),
         }
     }
crates/amaru-network/src/keepalive/mod.rs (2)

46-88: Initiator logic is on point!

The cookie validation (lines 63-66) with early termination is exactly what you want—no mucking about with mismatched cookies. The state progression with cookie.next() keeps your keepalive rounds distinct.

One thing though: the 1-second wait on line 79 is hardcoded. Might be worth pulling that out as a constant if it's a protocol requirement, or making it configurable if different scenarios need different intervals.

🎨 Optional refactor: extract keepalive interval constant
+const KEEPALIVE_INTERVAL: Duration = Duration::from_secs(1);
+
 pub async fn initiator(
     mut state: KeepAlive,
     msg: HandlerMessage,
     eff: Effects<HandlerMessage>,
 ) -> KeepAlive {
     match msg {
         HandlerMessage::Registered(_) => {}
         HandlerMessage::FromNetwork(non_empty_bytes) => {
             // ... existing code ...
         }
     }
-    eff.wait(Duration::from_secs(1)).await;
+    eff.wait(KEEPALIVE_INTERVAL).await;
     // ... rest of code ...
 }

75-75: TODO noted for timing metrics

Fair dinkum leaving this TODO for later—tracking keepalive timings for monitoring makes total sense, but doesn't need to block the initial implementation.

Want me to sketch out what a timing tracker implementation might look like, or open an issue to track this?

crates/amaru/src/stages/consensus/forward_chain/mod.rs (1)

24-31: Functional conversion helper – consider using accessors for clarity.

The function does the job like a proper side quest NPC, but accessing tuple fields directly (tip.0, tip.1) is a bit like reading subtitles in a language you half-understand. Based on the relevant code snippet from crates/amaru-kernel/src/protocol_messages/tip.rs, there's a point() accessor available. Using it would make the intent clearer:

🔎 Suggested improvement for readability
 pub fn to_pallas_tip(
     tip: amaru_kernel::protocol_messages::tip::Tip,
 ) -> pallas_network::miniprotocols::chainsync::Tip {
     pallas_network::miniprotocols::chainsync::Tip(
-        amaru_network::point::to_network_point(tip.0),
-        tip.1.as_u64(),
+        amaru_network::point::to_network_point(tip.point()),
+        tip.block_height().as_u64(),
     )
 }

This assumes block_height() accessor exists – if not, tip.1.as_u64() is perfectly fine!

crates/amaru-network/src/socket.rs (1)

168-184: Test helper works, but the error handling is a bit of a maze.

The function uses multiple unwrap() calls inside the timeout future (lines 179-180), which means connection failures will panic before the timeout can properly handle them. Then the timeout result gets unwrapped again at line 182. While this is test code and panics are fine, it might make debugging trickier when tests fail—you'll get a panic instead of a clear timeout or connection error.

Consider propagating errors with ? inside the async block and handling them consistently, like this:

🔎 Optional refactor for clearer error handling
 #[cfg(test)]
 pub async fn create_connection(conn: &ConnectionResource) -> anyhow::Result<ConnectionId> {
-    Ok(
-        tokio::time::timeout(std::time::Duration::from_secs(5), async {
-            let addr = crate::socket_addr::ToSocketAddrs::String(
-                std::env::var("PEER").unwrap_or_else(|_| "127.0.0.1:3000".to_string()),
-            )
-            .resolve()
-            .await
-            .unwrap();
-            conn.connect(addr).await.unwrap()
-        })
-        .await?,
-    )
+    tokio::time::timeout(std::time::Duration::from_secs(5), async {
+        let addr = crate::socket_addr::ToSocketAddrs::String(
+            std::env::var("PEER").unwrap_or_else(|_| "127.0.0.1:3000".to_string()),
+        )
+        .resolve()
+        .await?;
+        conn.connect(addr).await
+    })
+    .await?
 }

This way timeout errors and connection errors are distinguishable—like knowing whether you died from fall damage or enemy fire in a game.

crates/amaru-network/src/tx_submission/tests/faulty_tx_validator.rs (1)

24-38: Faulty validator logic works, but could be clearer.

The validator rejects transactions when the counter is odd, which means the sequence goes: pass, fail, pass, fail, etc. The comment on line 26 says "reject every second transaction" which is accurate, but since the counter starts at 0, it might be a bit confusing. The first call has count=0 (even), second has count=1 (odd, rejected), etc.

The logic is sound for a test helper, but you might consider making it more explicit:

🔎 Optional refactor for clarity
     fn validate_transaction(&self, _tx: Tx) -> Result<(), TransactionValidationError> {
-        // Reject every second transaction
         let mut count = self.count.lock().unwrap();
-        let is_valid = (*count).is_multiple_of(2);
         *count += 1;
-        if is_valid {
+        // Reject odd-numbered transactions (2nd, 4th, 6th, etc. if counting from 1)
+        if (*count - 1).is_multiple_of(2) {
             Ok(())
         } else {
             Err(TransactionValidationError::new(anyhow::anyhow!(
                 "Transaction is invalid"
             )))
         }
     }

Or even simpler, increment first then check:

🔎 Alternative: increment-then-check pattern
     fn validate_transaction(&self, _tx: Tx) -> Result<(), TransactionValidationError> {
         let mut count = self.count.lock().unwrap();
         *count += 1;
-        // Reject every second transaction
-        let is_valid = (*count).is_multiple_of(2);
-        *count += 1;
-        if is_valid {
+        // Reject odd-numbered transactions (1st call is count=1, valid; 2nd is count=2, invalid)
+        if count.is_multiple_of(2) {
+            Err(TransactionValidationError::new(anyhow::anyhow!(
+                "Transaction is invalid"
+            )))
+        } else {
             Ok(())
-        } else {
-            Err(TransactionValidationError::new(anyhow::anyhow!(
-                "Transaction is invalid"
-            )))
         }
     }

Either way works—just a matter of making the intent crystal clear, like tutorial tooltips that actually explain the mechanics.

crates/amaru-network/src/tx_submission/tests/nodes_options.rs (1)

73-78: Heads up: hardcoded capacity in with_responder_tx_validator.

Like a side quest in Elden Ring you might overlook – the capacity is hardcoded to 4 here. If someone wants a different capacity and a custom validator, they'd need to call with_responder_mempool directly with a manually constructed SizedMempool.

Not a blocker, just a wee gotcha for future test authors.

🔎 Optional: Add a more flexible builder method
pub fn with_responder_mempool_capacity_and_validator(
    self,
    capacity: u64,
    tx_validator: Arc<dyn CanValidateTransactions<Tx>>,
) -> Self {
    self.with_responder_mempool(Arc::new(SizedMempool::with_tx_validator(capacity, tx_validator)))
}
crates/amaru-network/src/tx_submission/tests/assertions.rs (2)

31-36: Dead assignment, like a dropped loot item no one picks up.

Line 31 assigns an empty vec to actual, but it's immediately reassigned on lines 33-36. You can simplify this, cobber.

🔎 Proposed fix
-    let mut actual = vec![];
-
-    actual = tx_ids
+    let actual: Vec<_> = tx_ids
         .iter()
         .filter(|tx_id| mempool.contains(tx_id))
         .collect();

27-29: Doc says "eventually" but it's more like "right now, mate".

The comment mentions "eventually present" but the function checks synchronously without any retries or polling. Might want to tweak the wording to avoid confusion – or add a wee polling loop if eventual consistency is the intent.

🔎 Suggested doc fix
-/// Check that all the given transactions are eventually present in the given mempool.
+/// Check that all the given transactions are present in the given mempool.
crates/amaru-network/src/handshake/tests.rs (1)

160-161: Slight inconsistency with the first test, mate.

The first test (test_against_node) extracts network_magic into a variable, but here you're using NetworkMagic::MAINNET directly. Not a game-breaker, just a bit like having different control schemes in co-op mode – works, but slightly jarring.

🔎 For consistency with the first test
+    let network_magic = NetworkMagic::MAINNET;
+
     let handshake = network.stage("handshake", handshake::stage);
     let handshake = network.wire_up(
         handshake,
         handshake::Handshake::new(
             mux.clone().without_state(),
             output,
             Role::Initiator,
-            VersionTable::v11_and_above(NetworkMagic::MAINNET, true),
+            VersionTable::v11_and_above(network_magic, true),
         ),
     );
     ...
     assert_eq!(
         result,
         handshake::HandshakeResult::Accepted(
             VersionNumber::V14,
-            VersionData::new(NetworkMagic::MAINNET, true, PEER_SHARING_DISABLED, false),
+            VersionData::new(network_magic, true, PEER_SHARING_DISABLED, false),
         )
     );

Also applies to: 186-187

crates/amaru-network/src/tx_submission/responder_state.rs (1)

192-206: Unnecessary generic shadowing the imported Tx type.

This function introduces a generic <Tx: Send + Sync + 'static> that shadows the imported amaru_kernel::Tx. Since the function doesn't actually use the Tx type (only TxId), this generic seems unnecessary and could cause confusion.

🔎 Remove the unnecessary generic
-    fn received_tx_ids<Tx: Send + Sync + 'static>(
+    fn received_tx_ids(
         &mut self,
-        mempool: &dyn TxSubmissionMempool<Tx>,
+        mempool: &dyn TxSubmissionMempool<Tx>,
         tx_ids: Vec<(TxId, u32)>,
     ) {
crates/amaru-network/src/tx_submission/tests/test_data.rs (1)

39-45: Optional: Could be a tad more Rustic with iterators.

This works a treat, but you could express it more idiomatically like a proper Melbourne hipster sipping flat whites:

🔎 Iterator-based alternative
 pub fn create_transactions(number: u64) -> Vec<Tx> {
-    let mut txs = vec![];
-    for i in 0..number {
-        txs.push(create_transaction(i));
-    }
-    txs
+    (0..number).map(create_transaction).collect()
 }
crates/amaru-network/src/tx_submission/outcome.rs (1)

106-117: The PartialEq for Outcome could skip the string dance.

Mate, ProtocolError already derives PartialEq, so you're doing extra work here – like taking the scenic route through the Outback when there's a highway. You can compare errors directly without the string formatting detour:

🔎 Direct comparison fix
 impl PartialEq for Outcome {
     fn eq(&self, other: &Self) -> bool {
         match (self, other) {
             (Outcome::Done, Outcome::Done) => true,
-            (Outcome::Error(e1), Outcome::Error(e2)) => format!("{}", e1) == format!("{}", e2),
+            (Outcome::Error(e1), Outcome::Error(e2)) => e1 == e2,
             (Outcome::Send(msg1), Outcome::Send(msg2)) => msg1 == msg2,
             _ => false,
         }
     }
 }
crates/amaru-network/src/tx_submission/stage.rs (3)

176-179: Magic numbers spotted - like finding a cheat code without the manual!

ResponderParams::new(2, 3) - what do these numbers mean? A quick comment or named constants would help future you (or anyone else) understand without having to trace through the code like you're solving a puzzle in Myst.

🔎 Consider named constants
+// TODO: Document the meaning of these responder parameters
+const DEFAULT_ACK_THRESHOLD: u16 = 2;
+const DEFAULT_REQ_THRESHOLD: u16 = 3;
+
 Role::Responder => {
     let tx_submission = eff.stage("tx_submission", responder_stage).await;
     eff.wire_up(
         tx_submission,
         TxSubmission::new(
             mux.clone(),
-            TxSubmissionResponderState::new(ResponderParams::new(2, 3)),
+            TxSubmissionResponderState::new(ResponderParams::new(DEFAULT_ACK_THRESHOLD, DEFAULT_REQ_THRESHOLD)),
         ),
     )
     .await
 }

203-208: Another magic number in the wild!

max_buffer: 5760 - is this derived from some protocol spec or network constraint? Would be bonzer to have a constant or at least a comment explaining the reasoning. Like knowing why Link starts with 3 hearts in Zelda.


250-251: FIXME acknowledged - timeout value TBD.

Good that you've flagged this with a FIXME. The 1-second timeout is reasonable as a placeholder. Just wanted to give this a nod - you're aware it needs tuning, no drama here.

Would you like me to open an issue to track finding the right timeout value?

crates/amaru-network/src/mempool_effects.rs (1)

186-217: TxIdsSince struct missing Eq derive.

Line 186 derives only Debug, PartialEq, Serialize, Deserialize but the other effect structs also derive Eq. For consistency and to enable use in HashSets/BTreeSets if ever needed:

🔎 Proposed fix for consistency
-#[derive(Debug, PartialEq, serde::Serialize, serde::Deserialize)]
+#[derive(Debug, PartialEq, Eq, serde::Serialize, serde::Deserialize)]
 struct TxIdsSince {
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (1)

150-159: Initial rollback now wraps with PallasTip correctly.

The conversion PallasTip(to_network_point(init_tip.0), init_tip.1.as_u64()) properly bridges between kernel and pallas types. Using .0 and .1 tuple access works but is less readable than using accessor methods.

🔎 Consider using accessor methods for clarity
         if let Some(ref init_tip) = self.initial {
             let result = Some(ClientOp::Backward(PallasTip(
-                to_network_point(init_tip.0),
-                init_tip.1.as_u64(),
+                to_network_point(init_tip.point()),
+                init_tip.block_height().as_u64(),
             )));

Based on the relevant_code_snippets showing Tip::point() and Tip::block_height() accessors exist.

crates/amaru/src/stages/consensus/forward_chain/tcp_forward_chain_server.rs (1)

85-85: Unnecessary .clone() on a moved value.

Since our_tip is passed by value and immediately consumed by the Mutex::new(), the .clone() is doing a bit of overtime for nothing, like grinding XP when you're already max level.

🔎 Optional fix
-        let our_tip = Arc::new(Mutex::new(our_tip.clone()));
+        let our_tip = Arc::new(Mutex::new(our_tip));
crates/amaru-network/src/tx_submission/initiator_state.rs (2)

176-185: Unnecessary .clone() on tx_ids.

At line 179, you're cloning tx_ids then immediately consuming it with into_iter(). Since you own tx_ids, you can skip the clone - it's like copying a save file just to delete the original straight after.

🔎 Optional fix
     fn get_next_tx_ids<Tx: Send + Debug + Sync + 'static>(
         &mut self,
         mempool: &dyn TxSubmissionMempool<Tx>,
         required_next: u16,
     ) -> anyhow::Result<Vec<(TxId, u32)>> {
         let tx_ids = mempool.tx_ids_since(self.next_seq(), required_next);
         let result = tx_ids
-            .clone()
             .into_iter()
             .map(|(tx_id, tx_size, _)| (tx_id, tx_size))
             .collect();
-        self.update(tx_ids);
+        self.update(mempool.tx_ids_since(self.next_seq(), required_next));
         Ok(result)
     }

Actually, that would call mempool twice. Better approach - iterate once and collect both:

     fn get_next_tx_ids<Tx: Send + Debug + Sync + 'static>(
         &mut self,
         mempool: &dyn TxSubmissionMempool<Tx>,
         required_next: u16,
     ) -> anyhow::Result<Vec<(TxId, u32)>> {
         let tx_ids = mempool.tx_ids_since(self.next_seq(), required_next);
-        let result = tx_ids
-            .clone()
-            .into_iter()
-            .map(|(tx_id, tx_size, _)| (tx_id, tx_size))
-            .collect();
-        self.update(tx_ids);
+        let result: Vec<(TxId, u32)> = tx_ids
+            .iter()
+            .map(|(tx_id, tx_size, _)| (*tx_id, *tx_size))
+            .collect();
+        self.update(tx_ids);
         Ok(result)
     }

187-192: VecDeque drain could be slightly cleaner.

The current approach works grand, but VecDeque allows in-place modification with drain. Minor efficiency win if you're into that sort of optimisation.

🔎 Optional alternative
     fn discard(&mut self, acknowledged: u16) {
-        if self.window.len() >= acknowledged as usize {
-            self.window = self.window.drain(acknowledged as usize..).collect();
-        }
+        for _ in 0..acknowledged.min(self.window.len() as u16) {
+            self.window.pop_front();
+        }
     }
crates/amaru-mempool/src/strategies/in_memory_mempool.rs (4)

134-151: Redundant sort in tx_ids_since.

Line 149 sorts by seq_no, but BTreeMap::range() already iterates in key order (ascending by MempoolSeqNo). The sort is doing nothing but burning a few cycles - like fast-forwarding a cutscene you've already skipped.

🔎 Remove redundant sort
     fn tx_ids_since(&self, from_seq: MempoolSeqNo, limit: u16) -> Vec<(TxId, u32, MempoolSeqNo)> {
-        let mut result: Vec<(TxId, u32, MempoolSeqNo)> = self
+        self
             .entries_by_seq
             .range(from_seq..)
             .take(limit as usize)
             .map(|(seq, tx_id)| {
                 let Some(entry) = self.entries_by_id.get(tx_id) else {
                     panic!(
                         "Inconsistent mempool state: entry missing for tx_id {:?}",
                         tx_id
                     )
                 };
                 (*tx_id, entry.tx_size, *seq)
             })
-            .collect();
-        result.sort_by_key(|(_, _, seq_no)| *seq_no);
-        result
+            .collect()
     }

153-166: Could use direct lookups instead of full iteration.

get_txs_for_ids iterates all entries then filters. Since you have entries_by_id, you could lookup directly for each requested id, potentially faster when the mempool is large and the request set is small.

🔎 Direct lookup approach
     fn get_txs_for_ids(&self, ids: &[TxId]) -> Vec<Tx> {
-        // Make sure that the result are sorted by seq_no
-        let mut result: Vec<(&TxId, &MempoolEntry<Tx>)> = self
-            .entries_by_id
-            .iter()
-            .filter(|(key, _)| ids.contains(*key))
-            .collect();
-        result.sort_by_key(|(_, entry)| entry.seq_no);
-        result
-            .into_iter()
-            .map(|(_, entry)| entry.tx.clone())
-            .collect()
+        let mut entries: Vec<&MempoolEntry<Tx>> = ids
+            .iter()
+            .filter_map(|id| self.entries_by_id.get(id))
+            .collect();
+        entries.sort_by_key(|entry| entry.seq_no);
+        entries.into_iter().map(|entry| entry.tx.clone()).collect()
     }

259-286: Double iteration in acknowledge.

You're iterating entries_by_id twice: once to collect seq_nos_to_remove (lines 268-277), then again in retain (lines 278-282). Could combine these passes, though the current approach is clear and correct.

🔎 Single-pass alternative
     fn acknowledge<TxKey: Ord, I>(&self, tx: &Tx, keys: fn(&Tx) -> I)
     where
         I: IntoIterator<Item = TxKey>,
         Self: Sized,
     {
         let keys_to_remove = BTreeSet::from_iter(keys(tx));
         let mut inner = self.inner.write();
 
-        // remove entries matching the keys criteria in both maps
-        let seq_nos_to_remove: Vec<MempoolSeqNo> = inner
-            .entries_by_id
-            .values()
-            .filter(|entry| {
-                keys(&entry.tx)
-                    .into_iter()
-                    .any(|k| keys_to_remove.contains(&k))
-            })
-            .map(|entry| entry.seq_no)
-            .collect();
-        inner.entries_by_id.retain(|_, entry| {
-            !keys(&entry.tx)
-                .into_iter()
-                .any(|k| keys_to_remove.contains(&k))
-        });
-        for seq_no in seq_nos_to_remove {
-            inner.entries_by_seq.remove(&seq_no);
-        }
+        inner.entries_by_id.retain(|_, entry| {
+            let should_remove = keys(&entry.tx)
+                .into_iter()
+                .any(|k| keys_to_remove.contains(&k));
+            if should_remove {
+                inner.entries_by_seq.remove(&entry.seq_no);
+            }
+            !should_remove
+        });
     }

Wait, that won't work - can't mutably borrow entries_by_seq while iterating entries_by_id. The original approach is actually necessary given Rust's borrowing rules. Scratch that suggestion.

Actually, on second thought, the double iteration is necessary due to borrow checker constraints. The current approach is the correct way to handle this. Never mind!


300-326: Good basic test, could use more edge cases.

The test covers the happy path well. Consider adding tests for:

  • Mempool full rejection (max_txs limit)
  • Duplicate transaction rejection
  • acknowledge behavior
  • take draining the mempool

But hey, this is solid groundwork - you can always add more test levels later, like DLC.

crates/amaru-network/src/chainsync/initiator.rs (1)

129-145: The duplicate concern is spot-on, mate, but harmless for the protocol.

The exponential spacing pattern (1, 2, 4, 8...) is class – classic Bitcoin locator vibes. The protocol just needs points dense near the tip and sparse going back, with the receiver matching the first/best point in preference order, so duplicates won't break anything.

However, your edge case is legit: if the loop ends exactly when index == spacing, that point gets pushed inside the loop (line 139), then again after (line 143). Not a bug per se, but a wee bit untidy. Consider tracking whether you pushed at the final spacing to avoid the double-dip.

Signed-off-by: Roland Kuhn <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (4)
crates/pure-stage/src/simulation/running.rs (1)

756-777: Contramap effect handling in handle_effect - the name validation looks fixed now!

Looking at this with fresh eyes (and checking the resume_contramap_internal code in resume.rs), the flow is:

  1. Line 761: Generate unique name like "my-adapter-1" from prefix "my-adapter"
  2. Line 767: Pass both original and the generated name to resume_contramap_internal
  3. In resume.rs line 294: The check is name.as_str().starts_with(new_name.as_str())

So when new_name="my-adapter" and name="my-adapter-1", the starts_with check passes correctly. The validation now properly accounts for the counter suffix that stage_name() adds. This addresses the concern from the past review - the validation was updated to use starts_with instead of exact equality.

The Adapter is then correctly inserted with the generated name as both name and key.

crates/amaru-mempool/src/strategies/in_memory_mempool.rs (1)

216-240: No termination path in the async wait loop!

Righto, this is the issue flagged in the previous review and it's still present. The loop will wait forever if the target seq_no is never reached and there's no shutdown mechanism. Looking at MempoolInner (lines 73-88), there's no closed flag or shutdown signal.

If the mempool stops accepting transactions, callers will hang indefinitely — like waiting for Half-Life 3 that never comes! 🎮

Consider adding:

  • A closed: bool field to MempoolInner
  • Set it to true on shutdown and call notify.notify_waiters()
  • Check the flag in the loop and return false if closed
🔎 Example approach to add shutdown handling

Add a closed field to MempoolInner:

 struct MempoolInner<Tx> {
     next_seq: u64,
     entries_by_id: BTreeMap<TxId, MempoolEntry<Tx>>,
     entries_by_seq: BTreeMap<MempoolSeqNo, TxId>,
     notify: Arc<Notify>,
+    closed: bool,
 }

Update default:

 impl<Tx> Default for MempoolInner<Tx> {
     fn default() -> Self {
         MempoolInner {
             next_seq: 1,
             entries_by_id: Default::default(),
             entries_by_seq: Default::default(),
             notify: Arc::new(Notify::new()),
+            closed: false,
         }
     }
 }

Check in wait_for_at_least:

 fn wait_for_at_least(
     &self,
     seq_no: MempoolSeqNo,
 ) -> Pin<Box<dyn Future<Output = bool> + Send + '_>> {
     Box::pin(async move {
         loop {
-            let (current_next_seq, notify) = {
+            let (current_next_seq, closed, notify) = {
                 let inner = self.inner.read();
-                (inner.next_seq, inner.notify.clone())
+                (inner.next_seq, inner.closed, inner.notify.clone())
             };
             let notified = notify.notified();
 
+            if closed {
+                return false;
+            }
+            
             if current_next_seq >= seq_no.0 {
                 return true;
             }
 
             notified.await;
         }
     })
 }
crates/amaru-network/src/mempool_effects.rs (1)

84-89: Future import is still missing, legend.

Line 87 uses Pin<Box<dyn Future<Output = bool> + Send + '_>> but std::future::Future isn't imported in lines 15-26. You've got Pin but not Future, which'll cause a compile error. Add use std::future::Future; to your imports and you're golden.

(This matches the past review comment—looks like the fix didn't land yet.)

crates/amaru-network/src/tx_submission/initiator_state.rs (1)

425-439: Test uses wrong blocking mode (already flagged).

This was spotted in a previous review – the test name says "non_blocking" but uses Blocking::Yes. Classic copy-paste situation, happens to the best of us!

🧹 Nitpick comments (17)
crates/amaru-minicbor-extra/src/lib.rs (1)

26-32: G'day! Consider memory retention for the thread-local buffer, cobber.

Right, so here's the yarn: this pattern's got a bit of a "hoarder" vibe going on (think that episode where someone's house is full of old newspapers, but it's Vec capacity instead). When you serialize the occasional chonky object, the thread-local buffer grows to match it, but clear() only drops the length—capacity sticks around like a stubborn stain on your favourite band tee.

For long-lived threads (your typical server workers, async runtimes, that sort of thing), this can lead to memory creeping up over time. Imagine serializing a 10MB response once in a blue moon—that thread's now permanently holding 10MB capacity, even if most serializations are just a few KB.

A couple of options to keep this from turning into a memory leak side quest:

  1. Add a capacity threshold with shrink_to_fit: After clearing, check if capacity exceeds some reasonable limit (say, 1MB or 4MB), and call buffer.shrink_to_fit() to reclaim the memory.
  2. Hybrid approach: If the serialization size exceeds a threshold, bypass the thread-local buffer entirely and allocate directly.

Example with shrink_to_fit:

🔎 Suggested improvement with capacity management
 BUFFER.with_borrow_mut(|buffer| {
     #[expect(clippy::expect_used)]
     cbor::encode(value, &mut *buffer).expect("serialization should not fail");
     let ret = buffer.as_slice().to_vec();
     buffer.clear();
+    // Shrink if capacity exceeds 1MB to avoid memory retention
+    if buffer.capacity() > 1024 * 1024 {
+        buffer.shrink_to_fit();
+    }
     ret
 })

The current implementation is totally fine for short-lived threads or bounded workloads, but worth considering for production scenarios where threads stick around indefinitely.

crates/amaru-consensus/src/consensus/effects/store_effects.rs (1)

106-112: Spot on refactor, mate!

Point implements Copy (with Slot and HeaderHash both being Copy types), so swapping point.clone() to *point at lines 107 and 111 is absolutely the right move. This keeps your code idiomatic and efficient—bit-wise copies instead of explicit clone calls. Both changes stay consistent with each other, which is chef's kiss.

Quick heads-up though: found a few other point.clone() calls kicking around elsewhere in the codebase that could get the same treatment (receive_header.rs lines 204 and 216, consensus_effects.rs line 193, and a couple others). Not blockers for this PR, but worth adding to the backlog if you're on a refactor spree.

crates/pure-stage/src/simulation/resume.rs (1)

283-310: Implementation's solid and follows the pattern nicely, mate!

The resume_contramap_internal function nails the validation-then-consume sequence like its siblings. The starts_with check at line 294 does what's needed given the names generated by stage_name() like "my-adapter-1" from prefix "my-adapter".

One thing worth a look though: that starts_with check could theoretically let "foobar" slip through if new_name is "foo". Over in blocked.rs (lines 129–140), the name_match() function takes a stricter approach—it validates that starts_with is true AND the next character is '-' AND the rest parses as a number. If you're feeling extra paranoid (can't blame you in simulation code), adopting that pattern here would be more defensive, especially if invalid names ever leak into the system.

simulation/amaru-sim/src/simulator/run.rs (1)

115-219: Testing's on the horizon, yeah?

I see from the PR description that tests are coming when the new stages are complete. Given the type migrations and syntax concerns I flagged earlier (especially that dodgy (*hash.bytes) business), it'd be ace to have some tests verify the Point/Tip conversions are working as expected.

When you're ready to add those tests, make sure they exercise the RollForward and Rollback event handling paths, particularly the hash conversions. Nothing like a good test suite to catch those sneaky type conversion bugs—learned that the hard way after too many "but it compiled!" moments!

Want me to sketch out some test cases for the event handling once your stages are ready?

crates/amaru-network/src/socket.rs (1)

168-184: Test helper looks choice, but could clean up the error handling a tad.

The function does what it says on the tin—connects to a peer (either from PEER env or localhost:3000) with a timeout. However, you've got three unwrap() calls daisy-chained at lines 175, 178, and 180. Since you're already returning anyhow::Result, why not swap those unwraps for the ? operator? Makes the error propagation cleaner and the failure messages more descriptive when tests go sideways.

🔎 Suggested refactor
 #[cfg(test)]
 pub async fn create_connection(conn: &ConnectionResource) -> anyhow::Result<ConnectionId> {
     Ok(
         tokio::time::timeout(std::time::Duration::from_secs(5), async {
             let addr = crate::socket_addr::ToSocketAddrs::String(
-                std::env::var("PEER").unwrap_or_else(|_| "127.0.0.1:3000".to_string()),
+                std::env::var("PEER").unwrap_or_else(|_| "127.0.0.1:3000".to_string()),
             )
             .resolve()
-            .await
-            .unwrap();
-            conn.connect(addr).await.unwrap()
+            .await?;
+            conn.connect(addr).await
         })
         .await?,
     )
 }
crates/amaru-mempool/Cargo.toml (1)

14-22: G'day mate, these deps are looking proper tidy!

The switch to workspace-based dependencies keeps everything nice and DRY across the crates - like a well-organised inventory in a Zelda game. The tokio features (time, macros, rt) make sense for async mempool shenanigans.

One tiny thing on line 18: you've got "workspace" = true with quotes around the key. It's valid TOML and works fine, but it's inconsistent with the other workspace deps. Might want to drop the quotes for consistency, yeah?

-tokio = { "workspace" = true, features = ["time", "macros", "rt"] }
+tokio = { workspace = true, features = ["time", "macros", "rt"] }
crates/amaru-network/src/tx_submission/tests/nodes.rs (1)

59-65: Nice helper for test scenarios, feels like a proper co-op mode setup!

The insert_client_transactions method does the job, but that .unwrap() on line 63 could leave you scratching your head if something goes wrong during test setup. Even in test utilities, a wee bit of context in the error message can save debugging time - like having a mini-map in Dark Souls.

🔎 Optional: Add context to the unwrap
     pub fn insert_client_transactions(&self, txs: &[Tx]) {
         for tx in txs.iter() {
             self.initiator_mempool
                 .insert(tx.clone(), amaru_ouroboros_traits::TxOrigin::Remote)
-                .unwrap();
+                .expect("failed to insert test transaction into initiator mempool");
         }
     }
crates/amaru-network/src/tx_submission/tests/assertions.rs (1)

29-53: Works, but there's a wee bit of unnecessary shadowing here.

You're creating an empty actual vec on line 31, then immediately replacing it with the filtered result on line 33. It's like ordering a coffee, pouring it out, then ordering another one — technically works, but why not just skip straight to the good stuff?

Consider removing line 31 and declaring actual directly with the filter result. Keeps things cleaner and avoids the shadowing confusion.

🔎 Proposed refactor
 pub fn expect_transactions(mempool: Arc<dyn TxSubmissionMempool<Tx>>, txs: Vec<Tx>) {
     let tx_ids: Vec<_> = txs.iter().map(TxId::from).collect();
-    let mut actual = vec![];
 
-    actual = tx_ids
+    let actual: Vec<_> = tx_ids
         .iter()
         .filter(|tx_id| mempool.contains(tx_id))
         .collect();
crates/amaru-network/src/handshake/tests.rs (1)

120-191: Looks good, but small style inconsistency with the other test.

The updates here mirror the first test — using create_connection and handshake::handler_transform — which is great. However, I noticed this test uses NetworkMagic::MAINNET directly (lines 160, 186), while test_against_node introduces a network_magic variable.

For consistency's sake, might be worth extracting the magic to a variable here too, but it's totally optional. Either way works fine!

crates/amaru-network/src/chainsync/responder.rs (1)

66-154: Responder loop looks solid, but one small note on Store creation.

The state machine loop here is well-structured — internal Msg enum flattens the variants nicely, error handling with or_terminate is consistent, and the loop continuation pattern for chaining actions is clever. Like a choose-your-own-adventure book, but for blockchain sync!

One tiny thing: you're creating a new Store::new(eff.clone()) on lines 126 and 136 each time through. Not a bug, but if Store construction has overhead, you might consider lifting it out of the match arms. Totally optional though — if it's lightweight, this is fine.

crates/amaru-network/src/tx_submission/responder_state.rs (2)

153-188: Ack/request logic is sound, but those expects might bite in prod.

The windowing and acknowledgment logic here is solid — you acknowledge what's already in the mempool, request as much as fits in the window, and block when necessary. The checked arithmetic is good defensive coding.

However, the expect() calls on lines 165 and 178 will panic if protocol invariants are violated (overflow/underflow). The #[allow(clippy::expect_used)] suggests you're aware, but consider whether panicking is the right behavior in production. If a misbehaving peer sends too many acks, you'd crash rather than gracefully terminating that connection.

For now it's probably fine for testing, but worth thinking about whether these should return Result and let the caller decide how to handle violations.


225-248: TX processing logic works, but silent skip on mismatches could hide bugs.

The FIFO queue matching (line 233) relies on txs arriving in the same order they were requested, which is a protocol requirement. The validation and insertion flow is clean.

One thing though: if pop_front() returns None (meaning we've exhausted the in-flight queue), the tx is silently skipped. While the upstream check in process_txs_reply already validates batch size, silently ignoring unexpected txs could mask protocol violations or bugs. Consider logging a warning or returning an error if this happens, so you catch edge cases during testing.

🔎 Suggested improvement
         for tx in txs {
             // this is the exact id we requested for this body (FIFO)
-            if let Some(requested_id) = self.inflight_fetch_queue.pop_front() {
+            let Some(requested_id) = self.inflight_fetch_queue.pop_front() else {
+                tracing::warn!("received more txs than requested");
+                return Err(anyhow::anyhow!("received more txs than in-flight queue"));
+            };
                 self.inflight_fetch_set.remove(&requested_id);
 
                 match mempool.validate_transaction(tx.clone()) {
                     Ok(_) => {
                         mempool.insert(tx, TxOrigin::Remote)?;
                     }
                     Err(e) => {
                         tracing::warn!("received invalid transaction {}: {}", requested_id, e);
                     }
                 }
-            }
         }
crates/amaru-kernel/src/lib.rs (1)

117-117: Consider explicit re-exports over wildcard for clearer API boundaries.

Switching from pub use is_header::{BlockHeader, IsHeader} to a wildcard export (pub use is_header::*) might unintentionally expose more than you bargained for, mate. Wildcard exports can make it tricky to track what's actually public and can accidentally leak implementation details that weren't meant for the outside world.

If you're intentionally expanding the public surface, no worries! But if you just want BlockHeader and IsHeader visible, keeping it explicit helps future devs (and your future self) know exactly what's on offer. It's like keeping your inventory tidy in an RPG—you want to know exactly what's in your kit, yeah?

crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (1)

79-79: Consider using Tip::new consistently for clarity.

Lines 79 and 140 use tuple-struct construction Tip(Point::Origin, 0.into()), while line 169 in is_header/mod.rs shows Tip::new(...) is available. Using Tip::new(Point::Origin, 0.into()) consistently would make the construction more explicit and easier to spot in code reviews, but the current tuple form is fine if that's the team style.

Also applies to: 140-140

crates/amaru-ouroboros-traits/src/mempool.rs (2)

105-113: Potential integer overflow on sequence number operations.

G'day! These next() and add() methods could theoretically overflow if the sequence number reaches u64::MAX. Now, I know what you're thinking – "that's 18 quintillion transactions, mate, we'll be grand!" And you're probably right for any practical scenario. It's like worrying about running out of save slots in a game with unlimited saves.

But if you're feeling defensive (and in the blockchain world, that's usually wise), you could use saturating_add or checked_add to be extra safe. Just a thought – not a deal-breaker by any means.

🔎 Optional: Use saturating arithmetic for extra safety
 impl MempoolSeqNo {
     pub fn next(&self) -> MempoolSeqNo {
-        MempoolSeqNo(self.0 + 1)
+        MempoolSeqNo(self.0.saturating_add(1))
     }

     pub fn add(&self, n: u64) -> MempoolSeqNo {
-        MempoolSeqNo(self.0 + n)
+        MempoolSeqNo(self.0.saturating_add(n))
     }
 }

181-189: Consider renaming from to avoid confusion with From trait.

The method TxId::from<Tx>(tx: &Tx) shadows the conventional From trait method naming. It's a bit like naming your character "Link" in a Zelda game – works fine, but might confuse people expecting the original.

Since you can't easily implement From<&Tx> for all Tx: Encode<()> without orphan rule issues, consider renaming this to something like compute or from_tx to make it clear this is a regular associated function rather than a trait implementation.

🔎 Suggested rename for clarity
 impl TxId {
     pub fn new(hash: Hash<32>) -> Self {
         TxId(hash)
     }

-    pub fn from<Tx: Encode<()>>(tx: &Tx) -> Self {
+    pub fn from_tx<Tx: Encode<()>>(tx: &Tx) -> Self {
         TxId(Hasher::<{ 32 * 8 }>::hash_cbor(tx))
     }
 }
crates/amaru-network/src/tx_submission/initiator_state.rs (1)

172-185: Minor: Unnecessary clone in get_next_tx_ids.

The tx_ids.clone().into_iter() followed by self.update(tx_ids) works, but you're cloning the entire vector just to iterate it. It's like photocopying your quest map when you could just look at it twice.

You could restructure to avoid the clone:

🔎 Avoid the clone with a single-pass approach
     fn get_next_tx_ids<Tx: Send + Debug + Sync + 'static>(
         &mut self,
         mempool: &dyn TxSubmissionMempool<Tx>,
         required_next: u16,
     ) -> anyhow::Result<Vec<(TxId, u32)>> {
         let tx_ids = mempool.tx_ids_since(self.next_seq(), required_next);
-        let result = tx_ids
-            .clone()
-            .into_iter()
-            .map(|(tx_id, tx_size, _)| (tx_id, tx_size))
-            .collect();
-        self.update(tx_ids);
+        let mut result = Vec::with_capacity(tx_ids.len());
+        for (tx_id, tx_size, seq_no) in tx_ids {
+            result.push((tx_id, tx_size));
+            self.window.push_back((tx_id, seq_no));
+            self.last_seq = Some(seq_no);
+        }
         Ok(result)
     }

Signed-off-by: Roland Kuhn <[email protected]>
Signed-off-by: Roland Kuhn <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
crates/amaru-mempool/src/strategies/in_memory_mempool.rs (1)

219-243: No termination path in wait_for_at_least—still hanging around like a bad sequel.

This infinite loop issue was flagged in the previous review and it's still here, mate! If the mempool stops accepting transactions or gets shut down, callers will hang forever—like waiting for Half-Life 3. The trait signature returns bool to indicate "give up," but there's no path to return false.

You'll want to add a shutdown/closed flag to MempoolInner that gets checked in the loop. When the mempool closes, set the flag and call notify.notify_waiters() so the loop can wake up, see it's closed, and return false.

🔎 Recommended approach

Add a closed: AtomicBool field to MempoolInner:

struct MempoolInner<Tx> {
    next_seq: u64,
    entries_by_id: BTreeMap<TxId, MempoolEntry<Tx>>,
    entries_by_seq: BTreeMap<MempoolSeqNo, TxId>,
    notify: Arc<Notify>,
    closed: AtomicBool,  // Add this
}

Then in wait_for_at_least, check if closed before awaiting:

loop {
    let (current_next_seq, notify, closed) = {
        let inner = self.inner.read();
        (inner.next_seq, inner.notify.clone(), inner.closed.load(Ordering::Relaxed))
    };
    
    if closed {
        return false;  // Mempool shut down
    }
    
    if current_next_seq >= seq_no.0 {
        return true;
    }
    
    let notified = notify.notified();
    notified.await;
}

And implement a shutdown method that sets closed = true and calls notify.notify_waiters().

🧹 Nitpick comments (11)
simulation/amaru-sim/src/sync/chain_sync_message.rs (1)

19-19: Nice one, mate! The refactor to Hash types looks solid.

The import of Hash and the updated conversions for both RollForward and Rollback are spot on - you've properly aligned with the broader move to Hash<32> instead of the old representation. Both branches are consistent, which is bonzer.

Tiny style thing for code harmony: Just noticed that lines 78-79 in this same file use hash.bytes.as_slice() for a similar conversion, while you're using &*hash.bytes here. Both work like a charm, but keeping it consistent with the existing pattern might make future devs feel more at home, y'know? Think of it like matching the visual style in a game level - not critical, but it makes the whole experience flow better.

🔎 Optional style tweak for consistency
-                point: Point::Specific(slot, Hash::from(&*hash.bytes)),
+                point: Point::Specific(slot, Hash::from(hash.bytes.as_slice())),
-                rollback_point: Point::Specific(slot, Hash::from(&*hash.bytes)),
+                rollback_point: Point::Specific(slot, Hash::from(hash.bytes.as_slice())),

Also applies to: 204-204, 214-214

crates/amaru/src/stages/mod.rs (3)

241-258: G'day! Quick word on the commented code, cobber.

Fair dinkum, I get that this is work-in-progress scaffolding (as per the PR description), but having a big chunk of commented code sitting around is like leaving Easter eggs in the game that players can't access yet – bit confusing, yeah?

Consider these alternatives:

  • Pop it behind a feature flag so it's compiled conditionally
  • Move it to a separate branch until ready to activate
  • If keeping it for reference, maybe add a tracking issue link in the comment

No dramas if you want to keep it for now since you're actively working on it, but future-you (or the next dev) will appreciate cleaner code!


307-307: Hardcoded timeout spotted, champ!

The 5-second timeout is hardcoded here. Since this is temporary code (as your doc comment notes), it's not a biggie, but if you reckon this might stick around longer than expected, consider making it configurable via a const or parameter. Like giving players difficulty settings instead of locking it on "normal," y'know?


308-308: Hidden env var Easter egg!

The PEER environment variable can override the peer address, but there's no mention of this in the docs or comments. It's like finding a secret level with no hints – cool for those who know, but confusing for everyone else!

Worth adding a quick comment explaining this override behavior, or documenting it somewhere visible if this pattern sticks around post-temporary phase.

crates/amaru-ouroboros-traits/src/mempool.rs (1)

107-113: Consider overflow protection for sequence arithmetic.

Right, so these arithmetic methods on MempoolSeqNo don't check for overflow. With a u64, you'd need to process about 584 million transactions per year for 1000 years to overflow, so it's more of a "heat death of the universe" scenario than a realistic concern. But if you're feeling extra defensive (like preparing for a FromSoftware boss fight), adding saturating_add or overflow checks wouldn't hurt.

🔎 Optional: Use saturating arithmetic
 impl MempoolSeqNo {
     pub fn next(&self) -> MempoolSeqNo {
-        MempoolSeqNo(self.0 + 1)
+        MempoolSeqNo(self.0.saturating_add(1))
     }
 
     pub fn add(&self, n: u64) -> MempoolSeqNo {
-        MempoolSeqNo(self.0 + n)
+        MempoolSeqNo(self.0.saturating_add(n))
     }
 }
crates/amaru-mempool/src/strategies/in_memory_mempool.rs (2)

140-145: Panic on inconsistent state—consider returning an error instead.

Right, so if the mempool state gets inconsistent between the two maps, this panics. While defensive programming is good (like saving before a boss fight), panicking will crash the entire mempool. Consider returning a Result from tx_ids_since and handling this as a recoverable error, or at least logging before panicking.

That said, if this inconsistency represents a genuine bug that should never happen, the panic is defensible as a fail-fast mechanism.


262-289: Nested iteration in acknowledge could be optimized.

The acknowledge method iterates through entries_by_id twice—once to collect sequence numbers (lines 271-280) and once to retain entries (lines 281-285). Both loops call keys(&entry.tx) for each entry. For a large mempool, this could be a bit sluggish.

You could optimize by doing a single pass, building both the removal sets and updating the maps in one go. But honestly, for most use cases this is probably fine—premature optimization and all that. Just something to keep in mind if profiling shows this as a hot path.

crates/amaru-network/src/tx_submission/stage.rs (3)

31-83: Consider creating MemoryPool once at function start, mate.

You're calling MemoryPool::new(eff.clone()) twice in this function (lines 43 and 63) - bit like ordering the same coffee twice instead of getting a large to begin with. Since the mempool is just wrapping the effects, creating it once at the start would be cleaner and avoid redundant allocations.

🔎 Proposed refactor
 pub async fn initiator_stage(
     mut tx_submission: TxSubmission<TxSubmissionInitiatorState>,
     msg: TxSubmissionMessage,
     eff: Effects<TxSubmissionMessage>,
 ) -> TxSubmission<TxSubmissionInitiatorState> {
+    let mempool = MemoryPool::new(eff.clone());
     let (protocol_state, outcome) = match msg {
         TxSubmissionMessage::Registered => {
             tracing::trace!("tx submission protocol registered");
             tx_submission
                 .role_state
                 .step(
-                    &MemoryPool::new(eff.clone()),
+                    &mempool,
                     &tx_submission.protocol_state,
                     Message::Init,
                 )
                 .await
                 .or_terminate(&eff, async |err| {
                     tracing::error!(%err, "failed to initialize");
                 })
                 .await
         }
         TxSubmissionMessage::FromNetwork(non_empty_bytes) => {
             tracing::trace!("received a message from the network");
             let msg: Message = minicbor::decode(&non_empty_bytes)
                 .or_terminate(&eff, async |err| {
                     tracing::error!(%err, "failed to decode message from network");
                 })
                 .await;
             tx_submission
                 .role_state
                 .step(
-                    &MemoryPool::new(eff.clone()),
+                    &mempool,
                     &tx_submission.protocol_state,
                     msg,
                 )
                 .await
                 .or_terminate(&eff, async |err| {
                     tracing::error!(%err, "failed to step tx submission state machine");
                 })
                 .await
         }
     };

209-209: Magic number alert! That 5760's feeling a bit lonely.

The max_buffer value is hardcoded here, mate. Like the FIXME for the timeout below, this could benefit from being a named constant or configurable parameter. Makes it easier to tune or understand why 5760 specifically.

🔎 Suggested improvement

Consider adding a constant at module level:

// Maximum buffer size for tx submission protocol messages
const TX_SUBMISSION_MAX_BUFFER: usize = 5760;

Then use it in the registration:

     eff.send(
         &mux,
         MuxMessage::Register {
             protocol: PROTO_N2N_TX_SUB.erase(),
             frame: mux::Frame::OneCborItem,
             handler: tx_submission_bytes,
-            max_buffer: 5760,
+            max_buffer: TX_SUBMISSION_MAX_BUFFER,
         },
     )
     .await;

252-253: FIXME spotted - timeout value needs tuning.

You've got a placeholder timeout of 1 second here with a FIXME. Based on the learnings about network operations needing proper timeout handling, this is definitely worth revisiting once you have some real-world data on network latencies.

Want me to open an issue to track finding the right timeout value based on performance testing?

crates/amaru-network/src/tx_submission/responder_state.rs (1)

122-155: Origin's going on quite the journey with all those clones.

At line 76 self.origin gets cloned, then passed here at line 144, and then in received_txs (line 243) it's cloned again for each transaction insertion. That's a fair few clones for the same data. Consider having received_txs take &TxOrigin instead to reduce allocations.

🔎 Proposed refactor

Change the signature of received_txs:

 fn received_txs(
     &mut self,
     mempool: &dyn TxSubmissionMempool<Tx>,
     txs: Vec<Tx>,
-    origin: TxOrigin,
+    origin: &TxOrigin,
 ) -> anyhow::Result<()> {

Then at line 144:

-        self.received_txs(mempool, txs, origin)?;
+        self.received_txs(mempool, txs, &origin)?;

And inside received_txs at line 243, the clone is still needed for mempool.insert, but we've saved one clone in the chain.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c3f0c09 and ac29b0f.

📒 Files selected for processing (22)
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs (12 hunks)
  • crates/amaru-consensus/src/consensus/headers_tree/data_generation/actions.rs (2 hunks)
  • crates/amaru-consensus/src/consensus/stages/receive_header.rs (4 hunks)
  • crates/amaru-mempool/src/strategies/in_memory_mempool.rs (1 hunks)
  • crates/amaru-network/src/connection.rs (1 hunks)
  • crates/amaru-network/src/keepalive/tests.rs (1 hunks)
  • crates/amaru-network/src/socket.rs (1 hunks)
  • crates/amaru-network/src/tx_submission/initiator_state.rs (1 hunks)
  • crates/amaru-network/src/tx_submission/responder_state.rs (1 hunks)
  • crates/amaru-network/src/tx_submission/stage.rs (1 hunks)
  • crates/amaru-network/src/tx_submission/tests/assertions.rs (1 hunks)
  • crates/amaru-network/src/tx_submission/tests/nodes.rs (1 hunks)
  • crates/amaru-network/src/tx_submission/tests/system_test.rs (1 hunks)
  • crates/amaru-network/src/tx_submission/tests/test_cases.rs (1 hunks)
  • crates/amaru-network/src/tx_submission/tests/test_data.rs (1 hunks)
  • crates/amaru-ouroboros-traits/src/mempool.rs (1 hunks)
  • crates/amaru-stores/src/lib.rs (1 hunks)
  • crates/amaru/src/bin/ledger/cmd/sync.rs (3 hunks)
  • crates/amaru/src/stages/mod.rs (8 hunks)
  • simulation/amaru-sim/src/simulator/data_generation/generate.rs (1 hunks)
  • simulation/amaru-sim/src/simulator/run.rs (4 hunks)
  • simulation/amaru-sim/src/sync/chain_sync_message.rs (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
  • crates/amaru-consensus/src/consensus/stages/receive_header.rs
  • crates/amaru-network/src/tx_submission/tests/test_cases.rs
  • crates/amaru-network/src/tx_submission/tests/assertions.rs
  • crates/amaru-network/src/tx_submission/tests/system_test.rs
  • crates/amaru-network/src/tx_submission/tests/test_data.rs
  • crates/amaru-consensus/src/consensus/headers_tree/data_generation/actions.rs
  • crates/amaru-network/src/tx_submission/initiator_state.rs
🧰 Additional context used
🧠 Learnings (26)
📚 Learning: 2025-08-08T14:43:20.218Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/point.rs:45-52
Timestamp: 2025-08-08T14:43:20.218Z
Learning: In crates/amaru-kernel/src/point.rs, Point::Specific stores a header hash that is always exactly 32 bytes by project invariant. Therefore, converting it with `impl From<&Point> for Hash<32>` without a runtime length check is acceptable; future reviews should not request an error-returning check inside this `From` impl. If validation is ever desired, it should be done via `TryFrom` or upstream constructors.

Applied to files:

  • crates/amaru/src/bin/ledger/cmd/sync.rs
  • simulation/amaru-sim/src/sync/chain_sync_message.rs
  • simulation/amaru-sim/src/simulator/run.rs
  • crates/amaru-stores/src/lib.rs
📚 Learning: 2025-08-20T13:02:25.763Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.

Applied to files:

  • crates/amaru/src/bin/ledger/cmd/sync.rs
  • simulation/amaru-sim/src/sync/chain_sync_message.rs
  • simulation/amaru-sim/src/simulator/data_generation/generate.rs
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
  • crates/amaru-mempool/src/strategies/in_memory_mempool.rs
  • crates/amaru-stores/src/lib.rs
📚 Learning: 2025-06-03T06:31:57.736Z
Learnt from: stevana
Repo: pragma-org/amaru PR: 236
File: simulation/amaru-sim/src/simulator/generate.rs:141-145
Timestamp: 2025-06-03T06:31:57.736Z
Learning: In the amaru project, the team prefers to use as_bytes() instead of hex::decode() for converting hash and header strings to bytes in simulation/amaru-sim/src/simulator/generate.rs, even though they appear to be hex-encoded strings.

Applied to files:

  • crates/amaru/src/bin/ledger/cmd/sync.rs
  • simulation/amaru-sim/src/sync/chain_sync_message.rs
  • simulation/amaru-sim/src/simulator/run.rs
  • crates/amaru-stores/src/lib.rs
📚 Learning: 2025-08-08T14:46:53.013Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/pool_params.rs:107-116
Timestamp: 2025-08-08T14:46:53.013Z
Learning: In crates/amaru-kernel/src/pool_params.rs, when serializing Relay::SingleHostAddr IPv6 to text, the project intentionally reverses each 4-byte chunk before constructing std::net::Ipv6Addr. This matches cardano-ledger’s IPv6 representation (four little-endian Word32 chunks). Do not “simplify” by passing the raw 16 bytes directly to Ipv6Addr::from; that would break ledger compatibility.

Applied to files:

  • crates/amaru/src/bin/ledger/cmd/sync.rs
  • simulation/amaru-sim/src/simulator/run.rs
📚 Learning: 2025-12-16T21:32:37.668Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/amaru-network/src/handshake/tests.rs:40-47
Timestamp: 2025-12-16T21:32:37.668Z
Learning: In Rust, shadowing a binding with a new let does not drop the previous binding until the end of the scope. All shadowed bindings in a scope are dropped in reverse-declaration order when the scope ends. Therefore, multiple let _guard = register_*() calls will keep all guards alive until the end of the function (or the surrounding scope). When reviewing code, be mindful that resources tied to shadowed bindings persist longer than the most recent binding; to release early, constrain the lifetime in an inner block or explicitly drop guards when appropriate.

Applied to files:

  • crates/amaru/src/bin/ledger/cmd/sync.rs
  • crates/amaru-network/src/socket.rs
  • crates/amaru-network/src/connection.rs
  • simulation/amaru-sim/src/sync/chain_sync_message.rs
  • crates/amaru-network/src/keepalive/tests.rs
  • crates/amaru/src/stages/mod.rs
  • crates/amaru-network/src/tx_submission/responder_state.rs
  • simulation/amaru-sim/src/simulator/data_generation/generate.rs
  • crates/amaru-network/src/tx_submission/stage.rs
  • crates/amaru-network/src/tx_submission/tests/nodes.rs
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
  • simulation/amaru-sim/src/simulator/run.rs
  • crates/amaru-mempool/src/strategies/in_memory_mempool.rs
  • crates/amaru-stores/src/lib.rs
  • crates/amaru-ouroboros-traits/src/mempool.rs
📚 Learning: 2025-02-03T11:15:22.640Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Applied to files:

  • simulation/amaru-sim/src/sync/chain_sync_message.rs
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
  • simulation/amaru-sim/src/simulator/run.rs
📚 Learning: 2025-04-22T09:18:19.893Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.

Applied to files:

  • simulation/amaru-sim/src/sync/chain_sync_message.rs
  • simulation/amaru-sim/src/simulator/run.rs
📚 Learning: 2025-08-23T15:44:44.318Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 392
File: crates/ouroboros-traits/src/is_header/fake.rs:52-88
Timestamp: 2025-08-23T15:44:44.318Z
Learning: In the Amaru project, FakeHeader in crates/ouroboros-traits/src/is_header/fake.rs is specifically designed for testing purposes, and panic behavior (like copy_from_slice() panicking on malformed hex) is the preferred approach rather than graceful error handling, as it helps identify test data issues quickly during development.

Applied to files:

  • simulation/amaru-sim/src/sync/chain_sync_message.rs
  • simulation/amaru-sim/src/simulator/data_generation/generate.rs
  • simulation/amaru-sim/src/simulator/run.rs
📚 Learning: 2025-04-22T09:18:19.893Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, the ValidateHeader::handle_roll_forward method returns a PullEvent, not a ValidateHeaderEvent.

Applied to files:

  • simulation/amaru-sim/src/sync/chain_sync_message.rs
📚 Learning: 2025-05-12T14:21:27.470Z
Learnt from: stevana
Repo: pragma-org/amaru PR: 210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.

Applied to files:

  • crates/amaru-network/src/keepalive/tests.rs
  • simulation/amaru-sim/src/simulator/data_generation/generate.rs
  • crates/amaru-network/src/tx_submission/tests/nodes.rs
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
  • simulation/amaru-sim/src/simulator/run.rs
📚 Learning: 2025-04-20T17:56:48.565Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.

Applied to files:

  • crates/amaru/src/stages/mod.rs
📚 Learning: 2025-05-09T13:09:47.915Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 206
File: crates/pure-stage/src/simulation/running.rs:240-242
Timestamp: 2025-05-09T13:09:47.915Z
Learning: Cloning messages in the pure-stage crate should be avoided for performance reasons. The current implementation in SimulationRunning deliberately avoids duplicating message data structures.

Applied to files:

  • simulation/amaru-sim/src/simulator/data_generation/generate.rs
📚 Learning: 2025-09-05T17:30:55.869Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.

Applied to files:

  • crates/amaru-network/src/tx_submission/stage.rs
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
  • crates/amaru-mempool/src/strategies/in_memory_mempool.rs
📚 Learning: 2025-09-29T16:39:24.001Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:317-325
Timestamp: 2025-09-29T16:39:24.001Z
Learning: In crates/amaru-network/src/mux.rs, the outgoing() method intentionally uses unwrap() after get_mut(&proto_id) as a fail-fast mechanism. This panic is designed to catch programming errors where an actor tries to send on an unregistered protocol, and should not be changed to return a Result since it represents internal code bugs that should terminate the process, not external input that should be handled gracefully.

Applied to files:

  • crates/amaru-network/src/tx_submission/stage.rs
📚 Learning: 2025-05-21T18:58:48.631Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.

Applied to files:

  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
  • crates/amaru-mempool/src/strategies/in_memory_mempool.rs
📚 Learning: 2025-09-29T20:08:29.906Z
Learnt from: yHSJ
Repo: pragma-org/amaru PR: 453
File: crates/amaru-metrics/src/ledger.rs:60-104
Timestamp: 2025-09-29T20:08:29.906Z
Learning: In amaru-ledger/src/state.rs, the txs_processed field represents the number of transactions in the current block (block.transaction_bodies.len() as u64), not a cumulative total. Therefore, using a Counter with .add() in the metrics is correct for accumulating per-block transaction counts.

Applied to files:

  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
📚 Learning: 2025-08-12T12:28:24.027Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 372
File: simulation/amaru-sim/src/simulator/mod.rs:410-412
Timestamp: 2025-08-12T12:28:24.027Z
Learning: In the Amaru project, panic statements are acceptable in simulation/test code (like amaru-sim crate) as they help identify configuration issues quickly during development, rather than needing proper error handling like production code.

Applied to files:

  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
📚 Learning: 2025-04-20T17:57:23.233Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Applied to files:

  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
📚 Learning: 2025-01-21T15:32:17.911Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 69
File: crates/amaru/src/ledger/state/diff_epoch_reg.rs:112-117
Timestamp: 2025-01-21T15:32:17.911Z
Learning: When suggesting code changes in Rust, always verify that the types align correctly, especially when dealing with references and Options. The `Fold::Registered` variant in `diff_epoch_reg.rs` expects a reference `&'a V`, so unwrapping an `Option<&V>` requires only a single `.expect()`.

Applied to files:

  • simulation/amaru-sim/src/simulator/run.rs
📚 Learning: 2025-08-08T14:35:35.562Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/transaction_pointer.rs:36-44
Timestamp: 2025-08-08T14:35:35.562Z
Learning: In the amaru project, when decoding CBOR arrays, prefer using minicbor_extra::heterogenous_array with the expected length to validate definite-length arrays and correctly handle indefinite-length arrays. Example: crates/amaru-kernel/src/transaction_pointer.rs Decode should use heterogenous_array(d, 2, …) instead of ignoring the length from d.array().

Applied to files:

  • simulation/amaru-sim/src/simulator/run.rs
📚 Learning: 2025-09-06T09:16:25.025Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 435
File: crates/amaru/src/bin/amaru/cmd/convert_ledger_state.rs:113-116
Timestamp: 2025-09-06T09:16:25.025Z
Learning: In cardano-node serialized ledger state CBOR encoding, indefinite-length structures may be terminated with 0xFF "break" markers. The current code in convert_ledger_state.rs unconditionally strips the last byte (bytes[p..bytes.len() - 1]), which could corrupt data if the trailing byte is not actually a CBOR break marker.

Applied to files:

  • simulation/amaru-sim/src/simulator/run.rs
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.

Applied to files:

  • simulation/amaru-sim/src/simulator/run.rs
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In the amaru consensus pipeline, headers are now stored before validation (store_header stage runs before validate_header stage). The validity state of headers should be tracked in the database to avoid revalidating already-valid headers and to prevent flood attacks with duplicate invalid headers.

Applied to files:

  • simulation/amaru-sim/src/simulator/run.rs
📚 Learning: 2025-08-18T08:10:32.640Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:427-433
Timestamp: 2025-08-18T08:10:32.640Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp, so unimplemented methods like set_proposals_roots and set_constitution are intentionally left as placeholders until the revamp is complete.

Applied to files:

  • crates/amaru-mempool/src/strategies/in_memory_mempool.rs
📚 Learning: 2025-08-18T08:10:35.849Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:431-433
Timestamp: 2025-08-18T08:10:35.849Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp soon, so unimplemented methods like set_constitution, set_proposals_roots are intentionally left aside until the revamp is complete.

Applied to files:

  • crates/amaru-mempool/src/strategies/in_memory_mempool.rs
📚 Learning: 2025-06-14T16:38:35.449Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 263
File: crates/amaru-consensus/src/consensus/store.rs:220-223
Timestamp: 2025-06-14T16:38:35.449Z
Learning: In `NetworkName::Preprod.into()` when converting to `&EraHistory`, the From implementation returns a static reference to a constant value, not a temporary. This makes it safe to return directly from functions expecting `&EraHistory` without storing it in a struct field.

Applied to files:

  • crates/amaru-stores/src/lib.rs
🧬 Code graph analysis (8)
crates/amaru/src/bin/ledger/cmd/sync.rs (1)
crates/amaru-kernel/src/protocol_messages/point.rs (2)
  • hash (41-47)
  • from (66-68)
simulation/amaru-sim/src/sync/chain_sync_message.rs (2)
crates/amaru-consensus/src/consensus/headers_tree/data_generation/actions.rs (1)
  • hash (70-75)
crates/amaru-kernel/src/protocol_messages/point.rs (1)
  • hash (41-47)
crates/amaru-network/src/keepalive/tests.rs (2)
crates/amaru-network/src/socket.rs (3)
  • create_connection (171-181)
  • new (45-48)
  • new (72-77)
crates/amaru-network/src/connection.rs (2)
  • new (39-49)
  • stage (84-97)
simulation/amaru-sim/src/simulator/data_generation/generate.rs (1)
crates/amaru-kernel/src/is_header/mod.rs (1)
  • header (152-154)
crates/amaru-network/src/tx_submission/stage.rs (4)
crates/amaru-network/src/protocol.rs (1)
  • outcome (48-53)
crates/amaru-network/src/tx_submission/initiator_state.rs (1)
  • new (48-53)
crates/amaru-network/src/tx_submission/responder_state.rs (1)
  • new (48-57)
crates/amaru-network/src/tx_submission/messages.rs (2)
  • decode (96-124)
  • message_type (35-44)
crates/amaru-network/src/tx_submission/tests/nodes.rs (2)
crates/amaru-network/src/tx_submission/initiator_state.rs (1)
  • step (55-72)
crates/amaru-network/src/tx_submission/responder_state.rs (2)
  • txs (134-134)
  • step (59-91)
crates/amaru-consensus/src/consensus/effects/consensus_effects.rs (2)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (1)
  • point (87-89)
crates/amaru-mempool/src/strategies/in_memory_mempool.rs (2)
  • default (43-45)
  • default (81-88)
simulation/amaru-sim/src/simulator/run.rs (3)
crates/amaru-kernel/src/is_header/mod.rs (8)
  • tip (168-170)
  • slot (52-52)
  • slot (217-219)
  • slot (239-241)
  • hash (37-37)
  • hash (139-141)
  • hash (205-207)
  • hash (227-229)
crates/amaru-kernel/src/protocol_messages/tip.rs (2)
  • slot (34-36)
  • hash (38-40)
crates/amaru-kernel/src/protocol_messages/point.rs (1)
  • hash (41-47)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: clippy
  • GitHub Check: Build riscv32
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Test the CLI
  • GitHub Check: Test coverage
  • GitHub Check: Benches
  • GitHub Check: Analyze (rust)
🔇 Additional comments (31)
crates/amaru-stores/src/lib.rs (1)

207-207: Spot on with the Hash migration, mate!

The switch from Vec to Hash::from([0u8; 32]) for Point::Specific is bang on—aligns perfectly with the broader PR goal of moving toward Copy semantics and reducing allocations. Using a zero hash for test fixture data is totally fair dinkum for this scenario.

crates/amaru-network/src/keepalive/tests.rs (1)

28-68: Nice scaffolding for the integration test, mate!

The test is appropriately marked #[ignore] since it needs a live node. A couple of thoughts:

  1. The test name says test_keepalive_with_node but it wires up the full connection::stage rather than testing keepalive in isolation. That's totally grand if the intent is end-to-end, but might want to rename it to something like test_connection_keepalive_with_node for clarity.

  2. The timeout path at lines 64-66 logs "test timed out as expected" but returns () just like the success path. If the keepalive is expected to run indefinitely (like those never-ending Dark Souls runs), that's fair dinkum. But if there's a specific behavior you're waiting for, you might want to assert on it rather than just timing out.

crates/amaru-network/src/socket.rs (1)

167-181: Solid test utility, like finding a health pack right when you need it!

Clean helper function with sensible defaults. The 5-second timeout prevents tests from hanging like a frozen game. The PEER env var override is a nice touch for flexibility.

One small note: the nested async block means errors from resolve() and connect() get wrapped by the timeout error. The current approach with ? propagation handles this fine, but just be aware that a resolution failure will surface as anyhow::Error rather than a distinct error type if that matters for debugging.

crates/amaru-network/src/connection.rs (4)

60-75: State machine looking crisp, like a fresh character build!

The State::Initiator variant stores a bunch of StageRefs. Quick question though - are handshake and keepalive and tx_submission refs actually used after the connection is established, or are they just kept alive to prevent dropping? If they're needed for future interactions (like a responder state transition), that's totally ace. But if they're just retained for lifetime purposes, a comment would help future devs understand the intent.


89-95: Heads up on the unimplemented! - it's like a "You Died" screen waiting to happen!

The catch-all unimplemented!("{x:?}") at line 94 will panic on any unexpected state/message combo. Given the PR description mentions this is WIP ("no individual test yet"), that's fair for now. But before this goes to prod, you'll want to either:

  1. Replace with proper error handling (maybe eff.terminate() with logging?)
  2. Or explicitly match all valid transitions and use unreachable!() for truly impossible states

The debug format in the panic message is handy for debugging though - nice touch!


99-149: The initialization choreography is smooth as a speedrun!

Nice use of contramap for the message transformations - keeps the wiring clean. The flow from muxer → handshake → handler registration is well structured. The VersionTable::v11_and_above with initiator_only_diffusion_mode: true makes sense for an initiating client.


151-178: Handshake handling is solid, like nailing a QTE!

Clean extraction of version info and proper termination on refusal. The TxOrigin::Remote(peer.clone()) is correct since you need ownership there.

One thing though - State::Initiator is returned regardless of the role parameter. If a Role::Responder goes through this path, it'll end up in a state called Initiator which could be confusing. Might be worth renaming to something like State::Active or State::Connected, unless responders take a different code path entirely?

crates/amaru/src/stages/mod.rs (4)

196-197: Proper Origin handling, legend!

The code correctly handles both the case where a header exists and the fresh node scenario (Origin with height 0). This aligns with the learnings about bootstrapping nodes properly.

Based on learnings, this is the right way to initialize the tip for a fresh node.


224-224: Tip conversion looks ace!

The to_pallas_tip conversion cleanly bridges the kernel Tip to the pallas_network Tip type for the forward chain server. Nice separation of concerns!


259-263: Mempool registration is spot on!

The ResourceMempool registration follows the same clean pattern as the other resources. InMemoryMempool wrapped in Arc – textbook stuff!


440-443: Clean Tip construction, mate!

The direct tuple construction Tip(self.point(), self.block_height()) is much cleaner than going through a conversion function. Straightforward and type-safe – chef's kiss! 👌

crates/amaru-ouroboros-traits/src/mempool.rs (1)

138-145: Nice one—TxOrigin doc comment now matches the code!

Cheers for fixing this! The previous review flagged that the doc mentioned Remote(Peer) but the enum was fieldless. Now line 144 properly includes the Peer field, so the docs and code are in sync. That's the kind of consistency that'd make even the strictest code reviewer crack a smile!

crates/amaru-mempool/src/strategies/in_memory_mempool.rs (1)

91-126: Insert logic looks solid.

The insertion path handles the key constraints well—checks capacity, prevents duplicates, assigns sequence numbers, and maintains both indexes. The CBOR size calculation and notification of waiters is also spot on. No dramas here!

crates/amaru-consensus/src/consensus/effects/consensus_effects.rs (3)

19-21: Clean mempool integration into ConsensusOps.

This is textbook integration work, mate! You've added the mempool accessor to the trait (lines 31-33), implemented it in ConsensusEffects by wrapping it in a MemoryPool (lines 58-60), and delegated it properly in the trait impl (lines 84-86). The pattern matches the existing store() and ledger() accessors perfectly. Chef's kiss! 👨‍🍳

Also applies to: 58-60, 84-86


287-294: Mock call() stubs are appropriate for testing.

Adding these call() method stubs to both MockBaseOps implementations is the right move for supporting the expanded BaseOps trait API. Returning None is a sensible default for mock behavior—like a test dummy that politely declines to help. If specific tests need custom responses, they can extend the mock later.

Also applies to: 323-330


360-369: Mock setup with InMemoryMempool looks good.

The mock_consensus_ops() helper now properly initializes a mock_mempool using InMemoryMempool::default() (line 363). This gives tests a working mempool implementation out of the box, which is exactly what you want for test scaffolding.

Just note that the underlying InMemoryMempool has the infinite loop issue in wait_for_at_least that was flagged earlier in this review. Once that's sorted, this mock will be even more reliable!

crates/amaru-network/src/tx_submission/stage.rs (3)

85-127: Nice one! Responder stage looks solid.

The logic here is clean and the error handling is spot-on. Same MemoryPool pattern as the initiator, so if you refactor that one, might as well keep this consistent too.


129-155: Clean outcome handling, good stuff!

The pattern here is solid - send messages, handle errors, and terminate when done. Like a well-choreographed dance routine.


246-250: The expect pattern here is fair dinkum.

Using expect() for encoding that's guaranteed by the message format is reasonable. The #[expect(clippy::expect_used)] shows this is intentional, and these really shouldn't fail given the protocol guarantees.

crates/amaru-network/src/tx_submission/responder_state.rs (3)

28-45: Nice data structure choices, champion!

Love the dual tracking of inflight fetches with both queue and set - like having both a to-do list and a quick reference card. The field docs are clear too.


157-192: Protocol invariant guards are on point.

Using expect() for checked arithmetic on protocol invariants is the right call here - if these overflow or underflow, something's gone seriously wrong with the protocol logic. The #[allow(clippy::expect_used)] shows you've thought this through.


255-413: Test coverage is top-notch!

Great to see comprehensive tests covering window limits, batch sizes, and the inflight tracking. The test helpers make the scenarios nice and readable too.

crates/amaru-network/src/tx_submission/tests/nodes.rs (2)

26-38: Solid test harness structure!

The two-node setup with separate states and mempools is clean, and exposing the mempools and outcomes publicly makes verification easy in tests. Nice work.


81-126: Step logic is nice and clear.

The agency alternation between initiator and responder is easy to follow, and while there's a few clones happening (lines 90, 103, 120, 121), that's perfectly fine for test harness code where clarity trumps performance.

simulation/amaru-sim/src/simulator/data_generation/generate.rs (1)

257-257: Noice! Trimmed the fat off that conversion, mate.

Removing Slot::from() here is spot on—header.slot() already returns the right type, so wrapping it was just doing a walkabout for no reason. Clean and lean, just like a good flat white.

simulation/amaru-sim/src/simulator/run.rs (3)

36-38: Solid imports, bringing in the new crew.

Adding Hash and Tip from amaru_kernel to replace the old HeaderTip setup—that's the ticket. These are the building blocks for the type refactoring below.


182-182: Right on with the Tip refactor!

Swapping HeaderTip::new(Point::Origin, 0) for Tip::new(Point::Origin, 0.into()) is the way to go. The .into() there converts the block height to whatever newtype wrapper it needs—probably BlockHeight. Origin point for genesis, zero height, Bob's your uncle.


159-159: Hash conversion compiles and is sound—no issues here.

The Hash::from(&*hash.bytes) pattern is idiomatic Rust converting Vec<u8> to &[u8]. This same pattern is already used in the project's test code (point.rs line 153) with Hash::from(random_bytes(HEADER_HASH_SIZE).as_slice()), so you're in good company. Since HeaderHash is Hash<32> and the project maintains the invariant that header hashes are always 32 bytes, the From implementation doesn't need runtime length validation—that's by design.

crates/amaru/src/bin/ledger/cmd/sync.rs (3)

19-19: Solid import additions, mate!

Adding Hash and FromStr to support the new parsing logic below. Clean as a whistle.

Also applies to: 35-35


169-169: Beauty! Point's got Copy semantics now.

Swapping num.clone() for *num is a ripper optimization—no more unnecessary allocations when sorting. This is exactly the kind of quality-of-life improvement that makes the codebase feel like a well-oiled machine. Level up! 🎮


159-161: Solid upgrade to idiomatic hash parsing, mate!

The shift to Hash::from_str() is spot-on—it's the proper Rust way to handle string-to-type conversions via the FromStr trait. Same format, same permissiveness as the old hex decode approach, just cleaner. The slot.into() conversion meshes perfectly with the type standardization you're pushing through.

No compatibility worries here—archive filenames will parse just fine with the standard hex format.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (5)
Cargo.toml (1)

48-55: Hold up, mate—are these pallas versions actually released?

The previous review flagged that pallas 0.34.0 doesn't exist yet, with the latest stable being v0.33.0 (July 13, 2024). Before you merge this, we need to confirm these versions are actually published on crates.io, otherwise your CI build is gonna have a bad time trying to fetch non-existent crates.

Run this to check the actual published versions:

#!/bin/bash
# Check latest published versions of pallas crates on crates.io

for crate in pallas-addresses pallas-codec pallas-crypto pallas-hardano pallas-math pallas-network pallas-primitives pallas-traverse; do
  echo "Checking $crate..."
  curl -s "https://crates.io/api/v1/crates/$crate" | jq -r '.crate.max_stable_version // .crate.max_version // "NOT FOUND"'
done

Also search the web to confirm:

What is the latest published version of pallas-primitives on crates.io?
crates/amaru-protocols/src/store_effects.rs (1)

76-82: Stub methods returning None — already flagged.

These methods were noted in a previous review. I'll not repeat the same tune twice, like a broken record in a hipster vinyl shop.

crates/amaru-protocols/src/mempool_effects.rs (1)

15-26: Missing Future import will cause a compile error.

G'day! The past review flagged this and it's still an issue - line 87 uses Pin<Box<dyn Future<Output = bool> + Send + '_>> but Future isn't imported. You've got std::pin::Pin but not std::future::Future. Same goes for the test module at line 382.

🔎 Proposed fix to add Future import
 use std::fmt::Debug;
+use std::future::Future;
 use std::pin::Pin;
crates/amaru-protocols/src/chainsync/initiator.rs (1)

103-109: Timeout result from mux call is silently ignored - consider handling.

The eff.call() returns Option<Resp> where None indicates a timeout. Currently, the result is discarded. If the send times out, the message is lost without any indication. At minimum, logging the timeout would provide visibility into network issues.

This was flagged in a previous review and is still applicable.

🔎 Suggested fix: Handle the timeout case
-        eff.call(&initiator.muxer, NETWORK_SEND_TIMEOUT, move |cr| {
+        let result = eff.call(&initiator.muxer, NETWORK_SEND_TIMEOUT, move |cr| {
             MuxMessage::Send(PROTO_N2N_CHAIN_SYNC.erase(), msg, cr)
         })
         .await;
+        if result.is_none() {
+            tracing::warn!(peer = %initiator.peer, "chainsync send timed out");
+        }
crates/amaru-ouroboros-traits/src/mempool.rs (1)

20-23: Missing std::future::Future import.

Line 94 uses Future in the return type Pin<Box<dyn Future<Output = bool> + Send + '_>>, but Future isn't imported. In Rust 2021, it's not in the prelude, so you'll need to add it explicitly. Otherwise the compiler will be as confused as a character with no quest marker!

🔎 Proposed fix
 use std::fmt;
 use std::fmt::{Display, Formatter};
+use std::future::Future;
 use std::pin::Pin;
 use std::sync::Arc;
🧹 Nitpick comments (18)
crates/amaru-kernel/Cargo.toml (1)

44-45: Same WASM FIXME as in amaru-ouroboros—time to sort this out?

You've got the same FIXME comment here about the WASM story, and getrandom is marked as ignored by cargo-machete (line 61). This suggests it's not directly used but needed for feature resolution. Since this is now workspace-unified, consider documenting the reasoning in a central location (maybe root Cargo.toml or a WASM.md doc) rather than scattering FIXMEs across multiple crates.

Would you like to consolidate the WASM dependency rationale into a single doc or comment?

crates/amaru/src/stages/mod.rs (1)

233-317: Strewth, that's a chunky block of commented code! Like finding a cassette tape in 2025.

I get it – you're scaffolding for the upcoming network protocol stack and the PR description says you'll test when stages are complete. But keeping 84 lines of commented code in the main branch is a bit dodge, mate. It can get stale faster than bread in summer and confuses anyone reading through later.

Consider these alternatives instead:

  • Open a tracking issue and link it with a TODO comment
  • Use feature flags if you want to merge partial implementations
  • Keep the scaffolding in a separate WIP branch until it's ready to integrate
crates/amaru-protocols/src/handshake/tests.rs (1)

162-188: Minor inconsistency with network magic handling.

In test_against_node, you've got a nice network_magic variable that's reused (lines 49, 81, 117). But here in test_against_node_with_tokio, you're hardcoding NetworkMagic::MAINNET directly in two places. It's like having a config file but then typing values manually anyway — works, but not as tidy.

🔎 Optional: Extract to a variable for consistency
+    let network_magic = NetworkMagic::MAINNET;
+
     let handshake = network.stage("handshake", handshake::stage);
     let handshake = network.wire_up(
         handshake,
         handshake::Handshake::new(
             mux.clone().without_state(),
             output,
             Role::Initiator,
-            VersionTable::v11_and_above(NetworkMagic::MAINNET, true),
+            VersionTable::v11_and_above(network_magic, true),
         ),
     );
     ...
     assert_eq!(
         result,
         handshake::HandshakeResult::Accepted(
             VersionNumber::V14,
-            VersionData::new(NetworkMagic::MAINNET, true, PEER_SHARING_DISABLED, false),
+            VersionData::new(network_magic, true, PEER_SHARING_DISABLED, false),
         )
     );
crates/amaru-protocols/src/tx_submission/tests/system_test.rs (1)

35-38: Use try_init() to avoid panics when tracing is already initialized.

Using .init() will panic if the subscriber is already set (e.g., when running multiple tests in parallel). It's like trying to load a save file that's already loaded — the game crashes. Using .try_init() is more forgiving.

🔎 Switch to try_init()
-    tracing_subscriber::fmt()
-        .with_env_filter(EnvFilter::from_default_env())
-        .with_test_writer()
-        .init();
+    let _ = tracing_subscriber::fmt()
+        .with_env_filter(EnvFilter::from_default_env())
+        .with_test_writer()
+        .try_init();
crates/amaru-protocols/src/effects.rs (1)

170-187: Use the imported ToSocketAddrs instead of the fully qualified path

Great observation! You're spot on here. Line 16 already imports ToSocketAddrs from amaru_ouroboros, and line 180 uses the fuller path amaru_ouroboros_traits::connection::ToSocketAddrs::String(...). Since amaru_ouroboros re-exports everything from amaru_ouroboros_traits, they're the exact same type. For consistency and readability, swap that full path for the simpler ToSocketAddrs::String(...) you've already got in scope—cleaner code, no redundancy.

crates/amaru-protocols/src/tx_submission/messages.rs (1)

143-161: Minor nit on Display formatting.

The Display implementation works, but you could tighten it up a bit using write! with iterators directly instead of collecting into a Vec<_> first. Not a blocker though — just a "nice to have" like finding a shortcut in a speedrun.

🔎 Optional: Avoid intermediate Vec allocation
 Message::ReplyTxIds(ids) => {
-    write!(
-        f,
-        "ReplyTxIds(ids: [{}])",
-        ids.iter()
-            .map(|(id, size)| format!("({}, {})", id, size))
-            .collect::<Vec<_>>()
-            .join(", ")
-    )
+    write!(f, "ReplyTxIds(ids: [")?;
+    for (i, (id, size)) in ids.iter().enumerate() {
+        if i > 0 { write!(f, ", ")?; }
+        write!(f, "({}, {})", id, size)?;
+    }
+    write!(f, "])")
 }
crates/amaru-protocols/src/chainsync/messages.rs (1)

91-128: Decode implementation mirrors encoding nicely.

The decode logic follows the same pattern as encoding with proper variant matching. One small thing though - the error message at line 124 doesn't include the actual variant number received, which could make debugging a bit of a treasure hunt.

🔎 Optional: Include the unknown variant in the error message
-            _ => Err(decode::Error::message(
-                "unknown variant for chainsync message",
-            )),
+            n => Err(decode::Error::message(format!(
+                "unknown variant for chainsync message: {}",
+                n
+            ))),
crates/amaru-protocols/src/mempool_effects.rs (1)

116-132: Consider documenting the expect message pattern.

All the #[expect(clippy::expect_used)] annotations with identical expect messages ("ResourceMempool requires a mempool") are fine for internal code, but if this resource is ever missing at runtime, the error might not give enough context about where it failed.

Not a blocker by any means - just something to keep in mind if debugging becomes a hassle later. Like trying to find that one missing collectible in an open world game without a guide.

crates/amaru-protocols/src/tx_submission/outcome.rs (1)

106-117: PartialEq uses string comparison when it doesn't need to.

Hey mate, ProtocolError already derives PartialEq, so the format!("{}", e1) == format!("{}", e2) at line 110 is doing extra work for nothing. It allocates two strings and compares them when you could just do e1 == e2 directly.

Not a major drama, but it's like taking the scenic route when there's a highway available.

🔎 Proposed fix for efficient PartialEq
 impl PartialEq for Outcome {
     fn eq(&self, other: &Self) -> bool {
         match (self, other) {
             (Outcome::Done, Outcome::Done) => true,
-            (Outcome::Error(e1), Outcome::Error(e2)) => format!("{}", e1) == format!("{}", e2),
+            (Outcome::Error(e1), Outcome::Error(e2)) => e1 == e2,
             (Outcome::Send(msg1), Outcome::Send(msg2)) => msg1 == msg2,
             _ => false,
         }
     }
 }
crates/amaru-protocols/src/tx_submission/tests/nodes_options.rs (1)

73-78: Hardcoded capacity might trip you up like a surprise quick-time event.

The capacity 4 is hardcoded here, which won't respect any max_window you've set via with_max_window(). For test code this is probably grand, but if you ever need consistency, consider deriving from self.responder_params.max_window.

🔎 Optional: derive capacity from params
     pub fn with_responder_tx_validator(
         self,
         tx_validator: Arc<dyn CanValidateTransactions<Tx>>,
     ) -> Self {
-        self.with_responder_mempool(Arc::new(SizedMempool::with_tx_validator(4, tx_validator)))
+        self.with_responder_mempool(Arc::new(SizedMempool::with_tx_validator(
+            self.responder_params.max_window.into(),
+            tx_validator,
+        )))
     }
crates/amaru-protocols/src/connection.rs (1)

107-145: Initialization flow looks solid, but those magic numbers are sneaky.

The buffer size 5760 appears twice here. Might be worth extracting to a named constant – like defining your keybindings in one place rather than hardcoding them throughout the game. Not a blocker, just a nice-to-have for maintainability.

crates/amaru-protocols/src/keepalive/mod.rs (1)

46-88: Initiator logic is sound with proper cookie validation.

Cookie mismatch terminates the connection – good security hygiene, like verifying save file checksums. The TODO at line 75 about tracking timings is noted.

Want me to open an issue to track implementing the timing metrics mentioned in the TODO comment?

crates/amaru-protocols/src/tx_submission/tests/nodes.rs (1)

63-70: Consider handling the unwrap() more gracefully.

While this is test code (so panicking on failure is often acceptable), the unwrap() on line 68 will give a less informative error if mempool insertion fails. A wee bit of context in the panic message could save debugging time later.

🔎 Optional: Add context to the unwrap
     pub fn insert_client_transactions(&self, txs: &[Tx], origin: TxOrigin) {
         for tx in txs.iter() {
             self.initiator_mempool
                 .insert(tx.clone(), origin.clone())
-                .unwrap();
+                .expect("failed to insert transaction into initiator mempool");
         }
     }
crates/amaru-protocols/src/tx_submission/initiator_state.rs (2)

29-29: Consider making MAX_REQUESTED_TX_IDS configurable or documented.

The constant is set to 10, but there's no documentation explaining why this value was chosen or whether it should match any protocol specification. Just a thought for future maintainers, like leaving a note in your save file about why you made certain choices.


171-185: Redundant generic type parameter Tx shadows the imported type.

The function signature fn get_next_tx_ids<Tx: Send + Debug + Sync + 'static> introduces a new generic Tx that shadows the imported amaru_kernel::Tx. This works because TxSubmissionMempool<Tx> uses the same generic, but it's confusing and potentially error-prone. The imported Tx already satisfies these bounds.

🔎 Remove the redundant generic parameter
-    fn get_next_tx_ids<Tx: Send + Debug + Sync + 'static>(
+    fn get_next_tx_ids(
         &mut self,
         mempool: &dyn TxSubmissionMempool<Tx>,
         required_next: u16,
     ) -> anyhow::Result<Vec<(TxId, u32)>> {
crates/amaru-network/src/connection.rs (1)

75-97: Consider extracting shared connection logic to reduce duplication.

Lines 83-95 are essentially a copy-paste of lines 59-71 from connect(). It's not a big deal for now, but if you fancy a wee refactor later, extracting the common "create connection and insert" logic into a helper would keep things DRY.

🔎 Possible refactor
+    fn do_connect(
+        resource: Arc<Mutex<Connections>>,
+        read_buf_size: usize,
+        addr: &[SocketAddr],
+    ) -> BoxFuture<'static, std::io::Result<ConnectionId>> {
+        let addr = addr.to_vec();
+        Box::pin(async move {
+            let (reader, writer) = TcpStream::connect(&*addr).await?.into_split();
+            let id = ConnectionId::new();
+            resource.lock().insert(
+                id,
+                Connection {
+                    reader: Arc::new(AsyncMutex::new((
+                        reader,
+                        BytesMut::with_capacity(read_buf_size),
+                    ))),
+                    writer: Arc::new(AsyncMutex::new(writer)),
+                },
+            );
+            Ok(id)
+        })
+    }
crates/amaru-protocols/src/tx_submission/stage.rs (2)

178-181: Consider documenting or making configurable the hardcoded ResponderParams values.

ResponderParams::new(2, 3) on line 180 uses magic numbers. What do 2 and 3 represent? Would be grand to either:

  • Add a comment explaining these values
  • Define named constants
  • Make them configurable

Same goes for max_buffer: 5760 on line 209 - that's a suspiciously specific number, like a secret code in a puzzle game.

🔎 Suggested improvement
+// TODO: Document these protocol parameters
+// First param: [meaning], Second param: [meaning]
+const DEFAULT_TX_IDS_BLOCKING: u16 = 2;
+const DEFAULT_TX_IDS_NON_BLOCKING: u16 = 3;
+const TX_SUBMISSION_MAX_BUFFER: usize = 5760;
+
 // In register_tx_submission:
-                    TxSubmissionResponderState::new(ResponderParams::new(2, 3), origin),
+                    TxSubmissionResponderState::new(
+                        ResponderParams::new(DEFAULT_TX_IDS_BLOCKING, DEFAULT_TX_IDS_NON_BLOCKING),
+                        origin,
+                    ),

246-253: LGTM with noted FIXME.

The expects in to_bytes are justified - encoding a valid Message to CBOR shouldn't fail, and a non-empty message gives non-empty bytes. The #[expect] attribute documents the intentionality.

The FIXME on NETWORK_SEND_TIMEOUT is noted - 1 second might be too aggressive or too lenient depending on network conditions. Worth revisiting when you've got real-world metrics, like tuning game difficulty after playtesting.

Would you like me to open an issue to track the NETWORK_SEND_TIMEOUT tuning?

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ac29b0f and 208ac7b.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (47)
  • Cargo.toml (4 hunks)
  • crates/amaru-consensus/Cargo.toml (1 hunks)
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs (12 hunks)
  • crates/amaru-kernel/Cargo.toml (3 hunks)
  • crates/amaru-kernel/src/bytes.rs (2 hunks)
  • crates/amaru-kernel/src/lib.rs (4 hunks)
  • crates/amaru-network/src/connection.rs (5 hunks)
  • crates/amaru-network/src/lib.rs (1 hunks)
  • crates/amaru-network/src/socket_addr.rs (1 hunks)
  • crates/amaru-ouroboros-traits/src/connection.rs (1 hunks)
  • crates/amaru-ouroboros-traits/src/lib.rs (1 hunks)
  • crates/amaru-ouroboros-traits/src/mempool.rs (1 hunks)
  • crates/amaru-ouroboros/Cargo.toml (1 hunks)
  • crates/amaru-protocols/Cargo.toml (1 hunks)
  • crates/amaru-protocols/src/chainsync/initiator.rs (1 hunks)
  • crates/amaru-protocols/src/chainsync/messages.rs (1 hunks)
  • crates/amaru-protocols/src/chainsync/mod.rs (1 hunks)
  • crates/amaru-protocols/src/chainsync/responder.rs (1 hunks)
  • crates/amaru-protocols/src/connection.rs (1 hunks)
  • crates/amaru-protocols/src/effects.rs (3 hunks)
  • crates/amaru-protocols/src/handshake/mod.rs (6 hunks)
  • crates/amaru-protocols/src/handshake/tests.rs (7 hunks)
  • crates/amaru-protocols/src/keepalive/messages.rs (1 hunks)
  • crates/amaru-protocols/src/keepalive/mod.rs (1 hunks)
  • crates/amaru-protocols/src/keepalive/tests.rs (1 hunks)
  • crates/amaru-protocols/src/lib.rs (1 hunks)
  • crates/amaru-protocols/src/mempool_effects.rs (1 hunks)
  • crates/amaru-protocols/src/mux.rs (9 hunks)
  • crates/amaru-protocols/src/protocol.rs (1 hunks)
  • crates/amaru-protocols/src/store_effects.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/initiator_state.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/messages.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/mod.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/outcome.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/responder_params.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/responder_state.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/stage.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/tests/assertions.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/tests/faulty_tx_validator.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/tests/mod.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/tests/nodes.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/tests/nodes_options.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/tests/sized_mempool.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/tests/system_test.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/tests/test_cases.rs (1 hunks)
  • crates/amaru-protocols/src/tx_submission/tests/test_data.rs (1 hunks)
  • crates/amaru/src/stages/mod.rs (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/amaru-kernel/src/lib.rs
  • crates/amaru-consensus/Cargo.toml
🧰 Additional context used
🧠 Learnings (27)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.
📚 Learning: 2025-09-29T20:08:29.906Z
Learnt from: yHSJ
Repo: pragma-org/amaru PR: 453
File: crates/amaru-metrics/src/ledger.rs:60-104
Timestamp: 2025-09-29T20:08:29.906Z
Learning: In amaru-ledger/src/state.rs, the txs_processed field represents the number of transactions in the current block (block.transaction_bodies.len() as u64), not a cumulative total. Therefore, using a Counter with .add() in the metrics is correct for accumulating per-block transaction counts.

Applied to files:

  • crates/amaru-protocols/src/tx_submission/tests/faulty_tx_validator.rs
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
📚 Learning: 2025-12-16T21:32:37.668Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/amaru-network/src/handshake/tests.rs:40-47
Timestamp: 2025-12-16T21:32:37.668Z
Learning: In Rust, shadowing a binding with a new let does not drop the previous binding until the end of the scope. All shadowed bindings in a scope are dropped in reverse-declaration order when the scope ends. Therefore, multiple let _guard = register_*() calls will keep all guards alive until the end of the function (or the surrounding scope). When reviewing code, be mindful that resources tied to shadowed bindings persist longer than the most recent binding; to release early, constrain the lifetime in an inner block or explicitly drop guards when appropriate.

Applied to files:

  • crates/amaru-protocols/src/tx_submission/tests/faulty_tx_validator.rs
  • crates/amaru-ouroboros-traits/src/lib.rs
  • crates/amaru-protocols/src/store_effects.rs
  • crates/amaru-protocols/src/chainsync/messages.rs
  • crates/amaru-protocols/src/chainsync/mod.rs
  • crates/amaru-network/src/lib.rs
  • crates/amaru-protocols/src/effects.rs
  • crates/amaru-protocols/src/lib.rs
  • crates/amaru-protocols/src/tx_submission/tests/system_test.rs
  • crates/amaru-protocols/src/tx_submission/responder_state.rs
  • crates/amaru-protocols/src/tx_submission/tests/assertions.rs
  • crates/amaru-protocols/src/mempool_effects.rs
  • crates/amaru-protocols/src/tx_submission/tests/nodes_options.rs
  • crates/amaru-protocols/src/keepalive/messages.rs
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
  • crates/amaru-protocols/src/tx_submission/initiator_state.rs
  • crates/amaru-protocols/src/tx_submission/tests/test_data.rs
  • crates/amaru-protocols/src/tx_submission/messages.rs
  • crates/amaru-network/src/socket_addr.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
  • crates/amaru-kernel/src/bytes.rs
  • crates/amaru/src/stages/mod.rs
  • crates/amaru-protocols/src/chainsync/initiator.rs
  • crates/amaru-protocols/src/tx_submission/outcome.rs
  • crates/amaru-protocols/src/tx_submission/tests/mod.rs
  • crates/amaru-protocols/src/tx_submission/tests/sized_mempool.rs
  • crates/amaru-protocols/src/tx_submission/mod.rs
  • crates/amaru-ouroboros-traits/src/mempool.rs
  • crates/amaru-protocols/src/tx_submission/responder_params.rs
  • crates/amaru-protocols/src/keepalive/tests.rs
  • crates/amaru-protocols/src/tx_submission/tests/nodes.rs
  • crates/amaru-protocols/src/handshake/mod.rs
  • crates/amaru-protocols/src/mux.rs
  • crates/amaru-ouroboros-traits/src/connection.rs
  • crates/amaru-protocols/src/tx_submission/tests/test_cases.rs
  • crates/amaru-protocols/src/connection.rs
  • crates/amaru-network/src/connection.rs
  • crates/amaru-protocols/src/protocol.rs
  • crates/amaru-protocols/src/keepalive/mod.rs
  • crates/amaru-protocols/src/tx_submission/stage.rs
  • crates/amaru-protocols/src/handshake/tests.rs
📚 Learning: 2025-08-20T13:02:25.763Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.

Applied to files:

  • crates/amaru-ouroboros/Cargo.toml
  • crates/amaru-protocols/src/store_effects.rs
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
  • crates/amaru-protocols/src/tx_submission/tests/mod.rs
  • Cargo.toml
  • crates/amaru-protocols/src/tx_submission/tests/test_cases.rs
  • crates/amaru-protocols/Cargo.toml
  • crates/amaru-kernel/Cargo.toml
  • crates/amaru-protocols/src/handshake/tests.rs
📚 Learning: 2025-08-12T12:28:24.027Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 372
File: simulation/amaru-sim/src/simulator/mod.rs:410-412
Timestamp: 2025-08-12T12:28:24.027Z
Learning: In the Amaru project, panic statements are acceptable in simulation/test code (like amaru-sim crate) as they help identify configuration issues quickly during development, rather than needing proper error handling like production code.

Applied to files:

  • crates/amaru-ouroboros/Cargo.toml
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
  • crates/amaru-protocols/src/tx_submission/tests/mod.rs
  • Cargo.toml
  • crates/amaru-protocols/src/keepalive/tests.rs
  • crates/amaru-protocols/Cargo.toml
📚 Learning: 2025-05-21T18:58:48.631Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.

Applied to files:

  • crates/amaru-protocols/src/store_effects.rs
  • crates/amaru-protocols/src/tx_submission/tests/assertions.rs
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
📚 Learning: 2025-08-18T08:10:32.640Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:427-433
Timestamp: 2025-08-18T08:10:32.640Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp, so unimplemented methods like set_proposals_roots and set_constitution are intentionally left as placeholders until the revamp is complete.

Applied to files:

  • crates/amaru-protocols/src/store_effects.rs
📚 Learning: 2025-08-18T08:10:35.849Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:431-433
Timestamp: 2025-08-18T08:10:35.849Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp soon, so unimplemented methods like set_constitution, set_proposals_roots are intentionally left aside until the revamp is complete.

Applied to files:

  • crates/amaru-protocols/src/store_effects.rs
📚 Learning: 2025-08-08T14:43:20.218Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/point.rs:45-52
Timestamp: 2025-08-08T14:43:20.218Z
Learning: In crates/amaru-kernel/src/point.rs, Point::Specific stores a header hash that is always exactly 32 bytes by project invariant. Therefore, converting it with `impl From<&Point> for Hash<32>` without a runtime length check is acceptable; future reviews should not request an error-returning check inside this `From` impl. If validation is ever desired, it should be done via `TryFrom` or upstream constructors.

Applied to files:

  • crates/amaru-protocols/src/store_effects.rs
  • crates/amaru-kernel/src/bytes.rs
📚 Learning: 2025-08-18T08:11:20.028Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:109-118
Timestamp: 2025-08-18T08:11:20.028Z
Learning: The proposals_roots() method in the MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is intentionally left returning all None values rather than reading from stored state, as it's planned for the upcoming major MemoryStore revamp.

Applied to files:

  • crates/amaru-protocols/src/store_effects.rs
📚 Learning: 2025-08-08T14:35:35.562Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/transaction_pointer.rs:36-44
Timestamp: 2025-08-08T14:35:35.562Z
Learning: In the amaru project, when decoding CBOR arrays, prefer using minicbor_extra::heterogenous_array with the expected length to validate definite-length arrays and correctly handle indefinite-length arrays. Example: crates/amaru-kernel/src/transaction_pointer.rs Decode should use heterogenous_array(d, 2, …) instead of ignoring the length from d.array().

Applied to files:

  • crates/amaru-protocols/src/chainsync/messages.rs
  • crates/amaru-kernel/src/bytes.rs
📚 Learning: 2025-09-06T09:16:25.025Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 435
File: crates/amaru/src/bin/amaru/cmd/convert_ledger_state.rs:113-116
Timestamp: 2025-09-06T09:16:25.025Z
Learning: In cardano-node serialized ledger state CBOR encoding, indefinite-length structures may be terminated with 0xFF "break" markers. The current code in convert_ledger_state.rs unconditionally strips the last byte (bytes[p..bytes.len() - 1]), which could corrupt data if the trailing byte is not actually a CBOR break marker.

Applied to files:

  • crates/amaru-protocols/src/chainsync/messages.rs
  • crates/amaru-kernel/src/bytes.rs
📚 Learning: 2025-02-03T11:15:22.640Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Applied to files:

  • crates/amaru-protocols/src/chainsync/messages.rs
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
📚 Learning: 2025-08-08T14:34:06.105Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/minicbor-extra/src/lib.rs:50-55
Timestamp: 2025-08-08T14:34:06.105Z
Learning: Project uses Rust 1.88 stable; using Result::is_err_and is acceptable and should not be flagged as unstable. In particular, crates/minicbor-extra/src/lib.rs relies on is_err_and when checking end_of_input; future reviews should not suggest replacing it for stability reasons.

Applied to files:

  • crates/amaru-protocols/src/chainsync/messages.rs
  • crates/amaru-protocols/src/mempool_effects.rs
  • crates/amaru-kernel/src/bytes.rs
  • Cargo.toml
  • crates/amaru-ouroboros-traits/src/mempool.rs
📚 Learning: 2025-08-08T14:46:53.013Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/pool_params.rs:107-116
Timestamp: 2025-08-08T14:46:53.013Z
Learning: In crates/amaru-kernel/src/pool_params.rs, when serializing Relay::SingleHostAddr IPv6 to text, the project intentionally reverses each 4-byte chunk before constructing std::net::Ipv6Addr. This matches cardano-ledger’s IPv6 representation (four little-endian Word32 chunks). Do not “simplify” by passing the raw 16 bytes directly to Ipv6Addr::from; that would break ledger compatibility.

Applied to files:

  • crates/amaru-protocols/src/chainsync/messages.rs
  • crates/amaru-protocols/src/effects.rs
  • crates/amaru-network/src/socket_addr.rs
  • crates/amaru-network/src/connection.rs
📚 Learning: 2025-09-29T16:39:24.001Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:317-325
Timestamp: 2025-09-29T16:39:24.001Z
Learning: In crates/amaru-network/src/mux.rs, the outgoing() method intentionally uses unwrap() after get_mut(&proto_id) as a fail-fast mechanism. This panic is designed to catch programming errors where an actor tries to send on an unregistered protocol, and should not be changed to return a Result since it represents internal code bugs that should terminate the process, not external input that should be handled gracefully.

Applied to files:

  • crates/amaru-protocols/src/effects.rs
  • crates/amaru-protocols/src/chainsync/initiator.rs
  • crates/amaru-protocols/src/tx_submission/outcome.rs
  • crates/amaru-protocols/src/handshake/mod.rs
  • crates/amaru-protocols/src/mux.rs
  • crates/amaru-network/src/connection.rs
  • crates/amaru-protocols/src/protocol.rs
  • crates/amaru-protocols/src/keepalive/mod.rs
  • crates/amaru-protocols/src/tx_submission/stage.rs
  • crates/amaru-protocols/src/handshake/tests.rs
📚 Learning: 2025-09-29T16:38:59.323Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:216-233
Timestamp: 2025-09-29T16:38:59.323Z
Learning: In crates/amaru-network/src/mux.rs, the segment length field uses u16 type which naturally limits values to 65535, matching MAX_SEGMENT_SIZE constant exactly. This provides type-level safety against oversized allocations without needing runtime bounds checking.

Applied to files:

  • crates/amaru-protocols/src/effects.rs
  • crates/amaru-protocols/src/mux.rs
  • crates/amaru-protocols/src/handshake/tests.rs
📚 Learning: 2025-05-12T14:21:27.470Z
Learnt from: stevana
Repo: pragma-org/amaru PR: 210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.

Applied to files:

  • crates/amaru-protocols/src/tx_submission/tests/system_test.rs
  • crates/amaru-protocols/src/tx_submission/tests/nodes_options.rs
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
  • crates/amaru-protocols/src/keepalive/tests.rs
  • crates/amaru-protocols/src/tx_submission/tests/nodes.rs
  • crates/amaru-protocols/src/tx_submission/tests/test_cases.rs
  • crates/amaru-protocols/src/handshake/tests.rs
📚 Learning: 2025-09-05T17:30:55.869Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.

Applied to files:

  • crates/amaru-protocols/src/mempool_effects.rs
  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
  • crates/amaru-protocols/src/tx_submission/initiator_state.rs
  • crates/amaru-protocols/src/chainsync/initiator.rs
  • crates/amaru-ouroboros-traits/src/mempool.rs
  • crates/amaru-network/src/connection.rs
📚 Learning: 2025-08-08T14:39:50.527Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/borrowed_datum.rs:32-39
Timestamp: 2025-08-08T14:39:50.527Z
Learning: In the amaru project, when converting BorrowedDatumOption::Data to an owned DatumOption in crates/amaru-kernel/src/borrowed_datum.rs, the call `.unwrap()` refers to pallas’s KeepRaw::unwrap, which is infallible (always returns the inner value) and is not a panic risk. Future reviews should not flag this unwrap as dangerous.

Applied to files:

  • crates/amaru-protocols/src/mempool_effects.rs
  • crates/amaru-ouroboros-traits/src/mempool.rs
📚 Learning: 2025-04-20T17:57:23.233Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Applied to files:

  • crates/amaru-consensus/src/consensus/effects/consensus_effects.rs
  • crates/amaru-protocols/src/tx_submission/tests/mod.rs
📚 Learning: 2025-04-20T17:56:39.223Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:0-0
Timestamp: 2025-04-20T17:56:39.223Z
Learning: For mpsc::channel in Tokio-based test code, use buffer sizes larger than 1 (e.g., 8) to avoid potential deadlocks when producers send multiple messages before consumers can process them.

Applied to files:

  • crates/amaru-protocols/src/tx_submission/initiator_state.rs
  • crates/amaru-protocols/src/tx_submission/messages.rs
  • crates/amaru-protocols/src/handshake/tests.rs
📚 Learning: 2025-06-03T06:31:57.736Z
Learnt from: stevana
Repo: pragma-org/amaru PR: 236
File: simulation/amaru-sim/src/simulator/generate.rs:141-145
Timestamp: 2025-06-03T06:31:57.736Z
Learning: In the amaru project, the team prefers to use as_bytes() instead of hex::decode() for converting hash and header strings to bytes in simulation/amaru-sim/src/simulator/generate.rs, even though they appear to be hex-encoded strings.

Applied to files:

  • crates/amaru-kernel/src/bytes.rs
  • crates/amaru-kernel/Cargo.toml
📚 Learning: 2025-06-14T16:36:04.502Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 263
File: crates/pure-stage/src/simulation/state.rs:33-36
Timestamp: 2025-06-14T16:36:04.502Z
Learning: In simulation and replay systems that require cloneable and serializable states, error types must often be converted to String rather than stored as trait objects (like Box<dyn Error> or anyhow::Error) because trait objects cannot be cloned, which breaks the snapshotting and replay functionality needed for deterministic simulation.

Applied to files:

  • crates/amaru-kernel/src/bytes.rs
📚 Learning: 2025-04-20T17:56:48.565Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.

Applied to files:

  • crates/amaru/src/stages/mod.rs
📚 Learning: 2025-09-29T16:44:14.807Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/protocol.rs:94-106
Timestamp: 2025-09-29T16:44:14.807Z
Learning: In the amaru-network crate protocol.rs file, the correct Cardano mini-protocol ID assignments are: PROTO_N2N_KEEP_ALIVE = 8 and PROTO_N2N_PEER_SHARE = 10, as verified against the network specification by the maintainer.

Applied to files:

  • Cargo.toml
  • crates/amaru-protocols/src/keepalive/tests.rs
  • crates/amaru-protocols/src/mux.rs
  • crates/amaru-protocols/src/keepalive/mod.rs
  • crates/amaru-protocols/src/handshake/tests.rs
📚 Learning: 2025-01-21T15:32:17.911Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 69
File: crates/amaru/src/ledger/state/diff_epoch_reg.rs:112-117
Timestamp: 2025-01-21T15:32:17.911Z
Learning: When suggesting code changes in Rust, always verify that the types align correctly, especially when dealing with references and Options. The `Fold::Registered` variant in `diff_epoch_reg.rs` expects a reference `&'a V`, so unwrapping an `Option<&V>` requires only a single `.expect()`.

Applied to files:

  • crates/amaru-ouroboros-traits/src/mempool.rs
🧬 Code graph analysis (26)
crates/amaru-protocols/src/tx_submission/tests/faulty_tx_validator.rs (3)
crates/amaru-ouroboros-traits/src/connection.rs (1)
  • new (36-39)
crates/amaru-ouroboros-traits/src/mempool.rs (1)
  • new (186-188)
crates/amaru-protocols/src/effects.rs (1)
  • new (38-40)
crates/amaru-ouroboros-traits/src/lib.rs (1)
crates/amaru-consensus/src/consensus/effects/consensus_effects.rs (4)
  • mempool (33-33)
  • mempool (58-60)
  • mempool (84-86)
  • mempool (149-151)
crates/amaru-protocols/src/store_effects.rs (4)
crates/pure-stage/src/stage_ref.rs (1)
  • sync (163-163)
crates/amaru-protocols/src/effects.rs (5)
  • new (38-40)
  • run (75-87)
  • run (101-113)
  • run (127-139)
  • run (152-164)
crates/amaru-protocols/src/mempool_effects.rs (11)
  • new (41-43)
  • new (111-113)
  • new (161-163)
  • external_sync (46-54)
  • run (118-125)
  • run (139-146)
  • run (168-175)
  • run (201-208)
  • run (230-238)
  • run (260-267)
  • run (281-288)
crates/pure-stage/src/effect.rs (1)
  • wrap_sync (409-416)
crates/amaru-protocols/src/chainsync/messages.rs (2)
crates/amaru-protocols/src/keepalive/messages.rs (4)
  • decode (60-62)
  • decode (97-120)
  • encode (49-56)
  • encode (73-93)
crates/amaru-protocols/src/handshake/messages.rs (4)
  • decode (113-126)
  • decode (175-198)
  • encode (92-106)
  • encode (144-167)
crates/amaru-protocols/src/chainsync/mod.rs (2)
crates/amaru-protocols/src/chainsync/initiator.rs (1)
  • initiator (70-127)
crates/amaru-protocols/src/chainsync/responder.rs (1)
  • responder (65-153)
crates/amaru-protocols/src/effects.rs (1)
crates/amaru-network/src/socket_addr.rs (1)
  • resolve (19-33)
crates/amaru-protocols/src/lib.rs (1)
crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (1)
  • tx_submission (356-365)
crates/amaru-protocols/src/tx_submission/tests/system_test.rs (3)
crates/amaru-protocols/src/effects.rs (2)
  • create_connection (174-187)
  • new (38-40)
crates/amaru-network/src/connection.rs (1)
  • new (46-51)
crates/amaru-protocols/src/connection.rs (1)
  • new (38-48)
crates/amaru-protocols/src/tx_submission/tests/assertions.rs (1)
crates/amaru-protocols/src/tx_submission/responder_state.rs (2)
  • txs (134-134)
  • tx_ids (135-139)
crates/amaru-protocols/src/mempool_effects.rs (2)
crates/amaru-ouroboros-traits/src/mempool.rs (7)
  • new (186-188)
  • insert (55-55)
  • get_tx (73-73)
  • tx_ids_since (81-81)
  • wait_for_at_least (91-94)
  • get_txs_for_ids (97-97)
  • last_seq_no (100-100)
crates/pure-stage/src/effect.rs (2)
  • wrap_sync (409-416)
  • wrap (396-406)
crates/amaru-protocols/src/tx_submission/tests/nodes_options.rs (2)
crates/amaru-protocols/src/tx_submission/responder_params.rs (1)
  • default (30-35)
crates/amaru-protocols/src/tx_submission/tests/sized_mempool.rs (2)
  • with_tx_validator (49-57)
  • with_capacity (45-47)
crates/amaru-consensus/src/consensus/effects/consensus_effects.rs (4)
crates/amaru-kernel/src/protocol_messages/tip.rs (1)
  • point (30-32)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (1)
  • point (87-89)
crates/amaru-protocols/src/tx_submission/tests/nodes_options.rs (1)
  • default (31-40)
crates/amaru-mempool/src/strategies/in_memory_mempool.rs (2)
  • default (43-45)
  • default (81-88)
crates/amaru-protocols/src/tx_submission/tests/test_data.rs (2)
crates/amaru-kernel/src/lib.rs (7)
  • from (281-283)
  • from (304-306)
  • from (342-344)
  • from (510-515)
  • from (519-526)
  • from (578-586)
  • new (321-323)
crates/amaru-ouroboros-traits/src/mempool.rs (1)
  • new (186-188)
crates/amaru-protocols/src/tx_submission/messages.rs (1)
crates/amaru-protocols/src/tx_submission/responder_params.rs (2)
  • decode (69-75)
  • encode (55-65)
crates/amaru-protocols/src/chainsync/responder.rs (2)
crates/amaru-protocols/src/protocol.rs (4)
  • outcome (48-53)
  • decode (66-68)
  • encode (62-64)
  • result (40-45)
crates/amaru-protocols/src/chainsync/messages.rs (4)
  • decode (92-127)
  • decode (131-165)
  • encode (38-88)
  • encode (169-199)
crates/amaru-protocols/src/chainsync/initiator.rs (2)
crates/amaru-protocols/src/protocol.rs (4)
  • outcome (48-53)
  • decode (66-68)
  • encode (62-64)
  • result (40-45)
crates/amaru-protocols/src/chainsync/messages.rs (4)
  • decode (92-127)
  • decode (131-165)
  • encode (38-88)
  • encode (169-199)
crates/amaru-protocols/src/tx_submission/outcome.rs (1)
crates/amaru-protocols/src/tx_submission/messages.rs (1)
  • fmt (127-175)
crates/amaru-protocols/src/tx_submission/tests/sized_mempool.rs (1)
crates/amaru-ouroboros-traits/src/mempool.rs (8)
  • insert (55-55)
  • get_tx (73-73)
  • tx_ids_since (81-81)
  • wait_for_at_least (91-94)
  • get_txs_for_ids (97-97)
  • last_seq_no (100-100)
  • take (36-36)
  • acknowledge (41-44)
crates/amaru-protocols/src/tx_submission/mod.rs (1)
crates/amaru-protocols/src/protocol.rs (1)
  • outcome (48-53)
crates/amaru-protocols/src/tx_submission/responder_params.rs (1)
crates/amaru-protocols/src/tx_submission/tests/nodes_options.rs (1)
  • default (31-40)
crates/amaru-protocols/src/tx_submission/tests/nodes.rs (2)
crates/amaru-protocols/src/tx_submission/responder_state.rs (2)
  • txs (134-134)
  • step (59-91)
crates/amaru-protocols/src/tx_submission/initiator_state.rs (1)
  • step (55-72)
crates/amaru-protocols/src/handshake/mod.rs (2)
crates/amaru-protocols/src/mux.rs (2)
  • encode (52-54)
  • encode (331-348)
crates/amaru-protocols/src/handshake/messages.rs (2)
  • encode (92-106)
  • encode (144-167)
crates/amaru-protocols/src/mux.rs (5)
crates/amaru-protocols/src/tx_submission/stage.rs (2)
  • eff (148-148)
  • eff (152-152)
crates/amaru-kernel/src/bytes.rs (2)
  • len (72-75)
  • new (38-44)
crates/pure-stage/src/tokio.rs (1)
  • trace_buffer (493-495)
crates/pure-stage/src/stage_ref.rs (1)
  • sync (163-163)
crates/amaru-network/src/connection.rs (1)
  • new (46-51)
crates/amaru-protocols/src/tx_submission/tests/test_cases.rs (5)
crates/amaru-protocols/src/tx_submission/tests/test_data.rs (5)
  • create_transactions (39-41)
  • reply_tx_ids (23-25)
  • reply_txs (27-29)
  • request_tx_ids (31-33)
  • request_txs (35-37)
crates/amaru-protocols/src/tx_submission/responder_state.rs (2)
  • request_tx_ids (160-192)
  • txs (134-134)
crates/amaru-protocols/src/tx_submission/initiator_state.rs (1)
  • request_txs (131-152)
crates/amaru-protocols/src/tx_submission/tests/assertions.rs (1)
  • assert_outcomes_eq (55-71)
crates/amaru-protocols/src/tx_submission/tests/nodes_options.rs (1)
  • default (31-40)
crates/amaru-network/src/connection.rs (1)
crates/amaru-network/src/socket_addr.rs (1)
  • resolve (19-33)
crates/amaru-protocols/src/keepalive/mod.rs (1)
crates/amaru-protocols/src/keepalive/messages.rs (5)
  • new (27-29)
  • decode (60-62)
  • decode (97-120)
  • encode (49-56)
  • encode (73-93)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.1.4)
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Build wasm32
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Benches
  • GitHub Check: Test coverage
  • GitHub Check: Analyze (rust)
  • GitHub Check: Test the CLI
  • GitHub Check: clippy

Signed-off-by: Roland Kuhn <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
crates/amaru-network/src/connection.rs (3)

74-96: Two hangpoints for the price of one – crikey!

This method's got a double whammy:

  1. Line 81: resolve(addr).await – DNS resolution without timeout can hang if the resolver's stuffed
  2. Line 82: TcpStream::connect – same timeout issue as the other connect method

That's like having two points of failure in your escape plan – not ideal when you're trying to build something production-ready.

🔎 Suggested fix with timeouts
 fn connect_addrs(
     &self,
     addr: ToSocketAddrs,
 ) -> BoxFuture<'static, std::io::Result<ConnectionId>> {
     let resource = self.connections.clone();
     let read_buf_size = self.read_buf_size;
     Box::pin(async move {
-        let addr = resolve(addr).await?;
-        let (reader, writer) = TcpStream::connect(&*addr).await?.into_split();
+        let addr = tokio::time::timeout(
+            std::time::Duration::from_secs(5),
+            resolve(addr)
+        )
+        .await
+        .map_err(|_| std::io::Error::new(std::io::ErrorKind::TimedOut, "resolve timeout"))??;
+        let stream = tokio::time::timeout(
+            std::time::Duration::from_secs(10),
+            TcpStream::connect(&*addr)
+        )
+        .await
+        .map_err(|_| std::io::Error::new(std::io::ErrorKind::TimedOut, "connection timeout"))??;
+        let (reader, writer) = stream.into_split();
         let id = ConnectionId::new();

98-116: No timeout on write_all – that's a potential blocker, mate!

Line 113's write_all(&data).await has no timeout, so if the receiver stops draining or the network goes pear-shaped, this'll hang indefinitely. That's like sending a letter and waiting forever for confirmation it was delivered.

For resilience, you'd want a bounded wait on write operations, especially in a distributed system where the other end might've carked it.

🔎 Suggested fix with write timeout
 fn send(
     &self,
     conn: ConnectionId,
     data: NonEmptyBytes,
 ) -> BoxFuture<'static, std::io::Result<()>> {
     let resource = self.connections.clone();
     Box::pin(async move {
         let connection = resource
             .lock()
             .get(&conn)
             .ok_or_else(|| {
                 std::io::Error::other(format!("connection {conn} not found for send"))
             })?
             .writer
             .clone();
-        connection.lock().await.write_all(&data).await?;
+        tokio::time::timeout(
+            std::time::Duration::from_secs(30),
+            connection.lock().await.write_all(&data)
+        )
+        .await
+        .map_err(|_| std::io::Error::new(std::io::ErrorKind::TimedOut, "write timeout"))??;
         Ok(())
     })
 }

118-147: Read loop without timeout – recipe for a deadlock, champion!

The recv logic has a fundamental problem: the while loop at lines 136-140 keeps calling read_buf with no timeout. If the remote end stops sending (or sends less than requested), you'll be stuck in limbo like waiting for the sequel that never comes.

That's a reliability gotcha that could lock up your connection handling. Each individual read needs a deadline.

🔎 Suggested fix with read timeout
 fn recv(
     &self,
     conn: ConnectionId,
     bytes: NonZeroUsize,
 ) -> BoxFuture<'static, std::io::Result<NonEmptyBytes>> {
     let resource = self.connections.clone();
     Box::pin(async move {
         let connection = resource
             .lock()
             .get(&conn)
             .ok_or_else(|| {
                 std::io::Error::other(format!("connection {conn} not found for recv"))
             })?
             .reader
             .clone();
         let mut guard = connection.lock().await;
         let (reader, buf) = &mut *guard;
         buf.reserve(bytes.get() - buf.remaining().min(bytes.get()));
         while buf.remaining() < bytes.get() {
-            if reader.read_buf(buf).await? == 0 {
+            let n = tokio::time::timeout(
+                std::time::Duration::from_secs(30),
+                reader.read_buf(buf)
+            )
+            .await
+            .map_err(|_| std::io::Error::new(std::io::ErrorKind::TimedOut, "read timeout"))??;
+            if n == 0 {
                 return Err(std::io::ErrorKind::UnexpectedEof.into());
             };
         }
♻️ Duplicate comments (1)
crates/amaru-protocols/src/chainsync/responder.rs (1)

190-198: Hardcoded Conway era variant is still lurking here, mate.

Same issue that's been spotted before – the variant: 6 assumes every header is Conway era. If your chain's got headers from earlier eras (Babbage, Alonzo, etc.), you'll be sending incorrect era tags to peers, which is gonna cause protocol violations.

The era info should be extracted from the header itself rather than hardcoded. This is a duplicate of the previous review comment.

🧹 Nitpick comments (2)
crates/amaru-network/src/connection.rs (2)

149-158: Shutdown without timeout – one last gotcha for the road!

Line 155's shutdown().await has no timeout. While shutdown is typically quick, on a broken or stalled connection it could hang. That's like trying to politely say goodbye to someone who's already left – you might wait forever.

Less critical than the other timeout issues, but for completeness you'd want a bounded wait here too, mate.

🔎 Suggested fix with shutdown timeout
 fn close(&self, conn: ConnectionId) -> BoxFuture<'static, std::io::Result<()>> {
     let resource = self.connections.clone();
     Box::pin(async move {
         let connection = resource.lock().remove(&conn).ok_or_else(|| {
             std::io::Error::other(format!("connection {conn} not found for close"))
         })?;
-        connection.writer.lock().await.shutdown().await?;
+        tokio::time::timeout(
+            std::time::Duration::from_secs(5),
+            connection.writer.lock().await.shutdown()
+        )
+        .await
+        .map_err(|_| std::io::Error::new(std::io::ErrorKind::TimedOut, "shutdown timeout"))??;
         Ok(())
     })
 }

36-36: Consider switching back to HashMap for connections storage.

You've got BTreeMap handling connections, which keeps keys sorted but adds O(log n) overhead on lookups versus HashMap's O(1). Since ConnectionId (a u64 wrapper) supports hashing just fine and you're only doing basic get/insert/remove operations—no range queries or sorted iteration anywhere in sight—BTreeMap's ordering guarantee isn't earning its keep. Unless you've got a specific reason to maintain sorted connections (which I don't see in the code), HashMap will give you snappier constant-time access as the connection pool scales.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 208ac7b and ac38c15.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • crates/amaru-kernel/src/protocol_messages/tip.rs (1 hunks)
  • crates/amaru-network/Cargo.toml (1 hunks)
  • crates/amaru-network/src/connection.rs (5 hunks)
  • crates/amaru-protocols/Cargo.toml (1 hunks)
  • crates/amaru-protocols/src/chainsync/responder.rs (1 hunks)
  • crates/amaru-protocols/src/effects.rs (4 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (16 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (5 hunks)
  • crates/amaru/src/stages/consensus/forward_chain/mod.rs (1 hunks)
  • crates/amaru/src/stages/mod.rs (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • crates/amaru/src/stages/mod.rs
  • crates/amaru-kernel/src/protocol_messages/tip.rs
  • crates/amaru-network/Cargo.toml
  • crates/amaru-protocols/Cargo.toml
🧰 Additional context used
🧠 Learnings (15)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.
📚 Learning: 2025-05-12T14:21:27.470Z
Learnt from: stevana
Repo: pragma-org/amaru PR: 210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.

Applied to files:

  • crates/amaru-protocols/src/effects.rs
📚 Learning: 2025-09-29T16:39:24.001Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:317-325
Timestamp: 2025-09-29T16:39:24.001Z
Learning: In crates/amaru-network/src/mux.rs, the outgoing() method intentionally uses unwrap() after get_mut(&proto_id) as a fail-fast mechanism. This panic is designed to catch programming errors where an actor tries to send on an unregistered protocol, and should not be changed to return a Result since it represents internal code bugs that should terminate the process, not external input that should be handled gracefully.

Applied to files:

  • crates/amaru-protocols/src/effects.rs
  • crates/amaru-network/src/connection.rs
📚 Learning: 2025-08-08T14:46:53.013Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/pool_params.rs:107-116
Timestamp: 2025-08-08T14:46:53.013Z
Learning: In crates/amaru-kernel/src/pool_params.rs, when serializing Relay::SingleHostAddr IPv6 to text, the project intentionally reverses each 4-byte chunk before constructing std::net::Ipv6Addr. This matches cardano-ledger’s IPv6 representation (four little-endian Word32 chunks). Do not “simplify” by passing the raw 16 bytes directly to Ipv6Addr::from; that would break ledger compatibility.

Applied to files:

  • crates/amaru-protocols/src/effects.rs
  • crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
  • crates/amaru-network/src/connection.rs
📚 Learning: 2025-09-29T16:38:59.323Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:216-233
Timestamp: 2025-09-29T16:38:59.323Z
Learning: In crates/amaru-network/src/mux.rs, the segment length field uses u16 type which naturally limits values to 65535, matching MAX_SEGMENT_SIZE constant exactly. This provides type-level safety against oversized allocations without needing runtime bounds checking.

Applied to files:

  • crates/amaru-protocols/src/effects.rs
📚 Learning: 2025-12-16T21:32:37.668Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/amaru-network/src/handshake/tests.rs:40-47
Timestamp: 2025-12-16T21:32:37.668Z
Learning: In Rust, shadowing a binding with a new let does not drop the previous binding until the end of the scope. All shadowed bindings in a scope are dropped in reverse-declaration order when the scope ends. Therefore, multiple let _guard = register_*() calls will keep all guards alive until the end of the function (or the surrounding scope). When reviewing code, be mindful that resources tied to shadowed bindings persist longer than the most recent binding; to release early, constrain the lifetime in an inner block or explicitly drop guards when appropriate.

Applied to files:

  • crates/amaru-protocols/src/effects.rs
  • crates/amaru/src/stages/consensus/forward_chain/mod.rs
  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
  • crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
  • crates/amaru-network/src/connection.rs
📚 Learning: 2025-02-03T11:15:22.640Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
  • crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-04-22T09:18:19.893Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
  • crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-04-20T17:56:48.565Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-08-23T15:44:44.318Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 392
File: crates/ouroboros-traits/src/is_header/fake.rs:52-88
Timestamp: 2025-08-23T15:44:44.318Z
Learning: In the Amaru project, FakeHeader in crates/ouroboros-traits/src/is_header/fake.rs is specifically designed for testing purposes, and panic behavior (like copy_from_slice() panicking on malformed hex) is the preferred approach rather than graceful error handling, as it helps identify test data issues quickly during development.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-05-21T18:58:48.631Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs
📚 Learning: 2025-08-08T14:43:20.218Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/point.rs:45-52
Timestamp: 2025-08-08T14:43:20.218Z
Learning: In crates/amaru-kernel/src/point.rs, Point::Specific stores a header hash that is always exactly 32 bytes by project invariant. Therefore, converting it with `impl From<&Point> for Hash<32>` without a runtime length check is acceptable; future reviews should not request an error-returning check inside this `From` impl. If validation is ever desired, it should be done via `TryFrom` or upstream constructors.

Applied to files:

  • crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs
📚 Learning: 2025-09-06T09:16:25.025Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 435
File: crates/amaru/src/bin/amaru/cmd/convert_ledger_state.rs:113-116
Timestamp: 2025-09-06T09:16:25.025Z
Learning: In cardano-node serialized ledger state CBOR encoding, indefinite-length structures may be terminated with 0xFF "break" markers. The current code in convert_ledger_state.rs unconditionally strips the last byte (bytes[p..bytes.len() - 1]), which could corrupt data if the trailing byte is not actually a CBOR break marker.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-09-05T17:30:55.869Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.

Applied to files:

  • crates/amaru-network/src/connection.rs
🧬 Code graph analysis (4)
crates/amaru-protocols/src/effects.rs (1)
crates/amaru-network/src/socket_addr.rs (1)
  • resolve (19-33)
crates/amaru/src/stages/consensus/forward_chain/mod.rs (9)
crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (1)
  • tip (97-105)
crates/amaru-kernel/src/is_header/mod.rs (2)
  • tip (168-170)
  • point (40-42)
crates/amaru-ledger/src/state.rs (1)
  • tip (242-254)
crates/amaru-stores/src/in_memory/mod.rs (2)
  • tip (94-98)
  • tip (362-364)
crates/amaru-kernel/src/protocol_messages/tip.rs (1)
  • point (30-32)
crates/amaru-consensus/src/consensus/effects/network_effects.rs (1)
  • point (116-121)
crates/amaru-kernel/src/consensus_events.rs (1)
  • point (128-133)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (1)
  • point (87-89)
crates/amaru-network/src/point.rs (1)
  • to_network_point (18-23)
crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (6)
crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (1)
  • tip (97-105)
crates/amaru-kernel/src/protocol_messages/tip.rs (2)
  • point (30-32)
  • block_height (42-44)
crates/amaru/src/stages/consensus/forward_chain/test_infra.rs (2)
  • point (87-89)
  • new (102-127)
crates/amaru-network/src/point.rs (2)
  • from_network_point (25-32)
  • to_network_point (18-23)
crates/amaru/src/stages/consensus/forward_chain/mod.rs (1)
  • to_pallas_tip (24-31)
crates/amaru-kernel/src/protocol_messages/block_height.rs (2)
  • from (24-26)
  • from (30-32)
crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (1)
crates/amaru-network/src/point.rs (2)
  • from_network_point (25-32)
  • to_network_point (18-23)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 10.5.3)
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build wasm32
  • GitHub Check: Test coverage
  • GitHub Check: Benches
  • GitHub Check: Test the CLI
  • GitHub Check: clippy
  • GitHub Check: Analyze (rust)
🔇 Additional comments (27)
crates/amaru-protocols/src/effects.rs (4)

15-16: Noice refactoring, mate!

Clean import switch to the external crates. This sets up the foundation for the rest of the changes below.


109-109: Clean API alignment – the send() method accepts NonEmptyBytes directly now.

The signature's already expecting NonEmptyBytes for that data parameter, so dropping the .into() is spot on. No type shenanigans, just a proper bit of tidying up – like finally organizing your game collection after years of chaos on the shelf.


83-83: Connect effect refactor is bang on!

The shift from manual .resolve() to letting connect_addrs do the heavy lifting is the right move. Since the field is already typed as ToSocketAddrs, the method will handle resolution internally—no need for the extra legwork. The trait bound ensures all address types are properly supported without any hassle.


170-187: The public visibility is necessary—this helper is imported and used across multiple test modules.

G'day! The pub modifier under #[cfg(test)] isn't a quirk here—it's a requirement. This function is imported in handshake/tests.rs, tx_submission/tests/system_test.rs, and keepalive/tests.rs, so it needs to be public for cross-module test integration.

The type signature checks out too. The resolve() function returns Vec<SocketAddr>, which matches exactly what ConnectionProvider::connect() expects. Error handling is clean, and the timeout wrapper is idiomatic Tokio—all good.

The 5-second timeout works fine as a default for test code. If you want to make it more flexible in the future, you could read it from an env var, but that's optional polish rather than a fix.

crates/amaru-protocols/src/chainsync/responder.rs (10)

1-27: G'day! License and imports looking choice.

All the imports are spot on and you've got everything you need for the responder to do its thing. Clean start, mate!


29-57: Responder struct is locked and loaded!

Nice clean structure here, cobber. The initial state of Idle { send_rollback: false } with Point::Origin makes perfect sense for a fresh responder. All the derives are bang on for what you need.


59-63: Message types looking sweet as!

Simple and effective enum for your responder messages. No dramas here.


65-153: Responder loop is ace, mate!

This is a ripper of a design pattern. The internal Msg enum to handle both external messages and internal actions is clever, and the loop that keeps churning through follow-up actions until there's nothing left to do is exactly what you want for a state machine like this.

The .or_terminate() error handling is appropriate for a protocol handler – if something goes pear-shaped, you log it and bail out gracefully.


155-172: State checks and early returns are mint!

The early guards are clean as – checking state type, send_rollback flag, and whether we've caught up to the tip. Sweet logic flow, champion.


201-223: Intersection finder is a beauty!

The logic here is spot on, mate. Sorting points in reverse order and walking back from the tip makes perfect sense. The early exit on line 218 when you've gone past the lowest point is a nice little optimization – not gonna waste time searching when you know there's no match left.

The contains() call is O(n) but the points list is typically small enough that it's not worth worrying about.


225-232: Internal action enum is choice!

Private enum with all the actions you need for driving the state machine. Clean and purposeful – no worries here.


234-239: Result enum looking solid!

Public API surface for results is clean. The three variants map nicely to the different types of follow-up actions your responder needs to handle.


240-247: State enum is top-notch!

The state machine states are well thought out, mate. The send_rollback flag being threaded through Idle, CanAwait, and MustReply makes perfect sense for tracking whether you need to send a rollback on the next request.

One tiny thing I noticed – the Done state doesn't seem to be used anywhere in the current implementation. Not necessarily a problem if you're planning to use it down the track, but worth noting.


249-302: State machine is absolutely brilliant!

This is proper FSM design right here, champion. Every valid transition is handled explicitly, and the type system is doing the heavy lifting to prevent impossible states – exactly what the Amaru project prefers based on the learnings.

The logic flow makes sense:

  • Finding intersection sets send_rollback: true so the next request knows to rollback to that point
  • Roll actions complete and reset back to Idle with send_rollback: false
  • The catchall on line 299 ensures you'll catch any invalid transitions during testing

The pattern matching is clean as, and the state transitions are easy to follow. This is some top-shelf code, mate!

Based on learnings, separate result types for state transitions are preferred over runtime checks.

crates/amaru/src/stages/consensus/forward_chain/mod.rs (1)

24-31: Lovely bit of glue code, mate!

This conversion helper is clean as a whistle. You're bridging the internal Tip representation to Pallas' network protocol format with proper delegation to existing utilities. No dramas here - the logic is sound, and it's exactly what you need for interop between the kernel and network layers.

crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (5)

20-20: Nice tidy-up on the imports!

Consolidating the point conversion imports makes it crystal clear what utilities you're pulling in. Good housekeeping, no worries here.


87-89: Much clearer debug output, legend!

Using header.as_tip().block_height() instead of index-based access makes this way more readable. Anyone debugging chainsync issues will thank you for this clarity. Cheers!


100-103: Solid tip construction logic!

The explicit Tip construction with proper point and height conversions is spot on. You're maintaining type safety while bridging between the internal and network representations. Good stuff!


195-202: Clean point conversion setup!

Mapping the requested points through from_network_point before initializing the ChainFollower is exactly right. You're ensuring the follower works with kernel types internally, which keeps the type boundaries clean. Love it!


213-213: Proper boundary conversion, mate!

Converting the intersection to a network point before sending it out is the right move. You're maintaining clean separation between internal and external representations. Textbook stuff!

crates/amaru/src/stages/consensus/forward_chain/chain_follower.rs (6)

18-22: Top-notch import organization!

The explicit imports and the PallasTip alias make this code way more readable. Anyone jumping into this file will immediately understand what Tip variant they're dealing with. That's quality setup right there!


79-79: Origin fallback looks sound!

Using Tip::new(Point::Origin, 0.into()) as your fallback when the anchor or best intersection can't be loaded makes sense - it's like respawning at the starting checkpoint. This aligns with typical chain-sync behavior where you fall back to genesis if you can't find a better starting point.

Just double-check this matches your consensus expectations, especially for fresh nodes or weird edge cases where the store is in an unexpected state.

Also applies to: 140-140


153-156: Perfect PallasTip wrapping!

Wrapping the initial tip with proper point and height conversions for the Pallas network type is exactly what the doctor ordered. Clean and consistent with your conversion pattern throughout.


162-188: Solid behind-anchor progression logic!

The updates here are on point (pun absolutely intended). Using proper accessor methods like .block_height() and .point(), plus the consistent point conversions with to_network_point, keep everything type-safe. The flow from store lookup → header load → intersection update is clean as.


220-442: Test updates looking sharp, but don't forget the sequel!

The test refactoring is beautifully consistent with your type migrations - proper use of from_network_point, to_pallas_tip, and BlockHeight throughout. However, your PR description mentions "no individual test yet, will test when new stages are complete."

Just a friendly reminder to circle back and add comprehensive tests once the stages are done. You've got the test infrastructure updated, so you're set up for success - just need to fill in those test scenarios. Think of it like having all the weapons but not testing them against the final boss yet!


72-72: Point::hash() is properly implemented for all variants.

The Point type already has a well-designed hash() method that gracefully handles both Origin and Specific variants. Origin returns a zero hash by convention, while Specific returns the stored header hash. The code at line 72 is sound, no issues here.

Likely an incorrect or invalid review comment.

crates/amaru-network/src/connection.rs (1)

38-51: Nice split between sync and async mutexes, legend!

Using parking_lot's Mutex for the connections map (short-lived lock) and tokio's AsyncMutex for per-connection I/O (held across awaits) is spot on. That's like knowing when to use a sprint vs a marathon pace – textbook stuff.

Signed-off-by: Roland Kuhn <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
crates/amaru-protocols/src/store_effects.rs (1)

73-79: These stub implementations are already on the radar.

G'day! Just a heads up that load_from_best_chain and next_best_chain are returning None unconditionally. This was already flagged in a previous review round. Since the PR description mentions this is scaffolding work ("no individual test yet"), I'm guessing you're aware these need fleshing out when the stages are complete.

🧹 Nitpick comments (6)
crates/amaru-protocols/src/protocol/check.rs (1)

63-66: Consider handling init() error more gracefully.

Right now you're calling unwrap() on init(), which could panic at runtime if the protocol state's initialization fails. Since this is validation code, a panic might be intentional, but an expect() with a descriptive message would be more helpful for debugging - like getting a proper game over screen instead of just a crash, y'know?

🔎 Proposed improvement
-        assert_eq!(
-            initial.init().unwrap().0.send.is_some(),
+        let init_outcome = initial.init().expect("initial state should be valid");
+        assert_eq!(
+            init_outcome.0.send.is_some(),
             role == Role::Initiator
         );
crates/amaru-protocols/src/protocol/miniprotocol.rs (1)

121-148: Network input handling looks solid.

The pattern of decoding the wire message, stepping the protocol state, and conditionally sending a response is well-structured. The or_terminate with err() provides good error context. One thing though - when HandlerMessage isn't FromNetwork, you call proto.init(), which implies this is the only other variant. Might be worth adding a comment clarifying this assumption, like a tooltip in a game, so future devs don't get lost.

🔎 Consider adding a clarifying comment
                 } else {
+                    // HandlerMessage::Start or similar initialization variant
                     (proto.init(), "failed to initialize protocol state")
                 };
crates/amaru-protocols/src/protocol/mod.rs (1)

24-29: Consider the naming similarity between Input and Inputs.

You've got Input<L, R> here with Local/Remote variants, and Inputs<L> in miniprotocol.rs with Local/Network variants. Both are re-exported from this module. While they serve different purposes (generic vs handler-specific), the naming is pretty similar - could trip someone up, like mixing up similar-looking keys in Resident Evil.

Maybe consider renaming one for clarity? Something like ProtocolInput vs HandlerInputs, or just leave a doc comment explaining the distinction. Not a blocker, just a readability suggestion, mate.

🔎 Consider adding doc comments to distinguish
-/// Input to a protocol step
+/// Generic input type for protocol steps with distinct local and remote message types.
+/// See also [`Inputs`] for miniprotocol handler-specific inputs.
 #[derive(Debug, Clone, PartialEq, Eq, serde::Serialize, serde::Deserialize)]
 pub enum Input<L, R> {
     Local(L),
     Remote(R),
 }
crates/amaru-protocols/src/store_effects.rs (1)

117-533: This boilerplate is giving me serious déjà vu!

You've got 14 effect implementations here, all following the same pattern: struct → new()ExternalEffect::run()ExternalEffectAPIExternalEffectSync. While it works perfectly fine and is clear, this much repetition is like watching the same cutscene 14 times — eventually you want a skip button.

Consider introducing a macro to generate these effect types. Something like declare_effect! that takes the effect name, fields, response type, and the store method to call. Would cut down the line count significantly and make future additions less tedious.

That said, given this is scaffolding work (per the PR description), maybe it's fine to leave as-is for now and refactor once the full shape is clear.

crates/amaru-protocols/src/chainsync/messages.rs (2)

18-28: Quick question about the derives, chief.

You've got Ord and PartialOrd on this enum—are these messages actually being sorted or used in a BTreeMap somewhere? Just curious if there's a specific use case for total ordering on protocol messages, since it's not super common in this context.

If there's no need for ordering, you could trim those derives to keep things minimal.


130-166: Consider adding some defensive validation, mate.

The decode logic works, but based on the learnings, you could add a bit more robustness:

  1. Array length validation: The learning from KtorZ suggests preferring minicbor_extra::heterogenous_array to validate array lengths (e.g., lines 132, 138).

  2. Tag validation: Lines 144 and 155 read CBOR tags but don't verify they're the expected IanaTag::Cbor (tag 24). A mismatch could indicate malformed data.

These aren't critical—the decode will fail anyway if the structure is wrong—but explicit validation makes for clearer error messages and catches protocol violations earlier.

Based on learnings from KtorZ about using minicbor_extra::heterogenous_array for array validation.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ac38c15 and 0d2082d.

📒 Files selected for processing (8)
  • crates/amaru-protocols/src/chainsync/messages.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
  • crates/amaru-protocols/src/protocol/check.rs
  • crates/amaru-protocols/src/protocol/miniprotocol.rs
  • crates/amaru-protocols/src/protocol/mod.rs
  • crates/amaru-protocols/src/store_effects.rs
  • crates/pure-stage/src/lib.rs
  • crates/pure-stage/src/types.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/amaru-protocols/src/chainsync/responder.rs
🧰 Additional context used
🧠 Learnings (18)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/point.rs:45-52
Timestamp: 2025-08-08T14:43:20.218Z
Learning: In crates/amaru-kernel/src/point.rs, Point::Specific stores a header hash that is always exactly 32 bytes by project invariant. Therefore, converting it with `impl From<&Point> for Hash<32>` without a runtime length check is acceptable; future reviews should not request an error-returning check inside this `From` impl. If validation is ever desired, it should be done via `TryFrom` or upstream constructors.
📚 Learning: 2025-08-12T12:28:24.027Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 372
File: simulation/amaru-sim/src/simulator/mod.rs:410-412
Timestamp: 2025-08-12T12:28:24.027Z
Learning: In the Amaru project, panic statements are acceptable in simulation/test code (like amaru-sim crate) as they help identify configuration issues quickly during development, rather than needing proper error handling like production code.

Applied to files:

  • crates/pure-stage/src/types.rs
📚 Learning: 2025-12-16T21:32:37.668Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/amaru-network/src/handshake/tests.rs:40-47
Timestamp: 2025-12-16T21:32:37.668Z
Learning: In Rust, shadowing a binding with a new let does not drop the previous binding until the end of the scope. All shadowed bindings in a scope are dropped in reverse-declaration order when the scope ends. Therefore, multiple let _guard = register_*() calls will keep all guards alive until the end of the function (or the surrounding scope). When reviewing code, be mindful that resources tied to shadowed bindings persist longer than the most recent binding; to release early, constrain the lifetime in an inner block or explicitly drop guards when appropriate.

Applied to files:

  • crates/pure-stage/src/types.rs
  • crates/amaru-protocols/src/chainsync/messages.rs
  • crates/amaru-protocols/src/protocol/mod.rs
  • crates/amaru-protocols/src/store_effects.rs
  • crates/amaru-protocols/src/protocol/check.rs
  • crates/pure-stage/src/lib.rs
  • crates/amaru-protocols/src/protocol/miniprotocol.rs
📚 Learning: 2025-08-08T14:35:35.562Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/transaction_pointer.rs:36-44
Timestamp: 2025-08-08T14:35:35.562Z
Learning: In the amaru project, when decoding CBOR arrays, prefer using minicbor_extra::heterogenous_array with the expected length to validate definite-length arrays and correctly handle indefinite-length arrays. Example: crates/amaru-kernel/src/transaction_pointer.rs Decode should use heterogenous_array(d, 2, …) instead of ignoring the length from d.array().

Applied to files:

  • crates/amaru-protocols/src/chainsync/messages.rs
📚 Learning: 2025-09-06T09:16:25.025Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 435
File: crates/amaru/src/bin/amaru/cmd/convert_ledger_state.rs:113-116
Timestamp: 2025-09-06T09:16:25.025Z
Learning: In cardano-node serialized ledger state CBOR encoding, indefinite-length structures may be terminated with 0xFF "break" markers. The current code in convert_ledger_state.rs unconditionally strips the last byte (bytes[p..bytes.len() - 1]), which could corrupt data if the trailing byte is not actually a CBOR break marker.

Applied to files:

  • crates/amaru-protocols/src/chainsync/messages.rs
📚 Learning: 2025-08-08T14:34:06.105Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/minicbor-extra/src/lib.rs:50-55
Timestamp: 2025-08-08T14:34:06.105Z
Learning: Project uses Rust 1.88 stable; using Result::is_err_and is acceptable and should not be flagged as unstable. In particular, crates/minicbor-extra/src/lib.rs relies on is_err_and when checking end_of_input; future reviews should not suggest replacing it for stability reasons.

Applied to files:

  • crates/amaru-protocols/src/chainsync/messages.rs
  • crates/amaru-protocols/src/protocol/mod.rs
  • crates/amaru-protocols/src/protocol/miniprotocol.rs
📚 Learning: 2025-02-03T11:15:22.640Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Applied to files:

  • crates/amaru-protocols/src/chainsync/messages.rs
📚 Learning: 2025-08-08T14:46:53.013Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/pool_params.rs:107-116
Timestamp: 2025-08-08T14:46:53.013Z
Learning: In crates/amaru-kernel/src/pool_params.rs, when serializing Relay::SingleHostAddr IPv6 to text, the project intentionally reverses each 4-byte chunk before constructing std::net::Ipv6Addr. This matches cardano-ledger’s IPv6 representation (four little-endian Word32 chunks). Do not “simplify” by passing the raw 16 bytes directly to Ipv6Addr::from; that would break ledger compatibility.

Applied to files:

  • crates/amaru-protocols/src/chainsync/messages.rs
📚 Learning: 2025-09-29T16:39:24.001Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:317-325
Timestamp: 2025-09-29T16:39:24.001Z
Learning: In crates/amaru-network/src/mux.rs, the outgoing() method intentionally uses unwrap() after get_mut(&proto_id) as a fail-fast mechanism. This panic is designed to catch programming errors where an actor tries to send on an unregistered protocol, and should not be changed to return a Result since it represents internal code bugs that should terminate the process, not external input that should be handled gracefully.

Applied to files:

  • crates/amaru-protocols/src/protocol/mod.rs
  • crates/amaru-protocols/src/protocol/miniprotocol.rs
📚 Learning: 2025-05-21T18:58:48.631Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.

Applied to files:

  • crates/amaru-protocols/src/store_effects.rs
📚 Learning: 2025-08-20T13:02:25.763Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.

Applied to files:

  • crates/amaru-protocols/src/store_effects.rs
📚 Learning: 2025-08-18T08:10:32.640Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:427-433
Timestamp: 2025-08-18T08:10:32.640Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp, so unimplemented methods like set_proposals_roots and set_constitution are intentionally left as placeholders until the revamp is complete.

Applied to files:

  • crates/amaru-protocols/src/store_effects.rs
📚 Learning: 2025-08-18T08:10:35.849Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:431-433
Timestamp: 2025-08-18T08:10:35.849Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp soon, so unimplemented methods like set_constitution, set_proposals_roots are intentionally left aside until the revamp is complete.

Applied to files:

  • crates/amaru-protocols/src/store_effects.rs
📚 Learning: 2025-08-08T14:43:20.218Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/point.rs:45-52
Timestamp: 2025-08-08T14:43:20.218Z
Learning: In crates/amaru-kernel/src/point.rs, Point::Specific stores a header hash that is always exactly 32 bytes by project invariant. Therefore, converting it with `impl From<&Point> for Hash<32>` without a runtime length check is acceptable; future reviews should not request an error-returning check inside this `From` impl. If validation is ever desired, it should be done via `TryFrom` or upstream constructors.

Applied to files:

  • crates/amaru-protocols/src/store_effects.rs
📚 Learning: 2025-08-18T08:11:20.028Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:109-118
Timestamp: 2025-08-18T08:11:20.028Z
Learning: The proposals_roots() method in the MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is intentionally left returning all None values rather than reading from stored state, as it's planned for the upcoming major MemoryStore revamp.

Applied to files:

  • crates/amaru-protocols/src/store_effects.rs
📚 Learning: 2025-05-05T08:15:24.192Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 206
File: crates/pure-stage/src/simulation/state.rs:4-8
Timestamp: 2025-05-05T08:15:24.192Z
Learning: The `State` trait in the pure-stage crate already requires `Send` with its definition: `pub trait State: Any + fmt::Debug + Send + 'static`, making additional `+ Send` bounds redundant when using `Box<dyn State>`.

Applied to files:

  • crates/pure-stage/src/lib.rs
📚 Learning: 2025-06-14T16:31:53.134Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 263
File: simulation/amaru-sim/src/simulator/simulate.rs:298-300
Timestamp: 2025-06-14T16:31:53.134Z
Learning: StageRef in the pure-stage crate supports serde serialization and deserialization (derives serde::Serialize and serde::Deserialize), enabling it to be used in structs that also derive these traits for TraceBuffer and replay functionality.

Applied to files:

  • crates/pure-stage/src/lib.rs
📚 Learning: 2025-09-29T16:44:14.807Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/protocol.rs:94-106
Timestamp: 2025-09-29T16:44:14.807Z
Learning: In the amaru-network crate protocol.rs file, the correct Cardano mini-protocol ID assignments are: PROTO_N2N_KEEP_ALIVE = 8 and PROTO_N2N_PEER_SHARE = 10, as verified against the network specification by the maintainer.

Applied to files:

  • crates/amaru-protocols/src/protocol/miniprotocol.rs
🧬 Code graph analysis (6)
crates/amaru-protocols/src/chainsync/messages.rs (3)
crates/amaru-protocols/src/handshake/messages.rs (4)
  • decode (113-126)
  • decode (175-198)
  • encode (92-106)
  • encode (144-167)
crates/amaru-protocols/src/keepalive/messages.rs (6)
  • decode (60-62)
  • decode (97-120)
  • encode (49-56)
  • encode (73-93)
  • from (37-39)
  • from (43-45)
crates/amaru-protocols/src/tx_submission/messages.rs (3)
  • decode (95-123)
  • encode (47-91)
  • d (115-115)
crates/amaru-protocols/src/protocol/mod.rs (1)
crates/amaru-protocols/src/protocol/miniprotocol.rs (2)
  • miniprotocol (102-191)
  • outcome (52-57)
crates/amaru-protocols/src/store_effects.rs (4)
crates/amaru-protocols/src/effects.rs (9)
  • new (38-40)
  • run (75-87)
  • run (101-113)
  • run (127-139)
  • run (152-164)
  • resources (78-79)
  • resources (104-105)
  • resources (130-131)
  • resources (155-156)
crates/amaru-consensus/src/consensus/effects/consensus_effects.rs (5)
  • new (50-52)
  • store (28-28)
  • store (54-56)
  • store (80-82)
  • store (141-143)
crates/amaru-protocols/src/mempool_effects.rs (18)
  • new (41-43)
  • new (111-113)
  • new (161-163)
  • external_sync (46-54)
  • run (118-125)
  • run (139-146)
  • run (168-175)
  • run (201-208)
  • run (230-238)
  • run (260-267)
  • run (281-288)
  • resources (120-121)
  • resources (141-142)
  • resources (170-171)
  • resources (203-204)
  • resources (232-233)
  • resources (262-263)
  • resources (283-284)
crates/pure-stage/src/effect.rs (1)
  • wrap_sync (409-416)
crates/amaru-protocols/src/protocol/check.rs (2)
crates/amaru-protocols/src/protocol/mod.rs (3)
  • fmt (48-50)
  • fmt (80-82)
  • role (173-181)
crates/amaru-protocols/src/protocol/miniprotocol.rs (5)
  • init (67-67)
  • send (37-42)
  • outcome (52-57)
  • network (68-71)
  • network (88-93)
crates/pure-stage/src/lib.rs (1)
crates/pure-stage/src/types.rs (2)
  • err (381-383)
  • warn (385-387)
crates/amaru-protocols/src/protocol/miniprotocol.rs (3)
crates/amaru-protocols/src/protocol/mod.rs (4)
  • fmt (48-50)
  • fmt (80-82)
  • decode (42-44)
  • encode (38-40)
crates/pure-stage/src/types.rs (7)
  • err (381-383)
  • std (245-245)
  • std (284-284)
  • fmt (78-80)
  • fmt (227-229)
  • fmt (243-247)
  • fmt (282-286)
crates/amaru-protocols/src/chainsync/messages.rs (4)
  • decode (92-127)
  • decode (131-165)
  • encode (38-88)
  • encode (169-199)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (rust)
🔇 Additional comments (21)
crates/pure-stage/src/lib.rs (1)

50-50: Solid re-export, mate!

Extending the public API with these error and warning helpers is a ripper move. They pair nicely with the or_terminate pattern, giving users a tidy way to log errors without the ceremony. Clean as.

crates/pure-stage/src/types.rs (1)

381-387: Cracking little helpers, these!

These utility functions are spot-on for reducing boilerplate with the or_terminate pattern. The lifetime handling is sound—the '_ annotation means the returned closure borrows msg, and the compiler's got your back preventing any dodgy usage. Using structured logging with %err is proper tracing form, and the E: Display constraint meshes perfectly with it.

The names are beautifully terse—like a good flat white, no fuss, just what you need. Perfect for chaining like msg.or_terminate(&eff, err("oops")).await without typing out the whole async block ceremony each time.

crates/amaru-protocols/src/protocol/check.rs (4)

18-28: Clean Default implementation, mate!

The ProtoSpec struct and its Default impl are nicely done. Using BTreeMap for ordered traversal during checks is a solid choice - like picking the right tool for the job, yeah?


35-47: Panic on duplicate transition - fair dinkum for a spec checker.

The arrow() method panics on duplicate transitions, which is the right call for a protocol specification validator. You want to catch these conflicts early, like spotting a bug before the final boss fight.


71-85: Solid message routing logic.

The bifurcation between local and network messages is well-structured. The assertion on line 72-75 ensuring network messages aren't accepted as local is a nice defensive check - like having a bouncer at both doors of the pub.


86-101: Transition validation looks thorough.

The match on (to, outcome) handles all four cases cleanly, and the subsequent validation for local vs network paths with send verification is solid. The logic at lines 95-98 where you follow the chain of transitions when there's a subsequent send is particularly clever - reminds me of combo chains in a fighting game.

crates/amaru-protocols/src/protocol/miniprotocol.rs (6)

22-27: Clean input enum definition.

The Inputs<L> enum nicely separates local from network inputs. The derive macros are appropriate for the use case. Good stuff!


29-57: Builder-style Outcome is well-designed.

The Outcome struct with its builder methods send() and result() follows a clean pattern. The outcome() factory function for creating empty outcomes is handy - like having a quick-select wheel in your inventory, yeah?


62-73: ProtocolState trait is well-structured.

The separation of concerns with WireMsg, Action, and Out types is clean. The three methods (init, network, local) cover the essential state transitions. The CBOR encode/decode bounds on WireMsg make sense for wire protocol work.


78-94: Good use of #[expect(async_fn_in_trait)].

Using expect rather than allow is the right move here - it'll warn you if this lint ever becomes unnecessary. The StageState trait design with separate local and network handlers is solid. Based on learnings, this project uses Rust 1.88 stable, so async fn in trait is fully supported.


150-171: Decision-making phase is clean.

The three-way match on LocalOrNetwork handles all paths appropriately. The pattern of updating stage after each operation maintains proper state flow. Nice and tidy!


173-190: Network send phase completes the loop nicely.

The final action handling with message encoding and sending mirrors the earlier pattern, which is good for consistency. The whole miniprotocol function ties together protocol state and stage state elegantly - like a well-choreographed co-op sequence!

crates/amaru-protocols/src/protocol/mod.rs (1)

18-22: Module structure and re-exports look good.

The new check and miniprotocol modules are properly declared and their public APIs are re-exported cleanly. This keeps the public surface tidy - like organizing your inventory before a big quest!

crates/amaru-protocols/src/store_effects.rs (3)

15-42: Nice setup, mate!

The Store<T> wrapper and the delegation pattern through external_sync looks solid. This follows the same pattern as mempool_effects.rs, keeping things consistent across the codebase. The public API surface is minimal and clear.


82-110: Proper type safety for chain operations!

The ChainStore impl looks ace. I particularly dig how roll_forward_chain returns Result<(), StoreError> while rollback_chain returns Result<usize, StoreError> — this leverages the type system to prevent impossible states, which aligns with the project's patterns. Nice work!


92-92: RawBlock is fire (cheap clone), the others are chill too

Quick check done: RawBlock wraps Arc<[u8]>, so cloning is just bumping the reference counter—dead cheap. Nonces is tiny (~136 bytes: four Hash<32>s plus an Epoch). BlockHeader pairs a Header with its hash specifically to skip redundant hashing, which is an intentional design choice. These aren't going through a tight loop anyway—they're flowing through the effect system. No performance alarm bells here, mate.

crates/amaru-protocols/src/chainsync/messages.rs (5)

1-16: Solid foundation, mate!

The license header and imports are spot on. You've got the necessary bits from amaru_kernel and minicbor to get the job done.


37-89: Nice work on the encoding, legend!

All the message variants are encoded correctly with proper array lengths matching the elements. I can see you've sorted the IntersectNotFound issue from the previous review—it's now correctly using array(2) for the label and tip. Cheers for that fix!

The encoding logic is clean and mirrors the protocol spec nicely.


91-128: Decode logic looking mint, cobber!

The decode implementation correctly mirrors the encode side, with proper variant dispatch and error handling for unknown message types. The symmetry between encode and decode is chef's kiss.


168-200: Encoding's looking choice, no worries!

The HeaderContent encode correctly mirrors the decode, handling the Byron special case with its extra nesting layer and properly encoding the CBOR tag. The error check for missing byron_prefix on variant 0 (lines 186-188) is a nice touch—good defensive programming there.

The asymmetry between Byron and other variants might look a bit unusual at first glance, but it's intentional to match the protocol spec.


140-142: Byron header structure needs source documentation reference.

The comment flags the right concern—that (u8, u64) Byron-specific structure lacks a clear reference. While Byron addresses and general CBOR encoding are well-documented, the specific Byron header prefix format decoded here deserves a citation. Consider adding a link to the specification or codebase reference where this structure is defined (e.g., ouroboros-consensus or cardano-ledger specs).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
crates/amaru-protocols/src/chainsync/responder.rs (2)

156-167: That error message is telling porkies, cobber.

You're loading by pointer.hash() but claiming "tip not found"—bit like saying you lost your keys when you're actually missing your wallet. When debugging kicks off, that'll send folks down the wrong rabbit hole entirely.

Switch it to something like "pointer header not found" or "client pointer unknown" so it actually reflects what's being looked up.

🔎 Clearer error message
         let header = store
             .load_header(&pointer.hash())
-            .ok_or_else(|| anyhow::anyhow!("tip not found"))?;
+            .ok_or_else(|| anyhow::anyhow!("client pointer header not found"))?;

173-181: Hardcoded era variant is gonna bite when you hit older blocks.

That variant: 6 assumes every header is Conway era, but if your chain includes Shelley, Allegra, Mary, Alonzo, or Babbage blocks, you're mislabelling them. Peers decoding these messages expect the correct era marker—send them a Babbage header tagged as Conway and they'll either reject it or throw a wobbly.

The era info should be derivable from the header metadata or CBOR structure. Extract it dynamically rather than hardcoding it, otherwise you're setting yourself up for protocol violations once you sync pre-Conway blocks.

Check if BlockHeader or the pallas types expose era information:

#!/bin/bash
# Search for era-related methods or fields in BlockHeader and related types
rg -n "era|variant|Era|HeaderContent" --type rust -A 3 -B 1 crates/amaru-kernel/src/is_header/ crates/amaru-kernel/src/lib.rs | head -100
🧹 Nitpick comments (8)
crates/amaru-protocols/src/protocol/check.rs (4)

18-28: Consider adding documentation to the public API.

The ProtoSpec struct is public and forms part of the protocol checking API, but it's missing doc comments. Adding a brief description of what it represents (a protocol specification checker for state machines) would help future developers understand its purpose without diving into the implementation—like leaving breadcrumbs in Hansel and Gretel, but less likely to be eaten by birds.


35-47: Method lacks documentation.

The i() method (presumably "initiator") builds the state machine but has no doc comment explaining what it does or when it panics. A quick docstring would save developers from the "wait, what does this do again?" moment—like subtitles for a Christopher Nolan film.


49-61: Method lacks documentation (and spot the code duplication).

Like i(), the r() method (presumably "responder") needs documentation. Also, there's significant duplication between i() and r()—they differ only by the Role enum variant. While it's not critical, you could potentially extract a helper method like add_transition(from, msg, to, role) to keep things DRY. But hey, sometimes a bit of duplication is clearer than over-abstraction—your call, champion.


63-142: This method could use some love, mate.

The check() method is the workhorse of the protocol validator—it's doing a lot of heavy lifting with ~80 lines and multiple nested loops. Here are some thoughts:

  1. Missing documentation: This is a complex public method that would really benefit from a docstring explaining what it validates, what the closures are for, and when it panics. Think of it as the director's commentary for your code.

  2. High cognitive complexity: The nested logic (state loop → message loop → local/network branching → send handling) makes this method a bit of a mental maze. Consider extracting helper methods like validate_local_transition() and validate_network_transition() to break it down—like splitting a boss fight into phases.

  3. Error messages could be more helpful: Some panics (lines 88-90, 103-104, 131) could provide more context about why the invariant failed. Future debugging-you will thank present-you for the extra detail.

That said, the logic appears sound and the approach is reasonable for a validation method. These are all quality-of-life improvements rather than critical issues.

crates/amaru-protocols/src/chainsync/responder.rs (4)

15-15: Scoped suppression beats blanket silence, mate.

The file-level #![allow(dead_code)] is a bit of a blunt instrument—like using a sledgehammer to crack a walnut. It'll hide legitimate warnings once this code integrates with the rest of the system.

Since you mentioned testing comes later, fair dinkum, but consider either:

  • Removing it once the integration is complete, or
  • Scoping it to specific items with #[allow(dead_code)] on just the structs/functions that are genuinely WIP

161-167: Unbounded ancestor walk could turn into a marathon.

This loop walks ancestors until it finds a common point on the best chain, but there's no depth limit. If the client's on a deep fork (think something crazy like a 10k block reorg during a network partition—rare but not impossible), you're gonna be iterating for quite a while.

Consider adding a max depth check or at least a log warning if the search goes beyond a reasonable threshold (say, 2160 blocks). That way you don't accidentally DoS yourself when some cheeky peer sends a dodgy pointer.

🔎 Proposed safeguard with depth limit
+        const MAX_ROLLBACK_DEPTH: usize = 2160; // ~12 hours on mainnet
+        let mut depth = 0;
         for header in store.ancestors(header) {
+            depth += 1;
+            if depth > MAX_ROLLBACK_DEPTH {
+                tracing::warn!("rollback search exceeded max depth of {}", MAX_ROLLBACK_DEPTH);
+                anyhow::bail!("rollback search exceeded maximum depth");
+            }
             if store.load_from_best_chain(&header.point()).is_some() {
                 return Ok(Some(ResponderAction::RollBackward(header.point(), tip)));
             }
         }

184-206: Vec::contains in a loop? That's O(n²) territory, chief.

You've sorted points (line 192), which is brilliant prep work, but then you're doing points.contains(&point) at line 198 inside the ancestor walk. That's O(n) for each header you check—like searching through a phone book by reading every entry instead of using the index.

Since points is already sorted, swap contains() for binary_search() and you'll drop that to O(log n). Much snappier when syncing with peers that send large intersection lists.

🔎 Use binary_search for O(log n) lookup
     for header in store.ancestors(header) {
         let point = header.point();
-        if points.contains(&point) {
+        if points.binary_search(&point).is_ok() {
             return Ok(ResponderAction::IntersectFound(point, tip));
         }
         if Some(&point) < points.last() {
             break;
         }
     }

234-309: State machine transitions look good—FSM is tight.

The ProtocolState implementation handles all the valid state transitions properly. The ensure! guards (lines 282, 286) preventing AwaitReply/RollForward after intersection are spot-on—exactly the kind of type-level safety that prevents impossible states.

One tiny polish: the bail! error messages at lines 261 and 306 could include the message variant name in the error (currently just showing the debug output). Makes logs easier to grep when troubleshooting.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d2082d and 760aa62.

📒 Files selected for processing (3)
  • crates/amaru-protocols/src/chainsync/responder.rs
  • crates/amaru-protocols/src/protocol/check.rs
  • crates/amaru-protocols/src/protocol/mod.rs
🧰 Additional context used
🧠 Learnings (12)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.
📚 Learning: 2025-12-16T21:32:37.668Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/amaru-network/src/handshake/tests.rs:40-47
Timestamp: 2025-12-16T21:32:37.668Z
Learning: In Rust, shadowing a binding with a new let does not drop the previous binding until the end of the scope. All shadowed bindings in a scope are dropped in reverse-declaration order when the scope ends. Therefore, multiple let _guard = register_*() calls will keep all guards alive until the end of the function (or the surrounding scope). When reviewing code, be mindful that resources tied to shadowed bindings persist longer than the most recent binding; to release early, constrain the lifetime in an inner block or explicitly drop guards when appropriate.

Applied to files:

  • crates/amaru-protocols/src/protocol/check.rs
  • crates/amaru-protocols/src/protocol/mod.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-09-29T16:39:24.001Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:317-325
Timestamp: 2025-09-29T16:39:24.001Z
Learning: In crates/amaru-network/src/mux.rs, the outgoing() method intentionally uses unwrap() after get_mut(&proto_id) as a fail-fast mechanism. This panic is designed to catch programming errors where an actor tries to send on an unregistered protocol, and should not be changed to return a Result since it represents internal code bugs that should terminate the process, not external input that should be handled gracefully.

Applied to files:

  • crates/amaru-protocols/src/protocol/mod.rs
📚 Learning: 2025-08-08T14:34:06.105Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/minicbor-extra/src/lib.rs:50-55
Timestamp: 2025-08-08T14:34:06.105Z
Learning: Project uses Rust 1.88 stable; using Result::is_err_and is acceptable and should not be flagged as unstable. In particular, crates/minicbor-extra/src/lib.rs relies on is_err_and when checking end_of_input; future reviews should not suggest replacing it for stability reasons.

Applied to files:

  • crates/amaru-protocols/src/protocol/mod.rs
📚 Learning: 2025-04-22T09:18:19.893Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-08-08T14:46:53.013Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/pool_params.rs:107-116
Timestamp: 2025-08-08T14:46:53.013Z
Learning: In crates/amaru-kernel/src/pool_params.rs, when serializing Relay::SingleHostAddr IPv6 to text, the project intentionally reverses each 4-byte chunk before constructing std::net::Ipv6Addr. This matches cardano-ledger’s IPv6 representation (four little-endian Word32 chunks). Do not “simplify” by passing the raw 16 bytes directly to Ipv6Addr::from; that would break ledger compatibility.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-09-06T09:16:25.025Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 435
File: crates/amaru/src/bin/amaru/cmd/convert_ledger_state.rs:113-116
Timestamp: 2025-09-06T09:16:25.025Z
Learning: In cardano-node serialized ledger state CBOR encoding, indefinite-length structures may be terminated with 0xFF "break" markers. The current code in convert_ledger_state.rs unconditionally strips the last byte (bytes[p..bytes.len() - 1]), which could corrupt data if the trailing byte is not actually a CBOR break marker.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-08-23T15:44:44.318Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 392
File: crates/ouroboros-traits/src/is_header/fake.rs:52-88
Timestamp: 2025-08-23T15:44:44.318Z
Learning: In the Amaru project, FakeHeader in crates/ouroboros-traits/src/is_header/fake.rs is specifically designed for testing purposes, and panic behavior (like copy_from_slice() panicking on malformed hex) is the preferred approach rather than graceful error handling, as it helps identify test data issues quickly during development.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-02-03T11:15:22.640Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-04-20T17:56:48.565Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-08-08T14:43:20.218Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/point.rs:45-52
Timestamp: 2025-08-08T14:43:20.218Z
Learning: In crates/amaru-kernel/src/point.rs, Point::Specific stores a header hash that is always exactly 32 bytes by project invariant. Therefore, converting it with `impl From<&Point> for Hash<32>` without a runtime length check is acceptable; future reviews should not request an error-returning check inside this `From` impl. If validation is ever desired, it should be done via `TryFrom` or upstream constructors.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (rust)
🔇 Additional comments (6)
crates/amaru-protocols/src/protocol/mod.rs (3)

18-22: Solid module organization, mate!

The new check and miniprotocol modules with their re-exports create a clean separation of concerns. It's like organizing your vinyl collection by genre—everything has its place and it's easy to find what you need.


24-29: Nice and clean input abstraction!

The Input<L, R> enum is a ripper—simple, generic, and with all the right derives for serialization. It's the protocol equivalent of a Swiss Army knife: versatile without being overcomplicated.


101-108: opposite() is spot on!

This const method does exactly what it says on the tin—flips the role like a pancake on Sunday morning. Implementation is clean and correct.

crates/amaru-protocols/src/chainsync/responder.rs (3)

33-68: Clean struct setup—no dramas here.

The Responder struct and its constructor are well-structured. Initializing the pointer to Point::Origin is spot-on for a fresh connection, and returning the initial state alongside the responder follows good FSM patterns.


70-122: StageState impl is solid as a rock.

Both local and network methods handle their respective inputs cleanly, with proper error context propagation. The logging on Done is a nice touch for observability.


311-373: Test coverage hits all the key transitions—nice work.

The protocol state machine test is comprehensive, covering all the major paths: intersection handling, await/reply cycles, and forward/backward rolls. Using ProtoSpec to validate role-based protocol behavior is a solid approach.

(Note: line 333 has the same hardcoded variant: 6, but since it's test data and not the production issue, it's less of a drama.)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
crates/amaru-protocols/src/chainsync/responder.rs (2)

149-160: Misleading error message when loading pointer header.

Line 153 says "tip not found" but you're actually loading the header by pointer.hash(), not the tip. If this fails, it means the pointer header isn't in the store—bit like searching for a save file that got corrupted.

Change it to something like "pointer header not found" or "client pointer unknown" so future-you (or future-teammate) doesn't spend an hour chasing the wrong ghost.

🔎 Suggested fix
         let header = store
             .load_header(&pointer.hash())
-            .ok_or_else(|| anyhow::anyhow!("tip not found"))?;
+            .ok_or_else(|| anyhow::anyhow!("client pointer header not found"))?;

Based on past review comments.


166-174: Hardcoded era variant will break multi-era chains.

The variant: 6 at line 168 assumes all headers are Conway era. But if your chain's got headers from Shelley, Allegra, Mary, Alonzo, or Babbage, you're tagging them wrong—like putting a PS5 game in a PS4 case. Peers expecting accurate era markers will chuck a wobbly.

You need to extract the era from the header metadata (likely in the CBOR or pallas Header structure) and map it to the correct variant. Same issue pops up in client_protocol.rs:321 according to past reviews, so might be worth a systematic fix across the codebase.

Run this to find all the hardcoded variant usages in chain-sync code:

#!/bin/bash
# Search for hardcoded variant: 6 in chainsync-related code
rg -n "variant:\s*6" crates/amaru-protocols/src/chainsync/ crates/amaru/src/stages/consensus/ --type rust

Based on past review comments.

🧹 Nitpick comments (2)
crates/amaru-protocols/src/protocol/mod.rs (1)

26-31: Mate, Input vs Inputs might trip folks up.

You've got Input<L, R> here (generic over both types) and Inputs<L> from miniprotocol (hardcoded Network(HandlerMessage)). They serve different purposes but the similar names could cause confusion. Maybe add a doc comment explaining when to use which, or consider renaming one to be more distinctive?

crates/amaru-protocols/src/protocol/check.rs (1)

91-149: Complex validation logic - working but could use some love!

The main validation loop handles both local and network messages with careful role checking. The network path allows implementations to combine request-response into a single atomic transition, which is clever.

However, this logic is pretty dense. When a state sends a response (lines 132-140), the checker validates against the spec's expected state to rather than the state machine's intermediate state. This is intentional (allows atomic request-response), but a comment explaining this design choice would help future maintainers understand the abstraction.

💡 Optional: Add explanatory comment

Consider adding a comment around line 132 explaining why we look up transitions from to (spec state) rather than the state machine's intermediate state:

// When a response is sent, we validate the final state rather than intermediate state.
// This allows implementations to handle request-response atomically, even if the spec
// models it as two separate transitions.
if let Some(send) = send {
    ...
}
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 760aa62 and 81e6956.

📒 Files selected for processing (6)
  • crates/amaru-protocols/src/chainsync/initiator.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
  • crates/amaru-protocols/src/lib.rs
  • crates/amaru-protocols/src/protocol/check.rs
  • crates/amaru-protocols/src/protocol/miniprotocol.rs
  • crates/amaru-protocols/src/protocol/mod.rs
🧰 Additional context used
🧠 Learnings (14)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.
📚 Learning: 2025-09-29T16:39:24.001Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:317-325
Timestamp: 2025-09-29T16:39:24.001Z
Learning: In crates/amaru-network/src/mux.rs, the outgoing() method intentionally uses unwrap() after get_mut(&proto_id) as a fail-fast mechanism. This panic is designed to catch programming errors where an actor tries to send on an unregistered protocol, and should not be changed to return a Result since it represents internal code bugs that should terminate the process, not external input that should be handled gracefully.

Applied to files:

  • crates/amaru-protocols/src/protocol/mod.rs
  • crates/amaru-protocols/src/chainsync/initiator.rs
  • crates/amaru-protocols/src/protocol/miniprotocol.rs
📚 Learning: 2025-08-08T14:34:06.105Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/minicbor-extra/src/lib.rs:50-55
Timestamp: 2025-08-08T14:34:06.105Z
Learning: Project uses Rust 1.88 stable; using Result::is_err_and is acceptable and should not be flagged as unstable. In particular, crates/minicbor-extra/src/lib.rs relies on is_err_and when checking end_of_input; future reviews should not suggest replacing it for stability reasons.

Applied to files:

  • crates/amaru-protocols/src/protocol/mod.rs
  • crates/amaru-protocols/src/protocol/miniprotocol.rs
📚 Learning: 2025-12-16T21:32:37.668Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/amaru-network/src/handshake/tests.rs:40-47
Timestamp: 2025-12-16T21:32:37.668Z
Learning: In Rust, shadowing a binding with a new let does not drop the previous binding until the end of the scope. All shadowed bindings in a scope are dropped in reverse-declaration order when the scope ends. Therefore, multiple let _guard = register_*() calls will keep all guards alive until the end of the function (or the surrounding scope). When reviewing code, be mindful that resources tied to shadowed bindings persist longer than the most recent binding; to release early, constrain the lifetime in an inner block or explicitly drop guards when appropriate.

Applied to files:

  • crates/amaru-protocols/src/protocol/mod.rs
  • crates/amaru-protocols/src/chainsync/responder.rs
  • crates/amaru-protocols/src/lib.rs
  • crates/amaru-protocols/src/chainsync/initiator.rs
  • crates/amaru-protocols/src/protocol/miniprotocol.rs
  • crates/amaru-protocols/src/protocol/check.rs
📚 Learning: 2025-04-22T09:18:19.893Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-08-08T14:46:53.013Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/pool_params.rs:107-116
Timestamp: 2025-08-08T14:46:53.013Z
Learning: In crates/amaru-kernel/src/pool_params.rs, when serializing Relay::SingleHostAddr IPv6 to text, the project intentionally reverses each 4-byte chunk before constructing std::net::Ipv6Addr. This matches cardano-ledger’s IPv6 representation (four little-endian Word32 chunks). Do not “simplify” by passing the raw 16 bytes directly to Ipv6Addr::from; that would break ledger compatibility.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-09-06T09:16:25.025Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 435
File: crates/amaru/src/bin/amaru/cmd/convert_ledger_state.rs:113-116
Timestamp: 2025-09-06T09:16:25.025Z
Learning: In cardano-node serialized ledger state CBOR encoding, indefinite-length structures may be terminated with 0xFF "break" markers. The current code in convert_ledger_state.rs unconditionally strips the last byte (bytes[p..bytes.len() - 1]), which could corrupt data if the trailing byte is not actually a CBOR break marker.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-08-23T15:44:44.318Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 392
File: crates/ouroboros-traits/src/is_header/fake.rs:52-88
Timestamp: 2025-08-23T15:44:44.318Z
Learning: In the Amaru project, FakeHeader in crates/ouroboros-traits/src/is_header/fake.rs is specifically designed for testing purposes, and panic behavior (like copy_from_slice() panicking on malformed hex) is the preferred approach rather than graceful error handling, as it helps identify test data issues quickly during development.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-02-03T11:15:22.640Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
  • crates/amaru-protocols/src/protocol/check.rs
📚 Learning: 2025-04-20T17:56:48.565Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/mod.rs:0-0
Timestamp: 2025-04-20T17:56:48.565Z
Learning: When bootstrapping a node in Amaru, it's important to handle the case where the tip is Origin (for a fresh node). Instead of unconditionally trying to load a header from the chain store, check if the tip is Origin or Specific first, and handle each case appropriately.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-08-08T14:43:20.218Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/point.rs:45-52
Timestamp: 2025-08-08T14:43:20.218Z
Learning: In crates/amaru-kernel/src/point.rs, Point::Specific stores a header hash that is always exactly 32 bytes by project invariant. Therefore, converting it with `impl From<&Point> for Hash<32>` without a runtime length check is acceptable; future reviews should not request an error-returning check inside this `From` impl. If validation is ever desired, it should be done via `TryFrom` or upstream constructors.

Applied to files:

  • crates/amaru-protocols/src/chainsync/responder.rs
📚 Learning: 2025-09-05T17:30:55.869Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.

Applied to files:

  • crates/amaru-protocols/src/chainsync/initiator.rs
📚 Learning: 2025-09-29T16:44:14.807Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/protocol.rs:94-106
Timestamp: 2025-09-29T16:44:14.807Z
Learning: In the amaru-network crate protocol.rs file, the correct Cardano mini-protocol ID assignments are: PROTO_N2N_KEEP_ALIVE = 8 and PROTO_N2N_PEER_SHARE = 10, as verified against the network specification by the maintainer.

Applied to files:

  • crates/amaru-protocols/src/protocol/miniprotocol.rs
🧬 Code graph analysis (3)
crates/amaru-protocols/src/protocol/mod.rs (2)
crates/amaru-protocols/src/protocol/check.rs (1)
  • check (63-150)
crates/amaru-protocols/src/protocol/miniprotocol.rs (2)
  • miniprotocol (109-190)
  • outcome (52-57)
crates/amaru-protocols/src/lib.rs (1)
crates/amaru/src/stages/consensus/forward_chain/client_protocol.rs (1)
  • tx_submission (356-365)
crates/amaru-protocols/src/protocol/check.rs (3)
crates/amaru-protocols/src/protocol/mod.rs (2)
  • opposite (104-109)
  • opposite (167-169)
crates/amaru-protocols/src/chainsync/responder.rs (3)
  • init (232-234)
  • network (102-129)
  • network (236-256)
crates/amaru-protocols/src/protocol/miniprotocol.rs (6)
  • init (67-67)
  • send (37-42)
  • result (44-49)
  • outcome (52-57)
  • network (68-71)
  • network (88-93)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (rust)
🔇 Additional comments (23)
crates/amaru-protocols/src/chainsync/responder.rs (6)

32-76: LGTM! Solid foundation for the responder protocol.

The responder() constructor and Responder struct are properly wired up. Starting at Point::Origin with Idle state is the right move—like spawning at the tutorial zone before you go adventuring. Clean initialization, no dramas.


78-130: Handlers look solid.

The local() and network() handlers are properly routing messages through the state machine. Good use of .context() for error breadcrumbs—makes debugging less of a treasure hunt.


177-199: Intersection logic is sound.

The intersect() function properly sorts points and walks the ancestor chain to find a common point. The early-break optimization at lines 194-196 is a nice touch—no point wandering into the ancient blocks if we've already passed the client's oldest point. It's like not checking the basement when you know your keys are on the kitchen counter.


201-225: State machine types are well-structured.

The enum hierarchy is clean. Having send_rollback flags in Idle and CanAwait states lets you track post-intersection rollback requirements—clever bit of state tracking that keeps the transitions clear.


227-302: Protocol state machine transitions are correct.

The ProtocolState implementation properly enforces the chain-sync protocol rules. The ensure! guards at lines 275 and 279 prevent sending AwaitReply or RollForward after an intersection—good defensive programming that catches protocol violations early rather than letting them propagate. It's like having guardrails on a mountain road: you'll know immediately if you've gone off-track.


304-385: Test coverage is comprehensive.

The test validates all protocol transitions and checks refinement against the initiator spec. This dual-side verification is solid—like having both player 1 and player 2 controllers plugged in to make sure the co-op mode actually works.

crates/amaru-protocols/src/chainsync/initiator.rs (6)

28-77: LGTM! Initiator setup is clean.

The Initiator struct and constructor are properly wired. Starting with upstream: None and me: StageRef::blackhole() makes sense—you don't have a handler until Initialize runs, like not having a save slot until you've actually started the game.


85-141: Handler implementation looks solid.

The local() handler maps control messages to actions, and network() processes protocol results. The contramap setup at lines 112-118 creates the handler reference properly. The eff.send() to the pipeline is fire-and-forget, which is appropriate here—you're just notifying the pipeline about what happened, not waiting for a response.


143-159: Intersection point algorithm is efficient.

The exponential spacing (1, 2, 4, 8...) is a clever approach—like binary search but for chain history. You'll find a common ancestor quickly without sending hundreds of points. The expect() at line 148 is acceptable since the best chain hash should always be valid by invariant; if it's not, that's a programming error that should crash early.


161-185: State machine types are well-designed.

The enum hierarchy clearly separates actions (what we want to do), results (what we got from the network), and states (where we are in the protocol). Clean separation of concerns—like having different button mappings for different game modes.


187-238: Protocol state machine is correctly implemented.

The state transitions follow the chain-sync protocol spec. Nice touch at line 205 where finding an intersection automatically sends RequestNext—keeps the protocol moving forward without an extra round-trip. It's like auto-advancing dialogue in a cutscene: you already know what happens next, so why wait?


240-296: Test coverage is thorough.

The tests validate all protocol transitions and provide a public spec() function for cross-protocol verification. Making spec() public is smart—lets the responder tests check that both sides agree on the protocol, like having both players test the same multiplayer map.

crates/amaru-protocols/src/protocol/miniprotocol.rs (5)

29-57: Nice builder pattern, mate!

The Outcome struct with its builder methods is clean as a whistle. Consuming self and returning Self lets ya chain calls like a boss. Good stuff!


59-73: Solid trait separation, champion!

The ProtocolState trait nicely isolates network protocol mechanics from decision-making logic. The asymmetry between network() returning Outcome (can emit results) and local() returning Option<WireMsg> (just sends) makes sense for the use case.


75-94: StageState trait looks ace!

The decision-making layer is well-structured. Both methods returning Option<Proto::Action> gives flexibility to produce actions or not. The #[expect(async_fn_in_trait)] is totally fine in modern Rust.


120-190: Crikey, this orchestration is well thought-out!

The async handler threads protocol and stage state through the flow beautifully. Love how you've separated network handling, decision-making, and action emission into clear phases. The error handling with or_terminate is spot on for protocol violations.


135-135: NonEmptyBytes::encode panic is intentional validation, not a trap.

The .expect("CBOR item should not be empty") on line 62 of bytes.rs is deliberate. Well-formed protocol messages should never encode to empty CBOR, and this panic catches that invariant violation early. No changes needed here, mate.

crates/amaru-protocols/src/protocol/mod.rs (2)

18-24: Good modular split, legend!

Breaking out check and miniprotocol into separate modules and re-exporting the public API is clean architecture. Makes the codebase way easier to navigate.


103-110: Lovely little helper, that!

The opposite() method is dead simple and useful. Making it const fn is a nice touch for compile-time evaluation. Good on ya!

crates/amaru-protocols/src/lib.rs (1)

17-26: Clean public API surface!

All the module exports look properly organized. This gives consumers a nice clear entry point to the protocol implementations.

crates/amaru-protocols/src/protocol/check.rs (3)

35-61: Intentional fail-fast on duplicates - ace!

The i() and r() methods panic on duplicate transitions, which is perfect for a test helper. This catches programming errors early when building protocol specs. The panic messages are descriptive too.


77-89: Initial state validation looks solid!

The init checks properly enforce that Initiators send or produce results, while Responders stay quiet. This matches the protocol semantics.


152-171: Refinement checking looks proper!

The assert_refines() method correctly validates that a detailed spec refines an abstract one via a surjection. The state projection logic is sound.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/amaru-consensus/src/consensus/stages/fetch_block.rs`:
- Around line 56-66: Replace the .await.unwrap_or_default() on the
eff.base().call(...) call with a match on the Result so you preserve and log the
actual error before falling back to default Blocks; specifically, match the
result of eff.base().call(&manager, Duration::from_secs(5), ...).await, on Ok
assign blocks as before, on Err(e) emit a tracing::warn!(?e, ?point, "block
fetch failed") (or processLogger.error) and then use Blocks::default(), then
keep the existing check using blocks.blocks.into_iter().next() to send
ValidationFailed::new(&peer, ConsensusError::FetchBlockFailed(point)) when no
block is present.
♻️ Duplicate comments (6)
crates/amaru-consensus/src/consensus/stages/fetch_block.rs (1)

47-48: The hardcoded 5-second timeout needs calibration—previously flagged.

Already noted in a past review: Cardano uses a ~10-second grace period plus a 60-second protocol timeout. The 5-second wall here is aggressive and could cause false negatives on slower networks. Making this configurable would be the go.

crates/amaru-protocols/src/manager.rs (1)

180-199: FetchBlocks routing works – just a wee note on the error path.

The happy path forwarding to the connection stage is spot on. For the error case at lines 193-196, returning Blocks::default() is functional but means callers can't distinguish "peer has no blocks" from "peer doesn't exist." It's like getting an empty loot drop without knowing if the chest was locked or just empty, yeah?

This was flagged in a prior review already, so consider it a gentle reminder if you fancy improving error signaling down the track. Not blocking by any means.

crates/pure-stage/src/effect.rs (2)

28-33: Duplicate import of parking_lot::Mutex on riscv32.

G'day mate! This one's been chatted about before - the unconditional import on line 28 already covers riscv32, so the cfg-gated duplicate on lines 29-30 is like bringing two umbrellas to the same rainstorm. Works fine, but a bit redundant, yeah?


205-211: Nested call prevention panic - already discussed.

This one's like déjà vu in The Matrix - we've been here before! Per rkuhn's call, keeping the runtime guard is the play for now. Future-proofs without over-engineering. All good.

crates/pure-stage/src/simulation/running/mod.rs (2)

677-686: Past concern about call error handling appears unaddressed.

Hey mate, there was a previous review comment flagging that any error from resume_call_send_internal terminates the sim with Blocked::Terminated(from). The suggestion was that if the callee is already terminated, it should probably be non-fatal (caller can time out instead).

Is this intentional design now, or is the richer error handling still planned? Like, in most RPGs, if an NPC despawns before you finish your conversation, the game doesn't crash – it just moves on.


226-236: Schedule wakeup will panic if Effect::Wait gets a Duration::ZERO.

Right, here's the rub: ScheduleIds::next_at(instant) doesn't actually guarantee anything about the instant being in the future—it just wraps whatever instant you give it. If duration is Duration::ZERO, you're calling next_at(now), which returns a ScheduleId with time() == now. Then the assert in schedule_wakeup goes "nope, mate" and panics because now > now is false.

The fix would be to bump the scheduled time forward by at least one time unit when duration is zero, or validate that durations are never zero before reaching this code path. Right now, you're still sitting on a ticking bomb—just waiting for some poor soul to pass Duration::ZERO to Effect::Wait and watch everything blow up like a speedrun reset.

🧹 Nitpick comments (12)
crates/amaru-consensus/src/consensus/stages/fetch_block.rs (2)

25-30: Consider a named struct for better readability—optional refactor.

Four-element tuples start feeling a bit like trying to remember the Dark Souls boss order without a guide. A named struct would make the stage signature self-documenting:

struct FetchBlockState {
    downstream: StageRef<ValidateBlockEvent>,
    failures: StageRef<ValidationFailed>,
    errors: StageRef<ProcessingFailed>,
    manager: StageRef<ManagerMessage>,
}

Not critical—just a nice-to-have for when this inevitably grows or someone new dives into the codebase.


121-152: Test covers the happy path—consider adding failure scenario coverage.

G'day, the sunny-day test looks spot on! But there's no test for when the block fetch fails (empty blocks response or timeout). Given the new error handling on lines 58-66, a test verifying that ValidationFailed::FetchBlockFailed gets sent to the failures channel would be bonzer.

I know the PR description mentions testing will happen when stages are complete, so this might be intentional deferral. Just flagging it for tracking!

Would you like me to draft a test case for the failure path, or should we track this as a TODO for the broader testing effort?

crates/amaru-protocols/src/protocol/mod.rs (2)

33-34: TODO tracked for network timeout tuning.

The 1-second timeout is a fair starting point - like choosing your loadout before knowing the map. When you get to fine-tuning this, consider if different operations need different timeouts, or if this should be configurable.

Want me to open an issue to track finding the right value for this timeout?


26-31: Naming heads-up: Input<L, R> vs Inputs<L> could cause confusion down the line.

The Input<L, R> enum here isn't actually used anywhere yet, but you've got Inputs<L> from miniprotocol that is actively in the wild. Both are doing similar work but with different variant names (Remote vs Network), which is a bit like having two controllers for the same console—someone's gonna grab the wrong one eventually.

If this Input<L, R> is earmarked for future protocols, maybe slap a doc comment on it explaining the intended use case and how it differs from Inputs. Something like "use this for protocol steps, not miniprotocol handlers" would save a teammate from pulling their hair out later. Otherwise, if it's not needed, it's just dead weight.

crates/amaru-protocols/src/tx_submission/initiator.rs (2)

66-66: Consider documenting the protocol constant.

MAX_REQUESTED_TX_IDS = 10 is used for validation but lacks a comment explaining why this value was chosen. Is it a Cardano protocol spec requirement, or a tuning parameter? A quick comment would help future maintainers understand if this is set in stone or adjustable.

📝 Suggested documentation
-const MAX_REQUESTED_TX_IDS: u16 = 10;
+/// Maximum number of transaction IDs that can be requested in a single batch.
+/// This limit helps control memory usage and network overhead per request.
+const MAX_REQUESTED_TX_IDS: u16 = 10;

369-377: Unnecessary clone on tx_ids.

You're cloning tx_ids just to iterate over it, but you could avoid the allocation by restructuring slightly. Not a big deal—it's like that extra loading screen that adds a few milliseconds—but worth tidying up if you're in the area.

♻️ Avoid the clone
 fn get_next_tx_ids<Tx: Send + Debug + Sync + 'static>(
     &mut self,
     mempool: &dyn TxSubmissionMempool<Tx>,
     required_next: u16,
 ) -> anyhow::Result<Vec<(TxId, u32)>> {
     let tx_ids = mempool.tx_ids_since(self.next_seq(), required_next);
-    let result = tx_ids
-        .clone()
-        .into_iter()
+    let result = tx_ids
+        .iter()
         .map(|(tx_id, tx_size, _)| (tx_id, tx_size))
         .collect();
     self.update(tx_ids);
     Ok(result)

Or if update can take a reference:

+    let result = tx_ids
+        .iter()
+        .map(|(tx_id, tx_size, _)| (*tx_id, *tx_size))
+        .collect();
crates/amaru-protocols/src/tx_submission/mod.rs (2)

97-101: Magic numbers in ResponderParams could use context.

ResponderParams::new(2, 3) has me scratching my head like trying to remember what that cryptic item does in Dark Souls. What do 2 and 3 represent here? A quick comment or named constants would help future you (or any of us) understand the intent without diving into the ResponderParams definition.

📝 Suggested clarification
-        let (state, stage) = responder::TxSubmissionResponder::new(
-            muxer.clone(),
-            ResponderParams::new(2, 3),
-            origin,
-        );
+        // Responder configuration: request up to 2 blocking batches
+        // and 3 non-blocking batches of transaction IDs
+        let (state, stage) = responder::TxSubmissionResponder::new(
+            muxer.clone(),
+            ResponderParams::new(2, 3),
+            origin,
+        );

Or consider extracting to named constants if these values are used elsewhere.


116-125: Consider documenting or naming the max_buffer value.

max_buffer: 5760 looks like a protocol-specific limit. If this comes from the Cardano spec, a brief comment or a named constant would make it self-documenting. Not a blocker though—just one of those "would be nice" things.

crates/pure-stage/src/effect.rs (1)

959-965: Consider using native JSON boolean for call field.

Small nitpick here - format!("{:?}", call) gives you a string "true" instead of a proper JSON boolean true. It's like serving fish and chips wrapped in newspaper when you've got proper plates. Works, but JSON consumers might expect the native type.

♻️ Proposed fix
             serde_json::json!({
                 "type": "send",
                 "from": from,
                 "to": to,
-                "call": format!("{:?}", call),
+                "call": call,
                 "msg": format!("{msg}"),
             })
crates/pure-stage/src/simulation/running/mod.rs (3)

202-220: The return value might be misleading when wakeups schedule new wakeups.

G'day! The logic here checks self.scheduled.len() != initial_scheduled_nb to determine if wakeups were performed. But here's the thing – like a game of whack-a-mole, if a wakeup callback schedules a new wakeup at the same time slot, the length could remain unchanged even though work was done.

Consider tracking whether wakeups was non-empty instead:

-        // return true if any wakeups were performed
-        self.scheduled.len() != initial_scheduled_nb
+        // return true if any wakeups were performed
+        !wakeups.is_empty()

This would be more semantically accurate – "did we wake anything up?" rather than "did the queue size change?".


886-943: Note the TODO for external effects cleanup.

Solid termination logic here, mate! The recursive child termination and sender resumption is clean. However, I noticed the TODO at lines 891-893:

// TODO(network):
// - add kill switch to scheduled external effects to terminate them
// - record source stage for scheduled messages to remove them

This means terminated stages might still have scheduled messages that could be delivered after termination, or external effects that continue running. For now this is probably fine – like a game where NPCs finish their current animation even after you've left the area – but worth tracking for a future cleanup pass.

Want me to open an issue to track this cleanup work?


485-499: Consider addressing the TODO about clock ceiling.

Hey, spotted this TODO at line 485:

// TODO: shouldn't this have a clock ceiling?

The run_one_step method currently calls skip_to_next_wakeup(None) which has no time limit. If there's a use case where you want bounded time progression per step, this could cause issues – like a game running at uncapped FPS when you only wanted 60.

Want me to help sketch out what a clock ceiling parameter might look like here, or open an issue to track this?

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 003fd8c and e0c1886.

📒 Files selected for processing (9)
  • crates/amaru-consensus/src/consensus/stages/fetch_block.rs
  • crates/amaru-protocols/src/manager.rs
  • crates/amaru-protocols/src/protocol/mod.rs
  • crates/amaru-protocols/src/tx_submission/initiator.rs
  • crates/amaru-protocols/src/tx_submission/mod.rs
  • crates/pure-stage/src/effect.rs
  • crates/pure-stage/src/serde.rs
  • crates/pure-stage/src/simulation/running/mod.rs
  • crates/pure-stage/tests/functional.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/pure-stage/src/serde.rs
🧰 Additional context used
🧠 Learnings (30)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:427-433
Timestamp: 2025-08-18T08:10:32.640Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp, so unimplemented methods like set_proposals_roots and set_constitution are intentionally left as placeholders until the revamp is complete.
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.
📚 Learning: 2025-09-05T17:30:55.869Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.

Applied to files:

  • crates/amaru-consensus/src/consensus/stages/fetch_block.rs
  • crates/amaru-protocols/src/tx_submission/mod.rs
  • crates/amaru-protocols/src/manager.rs
  • crates/pure-stage/src/effect.rs
  • crates/amaru-protocols/src/tx_submission/initiator.rs
📚 Learning: 2025-12-28T19:26:35.354Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru-protocols/src/blockfetch/mod.rs:151-173
Timestamp: 2025-12-28T19:26:35.354Z
Learning: In crates/amaru-protocols/src/blockfetch/mod.rs, the blockfetch initiator uses .expect() when popping from the request queue on NoBlocks and Done results. These are intentional fail-fast assertions: the protocol state machine guarantees the queue is non-empty when these messages arrive, so an empty queue indicates a protocol violation. A misbehaving peer triggers an erroneous protocol transition that will close the connection (supervision to be implemented in a future PR). This follows the project's fail-fast philosophy for protocol invariants.
<!--

Applied to files:

  • crates/amaru-consensus/src/consensus/stages/fetch_block.rs
  • crates/amaru-protocols/src/tx_submission/mod.rs
  • crates/amaru-protocols/src/manager.rs
  • crates/amaru-protocols/src/protocol/mod.rs
  • crates/amaru-protocols/src/tx_submission/initiator.rs
📚 Learning: 2025-12-28T19:39:16.476Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru/src/stages/mod.rs:233-239
Timestamp: 2025-12-28T19:39:16.476Z
Learning: In crates/amaru/src/stages/mod.rs, the network.preload() operation for manager stage has a mailbox capacity of 10 messages. When preloading peers at startup, breaking on the first failure (after potentially filling the mailbox) is intentional design—the mailbox cannot hold more than 10 messages, so continuing to attempt preloads after a failure would be futile and would break the application.

Applied to files:

  • crates/amaru-consensus/src/consensus/stages/fetch_block.rs
  • crates/amaru-protocols/src/manager.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-04-22T09:18:19.893Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.

Applied to files:

  • crates/amaru-consensus/src/consensus/stages/fetch_block.rs
📚 Learning: 2025-06-24T06:36:05.931Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 295
File: crates/amaru-consensus/src/consensus/store_header.rs:52-53
Timestamp: 2025-06-24T06:36:05.931Z
Learning: In crates/amaru-consensus, flood prevention for duplicate invalid headers should be implemented in the validate_header stage or database layer, not in the store_header stage, since store_header runs before validation and cannot determine header validity.

Applied to files:

  • crates/amaru-consensus/src/consensus/stages/fetch_block.rs
📚 Learning: 2025-08-23T15:44:44.318Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 392
File: crates/ouroboros-traits/src/is_header/fake.rs:52-88
Timestamp: 2025-08-23T15:44:44.318Z
Learning: In the Amaru project, FakeHeader in crates/ouroboros-traits/src/is_header/fake.rs is specifically designed for testing purposes, and panic behavior (like copy_from_slice() panicking on malformed hex) is the preferred approach rather than graceful error handling, as it helps identify test data issues quickly during development.

Applied to files:

  • crates/amaru-consensus/src/consensus/stages/fetch_block.rs
  • crates/amaru-protocols/src/tx_submission/mod.rs
📚 Learning: 2025-12-16T21:32:37.668Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/amaru-network/src/handshake/tests.rs:40-47
Timestamp: 2025-12-16T21:32:37.668Z
Learning: In Rust, shadowing a binding with a new let does not drop the previous binding until the end of the scope. All shadowed bindings in a scope are dropped in reverse-declaration order when the scope ends. Therefore, multiple let _guard = register_*() calls will keep all guards alive until the end of the function (or the surrounding scope). When reviewing code, be mindful that resources tied to shadowed bindings persist longer than the most recent binding; to release early, constrain the lifetime in an inner block or explicitly drop guards when appropriate.

Applied to files:

  • crates/amaru-consensus/src/consensus/stages/fetch_block.rs
  • crates/amaru-protocols/src/tx_submission/mod.rs
  • crates/amaru-protocols/src/manager.rs
  • crates/amaru-protocols/src/protocol/mod.rs
  • crates/pure-stage/src/simulation/running/mod.rs
  • crates/pure-stage/src/effect.rs
  • crates/amaru-protocols/src/tx_submission/initiator.rs
📚 Learning: 2025-08-08T14:34:06.105Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/minicbor-extra/src/lib.rs:50-55
Timestamp: 2025-08-08T14:34:06.105Z
Learning: Project uses Rust 1.88 stable; using Result::is_err_and is acceptable and should not be flagged as unstable. In particular, crates/minicbor-extra/src/lib.rs relies on is_err_and when checking end_of_input; future reviews should not suggest replacing it for stability reasons.

Applied to files:

  • crates/amaru-protocols/src/tx_submission/mod.rs
  • crates/pure-stage/src/effect.rs
  • crates/amaru-protocols/src/tx_submission/initiator.rs
📚 Learning: 2025-08-20T13:02:25.763Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.

Applied to files:

  • crates/amaru-protocols/src/tx_submission/mod.rs
  • crates/pure-stage/src/effect.rs
📚 Learning: 2025-08-12T12:28:24.027Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 372
File: simulation/amaru-sim/src/simulator/mod.rs:410-412
Timestamp: 2025-08-12T12:28:24.027Z
Learning: In the Amaru project, panic statements are acceptable in simulation/test code (like amaru-sim crate) as they help identify configuration issues quickly during development, rather than needing proper error handling like production code.

Applied to files:

  • crates/amaru-protocols/src/tx_submission/mod.rs
  • crates/pure-stage/src/simulation/running/mod.rs
  • crates/pure-stage/src/effect.rs
  • crates/amaru-protocols/src/tx_submission/initiator.rs
📚 Learning: 2025-05-12T14:21:27.470Z
Learnt from: stevana
Repo: pragma-org/amaru PR: 210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.

Applied to files:

  • crates/amaru-protocols/src/tx_submission/mod.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2026-01-11T20:05:19.348Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru-protocols/src/chainsync/responder.rs:191-213
Timestamp: 2026-01-11T20:05:19.348Z
Learning: In crates/amaru-protocols/src/chainsync/responder.rs, the chainsync responder intentionally does not support serving headers when the tip is Origin. Amaru is not designed to cold-start a new Cardano blockchain, so the intersect() function correctly fails when tip is Origin without needing special handling. This is a conscious design decision.

Applied to files:

  • crates/amaru-protocols/src/tx_submission/mod.rs
  • crates/amaru-protocols/src/tx_submission/initiator.rs
📚 Learning: 2025-04-20T17:57:23.233Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Applied to files:

  • crates/amaru-protocols/src/tx_submission/mod.rs
  • crates/pure-stage/src/effect.rs
📚 Learning: 2025-09-29T16:39:24.001Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:317-325
Timestamp: 2025-09-29T16:39:24.001Z
Learning: In crates/amaru-network/src/mux.rs, the outgoing() method intentionally uses unwrap() after get_mut(&proto_id) as a fail-fast mechanism. This panic is designed to catch programming errors where an actor tries to send on an unregistered protocol, and should not be changed to return a Result since it represents internal code bugs that should terminate the process, not external input that should be handled gracefully.

Applied to files:

  • crates/amaru-protocols/src/tx_submission/mod.rs
  • crates/amaru-protocols/src/manager.rs
  • crates/amaru-protocols/src/protocol/mod.rs
  • crates/amaru-protocols/src/tx_submission/initiator.rs
📚 Learning: 2025-08-08T14:39:50.527Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/borrowed_datum.rs:32-39
Timestamp: 2025-08-08T14:39:50.527Z
Learning: In the amaru project, when converting BorrowedDatumOption::Data to an owned DatumOption in crates/amaru-kernel/src/borrowed_datum.rs, the call `.unwrap()` refers to pallas’s KeepRaw::unwrap, which is infallible (always returns the inner value) and is not a panic risk. Future reviews should not flag this unwrap as dangerous.

Applied to files:

  • crates/amaru-protocols/src/tx_submission/mod.rs
  • crates/amaru-protocols/src/manager.rs
  • crates/pure-stage/src/effect.rs
📚 Learning: 2025-09-29T16:44:14.807Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/protocol.rs:94-106
Timestamp: 2025-09-29T16:44:14.807Z
Learning: In the amaru-network crate protocol.rs file, the correct Cardano mini-protocol ID assignments are: PROTO_N2N_KEEP_ALIVE = 8 and PROTO_N2N_PEER_SHARE = 10, as verified against the network specification by the maintainer.

Applied to files:

  • crates/amaru-protocols/src/tx_submission/mod.rs
  • crates/amaru-protocols/src/protocol/mod.rs
  • crates/amaru-protocols/src/tx_submission/initiator.rs
📚 Learning: 2025-09-01T14:23:45.389Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 416
File: crates/amaru-consensus/src/consensus/select_chain.rs:53-57
Timestamp: 2025-09-01T14:23:45.389Z
Learning: In the Amaru consensus system, the peer set in SyncTracker is static/predetermined, not dynamic. If a caught_up signal is received for an unknown peer, it should be logged as a warning rather than auto-inserted, as this indicates a potential configuration issue or system anomaly.

Applied to files:

  • crates/amaru-protocols/src/manager.rs
📚 Learning: 2026-01-11T20:38:05.696Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/pure-stage/src/effect.rs:156-162
Timestamp: 2026-01-11T20:38:05.696Z
Learning: For pure-stage Effects::call in crates/pure-stage/src/effect.rs, rkuhn prefers to keep the runtime panic that prevents nested calls for now and only lift/relax this constraint later if/when it becomes necessary.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
  • crates/pure-stage/src/effect.rs
📚 Learning: 2025-12-16T21:50:46.690Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/pure-stage/src/adapter.rs:67-100
Timestamp: 2025-12-16T21:50:46.690Z
Learning: In the pure-stage crate's adapter system (crates/pure-stage/src/adapter.rs), adapters cannot form cycles because an existing adapter cannot be repointed after creation. The Adapter's target field is immutable, preventing the formation of loops in the adapter chain.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-05-09T13:09:47.915Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 206
File: crates/pure-stage/src/simulation/running.rs:240-242
Timestamp: 2025-05-09T13:09:47.915Z
Learning: Cloning messages in the pure-stage crate should be avoided for performance reasons. The current implementation in SimulationRunning deliberately avoids duplicating message data structures.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-06-14T16:41:13.061Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 263
File: crates/pure-stage/src/simulation/running.rs:868-875
Timestamp: 2025-06-14T16:41:13.061Z
Learning: In the pure-stage simulation framework, the effect air-lock protocol is designed so that when a stage is polled, the stage implementation consumes/takes the value from the effect lock during polling. There's no need to manually clear the effect lock after Poll::Ready because "the other side will have taken the value out" - this is by design, not a bug.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
  • crates/pure-stage/src/effect.rs
📚 Learning: 2025-08-20T20:19:07.396Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/tests/simulation.rs:20-28
Timestamp: 2025-08-20T20:19:07.396Z
Learning: Waker::noop() was stabilized in Rust 1.85.0 (released February 20, 2025) and is available in std::task::Waker, so no external dependencies like futures-task are needed for creating no-op wakers in tests.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T20:19:07.396Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/tests/simulation.rs:20-28
Timestamp: 2025-08-20T20:19:07.396Z
Learning: Waker::noop() was stabilized in Rust 1.85.0 (released February 2025) and is available in std::task::Waker, so no external dependencies like futures-task are needed for creating no-op wakers in tests.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T20:18:50.214Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/tests/simulation.rs:459-469
Timestamp: 2025-08-20T20:18:50.214Z
Learning: Rust 1.85 stabilized the Waker::noop() API, making it the preferred way to create a no-op waker instead of using futures_task::noop_waker_ref(). Code using Waker::noop() in modern Rust codebases is correct and doesn't need to be changed to use the futures_task alternative.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T19:37:32.510Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/src/effect.rs:204-223
Timestamp: 2025-08-20T19:37:32.510Z
Learning: In the pure-stage framework, the terminate() method uses never() which panics if called. This is intentional design: if terminate() ever returns, it indicates a serious framework bug that should immediately panic rather than allowing potentially corrupted execution to continue. The panic serves as a failsafe to surface framework issues.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
  • crates/pure-stage/src/effect.rs
📚 Learning: 2025-06-14T16:31:53.134Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 263
File: simulation/amaru-sim/src/simulator/simulate.rs:298-300
Timestamp: 2025-06-14T16:31:53.134Z
Learning: StageRef in the pure-stage crate supports serde serialization and deserialization (derives serde::Serialize and serde::Deserialize), enabling it to be used in structs that also derive these traits for TraceBuffer and replay functionality.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
  • crates/pure-stage/src/effect.rs
📚 Learning: 2025-04-20T18:02:25.073Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward.rs:73-75
Timestamp: 2025-04-20T18:02:25.073Z
Learning: In the current development stage, rkuhn prefers using explicit panics (via `.expect()` or `.unwrap()`) for fatal errors in the application code that would tear down the node, rather than propagating errors with `Result`. The intention is to eventually transition to proper error handling with `Result` as the codebase matures.

Applied to files:

  • crates/pure-stage/src/effect.rs
  • crates/amaru-protocols/src/tx_submission/initiator.rs
📚 Learning: 2025-01-21T15:32:17.911Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 69
File: crates/amaru/src/ledger/state/diff_epoch_reg.rs:112-117
Timestamp: 2025-01-21T15:32:17.911Z
Learning: When suggesting code changes in Rust, always verify that the types align correctly, especially when dealing with references and Options. The `Fold::Registered` variant in `diff_epoch_reg.rs` expects a reference `&'a V`, so unwrapping an `Option<&V>` requires only a single `.expect()`.

Applied to files:

  • crates/pure-stage/src/effect.rs
📚 Learning: 2025-02-03T11:15:22.640Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.

Applied to files:

  • crates/amaru-protocols/src/tx_submission/initiator.rs
🧬 Code graph analysis (3)
crates/amaru-consensus/src/consensus/stages/fetch_block.rs (1)
crates/amaru-protocols/src/manager.rs (2)
  • stage (66-200)
  • new (57-63)
crates/pure-stage/src/effect.rs (2)
crates/pure-stage/src/simulation/replay.rs (2)
  • new (48-62)
  • clock (205-207)
crates/pure-stage/src/simulation/running/scheduled_runnables.rs (2)
  • schedule (69-76)
  • schedule (249-256)
crates/amaru-protocols/src/tx_submission/initiator.rs (3)
crates/amaru-protocols/src/tx_submission/outcome.rs (2)
  • fmt (47-112)
  • fmt (130-137)
crates/amaru-protocols/src/protocol/miniprotocol.rs (4)
  • miniprotocol (156-259)
  • outcome (86-93)
  • init (104-104)
  • result (49-56)
crates/amaru-protocols/src/tx_submission/tests/test_data.rs (2)
  • create_transactions_in_mempool (25-33)
  • create_transactions (21-23)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 182, 15, 60, 10.5.3)
  • GitHub Check: End-to-end snapshot tests (preview, 2, 680, true, 15, 60, 10.5.3)
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Build riscv32
  • GitHub Check: Build wasm32
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Benches
  • GitHub Check: Test coverage
  • GitHub Check: Analyze (rust)
🔇 Additional comments (43)
crates/amaru-consensus/src/consensus/stages/fetch_block.rs (3)

15-23: LGTM on imports, mate!

Clean additions here—ManagerMessage for the new fetch path, Duration for the timeout, and updated error types. Everything's used and accounted for, like a well-organised inventory in Resident Evil.


68-104: Block storage and rollback handling look solid.

The flow here is clean as a whistle:

  • Store the block, route storage errors to errors channel
  • Construct and forward the validated event downstream
  • Rollback events pass through untouched

All return paths correctly propagate the state tuple. Nice work—this is giving Portal vibes with how neatly everything routes through.


156-162: Test helper updated correctly for the new state shape.

The make_state() helper now includes the manager StageRef, and destructuring in the stage function matches up. All good here, mate!

crates/amaru-protocols/src/protocol/mod.rs (5)

18-24: LGTM! Clean module scaffolding, mate.

The module declarations and re-exports follow standard Rust conventions. Like a well-organised Tetris board - everything slots in nicely.


103-110: LGTM! Nice symmetric toggle.

Clean const fn implementation - like a light switch that just works. Pairs nicely with the ProtocolId::opposite method further down. The type system doing the heavy lifting here, as it should.


115-134: Solid trait bounds - the full toolkit.

That's quite the collection of supertraits, but for a protocol role type that needs to be serialized, hashed, compared, and sent across threads, it makes sense. Like equipping your character with every perk before the final boss.

The sealed::Sealed pattern keeps the role types as a closed set - no rogue implementations sneaking in. Good defensive design, mate.


136-167: LGTM! Marker types are well-implemented.

The derives perfectly match the RoleT trait bounds - no "missing required trait" surprises at compile time.

The Erased::Opposite = Erased is a clever choice - once you've erased the role info, calling opposite() keeps it erased rather than trying to conjure information from the void. Very "what happens in Vegas stays in Vegas" energy.


177-185: LGTM! Good protocol number documentation.

Nice touch keeping the N2C protocol IDs as comments for reference - like leaving a note for future archaeologists. The decision to not implement N2C is clearly documented.

PROTO_TEST at 257 is well clear of the real protocol range, so no accidental collisions during testing. Smart move using #[cfg(test)] to keep it out of production builds.

Based on learnings, the N2N protocol IDs (keep-alive at 8, peer-share at 10) are verified correct per the network spec.

crates/amaru-protocols/src/tx_submission/initiator.rs (8)

15-49: Top-notch module documentation, mate!

This is some bonzer docstring work right here. The protocol flow, key components, window management, and error handling sections read like a proper game manual—clear, concise, and leaves no room for confusion. Future devs will thank ya for this.


68-77: Clean public API setup!

Like a well-designed game menu—simple entry points that do exactly what they say on the tin. The register_deserializers() and initiator() functions follow the established pattern across the protocol modules. Easy to wire up, easy to understand.


79-119: StageState implementation is solid.

The network handler routes messages to the appropriate processing functions like a well-oiled combo system in a fighting game. The local() no-op is fine since initiators respond to network requests rather than local inputs. Clean dispatch pattern.


121-193: Protocol state machine is rock solid now.

Like watching a well-choreographed action sequence in John Wick—each state transition flows naturally into the next. The init()Idle → various blocking/non-blocking states → back to Idle cycle is clean.

Good to see the InitiatorAction::Error variant now properly terminates the protocol (line 189) instead of falling through to the catch-all. That's the kind of explicit error handling that prevents nasty surprises down the road.


195-250: Clean type definitions for protocol actions and results.

These enums are like a well-designed inventory system—each variant has a clear purpose, and the Display impls give you just enough info for debugging without drowning in noise. The asymmetry between InitiatorAction (no serde) and InitiatorResult (with serde) makes sense given their roles in the system.


403-412: Helper and trait impl look good.

The protocol_error helper is a nice pattern for consistent error handling and logging. The AsRef impl for StageRef<MuxMessage> enables the generic protocol machinery to access the muxer. Clean and functional.


414-732: Comprehensive test coverage!

This test suite is like a thorough QA playthrough—you've covered the main questline (happy path), side quests (edge cases), and even the "what happens if I try to break the game" scenarios (protocol violations).

The blocking vs non-blocking rules tests at lines 618-665 are particularly good—those protocol invariants are easy to get wrong, and having explicit tests for them is 👨‍🍳💋.


734-811: Test helpers are a thing of beauty.

These helper functions are like those quality-of-life mods that make testing a breeze. The separation between run_stage (fire and forget), run_stage_and_return_state (inspect final state), and run_stage_and_return_state_with (resume from existing state) covers all the testing patterns you'd need.

The builder functions (reply_tx_ids, request_txs, etc.) make the test cases read almost like protocol specs. Grand job!

crates/amaru-protocols/src/tx_submission/mod.rs (3)

15-41: Module organization is clean.

Private modules for the actual implementations, public modules with re-exports for the types consumers need. The #[cfg(test)] gating on the test utilities keeps them out of production builds. It's like having a well-organized inventory with the good stuff easily accessible and the dev tools tucked away.


52-74: Protocol spec definition is solid.

The spec<R: RoleT>() function is like a blueprint for the protocol state machine—mapping out all valid transitions. Using closures for the message factories (|| RequestTxIdsBlocking(0, 0)) is a clever way to satisfy type requirements without needing actual data. The transitions align with the protocol documentation in the initiator module.


130-141: State enum is well-defined.

All the right variants for the protocol lifecycle, all the right derives for state machine usage. The Copy derive is a nice touch for a small enum—avoids unnecessary cloning when passing state around. Like having the perfect loadout for the mission ahead.

crates/amaru-protocols/src/manager.rs (6)

1-26: LGTM!

License header's spot on, and all the imports look necessary for the peer management functionality. Clean slate to build on, mate!


27-64: Clean state machine design, like a well-crafted RPG progression system!

The three-state model (ScheduledConnectedDisconnecting) gives nice clarity to the connection lifecycle. The internal comment on ConnectionDied at line 30-31 is a good touch to prevent misuse.


66-82: Nice work addressing the duplicate peer scenario!

The contains_key check at line 73 ensures we don't accidentally spawn duplicate connections like some buggy multiplayer lobby. The "insert as Scheduled, then queue Connect" flow is clean and predictable.


106-113: Reconnect scheduling on failure is solid – one small note on the assert.

Love that the reconnect is scheduled here (like a Dark Souls bonfire respawn timer!). The assert_eq! on line 110 is a defensive sanity check, which is fair enough – if that invariant breaks, something's gone proper sideways in the state machine. Just wanted to call it out in case you'd prefer logging over panicking in production, but given the single-threaded message processing it should hold true.


135-150: RemovePeer state transitions are bang on!

The transition to Disconnecting at line 143 before returning means ConnectionDied won't trigger a reconnect – that's the fix from the earlier review, and it's clean as a whistle. Removing Scheduled peers directly makes sense since there's no active connection to teardown.


151-178: ConnectionDied handler looks properly battle-tested now!

The state management here is correct – transitioning to Scheduled at line 164 ensures the subsequent Connect message can proceed. The sequential message processing in pure-stage guarantees no race between a new connection establishing and a stale ConnectionDied being processed – by the time Connect runs, the state will already be Scheduled.

The Disconnecting path at lines 170-177 cleanly removes the peer rather than attempting reconnection, which is the right call for intentional disconnects.

crates/pure-stage/src/effect.rs (7)

120-158: Solid implementation of platform-specific ID generation.

Nice one! Using Ordering::Relaxed here is spot-on - like ordering a coffee without the fancy latte art, you just need the caffeine (unique ID), not the synchronization theatrics. The riscv32 fallback with the mutex is a sensible choice for platforms without proper atomic support.


264-271: Timing semantics in schedule_after look intentional.

Just a quick heads-up, mate - self.clock.now() is captured synchronously when schedule_after is called, not when the future is polled. This is likely what you want - like setting an alarm for "10 minutes from now" means from when you set it, not from when you finally look at the clock again. If that's the intent, we're golden!


415-430: Clean supervision pattern with type-level enforcement.

Love this, mate! Using CanSupervise as a marker type is like having a VIP wristband at a festival - you can only supervise if you've got the right credentials. The tombstone association is elegant too. Like getting a notification when your Uber arrives, except it's when your stage shuffles off this mortal coil.


432-461: Nice contravariant functor implementation!

Ah, contramap - the ol' reverse uno card of functional programming! If you've played any category theory drinking games, this is a classic move. The implementation is clean, and the expect for internal type errors aligns with the project's "fail fast" philosophy. Like a boss fight in Dark Souls - if something's fundamentally wrong, you want to know immediately, not three hours later.


1189-1203: Sensible default tombstone handling.

Good call on defaulting to CanSupervise when no tombstone is provided - it's like having a "return to sender" address on a parcel. The stage name in the marker helps with debugging when things go pear-shaped. Clean and practical.


1450-1580: Comprehensive PartialEq implementation for all new variants.

You've covered all the new bases here - Schedule, CancelSchedule, Contramap, and the tombstone field in WireStage. Like a good game of Pokémon, gotta catch 'em all! The field-by-field comparison approach is solid and maintainable.


796-810: Call timeout scheduling looks well-integrated.

The ScheduleId generation for call timeouts is a nice touch - like setting a kitchen timer so your roast doesn't burn. The panic on unexpected CallExtra::Scheduled is appropriate since it indicates a framework state machine bug, not user error. Clean transformation flow from closure to concrete message.

crates/pure-stage/src/simulation/running/mod.rs (11)

73-91: LGTM on the struct definition!

The SimulationRunning struct is nicely organized, mate. The new scheduling primitives (scheduled: ScheduledRunnables, schedule_ids: ScheduleIds) integrate cleanly with the existing architecture. Like fitting a new power-up into your inventory without breaking the UI.


554-591: LGTM on Receive effect handling!

Solid work here, cobber. The error handling with the downcast to UnsupervisedChildTermination is a nice touch – reminds me of how Souls games handle boss phase transitions, except here we're gracefully handling termination cascades. The blocked sender resumption after a successful receive is logically sound.


650-676: LGTM on the Call response send path!

Nice one! The handling here is like a well-designed quest system – even if the recipient has gone off to the great respawn point in the sky, you still clean up the schedule ID and resume the sender. The comment on lines 660-662 explaining why this is okay (call may time out) is chef's kiss for maintainability.


716-738: LGTM on Schedule effect handling!

Beauty! Unlike the Wait effect path, this one properly handles the "schedule for now or past" case by delivering immediately (lines 734-737). It's like the difference between setting an alarm for tomorrow vs. "remind me to do this... right now" – both should work without exploding.


793-815: LGTM on Terminate effect handling!

This is some proper Erlang-style supervision tree vibes! When a stage terminates, it notifies its supervisor with a tombstone message. If it's a top-level stage (supervised by blackhole), the whole simulation shuts down – like when the final boss goes down and the credits roll.


1475-1486: LGTM on block_reason!

The scheduled call handling at lines 1482-1484 is spot on. The fix for the needless borrow (contains(id) instead of contains_key(&id)) has been applied. This correctly categorizes stages waiting for scheduled calls as "sleeping" rather than "busy" – like how a character in a turn-based game is waiting for their cooldown, not actively engaged.


1511-1551: LGTM on poll_stage!

The effect airlock protocol here is clean – like a proper save point system where the game state is serialized between transitions. The handling of Poll::Ready (transition complete, back to idle) vs Poll::Pending (effect awaited) is textbook correct. The Waker::noop() usage is idiomatic for Rust 1.85+.


1379-1444: LGTM on OverrideExternalEffect encapsulation!

Nice module boundary here for keeping those fields properly private – like keeping your game's cheat codes behind a proper menu instead of leaving them on the main screen. The usize::MAX handling for infinite overrides (line 1433) is a neat trick.


1009-1063: LGTM on resume_send!

The dual handling of regular sends vs call responses (via to.extra()) is well structured. Like a dialogue system that handles both one-way barks and full conversation trees through the same interface. The schedule cleanup on call response (line 1046) prevents orphaned timeouts.


1594-1691: LGTM on the invariant test!

Top-notch test coverage here, mate! It's like a QA tester methodically trying every button combination to make sure only valid moves work. The pattern of iterating through operations and verifying "this should fail" vs "this should work" is excellent for catching state machine bugs.


856-877: The Contramap name validation fix is spot on—no worries there!

Right, so the flow checks out beautifully. The stage_name() function generates names like "new_name-1", "new_name-2", etc., by appending a counter to the prefix. Then when resume_contramap_internal validates at line 365, it uses name.as_str().starts_with(new_name.as_str()) to check if the generated name begins with the expected prefix from the effect. Since "new_name-1".starts_with("new_name") is true, the validation passes every time. The fix is solid—whoever sorted that out really knew what they were doing!

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


impl<Msg> PartialEq for StageRef<Msg> {
fn eq(&self, other: &Self) -> bool {
self.name == other.name
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This already solves the problem, I think the modifications should be removed: the test must ensure that the right data deserializer is registered. This isn’t any more or less magic than with_extra_for_tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand which deserializer should be registered since it seems to me that we are losing information during the serialization step, which is "is there an extra value or not?"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted your commit a fixed the test with 35118a5 instead.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/amaru-network/src/connection.rs`:
- Around line 86-104: The call using tokio::time::timeout in async fn connect
(and the similar call around the other occurrence) currently uses `.await??`
which fails because Elapsed can't be converted into std::io::Error; update the
timeout handling to explicitly map the timeout error into an std::io::Error
(e.g. with map_err(|e| std::io::Error::new(std::io::ErrorKind::TimedOut, e)) or
matching the Result and returning Err(std::io::Error::new(...))) before
unwrapping the TcpStream; apply the same explicit conversion at the other
timeout site referenced in the review so the function compiles and returns a
std::io::Result<ConnectionId>.
♻️ Duplicate comments (1)
crates/amaru-ouroboros-traits/src/connection.rs (1)

15-25: Future trait missing — build breaks.
BoxFuture on Line 25 uses Future, but it’s not in scope in the std import list. Add future::Future so this compiles, no "red screen of death" vibes.

🔧 Suggested fix
 use std::{
     fmt,
+    future::Future,
     net::{IpAddr, Ipv4Addr, Ipv6Addr, SocketAddr, SocketAddrV4, SocketAddrV6},
     num::NonZeroUsize,
     pin::Pin,
     sync::Arc,
     time::Duration,
 };
🧹 Nitpick comments (2)
crates/amaru-protocols/src/mux.rs (1)

633-1148: Consider reusing register_deserializers() in tests to cut drift.
You’ve now got a handy helper; wiring it in here reduces boilerplate and keeps test setup aligned if mux types grow.

♻️ Suggested tweak (apply similarly in test_tcp/test_muxing/test_tokio)
-        let _guard = pure_stage::register_data_deserializer::<MuxMessage>();
-        let _guard = pure_stage::register_data_deserializer::<NonEmptyBytes>();
-        let _guard = pure_stage::register_effect_deserializer::<SendEffect>();
-        let _guard = pure_stage::register_effect_deserializer::<RecvEffect>();
-        let _guard = pure_stage::register_data_deserializer::<State>();
+        let _guards = super::register_deserializers();
+        let _guard = pure_stage::register_effect_deserializer::<SendEffect>();
+        let _guard = pure_stage::register_effect_deserializer::<RecvEffect>();
crates/amaru-network/src/connection.rs (1)

200-207: Consider idempotent close (TODO).
Right now a second close raises an error (Line 205). If higher layers ever double-close during shutdown, this gets noisy. Making close idempotent keeps things chill — like letting Neo dodge two bullets without drama.

💡 Possible tweak
-                let connection = resource.lock().remove(&conn).ok_or_else(|| {
-                    // TODO: figure out how to not raise an error for a connection that has simply been closed already
-                    std::io::Error::other(format!("connection {conn} not found for close"))
-                })?;
+                let connection = match resource.lock().remove(&conn) {
+                    Some(connection) => connection,
+                    None => return Ok(()),
+                };
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 28b1739 and a8b87ea.

📒 Files selected for processing (4)
  • crates/amaru-network/src/connection.rs
  • crates/amaru-ouroboros-traits/src/connection.rs
  • crates/amaru-protocols/src/mux.rs
  • crates/pure-stage/tests/functional.rs
🧰 Additional context used
🧠 Learnings (16)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru-protocols/src/chainsync/responder.rs:191-213
Timestamp: 2026-01-11T20:05:19.348Z
Learning: In crates/amaru-protocols/src/chainsync/responder.rs, the chainsync responder intentionally does not support serving headers when the tip is Origin. Amaru is not designed to cold-start a new Cardano blockchain, so the intersect() function correctly fails when tip is Origin without needing special handling. This is a conscious design decision.
📚 Learning: 2025-09-29T16:39:24.001Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:317-325
Timestamp: 2025-09-29T16:39:24.001Z
Learning: In crates/amaru-network/src/mux.rs, the outgoing() method intentionally uses unwrap() after get_mut(&proto_id) as a fail-fast mechanism. This panic is designed to catch programming errors where an actor tries to send on an unregistered protocol, and should not be changed to return a Result since it represents internal code bugs that should terminate the process, not external input that should be handled gracefully.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-09-29T16:38:59.323Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:216-233
Timestamp: 2025-09-29T16:38:59.323Z
Learning: In crates/amaru-network/src/mux.rs, the segment length field uses u16 type which naturally limits values to 65535, matching MAX_SEGMENT_SIZE constant exactly. This provides type-level safety against oversized allocations without needing runtime bounds checking.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-06-14T16:31:53.134Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 263
File: simulation/amaru-sim/src/simulator/simulate.rs:298-300
Timestamp: 2025-06-14T16:31:53.134Z
Learning: StageRef in the pure-stage crate supports serde serialization and deserialization (derives serde::Serialize and serde::Deserialize), enabling it to be used in structs that also derive these traits for TraceBuffer and replay functionality.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2026-01-11T20:38:05.696Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/pure-stage/src/effect.rs:156-162
Timestamp: 2026-01-11T20:38:05.696Z
Learning: For pure-stage Effects::call in crates/pure-stage/src/effect.rs, rkuhn prefers to keep the runtime panic that prevents nested calls for now and only lift/relax this constraint later if/when it becomes necessary.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-12-28T19:26:35.354Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru-protocols/src/blockfetch/mod.rs:151-173
Timestamp: 2025-12-28T19:26:35.354Z
Learning: In crates/amaru-protocols/src/blockfetch/mod.rs, the blockfetch initiator uses .expect() when popping from the request queue on NoBlocks and Done results. These are intentional fail-fast assertions: the protocol state machine guarantees the queue is non-empty when these messages arrive, so an empty queue indicates a protocol violation. A misbehaving peer triggers an erroneous protocol transition that will close the connection (supervision to be implemented in a future PR). This follows the project's fail-fast philosophy for protocol invariants.
<!--

Applied to files:

  • crates/amaru-protocols/src/mux.rs
  • crates/amaru-network/src/connection.rs
  • crates/amaru-ouroboros-traits/src/connection.rs
📚 Learning: 2025-09-29T16:44:14.807Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/protocol.rs:94-106
Timestamp: 2025-09-29T16:44:14.807Z
Learning: In the amaru-network crate protocol.rs file, the correct Cardano mini-protocol ID assignments are: PROTO_N2N_KEEP_ALIVE = 8 and PROTO_N2N_PEER_SHARE = 10, as verified against the network specification by the maintainer.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-12-28T19:39:16.476Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru/src/stages/mod.rs:233-239
Timestamp: 2025-12-28T19:39:16.476Z
Learning: In crates/amaru/src/stages/mod.rs, the network.preload() operation for manager stage has a mailbox capacity of 10 messages. When preloading peers at startup, breaking on the first failure (after potentially filling the mailbox) is intentional design—the mailbox cannot hold more than 10 messages, so continuing to attempt preloads after a failure would be futile and would break the application.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-05-12T14:21:27.470Z
Learnt from: stevana
Repo: pragma-org/amaru PR: 210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-04-20T17:56:39.223Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:0-0
Timestamp: 2025-04-20T17:56:39.223Z
Learning: For mpsc::channel in Tokio-based test code, use buffer sizes larger than 1 (e.g., 8) to avoid potential deadlocks when producers send multiple messages before consumers can process them.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
  • crates/amaru-network/src/connection.rs
📚 Learning: 2025-04-20T17:57:23.233Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-08-12T12:28:24.027Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 372
File: simulation/amaru-sim/src/simulator/mod.rs:410-412
Timestamp: 2025-08-12T12:28:24.027Z
Learning: In the Amaru project, panic statements are acceptable in simulation/test code (like amaru-sim crate) as they help identify configuration issues quickly during development, rather than needing proper error handling like production code.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-05-09T13:09:47.915Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 206
File: crates/pure-stage/src/simulation/running.rs:240-242
Timestamp: 2025-05-09T13:09:47.915Z
Learning: Cloning messages in the pure-stage crate should be avoided for performance reasons. The current implementation in SimulationRunning deliberately avoids duplicating message data structures.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2026-01-11T20:05:19.348Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru-protocols/src/chainsync/responder.rs:191-213
Timestamp: 2026-01-11T20:05:19.348Z
Learning: In crates/amaru-protocols/src/chainsync/responder.rs, the chainsync responder intentionally does not support serving headers when the tip is Origin. Amaru is not designed to cold-start a new Cardano blockchain, so the intersect() function correctly fails when tip is Origin without needing special handling. This is a conscious design decision.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-12-16T21:32:37.668Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/amaru-network/src/handshake/tests.rs:40-47
Timestamp: 2025-12-16T21:32:37.668Z
Learning: In Rust, shadowing a binding with a new let does not drop the previous binding until the end of the scope. All shadowed bindings in a scope are dropped in reverse-declaration order when the scope ends. Therefore, multiple let _guard = register_*() calls will keep all guards alive until the end of the function (or the surrounding scope). When reviewing code, be mindful that resources tied to shadowed bindings persist longer than the most recent binding; to release early, constrain the lifetime in an inner block or explicitly drop guards when appropriate.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
  • crates/amaru-network/src/connection.rs
  • crates/amaru-ouroboros-traits/src/connection.rs
📚 Learning: 2025-09-05T17:30:55.869Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.

Applied to files:

  • crates/amaru-network/src/connection.rs
  • crates/amaru-ouroboros-traits/src/connection.rs
🧬 Code graph analysis (1)
crates/amaru-network/src/connection.rs (3)
crates/amaru-network/src/socket_addr.rs (1)
  • resolve (19-35)
crates/amaru-ouroboros-traits/src/connection.rs (2)
  • initial (38-40)
  • connect (123-127)
crates/amaru-kernel/src/bytes.rs (1)
  • len (72-75)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: End-to-end snapshot tests (preview, 2, 680, true, 15, 60, 10.5.3)
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 182, 15, 60, 10.5.3)
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Build riscv32
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build wasm32
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Benches
  • GitHub Check: Test coverage
  • GitHub Check: Test the CLI
  • GitHub Check: Analyze (rust)
🔇 Additional comments (6)
crates/amaru-protocols/src/mux.rs (6)

15-24: Imports look tidy after the StageRef/NonEmptyBytes swap.
Clean migration to the new types—feels like a smooth “Mass Effect” upgrade from old tech to new.


35-44: Nice addition of register_deserializers() for mux types.
Handy helper and clear scope.


110-129: Doc + Send signature update looks solid.
StageRef-based Send reads clean and the note about WantNext stays crisp.


388-401: Register flow looks good with StageRef handler.
Instrumentation + immediate handler notification feels right.


447-509: Outgoing/next_segment/received/want_next changes are tidy.
Trace-level instrumentation and StageRef usage are consistent across the mux path.


517-620: Per-proto notifier switch to StageRef is clean.
The send notification path is straightforward and easy to follow.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +86 to +104
async fn connect(
addr: Vec<SocketAddr>,
resource: Arc<Mutex<Connections>>,
read_buf_size: usize,
timeout: Duration,
) -> std::io::Result<ConnectionId> {
let (reader, writer) = tokio::time::timeout(timeout, TcpStream::connect(&*addr))
.await??
.into_split();
tracing::debug!(?addr, "connected");
let mut connections = resource.lock();
let id = connections.add_connection(Connection {
reader: Arc::new(AsyncMutex::new((
reader,
BytesMut::with_capacity(read_buf_size),
))),
writer: Arc::new(AsyncMutex::new(writer)),
});
Ok(id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "connection.rs" | head -5

Repository: pragma-org/amaru

Length of output: 233


🏁 Script executed:

cat -n crates/amaru-network/src/connection.rs | head -120

Repository: pragma-org/amaru

Length of output: 4354


🏁 Script executed:

grep -r "tokio::time::timeout" --include="*.rs" | head -20

Repository: pragma-org/amaru

Length of output: 833


🏁 Script executed:

fd "Cargo.toml" crates/amaru-network

Repository: pragma-org/amaru

Length of output: 91


🏁 Script executed:

cat crates/amaru-network/Cargo.toml

Repository: pragma-org/amaru

Length of output: 893


🏁 Script executed:

cat -n crates/amaru-network/src/connection.rs | sed -n '130,165p'

Repository: pragma-org/amaru

Length of output: 1480


🏁 Script executed:

cat -n crates/amaru-network/src/connection.rs | sed -n '190,220p'

Repository: pragma-org/amaru

Length of output: 1215


🏁 Script executed:

cd crates && grep -r "timeout.*\.await" --include="*.rs" -A 2 -B 2 | head -80

Repository: pragma-org/amaru

Length of output: 2993


🏁 Script executed:

# Check if code compiles and what the actual error handling looks like
cd crates/amaru-network && cargo check 2>&1 | head -100

Repository: pragma-org/amaru

Length of output: 169


🏁 Script executed:

# Check what other error handling patterns are used for timeout in the codebase
grep -B 5 -A 5 "timeout.*await" crates/amaru/src/bin/amaru/cmd/fetch_chain_headers.rs

Repository: pragma-org/amaru

Length of output: 484


🏁 Script executed:

# Look at what pure-stage does with timeout
cat -n crates/pure-stage/src/tokio.rs | sed -n '1,100p'

Repository: pragma-org/amaru

Length of output: 3839


🏁 Script executed:

# Search for error handling with timeout in match statements
grep -B 3 -A 3 "match.*timeout" crates -r --include="*.rs" | head -60

Repository: pragma-org/amaru

Length of output: 2680


🏁 Script executed:

# Check if there's a custom error implementation or From impl for Elapsed
rg "impl.*From.*Elapsed\|impl.*From.*Elapsed" crates/amaru-network --type rs

Repository: pragma-org/amaru

Length of output: 86


🏁 Script executed:

# Look at the actual usage - is the code supposed to work?
# Check if there's error mapping elsewhere
grep -r "Elapsed\|timeout" crates/amaru-network/src/connection.rs -B 5 -A 5

Repository: pragma-org/amaru

Length of output: 1955


🏁 Script executed:

# Check tokio version in workspace
cat Cargo.toml | grep -A 10 "\[workspace.dependencies\]"

Repository: pragma-org/amaru

Length of output: 414


🏁 Script executed:

# Get full tokio dependency info
grep -A 2 "^tokio = " Cargo.toml

Repository: pragma-org/amaru

Length of output: 148


🏁 Script executed:

# Check if tokio::time::error::Elapsed has From<Elapsed> for io::Error
# Let's look for any error conversion patterns in the codebase
rg "map_err.*timeout\|map_err.*Elapsed" crates --max-count 20

Repository: pragma-org/amaru

Length of output: 42


🏁 Script executed:

# Let's verify if the double ? operator works with Elapsed
# by checking similar patterns in the codebase
rg "\.await\?\?" crates --max-count 30

Repository: pragma-org/amaru

Length of output: 180


🏁 Script executed:

# Check if Elapsed implements From for io::Error by looking at actual tokio docs
# Let's see if this code actually compiles by checking recent git history
git log --oneline crates/amaru-network/src/connection.rs | head -10

Repository: pragma-org/amaru

Length of output: 125


🏁 Script executed:

# Check if there's any custom error handling trait in the amaru-network crate
grep -r "impl.*From\|impl.*Into" crates/amaru-network/src/ --include="*.rs"

Repository: pragma-org/amaru

Length of output: 42


🏁 Script executed:

# Let's look at the full context again and see if there's error context we're missing
cat -n crates/amaru-network/src/connection.rs | sed -n '86,105p'

Repository: pragma-org/amaru

Length of output: 831


🏁 Script executed:

# Let's verify the actual error by checking if tokio's Elapsed type has a From impl for io::Error
# First, let's create a small test to see what happens
cat > /tmp/test_timeout.rs << 'EOF'
use std::io;
use tokio::time::{timeout, Duration, error::Elapsed};

fn main() {
    // This should fail if Elapsed doesn't convert to io::Error
    let _: Result<i32, io::Error> = Err(Elapsed::new());
}
EOF
cat /tmp/test_timeout.rs

Repository: pragma-org/amaru

Length of output: 262


🏁 Script executed:

# Let's check the Cargo.lock to understand what versions are pinned
git log --oneline -20 -- Cargo.lock | head -5

Repository: pragma-org/amaru

Length of output: 125


🏁 Script executed:

# Let's see if there's a build or test CI configuration
find . -name "*.yml" -o -name "*.yaml" | grep -E "(github|gitlab|ci)" | head -10

Repository: pragma-org/amaru

Length of output: 339


🏁 Script executed:

# Let's check the continuous integration workflow to see if it tests compilation
cat .github/workflows/continuous-integration.yml | head -100

Repository: pragma-org/amaru

Length of output: 3563


🏁 Script executed:

# Check if there are any test files that exercise the connection code
find crates/amaru-network -name "*.rs" -path "*/tests/*" -o -name "*.rs" -path "*/test/*"

Repository: pragma-org/amaru

Length of output: 42


🏁 Script executed:

# Let's look for test modules in the connection.rs file
grep -n "mod test\|#\[test\]\|#\[tokio::test\]" crates/amaru-network/src/connection.rs

Repository: pragma-org/amaru

Length of output: 42


🏁 Script executed:

# Let's verify the actual conversion by checking if tokio provides From impl
# Search for any dependencies that might extend tokio's error handling
grep -r "From.*Elapsed\|From.*timeout" crates/ --include="*.rs"

Repository: pragma-org/amaru

Length of output: 42


🏁 Script executed:

# Let's check if the code actually builds by looking at the most recent changes
git log -1 --stat crates/amaru-network/src/connection.rs

Repository: pragma-org/amaru

Length of output: 428


🏁 Script executed:

# Let's see if there are any compilation issues or error handling patterns documented
grep -r "ErrorKind\|io::Error" crates/amaru-network/src/connection.rs -B 2 -A 2

Repository: pragma-org/amaru

Length of output: 1108


🏁 Script executed:

# Check if the file was added in a recent commit or if it's been there a while
git log --all --oneline -- crates/amaru-network/src/connection.rs | head -20

Repository: pragma-org/amaru

Length of output: 125


🏁 Script executed:

# See if there's a PR or issue discussion about this
git log --all --format="%H %s" -- crates/amaru-network/src/connection.rs | head -5

Repository: pragma-org/amaru

Length of output: 157


🏁 Script executed:

# Let's check the actual Tokio documentation by seeing what versions support what
# For now, let's verify the exact error by checking if Elapsed has conversions
# Create a test to see what the actual type error would be
python3 << 'EOF'
# tokio 1.45.0 - Elapsed does not implement From<io::Error> or Into<io::Error>
# The ? operator requires a From impl from the error type to the target type
# So Elapsed -> io::Error requires a From implementation

# Let's verify by checking if there's custom error handling
import subprocess
result = subprocess.run(['rg', 'io::Error::.*timeout', 'crates/', '-A', '2'], 
                       capture_output=True, text=True)
print("Custom timeout error creations:")
print(result.stdout)
EOF

Repository: pragma-org/amaru

Length of output: 146


🏁 Script executed:

# Let's verify the actual PR or recent history more carefully
git show a8b87eaf:crates/amaru-network/src/connection.rs | head -120 | tail -40

Repository: pragma-org/amaru

Length of output: 1177


🏁 Script executed:

# Check if there are any test runs that would have caught this
git log --all --format="%H %s %b" -- crates/amaru-network/src/connection.rs | head -30

Repository: pragma-org/amaru

Length of output: 209


🏁 Script executed:

# Search for any custom error implementations in the workspace
find . -name "*.rs" -type f -exec grep -l "impl.*From.*Elapsed\|Elapsed.*io::Error" {} \;

Repository: pragma-org/amaru

Length of output: 42


🌐 Web query:

tokio::time::error::Elapsed From impl std::io::Error conversion

💡 Result:

Short answer: tokio::time::error::Elapsed does not implement Into/Fromstd::io::Error for you — convert it manually, e.g. by creating an io::Error with ErrorKind::TimedOut.

Example:
use std::io::{Error, ErrorKind};
use tokio::time::error::Elapsed;

fn elapsed_to_io(e: Elapsed) -> Error {
Error::new(ErrorKind::TimedOut, e)
}

Docs: tokio::time::error::Elapsed. [1] std::io::Error / Error::new and std::io::ErrorKind::TimedOut. [2][3]

Sources:
[1] tokio::time::error::Elapsed (docs).
[2] std::io::Error (Error::new) (docs).
[3] std::io::ErrorKind::TimedOut (docs).


🏁 Script executed:

# Let's check the pattern more carefully - maybe there's a custom From impl somewhere
grep -r "impl From" crates/amaru-network --include="*.rs" -A 3

Repository: pragma-org/amaru

Length of output: 42


🏁 Script executed:

# Check if the code actually uses the timeout result - maybe the ?? is doing something else
git show a8b87eaf:crates/amaru-network/src/connection.rs | sed -n '86,165p'

Repository: pragma-org/amaru

Length of output: 2606


Map timeout errors explicitly — this won't compile as-is.

tokio::time::timeout returns Result<T, Elapsed>, and Elapsed doesn't implement From<std::io::Error>. The double ? in .await?? assumes the first ? can convert Elapsed into std::io::Error, which it can't. Like trying to sneak into a pub with the wrong ID, mate — you'll get bounced at the door. Gotta make the conversion explicit so the compiler knows you're serious.

🛠️ Suggested fix
-    let (reader, writer) = tokio::time::timeout(timeout, TcpStream::connect(&*addr))
-        .await??
-        .into_split();
+    let stream = tokio::time::timeout(timeout, TcpStream::connect(&*addr))
+        .await
+        .map_err(|_| std::io::Error::new(std::io::ErrorKind::TimedOut, "connect timeout"))??;
+    let (reader, writer) = stream.into_split();
-                tokio::time::timeout(
-                    Duration::from_secs(100),
-                    connection.lock().await.write_all(&data),
-                )
-                .await??;
+                tokio::time::timeout(
+                    Duration::from_secs(100),
+                    connection.lock().await.write_all(&data),
+                )
+                .await
+                .map_err(|_| std::io::Error::new(std::io::ErrorKind::TimedOut, "send timeout"))??;

Also applies to: 155-159

🤖 Prompt for AI Agents
In `@crates/amaru-network/src/connection.rs` around lines 86 - 104, The call using
tokio::time::timeout in async fn connect (and the similar call around the other
occurrence) currently uses `.await??` which fails because Elapsed can't be
converted into std::io::Error; update the timeout handling to explicitly map the
timeout error into an std::io::Error (e.g. with map_err(|e|
std::io::Error::new(std::io::ErrorKind::TimedOut, e)) or matching the Result and
returning Err(std::io::Error::new(...))) before unwrapping the TcpStream; apply
the same explicit conversion at the other timeout site referenced in the review
so the function compiles and returns a std::io::Result<ConnectionId>.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@crates/amaru-protocols/src/manager.rs`:
- Around line 74-77: The doc comment after the bullet list in manager.rs is
being treated as a lazy continuation by Clippy; add a blank line between the
list and the following paragraph so the sentence starting "A peer can be
added..." becomes a separate paragraph. Edit the doc comment above the Manager
operations (around the lines mentioning AddPeer and RemovePeer) to insert a
single empty line before the "A peer can be added right after being removed..."
sentence so Clippy's doc-lazy-continuation lint is satisfied.
- Around line 104-134: The Connect branch in ManagerMessage is awaiting
Network::new(...).connect(...) inside the manager stage which blocks processing
of other messages; move the blocking connect work out of the manager by spawning
an asynchronous task/future (via your executor helper, e.g., eff.spawn_task or
similar) that performs Network::connect(addr, Duration::from_secs(10)) and then
sends a result message back to the manager (e.g., ManagerMessage::ConnectResult
or reuse ManagerMessage::Connect with a result wrapper), so the manager only
updates manager.peers and returns immediately; keep the protocol invariant check
(the assert_eq! against ConnectionState::Scheduled) in the manager’s handler
that processes the connect result message rather than after the await here, and
preserve the retry logic by having the spawned task schedule
ManagerMessage::Connect(peer) on failure via eff.schedule_after.
♻️ Duplicate comments (3)
crates/amaru-network/src/connection.rs (1)

86-104: Fix timeout error mapping (this won’t compile as-is).

timeout(...).await?? can’t convert Elapsed into std::io::Error. You’ll need to map the timeout error explicitly, or the compiler will chuck a wobbly like a dodgy frame rate in Cyberpunk.

🎬 Suggested fix (explicit TimedOut mapping)
-    let (reader, writer) = tokio::time::timeout(timeout, TcpStream::connect(&*addr))
-        .await??
-        .into_split();
+    let stream = tokio::time::timeout(timeout, TcpStream::connect(&*addr))
+        .await
+        .map_err(|_| std::io::Error::new(std::io::ErrorKind::TimedOut, "connect timeout"))??;
+    let (reader, writer) = stream.into_split();
-                tokio::time::timeout(
-                    Duration::from_secs(100),
-                    connection.lock().await.write_all(&data),
-                )
-                .await??;
+                tokio::time::timeout(
+                    Duration::from_secs(100),
+                    connection.lock().await.write_all(&data),
+                )
+                .await
+                .map_err(|_| std::io::Error::new(std::io::ErrorKind::TimedOut, "send timeout"))??;

Also applies to: 155-159

crates/amaru-ouroboros-traits/src/connection.rs (1)

15-25: Import Future for BoxFuture alias.

Right now the alias references Future without bringing it into scope — this will fail to compile faster than a speedrun skip gone wrong.

🛠️ Minimal fix
 use std::{
     fmt,
+    future::Future,
     net::{IpAddr, Ipv4Addr, Ipv6Addr, SocketAddr, SocketAddrV4, SocketAddrV6},
     num::NonZeroUsize,
     pin::Pin,
     sync::Arc,
     time::Duration,
 };
crates/pure-stage/src/simulation/running/mod.rs (1)

1123-1141: Confirm resume_call_send’s error mapping still matches the internal semantics.
Line 1129+ maps false to “stage was not waiting for a call effect”. If resume_call_send_internal can also return false for “callee terminated”, the public error is misleading and can mask the real cause. Can you double-check resume_call_send_internal and tighten the mapping if needed? This is the same concern raised earlier, so I’m just making sure it didn’t slip back in.

#!/bin/bash
# Inspect resume_call_send_internal return semantics.
rg -n "fn resume_call_send_internal" crates/pure-stage/src/simulation/running/resume.rs -C4
rg -n "resume_call_send_internal" crates/pure-stage/src/simulation/running/resume.rs -C6
🧹 Nitpick comments (2)
crates/amaru-protocols/src/mux.rs (1)

1054-1148: Good addition for TokioBuilder coverage.

This new test is like having both a simulation mode and a live playtest – covering both SimulationBuilder and TokioBuilder paths is solid. One small thing though: unlike test_tcp which asserts on the termination reason (Line 782: assert_eq!(&t(join_handle).await.unwrap(), mux.name())), this test just calls running.join() without checking what stage terminated or why. If the error case matters for validation, you might want to add a similar assertion here for consistency.

💡 Optional: Add termination assertion for parity with test_tcp
         // wrong protocol ID
         buf[5] += 1;
         t(tcp.write_all(&buf)).await.unwrap();
         t(tcp.flush()).await.unwrap();
-        t(running.join()).await;
+        let terminated = t(running.join()).await;
+        // Optionally assert on which stage terminated, similar to test_tcp
 
         trace_guard.defuse();
crates/pure-stage/src/simulation/running/mod.rs (1)

205-223: Return value should reflect actual wakeups, not schedule size deltas.
Right now Line 205–223 returns self.scheduled.len() != initial_scheduled_nb. If a wakeup schedules a replacement, the length can end up unchanged and you’ll report “no wakeups” even though you just ran some — a bit of a Mad Max “witness me” moment for run_until_*. Suggest keying off wakeups directly.

♻️ Proposed tweak
-        self.scheduled.len() != initial_scheduled_nb
+        !wakeups.is_empty()
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0026c7c and 35118a5.

📒 Files selected for processing (9)
  • crates/amaru-network/src/connection.rs
  • crates/amaru-ouroboros-traits/src/connection.rs
  • crates/amaru-protocols/src/manager.rs
  • crates/amaru-protocols/src/mux.rs
  • crates/pure-stage/src/simulation/running/mod.rs
  • crates/pure-stage/src/stage_ref.rs
  • crates/pure-stage/tests/functional.rs
  • crates/pure-stage/tests/simulation.rs
  • simulation/amaru-sim/src/simulator/run.rs
🧰 Additional context used
🧠 Learnings (26)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 374
File: crates/amaru-stores/src/in_memory/mod.rs:427-433
Timestamp: 2025-08-18T08:10:32.640Z
Learning: The MemoryStore in crates/amaru-stores/src/in_memory/mod.rs is planned for a major revamp, so unimplemented methods like set_proposals_roots and set_constitution are intentionally left as placeholders until the revamp is complete.
📚 Learning: 2025-12-28T19:39:16.476Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru/src/stages/mod.rs:233-239
Timestamp: 2025-12-28T19:39:16.476Z
Learning: In crates/amaru/src/stages/mod.rs, the network.preload() operation for manager stage has a mailbox capacity of 10 messages. When preloading peers at startup, breaking on the first failure (after potentially filling the mailbox) is intentional design—the mailbox cannot hold more than 10 messages, so continuing to attempt preloads after a failure would be futile and would break the application.

Applied to files:

  • crates/amaru-protocols/src/manager.rs
  • crates/amaru-protocols/src/mux.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-12-28T19:26:35.354Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru-protocols/src/blockfetch/mod.rs:151-173
Timestamp: 2025-12-28T19:26:35.354Z
Learning: In crates/amaru-protocols/src/blockfetch/mod.rs, the blockfetch initiator uses .expect() when popping from the request queue on NoBlocks and Done results. These are intentional fail-fast assertions: the protocol state machine guarantees the queue is non-empty when these messages arrive, so an empty queue indicates a protocol violation. A misbehaving peer triggers an erroneous protocol transition that will close the connection (supervision to be implemented in a future PR). This follows the project's fail-fast philosophy for protocol invariants.
<!--

Applied to files:

  • crates/amaru-protocols/src/manager.rs
  • crates/amaru-ouroboros-traits/src/connection.rs
  • crates/amaru-network/src/connection.rs
  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-09-05T17:30:55.869Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.

Applied to files:

  • crates/amaru-protocols/src/manager.rs
  • crates/amaru-ouroboros-traits/src/connection.rs
  • crates/amaru-network/src/connection.rs
📚 Learning: 2025-09-29T16:39:24.001Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:317-325
Timestamp: 2025-09-29T16:39:24.001Z
Learning: In crates/amaru-network/src/mux.rs, the outgoing() method intentionally uses unwrap() after get_mut(&proto_id) as a fail-fast mechanism. This panic is designed to catch programming errors where an actor tries to send on an unregistered protocol, and should not be changed to return a Result since it represents internal code bugs that should terminate the process, not external input that should be handled gracefully.

Applied to files:

  • crates/amaru-protocols/src/manager.rs
  • crates/amaru-network/src/connection.rs
  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-08-08T14:39:50.527Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/amaru-kernel/src/borrowed_datum.rs:32-39
Timestamp: 2025-08-08T14:39:50.527Z
Learning: In the amaru project, when converting BorrowedDatumOption::Data to an owned DatumOption in crates/amaru-kernel/src/borrowed_datum.rs, the call `.unwrap()` refers to pallas’s KeepRaw::unwrap, which is infallible (always returns the inner value) and is not a panic risk. Future reviews should not flag this unwrap as dangerous.

Applied to files:

  • crates/amaru-protocols/src/manager.rs
📚 Learning: 2025-09-01T14:23:45.389Z
Learnt from: abailly
Repo: pragma-org/amaru PR: 416
File: crates/amaru-consensus/src/consensus/select_chain.rs:53-57
Timestamp: 2025-09-01T14:23:45.389Z
Learning: In the Amaru consensus system, the peer set in SyncTracker is static/predetermined, not dynamic. If a caught_up signal is received for an unknown peer, it should be logged as a warning rather than auto-inserted, as this indicates a potential configuration issue or system anomaly.

Applied to files:

  • crates/amaru-protocols/src/manager.rs
📚 Learning: 2025-12-16T21:32:37.668Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/amaru-network/src/handshake/tests.rs:40-47
Timestamp: 2025-12-16T21:32:37.668Z
Learning: In Rust, shadowing a binding with a new let does not drop the previous binding until the end of the scope. All shadowed bindings in a scope are dropped in reverse-declaration order when the scope ends. Therefore, multiple let _guard = register_*() calls will keep all guards alive until the end of the function (or the surrounding scope). When reviewing code, be mindful that resources tied to shadowed bindings persist longer than the most recent binding; to release early, constrain the lifetime in an inner block or explicitly drop guards when appropriate.

Applied to files:

  • crates/amaru-protocols/src/manager.rs
  • crates/amaru-ouroboros-traits/src/connection.rs
  • crates/amaru-network/src/connection.rs
  • crates/amaru-protocols/src/mux.rs
  • crates/pure-stage/src/stage_ref.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-04-20T17:56:39.223Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:0-0
Timestamp: 2025-04-20T17:56:39.223Z
Learning: For mpsc::channel in Tokio-based test code, use buffer sizes larger than 1 (e.g., 8) to avoid potential deadlocks when producers send multiple messages before consumers can process them.

Applied to files:

  • crates/amaru-network/src/connection.rs
  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-08-08T14:34:06.105Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/minicbor-extra/src/lib.rs:50-55
Timestamp: 2025-08-08T14:34:06.105Z
Learning: Project uses Rust 1.88 stable; using Result::is_err_and is acceptable and should not be flagged as unstable. In particular, crates/minicbor-extra/src/lib.rs relies on is_err_and when checking end_of_input; future reviews should not suggest replacing it for stability reasons.

Applied to files:

  • crates/amaru-network/src/connection.rs
📚 Learning: 2025-09-29T16:38:59.323Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/mux.rs:216-233
Timestamp: 2025-09-29T16:38:59.323Z
Learning: In crates/amaru-network/src/mux.rs, the segment length field uses u16 type which naturally limits values to 65535, matching MAX_SEGMENT_SIZE constant exactly. This provides type-level safety against oversized allocations without needing runtime bounds checking.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-06-14T16:31:53.134Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 263
File: simulation/amaru-sim/src/simulator/simulate.rs:298-300
Timestamp: 2025-06-14T16:31:53.134Z
Learning: StageRef in the pure-stage crate supports serde serialization and deserialization (derives serde::Serialize and serde::Deserialize), enabling it to be used in structs that also derive these traits for TraceBuffer and replay functionality.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
  • crates/pure-stage/src/stage_ref.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2026-01-11T20:38:12.301Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/pure-stage/src/effect.rs:156-162
Timestamp: 2026-01-11T20:38:12.301Z
Learning: For pure-stage Effects::call in crates/pure-stage/src/effect.rs, rkuhn prefers to keep the runtime panic that prevents nested calls for now and only lift/relax this constraint later if/when it becomes necessary.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
  • crates/pure-stage/src/stage_ref.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-09-29T16:44:14.807Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 471
File: crates/amaru-network/src/protocol.rs:94-106
Timestamp: 2025-09-29T16:44:14.807Z
Learning: In the amaru-network crate protocol.rs file, the correct Cardano mini-protocol ID assignments are: PROTO_N2N_KEEP_ALIVE = 8 and PROTO_N2N_PEER_SHARE = 10, as verified against the network specification by the maintainer.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-05-12T14:21:27.470Z
Learnt from: stevana
Repo: pragma-org/amaru PR: 210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-04-20T17:57:23.233Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-08-12T12:28:24.027Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 372
File: simulation/amaru-sim/src/simulator/mod.rs:410-412
Timestamp: 2025-08-12T12:28:24.027Z
Learning: In the Amaru project, panic statements are acceptable in simulation/test code (like amaru-sim crate) as they help identify configuration issues quickly during development, rather than needing proper error handling like production code.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-05-09T13:09:47.915Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 206
File: crates/pure-stage/src/simulation/running.rs:240-242
Timestamp: 2025-05-09T13:09:47.915Z
Learning: Cloning messages in the pure-stage crate should be avoided for performance reasons. The current implementation in SimulationRunning deliberately avoids duplicating message data structures.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
  • crates/pure-stage/src/stage_ref.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2026-01-11T20:05:26.594Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru-protocols/src/chainsync/responder.rs:191-213
Timestamp: 2026-01-11T20:05:26.594Z
Learning: In crates/amaru-protocols/src/chainsync/responder.rs, the chainsync responder intentionally does not support serving headers when the tip is Origin. Amaru is not designed to cold-start a new Cardano blockchain, so the intersect() function correctly fails when tip is Origin without needing special handling. This is a conscious design decision.

Applied to files:

  • crates/amaru-protocols/src/mux.rs
📚 Learning: 2025-05-05T08:15:24.192Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 206
File: crates/pure-stage/src/simulation/state.rs:4-8
Timestamp: 2025-05-05T08:15:24.192Z
Learning: The `State` trait in the pure-stage crate already requires `Send` with its definition: `pub trait State: Any + fmt::Debug + Send + 'static`, making additional `+ Send` bounds redundant when using `Box<dyn State>`.

Applied to files:

  • crates/pure-stage/src/stage_ref.rs
📚 Learning: 2025-12-16T21:50:46.690Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/pure-stage/src/adapter.rs:67-100
Timestamp: 2025-12-16T21:50:46.690Z
Learning: In the pure-stage crate's adapter system (crates/pure-stage/src/adapter.rs), adapters cannot form cycles because an existing adapter cannot be repointed after creation. The Adapter's target field is immutable, preventing the formation of loops in the adapter chain.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-06-14T16:41:13.061Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 263
File: crates/pure-stage/src/simulation/running.rs:868-875
Timestamp: 2025-06-14T16:41:13.061Z
Learning: In the pure-stage simulation framework, the effect air-lock protocol is designed so that when a stage is polled, the stage implementation consumes/takes the value from the effect lock during polling. There's no need to manually clear the effect lock after Poll::Ready because "the other side will have taken the value out" - this is by design, not a bug.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T20:19:07.396Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/tests/simulation.rs:20-28
Timestamp: 2025-08-20T20:19:07.396Z
Learning: Waker::noop() was stabilized in Rust 1.85.0 (released February 20, 2025) and is available in std::task::Waker, so no external dependencies like futures-task are needed for creating no-op wakers in tests.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T20:19:07.396Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/tests/simulation.rs:20-28
Timestamp: 2025-08-20T20:19:07.396Z
Learning: Waker::noop() was stabilized in Rust 1.85.0 (released February 2025) and is available in std::task::Waker, so no external dependencies like futures-task are needed for creating no-op wakers in tests.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T20:18:50.214Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/tests/simulation.rs:459-469
Timestamp: 2025-08-20T20:18:50.214Z
Learning: Rust 1.85 stabilized the Waker::noop() API, making it the preferred way to create a no-op waker instead of using futures_task::noop_waker_ref(). Code using Waker::noop() in modern Rust codebases is correct and doesn't need to be changed to use the futures_task alternative.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T19:37:32.510Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/src/effect.rs:204-223
Timestamp: 2025-08-20T19:37:32.510Z
Learning: In the pure-stage framework, the terminate() method uses never() which panics if called. This is intentional design: if terminate() ever returns, it indicates a serious framework bug that should immediately panic rather than allowing potentially corrupted execution to continue. The panic serves as a failsafe to surface framework issues.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
🧬 Code graph analysis (3)
crates/amaru-protocols/src/manager.rs (2)
crates/amaru-protocols/src/chainsync/initiator.rs (1)
  • new (67-84)
crates/amaru-protocols/src/connection.rs (2)
  • new (49-66)
  • stage (124-154)
crates/amaru-network/src/connection.rs (3)
crates/amaru-network/src/socket_addr.rs (1)
  • resolve (19-35)
crates/amaru-ouroboros-traits/src/connection.rs (6)
  • initial (38-40)
  • connect (123-127)
  • connect_addrs (129-133)
  • send (135-139)
  • recv (141-145)
  • close (147-147)
crates/amaru-kernel/src/bytes.rs (1)
  • len (72-75)
crates/amaru-protocols/src/mux.rs (2)
crates/pure-stage/src/serde.rs (7)
  • register_data_deserializer (162-174)
  • boxed (179-181)
  • boxed (385-387)
  • boxed (535-537)
  • new (31-33)
  • new (376-378)
  • from (400-408)
crates/pure-stage/src/stage_ref.rs (1)
  • named_for_tests (100-102)
🪛 GitHub Actions: Coding Practices
crates/amaru-protocols/src/manager.rs

[error] 77-77: doc list item without indentation. If this is supposed to be its own paragraph, add a blank line. Clippy: doc-lazy-continuation.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: End-to-end snapshot tests (preview, 2, 680, true, 15, 60, 10.5.3)
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 182, 15, 60, 10.5.3)
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build riscv32
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build wasm32
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Benches
  • GitHub Check: Analyze (rust)
  • GitHub Check: Test the CLI
  • GitHub Check: Test coverage
🔇 Additional comments (25)
crates/amaru-protocols/src/mux.rs (15)

15-24: LGTM!

G'day mate! These import changes look ripper – they're properly aligned with the CallRefStageRef migration happening across the codebase. Like swapping out your old NES for a Switch, same spirit but modernised internals.


35-44: LGTM!

This is a solid addition, like a well-organized inventory screen in an RPG. Registering all the mux-related deserializers in one clean function makes life easier for consumers. All the types here properly derive the serde traits, so you're golden.


113-128: Nice doc improvement and API update.

Love the clarified documentation about WantNext – it's like adding proper tooltips to a UI, helps folks understand the flow without diving into the source like it's a Dark Souls wiki. The CallRefStageRef swap here is consistent with the rest of the PR.


388-401: LGTM!

Instrumentation choices here are spot on, mate. Keeping registration at DEBUG level and skipping the bulky params – that's proper observability hygiene without cluttering logs like a messy inventory.


447-456: LGTM!

TRACE level for the outgoing path is the right call – like setting your minimap zoom just right, you don't want it cluttering up the main display unless you're specifically looking for it. The skip_all with explicit fields pattern is clean.


458-478: LGTM!

The TRACE level update here is sweet – keeps your logs clean during normal ops but still gives you the visibility when you crank up the verbosity, like enabling developer console in a game. The fail-fast .expect() on Line 469 is intentional as per project philosophy.


480-493: LGTM!

Clean instrumentation, mate! Logging the byte length without the actual content is a pro move – like showing your loadout without revealing your secret combo moves.


495-509: LGTM!

TRACE level for want_next makes sense – it's like the background music volume, you want it there but not drowning out the important dialogue.


512-522: LGTM!

The notifiers field update from CallRef to StageRef is consistent with the broader migration. All your ducks are in a row here.


604-608: LGTM!

Signature update is solid. Like upgrading your gear in a looter – same slot, better stats.


610-626: LGTM!

The switch from eff.respond() to eff.send() on Line 620 is the correct adaptation for StageRef messaging. The notification logic for sent bytes is preserved – like patching a game without breaking the core mechanics.


629-653: LGTM!

Test imports are all lined up nicely – got your TokioConnections, your ConnectionResource, and your TokioBuilder ready to rock. It's like loading up your inventory before a boss fight.


672-785: LGTM!

The test_tcp updates are looking sharp, mate! The migration to TokioConnections and StageRef-based messaging is clean. The byte layout assertion on Line 738 ([1, 1, 0, 3, 1, 24, 33]) correctly expects proto ID 257 (0x0101) for PROTO_TEST, length 3, and the payload bytes. Proper unit testing – like doing a speedrun but actually verifying the strats work.


804-804: Good follow-up on the naming.

I see ConnectionId::initial() here – nice one addressing the earlier review feedback about the naming. Much clearer than default() for communicating intent. It's like renaming your save file from "asdf" to "Main Campaign".


833-1021: LGTM!

The test_muxing simulation test is a proper stress test for the mux logic. All the helpers are updated for StageRef, the round-robin verification across protocols is thorough, and the buffer overflow test at the end is a nice edge case check. It's like having a QA team that actually plays your game instead of just checking boxes.

crates/pure-stage/src/stage_ref.rs (6)

15-16: Imports fit the new payload bits.
Looks tidy — like swapping in the right loadout before a boss fight.


27-32: Sender handle now initializes extra cleanly.
Simple and consistent — nice.


37-43: Confirm extra being skipped in serde is OK for trace/replay.
Since extra won’t survive (de)serialization, please sanity‑check that replay/TraceBuffer flows don’t rely on it — otherwise we may need a serializable wrapper or a doc note. Based on learnings, StageRef is serialized for TraceBuffer/replay.


52-60: Eq/Clone updates look spot‑on.
Cloning extra via Arc feels like the right trade‑off.


84-114: Helper APIs stay consistent with extra.
Nice flow — feels like a clean save point in a long RPG.


171-176: Test scaffolding updated correctly.
All good here.

crates/amaru-protocols/src/manager.rs (4)

84-103: LGTM! AddPeer state handling is solid.

Nice work here, legend! The state machine handles all the edge cases like a proper Souls game checkpoint system — whether you're already connected, scheduled, disconnecting, or fresh off the boat, there's a clear path forward. The transition from Disconnecting to Scheduled while letting the old connection clean itself up is a particularly elegant touch.


158-173: LGTM! RemovePeer correctly transitions to Disconnecting.

Top shelf, mate! The state transition to Disconnecting before awaiting the Disconnect send ensures that when ConnectionDied arrives later, it knows to clean up rather than schedule a reconnect. That was a sneaky bug that got squashed like a Goomba.

The handling of Scheduled and Disconnecting states by removing them directly is also correct — no active connection means nothing to wait for.


174-200: LGTM! ConnectionDied handles stale messages correctly.

Absolute ripper implementation here! The conn_id check at line 183 is the key defensive move that prevents stale ConnectionDied messages from corrupting a newer connection's state — like having a save-state checkpoint that doesn't get overwritten by an old autosave.

The unconditional socket close before state checks ensures resources get released regardless of the current peer state. Clean, defensive, and correct.


201-218: FetchBlocks works, but the error signaling could use some love down the track.

Fair dinkum, the unwrap issue from before is sorted — nice one! The if-let pattern is much safer than that previous .unwrap() that could've sent you straight to the Game Over screen.

The existing comment from past reviews about distinguishing "peer not found" from "no blocks in range" is still relevant but not blocking. When you circle back to polish this up (like a New Game+ run), consider wrapping Blocks in a Result or adding an error variant so callers know what actually went wrong.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +104 to +134
ManagerMessage::Connect(peer) => {
// TODO(network) slow connection will block the manager, should delegate to a child stage
let entry = match manager.peers.get_mut(&peer) {
Some(ConnectionState::Connected(..)) => {
tracing::debug!(%peer, "discarding connection request, already connected");
return manager;
}
Some(entry @ ConnectionState::Scheduled) => entry,
Some(ConnectionState::Disconnecting) => {
tracing::debug!(%peer, "discarding connection request, already disconnecting");
return manager;
}
None => {
tracing::debug!(%peer, "discarding connection request, not added");
return manager;
}
};
let addr = ToSocketAddrs::String(peer.to_string());
let conn_id = match Network::new(&eff)
.connect(addr, Duration::from_secs(10))
.await
{
Ok(conn_id) => conn_id,
Err(err) => {
tracing::error!(?err, %peer, "failed to connect to peer");
eff.schedule_after(ManagerMessage::Connect(peer), Duration::from_secs(10))
.await;
assert_eq!(*entry, ConnectionState::Scheduled);
return manager;
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's understand the project structure and find pure-stage references
fd -t f "Cargo.toml" | head -5

Repository: pragma-org/amaru

Length of output: 195


🏁 Script executed:

# Search for pure-stage in the codebase
rg "pure" --type rust -i | head -20

Repository: pragma-org/amaru

Length of output: 1908


🏁 Script executed:

# Look at the manager.rs file to understand context around the assert and TODO
fd -t f "manager.rs" | grep -E "amaru.*manager\.rs"

Repository: pragma-org/amaru

Length of output: 97


🏁 Script executed:

# Check if there's documentation about the manager or message processing model
rg "ManagerMessage" --type rust -B 2 -A 2 | head -40

Repository: pragma-org/amaru

Length of output: 3043


🏁 Script executed:

# Look for other instances of assert_eq or similar patterns in manager context
rg "assert_eq|assert!" --type rust crates/amaru/src/stages/manager.rs -B 2 -A 2 | head -50

Repository: pragma-org/amaru

Length of output: 134


🏁 Script executed:

# Look at the manager.rs file (correct path: crates/amaru-protocols)
cat -n crates/amaru-protocols/src/manager.rs | head -160

Repository: pragma-org/amaru

Length of output: 7668


🏁 Script executed:

# Search for assert patterns in the correct manager.rs file
rg "assert" crates/amaru-protocols/src/manager.rs -B 3 -A 3

Repository: pragma-org/amaru

Length of output: 402


🏁 Script executed:

# Look at the stage function signature and pure-stage integration
rg "pure_stage|stage\(" crates/amaru-protocols/src/manager.rs -B 2 -A 2

Repository: pragma-org/amaru

Length of output: 812


🏁 Script executed:

# Check the Cargo.toml to see pure_stage dependency
cat crates/amaru-protocols/Cargo.toml

Repository: pragma-org/amaru

Length of output: 1055


🏁 Script executed:

# Let me check the pure_stage crate documentation or usage patterns
rg "pure_stage" --type rust -A 3 | head -80

Repository: pragma-org/amaru

Length of output: 5816


🏁 Script executed:

# Look for other async/await patterns in manager.rs to understand state mutation patterns
cat -n crates/amaru-protocols/src/manager.rs | sed -n '78,170p'

Repository: pragma-org/amaru

Length of output: 4955


🏁 Script executed:

# Check if there are other examples of assert_eq after await in the codebase to understand the pattern
rg "assert" --type rust -B 5 -A 2 | grep -A 7 "\.await"

Repository: pragma-org/amaru

Length of output: 33934


🏁 Script executed:

# Search for documentation or comments about pure_stage's message processing model
rg "sequential|concurrent|processing" crates/amaru-protocols/src/manager.rs -B 2 -A 2

Repository: pragma-org/amaru

Length of output: 42


🏁 Script executed:

# Check the Cargo.lock or workspace version to understand pure_stage version
fd Cargo.lock

Repository: pragma-org/amaru

Length of output: 114


🏁 Script executed:

# Look for pure-stage documentation or source comments about message processing
cat -n crates/pure-stage/src/lib.rs 2>/dev/null | head -100

Repository: pragma-org/amaru

Length of output: 2004


🏁 Script executed:

# Check if there's any documentation about sequential message processing
rg "sequential|single-threaded|one message" crates/pure-stage --type rust -B 2 -A 2

Repository: pragma-org/amaru

Length of output: 1081


🏁 Script executed:

# Look at how other stages handle assert after await to understand the pattern
rg "assert.*after\|after.*assert" crates/amaru-protocols --type rust -B 3 -A 3 | head -40

Repository: pragma-org/amaru

Length of output: 42


🏁 Script executed:

# Examine the network connect operation to see if it could mutate manager state
cat -n crates/amaru-protocols/src/network_effects.rs | head -80

Repository: pragma-org/amaru

Length of output: 3167


🏁 Script executed:

# Look at ConnectionState to understand the size and whether reference invalidation is possible
rg "enum ConnectionState" crates/amaru-protocols/src/manager.rs -A 6

Repository: pragma-org/amaru

Length of output: 280


🏁 Script executed:

# Check if there are any other places that modify the peers HashMap during Connect handling
rg "peers\.remove\|peers\.insert" crates/amaru-protocols/src/manager.rs -B 2 -A 2

Repository: pragma-org/amaru

Length of output: 42


🏁 Script executed:

# Verify that pure-stage is indeed single-message-at-a-time per stage
rg "single.threaded|message.*queue|actor" crates/pure-stage/src --type rust -B 1 -A 1 | head -40

Repository: pragma-org/amaru

Length of output: 2551


The TODO is legit—blocking connect could stall the manager, but the assert is sound.

Mate, you've nailed it with that TODO(network) comment on line 105. If the connection takes its time (or the 10s timeout keeps firing for multiple peers), the manager stage gets stuck like a broken lift—can't process any other peer operations while waiting. It's a real tradeoff between simplicity and responsiveness.

The assert_eq! on line 131 is doing proper work though. Pure-stage processes messages sequentially per stage (actor model), so while we're awaiting that connect, no other message can sneak in and mutate the peers map. The assert verifies a protocol invariant—if the entry ever wasn't Scheduled after the await, it'd signal a serious bug that warrants a fail-fast. Solid defensive programming that fits the project's philosophy.

🤖 Prompt for AI Agents
In `@crates/amaru-protocols/src/manager.rs` around lines 104 - 134, The Connect
branch in ManagerMessage is awaiting Network::new(...).connect(...) inside the
manager stage which blocks processing of other messages; move the blocking
connect work out of the manager by spawning an asynchronous task/future (via
your executor helper, e.g., eff.spawn_task or similar) that performs
Network::connect(addr, Duration::from_secs(10)) and then sends a result message
back to the manager (e.g., ManagerMessage::ConnectResult or reuse
ManagerMessage::Connect with a result wrapper), so the manager only updates
manager.peers and returns immediately; keep the protocol invariant check (the
assert_eq! against ConnectionState::Scheduled) in the manager’s handler that
processes the connect result message rather than after the await here, and
preserve the retry logic by having the spawned task schedule
ManagerMessage::Connect(peer) on failure via eff.schedule_after.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@crates/pure-stage/src/simulation/running/mod.rs`:
- Around line 233-243: The assert in schedule_wakeup (ScheduleId.time() >
self.clock.now()) can panic if Effect::Wait constructs a wakeup at now when
given Duration::ZERO; update the Effect::Wait handling (the code that computes
id = ScheduleId::new(now + duration) in the Effect::Wait branch) to validate the
duration is strictly positive before computing now + duration (reject or return
an error when duration == Duration::ZERO), or alternatively normalize zero to a
minimum positive duration, and document this behavior; ensure the check
references the Effect::Wait handler and ScheduleId construction so
Duration::ZERO never reaches schedule_wakeup.

In `@crates/pure-stage/src/stagegraph.rs`:
- Around line 45-49: The Display implementation for ScheduleId prints fields in
the wrong order: ScheduleId is (Instant, u64) but the format string "id {} at
{}" expects (id, time); update the impl fmt::Display for ScheduleId (the fmt
method using fmt::Formatter and write!) to print the u64 id first and the
Instant second (e.g., use self.1 then self.0 or destructure into (instant, id)
and format "id {} at {}" with id then instant) so the output matches "id <u64>
at <Instant>".
🧹 Nitpick comments (2)
crates/pure-stage/src/simulation/running/scheduled_runnables.rs (1)

47-56: The Option::iter().all() idiom is clever but a wee bit cryptic.

Fair play, this works correctly - max_time.iter() produces an empty iterator when None (so all() returns true), or a single-element iterator when Some. It's like a hidden easter egg in the code that only seasoned Rustaceans will spot immediately.

Consider adding a brief comment for future maintainers, or alternatively using a more explicit pattern:

💡 Alternative for clarity (optional)
     pub fn wakeup(&mut self, max_time: Option<Instant>) -> Option<(Instant, Runnable)> {
         let id = self
             .by_id
             .first_key_value()
-            .and_then(|(id, _r)| max_time.iter().all(|t| id.time() <= *t).then_some(*id))?;
+            .and_then(|(id, _r)| {
+                // If max_time is None, always proceed; otherwise check time constraint
+                let within_limit = max_time.is_none_or(|t| id.time() <= t);
+                within_limit.then_some(*id)
+            })?;
         let runnable = self.by_id.remove(&id)?;
         Some((id.time(), runnable))
     }
crates/pure-stage/src/simulation/running/mod.rs (1)

893-950: Terminate_stage recursion looks proper!

This is like a proper cascade shutdown in a game engine - parent terminates, children follow, state gets cleaned up. The recursive approach at lines 941-944 ensures no orphaned stages remain.

The TODOs at lines 898-900 are worth tracking - killing scheduled external effects and cleaning up scheduled messages would complete the picture. Want me to open an issue to track those?

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 35118a5 and 9531c04.

📒 Files selected for processing (3)
  • crates/pure-stage/src/simulation/running/mod.rs
  • crates/pure-stage/src/simulation/running/scheduled_runnables.rs
  • crates/pure-stage/src/stagegraph.rs
🧰 Additional context used
🧠 Learnings (21)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru-protocols/src/chainsync/responder.rs:191-213
Timestamp: 2026-01-11T20:05:26.594Z
Learning: In crates/amaru-protocols/src/chainsync/responder.rs, the chainsync responder intentionally does not support serving headers when the tip is Origin. Amaru is not designed to cold-start a new Cardano blockchain, so the intersect() function correctly fails when tip is Origin without needing special handling. This is a conscious design decision.
📚 Learning: 2025-08-08T14:34:06.105Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/minicbor-extra/src/lib.rs:50-55
Timestamp: 2025-08-08T14:34:06.105Z
Learning: Project uses Rust 1.88 stable; using Result::is_err_and is acceptable and should not be flagged as unstable. In particular, crates/minicbor-extra/src/lib.rs relies on is_err_and when checking end_of_input; future reviews should not suggest replacing it for stability reasons.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
📚 Learning: 2025-01-21T15:32:17.911Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 69
File: crates/amaru/src/ledger/state/diff_epoch_reg.rs:112-117
Timestamp: 2025-01-21T15:32:17.911Z
Learning: When suggesting code changes in Rust, always verify that the types align correctly, especially when dealing with references and Options. The `Fold::Registered` variant in `diff_epoch_reg.rs` expects a reference `&'a V`, so unwrapping an `Option<&V>` requires only a single `.expect()`.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
📚 Learning: 2026-01-11T20:38:12.301Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/pure-stage/src/effect.rs:156-162
Timestamp: 2026-01-11T20:38:12.301Z
Learning: For pure-stage Effects::call in crates/pure-stage/src/effect.rs, rkuhn prefers to keep the runtime panic that prevents nested calls for now and only lift/relax this constraint later if/when it becomes necessary.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-12T12:28:24.027Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 372
File: simulation/amaru-sim/src/simulator/mod.rs:410-412
Timestamp: 2025-08-12T12:28:24.027Z
Learning: In the Amaru project, panic statements are acceptable in simulation/test code (like amaru-sim crate) as they help identify configuration issues quickly during development, rather than needing proper error handling like production code.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-12-28T19:26:35.354Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru-protocols/src/blockfetch/mod.rs:151-173
Timestamp: 2025-12-28T19:26:35.354Z
Learning: In crates/amaru-protocols/src/blockfetch/mod.rs, the blockfetch initiator uses .expect() when popping from the request queue on NoBlocks and Done results. These are intentional fail-fast assertions: the protocol state machine guarantees the queue is non-empty when these messages arrive, so an empty queue indicates a protocol violation. A misbehaving peer triggers an erroneous protocol transition that will close the connection (supervision to be implemented in a future PR). This follows the project's fail-fast philosophy for protocol invariants.
<!--

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
📚 Learning: 2025-04-20T18:02:25.073Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward.rs:73-75
Timestamp: 2025-04-20T18:02:25.073Z
Learning: In the current development stage, rkuhn prefers using explicit panics (via `.expect()` or `.unwrap()`) for fatal errors in the application code that would tear down the node, rather than propagating errors with `Result`. The intention is to eventually transition to proper error handling with `Result` as the codebase matures.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
📚 Learning: 2025-04-20T17:57:23.233Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
📚 Learning: 2025-09-05T17:30:55.869Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
  • crates/pure-stage/src/simulation/running/scheduled_runnables.rs
📚 Learning: 2025-06-14T16:31:53.134Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 263
File: simulation/amaru-sim/src/simulator/simulate.rs:298-300
Timestamp: 2025-06-14T16:31:53.134Z
Learning: StageRef in the pure-stage crate supports serde serialization and deserialization (derives serde::Serialize and serde::Deserialize), enabling it to be used in structs that also derive these traits for TraceBuffer and replay functionality.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-05-05T08:15:24.192Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 206
File: crates/pure-stage/src/simulation/state.rs:4-8
Timestamp: 2025-05-05T08:15:24.192Z
Learning: The `State` trait in the pure-stage crate already requires `Send` with its definition: `pub trait State: Any + fmt::Debug + Send + 'static`, making additional `+ Send` bounds redundant when using `Box<dyn State>`.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
📚 Learning: 2025-12-16T21:32:37.668Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/amaru-network/src/handshake/tests.rs:40-47
Timestamp: 2025-12-16T21:32:37.668Z
Learning: In Rust, shadowing a binding with a new let does not drop the previous binding until the end of the scope. All shadowed bindings in a scope are dropped in reverse-declaration order when the scope ends. Therefore, multiple let _guard = register_*() calls will keep all guards alive until the end of the function (or the surrounding scope). When reviewing code, be mindful that resources tied to shadowed bindings persist longer than the most recent binding; to release early, constrain the lifetime in an inner block or explicitly drop guards when appropriate.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
  • crates/pure-stage/src/simulation/running/scheduled_runnables.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-12-16T21:50:46.690Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/pure-stage/src/adapter.rs:67-100
Timestamp: 2025-12-16T21:50:46.690Z
Learning: In the pure-stage crate's adapter system (crates/pure-stage/src/adapter.rs), adapters cannot form cycles because an existing adapter cannot be repointed after creation. The Adapter's target field is immutable, preventing the formation of loops in the adapter chain.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-05-09T13:09:47.915Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 206
File: crates/pure-stage/src/simulation/running.rs:240-242
Timestamp: 2025-05-09T13:09:47.915Z
Learning: Cloning messages in the pure-stage crate should be avoided for performance reasons. The current implementation in SimulationRunning deliberately avoids duplicating message data structures.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-06-14T16:41:13.061Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 263
File: crates/pure-stage/src/simulation/running.rs:868-875
Timestamp: 2025-06-14T16:41:13.061Z
Learning: In the pure-stage simulation framework, the effect air-lock protocol is designed so that when a stage is polled, the stage implementation consumes/takes the value from the effect lock during polling. There's no need to manually clear the effect lock after Poll::Ready because "the other side will have taken the value out" - this is by design, not a bug.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T20:19:07.396Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/tests/simulation.rs:20-28
Timestamp: 2025-08-20T20:19:07.396Z
Learning: Waker::noop() was stabilized in Rust 1.85.0 (released February 20, 2025) and is available in std::task::Waker, so no external dependencies like futures-task are needed for creating no-op wakers in tests.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T20:19:07.396Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/tests/simulation.rs:20-28
Timestamp: 2025-08-20T20:19:07.396Z
Learning: Waker::noop() was stabilized in Rust 1.85.0 (released February 2025) and is available in std::task::Waker, so no external dependencies like futures-task are needed for creating no-op wakers in tests.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T20:18:50.214Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/tests/simulation.rs:459-469
Timestamp: 2025-08-20T20:18:50.214Z
Learning: Rust 1.85 stabilized the Waker::noop() API, making it the preferred way to create a no-op waker instead of using futures_task::noop_waker_ref(). Code using Waker::noop() in modern Rust codebases is correct and doesn't need to be changed to use the futures_task alternative.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T19:37:32.510Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/src/effect.rs:204-223
Timestamp: 2025-08-20T19:37:32.510Z
Learning: In the pure-stage framework, the terminate() method uses never() which panics if called. This is intentional design: if terminate() ever returns, it indicates a serious framework bug that should immediately panic rather than allowing potentially corrupted execution to continue. The panic serves as a failsafe to surface framework issues.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-05-12T14:21:27.470Z
Learnt from: stevana
Repo: pragma-org/amaru PR: 210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-12-28T19:39:16.476Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru/src/stages/mod.rs:233-239
Timestamp: 2025-12-28T19:39:16.476Z
Learning: In crates/amaru/src/stages/mod.rs, the network.preload() operation for manager stage has a mailbox capacity of 10 messages. When preloading peers at startup, breaking on the first failure (after potentially filling the mailbox) is intentional design—the mailbox cannot hold more than 10 messages, so continuing to attempt preloads after a failure would be futile and would break the application.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
🧬 Code graph analysis (1)
crates/pure-stage/src/simulation/running/scheduled_runnables.rs (1)
crates/pure-stage/src/stagegraph.rs (2)
  • new (36-38)
  • time (40-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: End-to-end snapshot tests (preview, 2, 680, true, 15, 60, 10.5.3)
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 182, 15, 60, 10.5.3)
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build wasm32
  • GitHub Check: Build riscv32
  • GitHub Check: Build aarch64/linux
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Benches
  • GitHub Check: Test the CLI
  • GitHub Check: Test coverage
  • GitHub Check: Analyze (rust)
🔇 Additional comments (11)
crates/pure-stage/src/stagegraph.rs (2)

30-33: LGTM on the ScheduleId struct!

Grand job here, mate! The struct now properly derives PartialEq instead of the manual wildcard implementation that was causing all those headaches with Eq/Ord contract violations. Like fixing a game-breaking bug before launch day - sorted!

The ordering by (Instant, u64) ensures time-first ordering which aligns with the doc comment. Proper sound design, that is.


110-119: Nice simplification of the StageGraph trait!

Swapping out the RefAux associated type for Box<dyn Any + Send> is a solid move - keeps things flexible without the type complexity. Like going from a complex skill tree to a more streamlined progression system. Good stuff!

crates/pure-stage/src/simulation/running/scheduled_runnables.rs (2)

19-25: Clean data structure design!

Using a BTreeMap<ScheduleId, Runnable> is a clever choice here - since ScheduleId is ordered by time first (thanks to the Ord derive and field ordering), you get O(log n) inserts and O(1) access to the earliest scheduled item via first_key_value(). It's like having a perfectly sorted quest log that always shows you what's due next!


93-119: Solid test coverage!

These tests are like a proper QA playthrough - covering the happy path, edge cases, and boundary conditions. The next_wakeup_time_is_the_smallest_time test nicely verifies the BTreeMap ordering works as expected. Good work, mate!

crates/pure-stage/src/simulation/running/mod.rs (7)

84-94: Good integration of ScheduledRunnables and ScheduleIds!

The refactor from the previous scheduling approach to ScheduledRunnables and ScheduleIds is clean. Like upgrading your inventory system from a basic list to a proper sorted container - much more efficient for what you're doing here!


205-227: Skip-to-next-wakeup logic is solid!

The approach of waking up all runnables at the same scheduled time before advancing the clock is proper simulation semantics - like ensuring all events in the same game tick resolve before moving to the next frame. The max_time limiting is a nice touch for controlled time advancement.


723-745: Nice handling of immediate vs future scheduling in Effect::Schedule!

I see you've got the if id.time() > now check here (line 733) to handle cases where the schedule time is now or in the past - delivering immediately instead of trying to schedule. That's proper defensive coding, like having a fallback when a quick-time event fails.

However, this same pattern isn't applied to Effect::Wait (lines 704-721), which would panic on Duration::ZERO. Might want to add similar handling there for consistency.


1386-1447: Nice encapsulation pattern with the private module!

Using a nested module to enforce field privacy is a classic Rust pattern - like putting your secret items in a hidden room that only certain code paths can access. The usize::MAX check at line 1440 prevents any sneaky underflow issues with "infinite" overrides. Solid work!


1455-1508: Block reason logic is comprehensive!

The categorization into Idle, Sleeping, Busy, and Deadlock states covers all the bases. And I notice the needless borrow on scheduled.contains(id) from the past review is all sorted now - no more double referencing. Clean as a whistle!


1514-1554: Poll_stage integration with ScheduleIds is clean!

The effect splitting at line 1548 now properly passes the schedule_ids factory and current time, enabling the new scheduling semantics. Like upgrading your game engine to a new physics system without breaking the existing gameplay loop. Well done!


1597-1694: Simulation invariants test is a proper stress test!

This test is like running a full debug playthrough - checking that each effect type works correctly and that the simulation maintains its invariants throughout. The loop through ops with predicate checking is clever for catching any state corruption. Top-notch QA, mate!

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

rkuhn added 2 commits January 16, 2026 13:46
this is tracked internally

Signed-off-by: Roland Kuhn <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/pure-stage/src/simulation/running/mod.rs`:
- Around line 720-741: In the Effect::Schedule handling, delivery failures from
deliver_message are currently ignored; update both the scheduled closure passed
to schedule_wakeup and the immediate-send branch to match on the Result from
deliver_message and emit a log/tracing event when Err occurs (include the timer
id, at_stage, msg metadata or type, and the returned error like Full/NotFound).
Locate deliver_message calls in this block (the closure inside schedule_wakeup
and the immediate call using &mut self.stages, self.mailbox_size, at_stage, msg)
and replace the throwaway let _ = ... with a match or if let Err(e) = ... that
logs the error with context instead of silently dropping it.
♻️ Duplicate comments (2)
crates/pure-stage/src/simulation/running/mod.rs (2)

687-690: Call-send failure still hard‑terminates the caller.
Feels like the same earlier concern: if the callee is already gone, letting the caller time out is likely kinder than nuking the sim.


1123-1141: resume_call_send still conflates failure modes.
This is the same earlier mismatch: false can mean “callee terminated”, not just “not waiting.”

🧹 Nitpick comments (2)
crates/pure-stage/src/stagegraph.rs (1)

110-125: Consider a tiny alias/doc note for the boxed aux type.
With Box<dyn Any + Send> now in the public trait, a short alias (or a brief doc note that a StageBuildRef must be wired by the builder that created it) could make the API feel less like a Diablo loot box.

crates/pure-stage/src/simulation/running/mod.rs (1)

895-898: TODO on kill‑switch for scheduled externals.
If you want, I can sketch an issue + plan for the cancellation plumbing.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74dd8dd and 3d84165.

📒 Files selected for processing (5)
  • crates/pure-stage/src/simulation/running/mod.rs
  • crates/pure-stage/src/stagegraph.rs
  • crates/pure-stage/src/tokio.rs
  • crates/pure-stage/tests/functional.rs
  • crates/pure-stage/tests/simulation.rs
🧰 Additional context used
🧠 Learnings (21)
📓 Common learnings
Learnt from: abailly
Repo: pragma-org/amaru PR: 75
File: crates/amaru/src/consensus/mod.rs:164-165
Timestamp: 2025-02-03T11:15:22.640Z
Learning: In the Amaru project, chain selection operations (roll_forward and rollback) should use separate result types to leverage the type system for preventing impossible states, rather than using runtime checks or panics.
Learnt from: abailly
Repo: pragma-org/amaru PR: 228
File: crates/amaru-stores/src/rocksdb/consensus.rs:89-128
Timestamp: 2025-05-21T18:58:48.631Z
Learning: The InMemConsensusStore implementation in crates/amaru-stores/src/rocksdb/consensus.rs will be fleshed out incrementally on a by-need basis, driven by test requirements rather than implementing all functionality upfront.
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru-protocols/src/chainsync/responder.rs:191-213
Timestamp: 2026-01-11T20:05:26.594Z
Learning: In crates/amaru-protocols/src/chainsync/responder.rs, the chainsync responder intentionally does not support serving headers when the tip is Origin. Amaru is not designed to cold-start a new Cardano blockchain, so the intersect() function correctly fails when tip is Origin without needing special handling. This is a conscious design decision.
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.
Learnt from: jeluard
Repo: pragma-org/amaru PR: 387
File: crates/amaru-stores/src/lib.rs:40-40
Timestamp: 2025-08-20T13:02:25.763Z
Learning: In the amaru-stores crate, amaru_slot_arithmetic types like Epoch and EraHistory are used throughout the main crate code in modules like in_memory/mod.rs, rocksdb/consensus.rs, and rocksdb/ledger/columns/, not just in tests. This means amaru-slot-arithmetic should be a regular dependency, not a dev-dependency.
Learnt from: abailly
Repo: pragma-org/amaru PR: 195
File: simulation/amaru-sim/src/simulator/mod.rs:167-182
Timestamp: 2025-04-22T09:18:19.893Z
Learning: In the Amaru consensus pipeline refactor, ValidateHeader::handle_roll_forward returns a Result<PullEvent, ConsensusError>, not ValidateHeaderEvent as might be expected from the older code structure.
📚 Learning: 2025-06-14T16:31:53.134Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 263
File: simulation/amaru-sim/src/simulator/simulate.rs:298-300
Timestamp: 2025-06-14T16:31:53.134Z
Learning: StageRef in the pure-stage crate supports serde serialization and deserialization (derives serde::Serialize and serde::Deserialize), enabling it to be used in structs that also derive these traits for TraceBuffer and replay functionality.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-08T14:34:06.105Z
Learnt from: KtorZ
Repo: pragma-org/amaru PR: 370
File: crates/minicbor-extra/src/lib.rs:50-55
Timestamp: 2025-08-08T14:34:06.105Z
Learning: Project uses Rust 1.88 stable; using Result::is_err_and is acceptable and should not be flagged as unstable. In particular, crates/minicbor-extra/src/lib.rs relies on is_err_and when checking end_of_input; future reviews should not suggest replacing it for stability reasons.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
📚 Learning: 2025-01-21T15:32:17.911Z
Learnt from: jeluard
Repo: pragma-org/amaru PR: 69
File: crates/amaru/src/ledger/state/diff_epoch_reg.rs:112-117
Timestamp: 2025-01-21T15:32:17.911Z
Learning: When suggesting code changes in Rust, always verify that the types align correctly, especially when dealing with references and Options. The `Fold::Registered` variant in `diff_epoch_reg.rs` expects a reference `&'a V`, so unwrapping an `Option<&V>` requires only a single `.expect()`.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
📚 Learning: 2026-01-11T20:38:12.301Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/pure-stage/src/effect.rs:156-162
Timestamp: 2026-01-11T20:38:12.301Z
Learning: For pure-stage Effects::call in crates/pure-stage/src/effect.rs, rkuhn prefers to keep the runtime panic that prevents nested calls for now and only lift/relax this constraint later if/when it becomes necessary.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-12T12:28:24.027Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 372
File: simulation/amaru-sim/src/simulator/mod.rs:410-412
Timestamp: 2025-08-12T12:28:24.027Z
Learning: In the Amaru project, panic statements are acceptable in simulation/test code (like amaru-sim crate) as they help identify configuration issues quickly during development, rather than needing proper error handling like production code.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-12-28T19:26:35.354Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru-protocols/src/blockfetch/mod.rs:151-173
Timestamp: 2025-12-28T19:26:35.354Z
Learning: In crates/amaru-protocols/src/blockfetch/mod.rs, the blockfetch initiator uses .expect() when popping from the request queue on NoBlocks and Done results. These are intentional fail-fast assertions: the protocol state machine guarantees the queue is non-empty when these messages arrive, so an empty queue indicates a protocol violation. A misbehaving peer triggers an erroneous protocol transition that will close the connection (supervision to be implemented in a future PR). This follows the project's fail-fast philosophy for protocol invariants.
<!--

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
📚 Learning: 2025-04-20T18:02:25.073Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward.rs:73-75
Timestamp: 2025-04-20T18:02:25.073Z
Learning: In the current development stage, rkuhn prefers using explicit panics (via `.expect()` or `.unwrap()`) for fatal errors in the application code that would tear down the node, rather than propagating errors with `Result`. The intention is to eventually transition to proper error handling with `Result` as the codebase matures.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
📚 Learning: 2025-04-20T17:57:23.233Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 149
File: crates/amaru/src/stages/consensus/chain_forward/test_infra.rs:272-285
Timestamp: 2025-04-20T17:57:23.233Z
Learning: In test infrastructure code, rkuhn prefers explicit panics (using .unwrap() or similar) over returning Result types, as test failures should be immediate and obvious.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
📚 Learning: 2025-09-05T17:30:55.869Z
Learnt from: etorreborre
Repo: pragma-org/amaru PR: 432
File: crates/amaru/src/stages/consensus/clients_block_fetcher.rs:0-0
Timestamp: 2025-09-05T17:30:55.869Z
Learning: In crates/amaru/src/stages/consensus/clients_block_fetcher.rs, the fetch method currently holds a mutex lock across an await operation and lacks timeout handling. The author etorreborre has acknowledged this should be improved later by releasing the lock before awaiting and adding a bounded timeout around the network fetch operation.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
📚 Learning: 2025-05-05T08:15:24.192Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 206
File: crates/pure-stage/src/simulation/state.rs:4-8
Timestamp: 2025-05-05T08:15:24.192Z
Learning: The `State` trait in the pure-stage crate already requires `Send` with its definition: `pub trait State: Any + fmt::Debug + Send + 'static`, making additional `+ Send` bounds redundant when using `Box<dyn State>`.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
📚 Learning: 2025-12-16T21:32:37.668Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/amaru-network/src/handshake/tests.rs:40-47
Timestamp: 2025-12-16T21:32:37.668Z
Learning: In Rust, shadowing a binding with a new let does not drop the previous binding until the end of the scope. All shadowed bindings in a scope are dropped in reverse-declaration order when the scope ends. Therefore, multiple let _guard = register_*() calls will keep all guards alive until the end of the function (or the surrounding scope). When reviewing code, be mindful that resources tied to shadowed bindings persist longer than the most recent binding; to release early, constrain the lifetime in an inner block or explicitly drop guards when appropriate.

Applied to files:

  • crates/pure-stage/src/stagegraph.rs
  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-12-16T21:50:46.690Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 584
File: crates/pure-stage/src/adapter.rs:67-100
Timestamp: 2025-12-16T21:50:46.690Z
Learning: In the pure-stage crate's adapter system (crates/pure-stage/src/adapter.rs), adapters cannot form cycles because an existing adapter cannot be repointed after creation. The Adapter's target field is immutable, preventing the formation of loops in the adapter chain.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-05-09T13:09:47.915Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 206
File: crates/pure-stage/src/simulation/running.rs:240-242
Timestamp: 2025-05-09T13:09:47.915Z
Learning: Cloning messages in the pure-stage crate should be avoided for performance reasons. The current implementation in SimulationRunning deliberately avoids duplicating message data structures.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-06-14T16:41:13.061Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 263
File: crates/pure-stage/src/simulation/running.rs:868-875
Timestamp: 2025-06-14T16:41:13.061Z
Learning: In the pure-stage simulation framework, the effect air-lock protocol is designed so that when a stage is polled, the stage implementation consumes/takes the value from the effect lock during polling. There's no need to manually clear the effect lock after Poll::Ready because "the other side will have taken the value out" - this is by design, not a bug.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T20:19:07.396Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/tests/simulation.rs:20-28
Timestamp: 2025-08-20T20:19:07.396Z
Learning: Waker::noop() was stabilized in Rust 1.85.0 (released February 20, 2025) and is available in std::task::Waker, so no external dependencies like futures-task are needed for creating no-op wakers in tests.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T20:19:07.396Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/tests/simulation.rs:20-28
Timestamp: 2025-08-20T20:19:07.396Z
Learning: Waker::noop() was stabilized in Rust 1.85.0 (released February 2025) and is available in std::task::Waker, so no external dependencies like futures-task are needed for creating no-op wakers in tests.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T20:18:50.214Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/tests/simulation.rs:459-469
Timestamp: 2025-08-20T20:18:50.214Z
Learning: Rust 1.85 stabilized the Waker::noop() API, making it the preferred way to create a no-op waker instead of using futures_task::noop_waker_ref(). Code using Waker::noop() in modern Rust codebases is correct and doesn't need to be changed to use the futures_task alternative.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-08-20T19:37:32.510Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 384
File: crates/pure-stage/src/effect.rs:204-223
Timestamp: 2025-08-20T19:37:32.510Z
Learning: In the pure-stage framework, the terminate() method uses never() which panics if called. This is intentional design: if terminate() ever returns, it indicates a serious framework bug that should immediately panic rather than allowing potentially corrupted execution to continue. The panic serves as a failsafe to surface framework issues.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-05-12T14:21:27.470Z
Learnt from: stevana
Repo: pragma-org/amaru PR: 210
File: simulation/amaru-sim/src/simulator/simulate.rs:264-277
Timestamp: 2025-05-12T14:21:27.470Z
Learning: The team plans to replace the out-of-process test in `simulation/amaru-sim/src/simulator/simulate.rs` with an in-process NodeHandle implementation in the future, eliminating the need for hard-coded binary paths (`../../target/debug/echo`) and making tests more reliable.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
📚 Learning: 2025-12-28T19:39:16.476Z
Learnt from: rkuhn
Repo: pragma-org/amaru PR: 612
File: crates/amaru/src/stages/mod.rs:233-239
Timestamp: 2025-12-28T19:39:16.476Z
Learning: In crates/amaru/src/stages/mod.rs, the network.preload() operation for manager stage has a mailbox capacity of 10 messages. When preloading peers at startup, breaking on the first failure (after potentially filling the mailbox) is intentional design—the mailbox cannot hold more than 10 messages, so continuing to attempt preloads after a failure would be futile and would break the application.

Applied to files:

  • crates/pure-stage/src/simulation/running/mod.rs
🧬 Code graph analysis (2)
crates/pure-stage/src/stagegraph.rs (1)
crates/pure-stage/src/simulation/simulation_builder.rs (1)
  • stage (268-320)
crates/pure-stage/src/simulation/running/mod.rs (1)
crates/pure-stage/src/simulation/running/scheduled_runnables.rs (2)
  • new (30-34)
  • wakeup (49-56)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: End-to-end snapshot tests (preview, 2, 680, true, 15, 60, 10.5.3)
  • GitHub Check: End-to-end snapshot tests (preprod, 1, 182, 15, 60, 10.5.3)
  • GitHub Check: Build x86_64/linux
  • GitHub Check: Build riscv32
  • GitHub Check: Benches
  • GitHub Check: Test coverage
  • GitHub Check: Test the CLI
  • GitHub Check: Build wasm32
  • GitHub Check: Build x86_64/windows
  • GitHub Check: Build aarch64/macos
  • GitHub Check: Build aarch64/linux
  • GitHub Check: clippy
  • GitHub Check: Analyze (rust)
🔇 Additional comments (9)
crates/pure-stage/src/stagegraph.rs (2)

15-23: Imports look sweet—Any is wired in cleanly.
No dramas here; the additions read tidy and purposeful.


25-48: ScheduleId’s ordering + Display are consistent.
Time-first tuple plus “id … at …” formatting lines up nicely—feels like the Fellowship marching in the right order.

crates/pure-stage/src/simulation/running/mod.rs (7)

168-193: Sweet override hook — clean chain-of-responsibility.
Type-gated transform + finite-use removal reads tidy and predictable, like a slick Zelda item swap.


205-239: Scheduling loop looks solid.
Clock advances once per wakeup batch and trace updates are consistent — nice and clean.


403-409: Good guard on terminated-stage external delivery.
The warn + early-return keeps the sim steady instead of hard-crashing — nice save.


504-523: Optimistic receive wakeups feel right.
Keeps things snappy without forcing a full run loop — like a quick‑save in Bioshock.


558-592: Receive resume error path is robust.
Mapping unsupervised termination to Blocked::Terminated is the right fail‑fast move.


1511-1548: Polling split with schedule_ids + now is clean.
Feels deterministic and easier to reason about — nice touch.


701-718: No action needed here, mate. The ScheduleIds::next_at call is completely safe as-is.

Here's why: Rust's Duration type is always non-negative by design, so now + duration can never produce a time in the past—it's mathematically impossible, like trying to go backwards in The Dark Knight. Even with duration == 0, you're scheduling at the present moment, not some retroactive nightmare. The code has no validation logic anywhere in the chain (ScheduleIds::next_atScheduleId::newScheduledRunnables::schedule), but it doesn't need it. The type system's got your back.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +720 to +741
Effect::Schedule { at_stage, msg, id } => {
let data = self
.stages
.get_mut(&at_stage)
.log_termination(&at_stage)?
.assert_stage("which cannot schedule");
resume_schedule_internal(data, run, id)
.expect("schedule effect is always runnable");
// Now schedule the wakeup (after run is dropped)
let now = self.clock.now();
if id.time() > now {
// Schedule wakeup
self.schedule_wakeup(id, {
move |sim| {
let _ =
deliver_message(&mut sim.stages, sim.mailbox_size, at_stage, msg);
}
});
} else {
// Send immediately
let _ = deliver_message(&mut self.stages, self.mailbox_size, at_stage, msg);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Scheduled delivery silently drops on Full/NotFound.
Right now failures are ignored, which can hide lost timer messages. I’d at least log drops so they’re visible in traces.

🎬 Suggested logging on delivery failures
-                if id.time() > now {
-                    // Schedule wakeup
-                    self.schedule_wakeup(id, {
-                        move |sim| {
-                            let _ =
-                                deliver_message(&mut sim.stages, sim.mailbox_size, at_stage, msg);
-                        }
-                    });
-                } else {
-                    // Send immediately
-                    let _ = deliver_message(&mut self.stages, self.mailbox_size, at_stage, msg);
-                }
+                if id.time() > now {
+                    // Schedule wakeup
+                    self.schedule_wakeup(id, move |sim| {
+                        match deliver_message(&mut sim.stages, sim.mailbox_size, at_stage, msg) {
+                            DeliverMessageResult::Delivered(_) => {}
+                            DeliverMessageResult::Full(_, _) => {
+                                tracing::warn!(stage = %at_stage, "scheduled delivery dropped: mailbox full");
+                            }
+                            DeliverMessageResult::NotFound => {
+                                tracing::warn!(stage = %at_stage, "scheduled delivery dropped: stage terminated");
+                            }
+                        }
+                    });
+                } else {
+                    // Send immediately
+                    match deliver_message(&mut self.stages, self.mailbox_size, at_stage, msg) {
+                        DeliverMessageResult::Delivered(_) => {}
+                        DeliverMessageResult::Full(_, _) => {
+                            tracing::warn!(stage = %at_stage, "scheduled delivery dropped: mailbox full");
+                        }
+                        DeliverMessageResult::NotFound => {
+                            tracing::warn!(stage = %at_stage, "scheduled delivery dropped: stage terminated");
+                        }
+                    }
+                }
🤖 Prompt for AI Agents
In `@crates/pure-stage/src/simulation/running/mod.rs` around lines 720 - 741, In
the Effect::Schedule handling, delivery failures from deliver_message are
currently ignored; update both the scheduled closure passed to schedule_wakeup
and the immediate-send branch to match on the Result from deliver_message and
emit a log/tracing event when Err occurs (include the timer id, at_stage, msg
metadata or type, and the returned error like Full/NotFound). Locate
deliver_message calls in this block (the closure inside schedule_wakeup and the
immediate call using &mut self.stages, self.mailbox_size, at_stage, msg) and
replace the throwaway let _ = ... with a match or if let Err(e) = ... that logs
the error with context instead of silently dropping it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants