network: Investigate sudden sync peer count drops #5236

lexnv · 2024-08-05T09:41:40Z

Investigate sudden sync peer count drops.

Over the past few days, the libp2p node (yellow) is more susceptible to peer count drops than litep2p (green).

This may be a side effect from the fact that litep2p (1.4k vs 8 for libp2p) submits more kademlia random queries to keep a healthy view of the network:

Libp2p logs:

2024-08-03 03:06:51.000 ERROR tokio-runtime-worker beefy: 🥩 Error: ConsensusReset. Restarting voter.    
2024-08-03 03:06:51.033  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/56904/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:52.800  INFO tokio-runtime-worker substrate: 💤 Idle (51 peers), best: #24317220 (0x7649…c2c7), finalized #24317216 (0xb9e7…97f4), ⬇ 2.6MiB/s ⬆ 937.8kiB/s    
2024-08-03 03:06:53.670  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/51336/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:53.816  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/53266/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:54.395  INFO tokio-runtime-worker substrate: 🏆 Imported #24317221 (0x7649…c2c7 → 0x1792…fe76)    
2024-08-03 03:06:56.737  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip6/2001:bc8:701:700:3eec:efff:feff:183c/tcp/57534/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:57.456  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/43108/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:57.800  INFO tokio-runtime-worker substrate: 💤 Idle (51 peers), best: #24317221 (0x1792…fe76), finalized #24317217 (0xb2dd…2d65), ⬇ 3.8MiB/s ⬆ 1.1MiB/s    
2024-08-03 03:06:59.917  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/35934/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:07:00.486  INFO tokio-runtime-worker substrate: 🏆 Imported #24317222 (0x1792…fe76 → 0x9626…cfa5)    
2024-08-03 03:07:01.962  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip6/2001:bc8:701:700:3eec:efff:feff:183c/tcp/57784/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:07:02.136  INFO tokio-runtime-worker beefy: 🥩 BEEFY gadget waiting for BEEFY pallet to become available...    
2024-08-03 03:07:02.137  INFO tokio-runtime-worker beefy: 🥩 BEEFY pallet available: block 24316220 beefy genesis 21943872    
2024-08-03 03:07:02.798  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/39836/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:07:02.807  INFO tokio-runtime-worker substrate: 💤 Idle (27 peers), best: #24317222 (0x9626…cfa5), finalized #24317219 (0x20bc…fadc), ⬇ 4.6MiB/s ⬆ 3.0MiB/s

No banned peers where reported during the count spikes.

There's been an instance where I've seen a large gap in logs produced by libp2p (around 2 minutes iirc).
This could be due to some subsystem performing a blocking task in a non-blocking async function.
Prometheus reports by default metrics at 15 seconds intervals. We may drop all connected peers, then slowly connect again to the network in this interval.

The text was updated successfully, but these errors were encountered:

lexnv · 2024-08-05T09:51:33Z

Another data point is that beefy reaches around 3mib usage of its channels. The only error that is happening in the vicinity of one incident is the beefy consesuns restart (should revalidate with #5197 applied), however I don't believe this is beefy related offhand.

lexnv added this to Networking Aug 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

network: Investigate sudden sync peer count drops #5236

network: Investigate sudden sync peer count drops #5236

lexnv commented Aug 5, 2024

lexnv commented Aug 5, 2024

network: Investigate sudden sync peer count drops #5236

network: Investigate sudden sync peer count drops #5236

Comments

lexnv commented Aug 5, 2024

lexnv commented Aug 5, 2024