Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

network: Investigate sudden sync peer count drops #5236

Open
lexnv opened this issue Aug 5, 2024 · 1 comment
Open

network: Investigate sudden sync peer count drops #5236

lexnv opened this issue Aug 5, 2024 · 1 comment

Comments

@lexnv
Copy link
Contributor

lexnv commented Aug 5, 2024

Investigate sudden sync peer count drops.
Screenshot 2024-08-05 at 12 19 43

Over the past few days, the libp2p node (yellow) is more susceptible to peer count drops than litep2p (green).

This may be a side effect from the fact that litep2p (1.4k vs 8 for libp2p) submits more kademlia random queries to keep a healthy view of the network:
Screenshot 2024-08-05 at 11 42 44

Libp2p logs:

2024-08-03 03:06:51.000 ERROR tokio-runtime-worker beefy: 🥩 Error: ConsensusReset. Restarting voter.    
2024-08-03 03:06:51.033  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/56904/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:52.800  INFO tokio-runtime-worker substrate: 💤 Idle (51 peers), best: #24317220 (0x7649…c2c7), finalized #24317216 (0xb9e7…97f4), ⬇ 2.6MiB/s ⬆ 937.8kiB/s    
2024-08-03 03:06:53.670  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/51336/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:53.816  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/53266/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:54.395  INFO tokio-runtime-worker substrate: 🏆 Imported #24317221 (0x7649…c2c7 → 0x1792…fe76)    
2024-08-03 03:06:56.737  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip6/2001:bc8:701:700:3eec:efff:feff:183c/tcp/57534/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:57.456  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/43108/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:57.800  INFO tokio-runtime-worker substrate: 💤 Idle (51 peers), best: #24317221 (0x1792…fe76), finalized #24317217 (0xb2dd…2d65), ⬇ 3.8MiB/s ⬆ 1.1MiB/s    
2024-08-03 03:06:59.917  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/35934/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:07:00.486  INFO tokio-runtime-worker substrate: 🏆 Imported #24317222 (0x1792…fe76 → 0x9626…cfa5)    
2024-08-03 03:07:01.962  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip6/2001:bc8:701:700:3eec:efff:feff:183c/tcp/57784/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:07:02.136  INFO tokio-runtime-worker beefy: 🥩 BEEFY gadget waiting for BEEFY pallet to become available...    
2024-08-03 03:07:02.137  INFO tokio-runtime-worker beefy: 🥩 BEEFY pallet available: block 24316220 beefy genesis 21943872    
2024-08-03 03:07:02.798  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/39836/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:07:02.807  INFO tokio-runtime-worker substrate: 💤 Idle (27 peers), best: #24317222 (0x9626…cfa5), finalized #24317219 (0x20bc…fadc), ⬇ 4.6MiB/s ⬆ 3.0MiB/s    

No banned peers where reported during the count spikes.

There's been an instance where I've seen a large gap in logs produced by libp2p (around 2 minutes iirc).
This could be due to some subsystem performing a blocking task in a non-blocking async function.
Prometheus reports by default metrics at 15 seconds intervals. We may drop all connected peers, then slowly connect again to the network in this interval.

@lexnv lexnv added this to Networking Aug 5, 2024
@lexnv
Copy link
Contributor Author

lexnv commented Aug 5, 2024

Another data point is that beefy reaches around 3mib usage of its channels. The only error that is happening in the vicinity of one incident is the beefy consesuns restart (should revalidate with #5197 applied), however I don't believe this is beefy related offhand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

1 participant