Description
Is there an existing issue?
- I have searched the existing issues
Experiencing problems? Have you tried our Stack Exchange first?
- This is not a support question.
Motivation
Right now SyncingStrategy
is created in the constructor of SyncingEngine
and the whole thing assumes there are just 3 syncing strategies that transition like this:
polkadot-sdk/substrate/client/network/sync/src/strategy.rs
Lines 520 to 522 in 6619277
This is a problem for chains that both don't have/use Warp sync and/or have custom sync strategies (like Subspace/Autonomys does, we have two custom sync strategies already).
Assumption of existence of certain strategies makes it impossible to implement custom ones without forking Substrate and injecting custom logic in strategic places, which is hard to maintain and fragile long-term. It also frequently results in unexpected side-effects like #4607, #4407 and more.
Specific examples of strategies we have already implemented with hacks in Subspace/Autonomys:
- Sync from DSN (Distributed Storage Network): we download and decode most of the blocks from archival history, import them sequentially before switching back to regular chain sync strategy
- Snap sync: we download the first block of the last segment of archived history from DSN, download and verify corresponding state from one of the nodes, import block with corresponding state before switching back to regular chain sync (similar to Warp sync, except we do not intend to ever sync the gap afterwards)
Request
Make it possible to define custom sync strategies and transition sequence without forking
Solution
Probably the first step would be to extract a trait out of existing SyncingStrategy
, then probably extract different strategies out of SyncingStrategy
and make them composable, finally expose a way to configure custom syncing strategy instead of what is used by default.
Not sure what to do with build_network
, it contains significant amount of boilerplate and while could be copy-pasted and modified in chain-specific codebase, it feels like there should be a better solution.
Open to suggestions on how to approach this so we can make progress and reduce the diff with upstream in our fork.
Examples of hacks that we had to introduce downstream to implement custom sync protocols:
- Temporarily pausing Substrate's sync so it doesn't attempt to catch up while we do progress towards the same objective with our custom sync protocol, avoiding unnecessary network usage, errors and eventually banning due to Node can send multiple same block requests while syncing from other nodesย #531 /
ChainSync
requesting the same block multiple timesย #1915: autonomys@4670a2c - API for sync strategy restart: autonomys@9897ffa
- API for force-clearing block gap: autonomys@d537512
- Expose state sync internals in order to be able to make the same requests ourselves: autonomys@de6a7e3
These are all fragile workarounds that we'd really like to get rid of sooner rather than later.
Are you willing to help with this request?
Yes!