State sync write db #9247

turuslan · 2025-07-17T11:04:53Z

Description

Currently state sync accumulates all key-values in memory and only writes them to disk at the end.
This increases memory usage and can cause out-of-memory crash.
Also re-encoding key-values to trie nodes may be inconsistent (e.g. state v0 to v1 migration).
State sync no_proof=false mode responses contain original encoded trie nodes,
which can be recursively verified (block header state root hash) and written to database.

Related issues

Integration

This PR changes implementation of StateSync component of sc-network-sync crate.
Some related interfaces are also changed and propagated, to allow StateSync import partial state into STATE database column and mark state as available:

sc-client-api: Backend.import_partial_state, BlockImportOperation.commit_complete_partial_state
sc-consensus: BlockImport.import_partial_state, ImportedState
sc-network: SyncingAction::ImportPartialState, ImportResult
sp-trie: fix decode_compact
trait usage: cumulus-client-consensus-aura, cumulus-client-consensus-common, sc-consensus-babe, sc-consensus-beefy, sc-consensus-grandpa, sc-consensus-pow, sc-client-db, sc-service

Review Notes

no_proof=false contain original encoded trie nodes.
Write received nodes to database using import_partial_state.
Mark state as available after database contains all trie nodes using commit_complete_partial_state.
Fix decoding compact trie proof into prefixed db.
Fix in-memory backend, store all nodes in one map/dict, like db backend does.

Notes

todo: Code comments and documentation?
need-help: Not sure about naming types/functions/...
need-help: Should I reuse existing crate/trait ProofProviderHashDb, or where to add it?
assume: State sync import can't abort sync due to invalid state root or database write error.

Checklist

My PR includes a detailed description as outlined in the "Description" and its two subsections above.
My PR follows the labeling requirements of this project (at minimum one label for T required)
I have made corresponding changes to the documentation (if applicable)
I have added tests that prove my fix is effective or that my feature works (if applicable)
- I tested by state syncing one local network node from another.
  - I tested trie with state v1 hashed values.
  - I tested trie with :child_storage:default:.
  - I tested with responses containing one new key-value, syncing one key at a time to validate transitions.
- I tested by state syncing Astar (astar-collator) with low memory usage

Bot Commands

/cmd label T0-node

bkchr · 2025-07-18T10:59:58Z

Hey, thank you for the pull request. If you want to work on this, please check out my comment and the work for that was already started here.

turuslan · 2025-09-11T11:36:42Z

Hey, thank you for the pull request. If you want to work on this, please check out my comment and the work for that was already started here.

add new keys to the same state and recalculate the state root

Rebuilding trie from key-values may change trie structure and root hash.

This has already happened on live network during StateVersion V0->V1 migration.

Before the migration, on StateVersion V0, all values were stored as inline values.
When the migration began, runtime api already reported StateVersion V1, meaning that values longer than 32 bytes should be hashed and stored in separate nodes.
But old values were still stored as inline values.
So rebuilding "key"="long ... value" from V0 {"prefix":"key","value":"long ... value"} into V1 {"prefix":"key","value":{"hash":"..."}} would change root hash.
Migration script iterated keys in lexicographic order and wrote them back as V1.
But other runtime pallets could insert/overwrite their keys as V1 during migration.
So trie was inconsistent.

State sync response proofs are originally encoded trie nodes with matching hashes.
Reusing them instead of rebuilding would allow syncing even inconsistent trie.

Ignoring nodes which are already stored in db makes state sync incremental and may speed it up.
Storing received nodes of incomplete trie in db allows to resume sync after process restarts.
Child nodes are stored first, root node is stored last, to ensure that node dependencies already exist in db.

During sync some child nodes don't have parent node referencing them in db yet.
Their hashes may be stored in db for garbage collection.

turuslan · 2025-09-11T12:03:46Z

This PR was backported and tested on Astar, which had issue with OOM during parachain state sync.
Logs and RAM usage was collected, and following plot suggests that modified state sync doesn't increase memory usage.

bkchr · 2025-09-11T13:08:31Z

State sync response proofs are originally encoded trie nodes with matching hashes.
Reusing them instead of rebuilding would allow syncing even inconsistent trie.

Not sure what you are saying here?

The state proofs are already "proofs", this means all the nodes from the storage root down to the leaves that contain the actual data. If we take these nodes and stick them directly into the db, we don't need to recalculate or anything else, because we get exactly the nodes.

This has already happened on live network during StateVersion V0->V1 migration.

Not sure how this is related here, as we download the nodes directly.

turuslan · 2025-09-11T14:50:11Z

Not sure what you are saying here?
take these nodes and stick them directly into the db, we don't need to recalculate

Yes

Not sure how this is related here, as we download the nodes directly.

Currently polkadot-sdk:

Requests proof consisting of encoded trie nodes.
Checks node hashes with verify_range_proof.
Converts nodes to key-value with verify_range_proof.
Rebuilds trie from key-values with reset_storage.

So received nodes are not used, but rebuilt/renecoded from key-value, which can cause problems.

bkchr · 2025-09-11T20:36:47Z

So received nodes are not used, but rebuilt/renecoded from key-value, which can cause problems.

My point being that we directly forward these trie nodes to the db, as it was already started here: #5956

turuslan · 2025-09-12T06:58:48Z

Thanks, checked #5956 again.

Don't see complete changes yet:

StateImporter is not used yet.
import_state still accepts Storage (key-value).

Your review comment suggests to forward proof PrefixedMemoryDB to import_state.
But this batch shouldn't be merged into db completely.
Also nodes should be inserted in reverse topological order.
Example:

// Root node referencing two leaves
root1 -> leaf1, leaf2
// No nodes in db
db == []

// Request first leaf
response1 == [root1, leaf1]
// Derive next request prefix

a. Insert whole proof into db
  // Insert root node and first leaf
  db == [root1, leaf1]

  a1. Sync continues
    // Request second leaf
    response2 == [root1, leaf2]

    // Insert root node (duplicate) and second leaf
    db == [root1, leaf1, leaf2]

    // All nodes in db, sync complete

  a2. Process restarts, sync restarts
    // Root node is already in database,
    // so sync is considered complete,
    // but db is missing second leaf.
    db == [root1, leaf1, (leaf2)]

b. Insert node if it's dependencies are already in db
  // First leaf doesn't depend on anything, insert
  db == [leaf1]
  // Root node still depends on second leaf, don't insert yet

  b1. Sync continues
    // Continue sync

  b2. Process restarts, sync restarts
    // Root node is not in database,
    // so sync should resume

    // Request log(N) branches, will skip prefixes of nodes already in db
    // May cache stack of not yet inserted nodes to reduce requests after restart
    response1 == [root1, leaf1]
    ...

  // Request second leaf
  response2 == [root1, leaf2]

  // Second leaf doesn't depend on anything, insert
  db == [leaf1, leaf2]
  // Now all root dependencies are in db, insert
  db == [leaf1, leaf2, root]

  // All nodes in db, sync complete

If there are some problems with syncing child storage,
they may be related to missing child_storage_root_hash prefix in db key.

db[nibble_prefix + node_hash] = node
db[child_storage_root_hash + nibble_prefix + node_hash] = child_storage_node

bkchr · 2025-09-12T07:16:17Z

But this batch shouldn't be merged into db completely.

Why? All the nodes that are part of the proof, are part of the original trie. Why should we not merge all these nodes into the db?

Also nodes should be inserted in reverse topological order.

I don't get why the order is important here. Again, I'm basically just saying that we write the nodes with H(Node) => Node into the backend.

turuslan · 2025-09-12T08:02:35Z

why not merge all? why order is important?

Example shows that merging whole proof may break database.

In that example there are three nodes:
root node with hash root1,
and it's two child leaves with hashes leaf1 and leaf2.
When state sync starts,
root1 is not yet in database,
so sync is not complete.
After receiving first proof with root1 and leaf1 nodes,
they may be merge into db.

(whole proof)
After merging whole first proof,
db contains root1 and leaf1.
If process restarts after merging first proof,
it will see that root1 is already in db,
and consider state sync completed.
But it didn't receive leaf2 yet,
so db doesn't contain whole trie.
(order)
If process stops between inserting root1 and leaf1 nodes,
db would contain root1, but not leaf1.
Again, if root1 is in db,
state sync is considered complete.

I assume that state sync doesn't recurse into brach,
if that branch hash is already in db,
i.e. db contains whole subtree under that branch.

bkchr · 2025-09-12T19:14:51Z

If process restarts after merging first proof,
it will see that root1 is already in db,
and consider state sync completed.

We can just store that we did not yet finished the state sync. Right now we don't support a restart any way.

- fix trie decode compact prefix db - partial state import operation - support block import after partial state import - import state sync proofs as partial state instead of accumulating key-values

bkchr · 2025-10-02T10:33:00Z

substrate/client/network/sync/src/strategy/state_sync.rs

 			return ImportResult::BadResponse
 		}
-		let complete = if !self.metadata.skip_proof {
+		let (complete, partial_state) = if !self.metadata.skip_proof {


I think we can change it to always require proofs. Otherwise a non verified state sync would be problematic.

We just don't need to verify the proof.

Can old sync without proofs be removed in separate PR?
To simplify review process.
May create issue to specify requirements.

bkchr · 2025-10-02T12:22:15Z

substrate/client/consensus/common/src/block_import.rs

 	async fn import_block(&self, block: BlockImportParams<B>) -> Result<ImportResult, Self::Error>;
+
+	/// Import partial state.
+	async fn import_partial_state(&self, partial_state: PrefixedMemoryDB<HashingFor<B>>) -> Result<(), Self::Error>;


I'm not sure I'm 100% happy with this way. (I mean introducing a new function).

However, I still need to think about this a little bit on what would be the best.

There is import_justification function separate from import_block.
import_block imports block, and may import something related.
import_justification imports justification, without block.

Like justifications, partial state import is separate from block import.
Importing partial state operation is repeated many times,
so block can't be imported until last partial state import makes state complete.
Also import_block has many side effects unrelated to partial state, which should happen after block is imported.

Have you found better solutions?
How should we proceed?

@bkchr do you have any suggestions with how to proceed with the PR?
cc @turuslan

I know the already existing functions are very sparse in documentation too, but that's not a good standard - so let's please add some docs: What this function is good for, why it exists, which invariants it assumes, what even is "partial state" in this context, ... What happens if partial state never becomes complete? How does it interact with other functions? ...

use KeySpacedDBMut wrapper

store merged in-memory state

CrimeaTopStill · 2025-11-25T18:33:15Z

substrate/client/db/src/lib.rs

 	}

+	fn import_partial_state(&self, mut partial_state: PrefixedMemoryDB<HashingFor<Block>>) -> sp_blockchain::Result<()> {
+		self.storage.db.commit(Transaction(


Is there a way this data is cleaned up from the storage in case of, for example, unsuccessful state sync attempts? Won't it be possible to flood a node with invalid partial states somehow?

Client checks with merkle proof that received nodes are reachable from state root,
so all inserted nodes are valid.
Most of these nodes don't change and would be reused in subsequent state sync attempts.

Harrm

Accidentally sent part of the comments from another account

substrate/client/api/src/backend.rs

substrate/client/api/src/in_mem.rs

eskimor · 2025-12-09T16:48:16Z

substrate/client/api/src/backend.rs


+	/// Commit complete partial state.
+	/// `sc-client-db` expects blocks with state to be marked.
+	/// Otherwise it complains that state is not found.


What is a complete partial state? Sounds like an oxymoron deserving a better description.

eskimor · 2025-12-09T17:13:12Z

substrate/client/consensus/common/src/block_import.rs

 	async fn import_block(&self, block: BlockImportParams<B>) -> Result<ImportResult, Self::Error>;
+
+	/// Import partial state.
+	async fn import_partial_state(&self, partial_state: PrefixedMemoryDB<HashingFor<B>>) -> Result<(), Self::Error>;


I know the already existing functions are very sparse in documentation too, but that's not a good standard - so let's please add some docs: What this function is good for, why it exists, which invariants it assumes, what even is "partial state" in this context, ... What happens if partial state never becomes complete? How does it interact with other functions? ...

github-actions · 2025-12-10T05:46:28Z

Review required! Latest push from author must always be reviewed

turuslan · 2025-12-12T12:27:23Z

Attaching Claude security analysis (from Element chat) PR_9247_SECURITY_ANALYSIS.md

# Conflicts: # Cargo.lock

turuslan · 2025-12-17T05:05:00Z

Attaching Claude security analysis (from Element chat) PR_9247_SECURITY_ANALYSIS.md

Partial State Bypasses StateDB - Orphaned Nodes Accumulate Forever
✅ Refactored.
Added StateDB integration.
State sync is used on finalized blocks, so pruner only uses Changeset.deleted and doesn't use Changeset.inserted.
State sync only generates Changeset.inserted and doesn't generate Changeset.deleted, because previous block state is not available.
State sync'ed block keys will be deleted later, when there is a descendant block with corresponding Changeset.deleted.
No Final State Validation After All Chunks Imported
❓ Refactor?
decode_compact and verify_range_proof decode same nodes.
Block import happens only after successful verify_range_proof verification.
Duplicate decode_compact call was added because verify_range_proof doesn't return PrefixedMemoryDB and uses MemoryDB instead.

Should we add function similar to verify_range_proof, but returning PrefixedMemoryDB instead of key-values?

No Transaction Rollback for Failed Partial State Imports
❓ Need discussion.
Initially this PR proposed adding function for importing partial state to ProofProvider.
This way StateSync can see if partial state import was successul, or there was problem writing to db.
In later discussion during call with Parity team, they asked to use block import pipeline, so we tried to make it similar to block/justification import.
We refactored StateSync, so it wouldn't call ProofProvider function directly,
but return ImportResult with partial state, so parent component would put that partial state into import queue.
Import queue functions (e.g. import_blocks/import_justifications) don't return error,
so StateSync doesn't know if partial state was imported or not.

Should node panic if there was write error during partial state import?
Should we add import_partial_state back to ProofProvider, instead of using import queue, to receive error result?
What should StateSync do if write error occurs?
Should it panic/retry/hang?
Should it return error to parent component, and what should that parent component do?

Race Condition: Concurrent Partial State Imports
✅ Fixed.
Added StateDB intergration and write deduplication.
Now StateDB stores set of partial state keys to avoid writing key twice for some block.
Memory Exhaustion via Unbounded Channel
❓ Analogous code in master.
Client sends state sync request, receives state sync response, and then imports it as partial state.
So server can't flood client without requests from client side.
Same for block and justification import.
May remove channel and call Client/Backend directly (see question 3).

Should this and block/justification channels be bounded in other PR?
Should we add import_partial_state back to ProofProvider, which is used by state sync client, instead of using import queue channels?

No Atomicity Between Partial State Chunks
✅ Fixed.
Added write deduplication (see question 4).
There is no missing chunks (see question 2).
No Error Handling for Database Commit Failures
❓ (see question 3)
Missing State Root Validation on Resume
❓ Need discussion.
Usual node doesn't state sync multiple blocks simultaneously,
but it may resume state sync from later block.
Removal of incomplete previous state sync should not happen on write error or node restart,
because these nodes existing in db can be reused by state sync to other blocks
(not in scope of this PR, see State sync v3 #10296).
In this PR StateDB stores partial state keys, so code cleaning incomplete state sync can be added.

Should this cleanup be added in this PR?
Should this cleanup occur on block import after state sync completed?

Proof Verification Happens After Decode
✅ Fixed.
Reordered decode_compact and verify_range_proof calls.
Channel Closure Handling
✅ Fixed.
Fixed log message.

turuslan added 2 commits October 2, 2025 11:23

partial state import

34e3c24

- fix trie decode compact prefix db - partial state import operation - support block import after partial state import - import state sync proofs as partial state instead of accumulating key-values

Merge remote-tracking branch 'origin/master' into state-sync-resume-2

3376844

turuslan force-pushed the state-sync-resume branch from 0eca01f to 3376844 Compare October 2, 2025 06:59

turuslan mentioned this pull request Oct 2, 2025

fix decode compact prefix db paritytech/trie#227

Merged

bkchr reviewed Oct 2, 2025

View reviewed changes

turuslan added 2 commits October 14, 2025 09:20

remove decode_compact_from_iter_with_prefix

2e4d27e

use KeySpacedDBMut wrapper

Merge remote-tracking branch 'origin/master' into state-sync-resume-2

247c5af

turuslan mentioned this pull request Nov 12, 2025

State sync v3 #10296

Open

8 tasks

turuslan added 4 commits November 12, 2025 20:57

implement in-memory backend

1eccc74

store merged in-memory state

Merge remote-tracking branch 'origin/master' into state-sync-resume-2

1a0c720

test import_partial_state and commit_complete_partial_state

a7d0543

Merge remote-tracking branch 'origin/master' into state-sync-resume-2

c9585b2

turuslan marked this pull request as draft November 14, 2025 09:18

turuslan added 4 commits November 18, 2025 12:24

Merge branch 'master' into state-sync-resume

73b0d03

Merge branch 'master' into state-sync-resume

52455e5

Merge branch 'master' into state-sync-resume

def41ef

Merge branch 'master' into state-sync-resume

04b69c5

CrimeaTopStill reviewed Nov 25, 2025

View reviewed changes

Harrm reviewed Nov 25, 2025

View reviewed changes

substrate/client/api/src/backend.rs Outdated Show resolved Hide resolved

substrate/client/api/src/in_mem.rs Outdated Show resolved Hide resolved

substrate/client/api/src/in_mem.rs Outdated Show resolved Hide resolved

Merge remote-tracking branch 'origin/master' into state-sync-resume

b204d31

pr comments

b1bafda

turuslan requested review from CrimeaTopStill and Harrm November 26, 2025 10:31

Harrm approved these changes Nov 26, 2025

View reviewed changes

turuslan marked this pull request as ready for review November 26, 2025 15:04

eskimor reviewed Dec 9, 2025

View reviewed changes

turuslan added 3 commits December 10, 2025 10:11

Merge remote-tracking branch 'origin/master' into state-sync-resume

b8b6ce1

rename commit_complete_partial_state

b48e77f

comments

aecf12c

github-actions bot requested a review from Harrm December 10, 2025 05:46

turuslan requested a review from eskimor December 10, 2025 05:47

turuslan added 8 commits December 15, 2025 07:49

typo

186a8b7

reorder

2ff9d1c

Merge remote-tracking branch 'origin/master' into state-sync-resume

caba99c

# Conflicts: # Cargo.lock

clone

4341e1d

update trie patch

05439f1

Merge remote-tracking branch 'origin/master' into state-sync-resume

a31b925

use state db, dedup partial state

094dd5a

Merge remote-tracking branch 'origin/master' into state-sync-resume

bf0d173

State sync write db #9247

Are you sure you want to change the base?

State sync write db #9247

Uh oh!

Conversation

turuslan commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues

Integration

Review Notes

Notes

Checklist

Bot Commands

Uh oh!

bkchr commented Jul 18, 2025

Uh oh!

turuslan commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

turuslan commented Sep 11, 2025

Uh oh!

bkchr commented Sep 11, 2025

Uh oh!

turuslan commented Sep 11, 2025

Uh oh!

bkchr commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

turuslan commented Sep 12, 2025

Uh oh!

bkchr commented Sep 12, 2025

Uh oh!

turuslan commented Sep 12, 2025

Uh oh!

bkchr commented Sep 12, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Harrm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 10, 2025

Uh oh!

turuslan commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

turuslan commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

turuslan commented Jul 17, 2025 •

edited

Loading

turuslan commented Sep 11, 2025 •

edited

Loading

bkchr commented Sep 11, 2025 •

edited

Loading

turuslan commented Dec 12, 2025 •

edited

Loading