Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine-grained Pending state #1575

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

NingLin-P
Copy link

The Pending state is a list that is used to keep track of all the (transaction, status, receipt) in the block during block execution, and then in on_finalize these data are moved to CurrentBlock/CurrentReceipts/CurrentTransactionStatuses for RPC service usage.

Because the Pending state is a StorageValue<_, Vec<...>>, which presents as a single value in the trie, appending items to the list means read,update,write to the same value, which gets slower and slower as the list grows during block execution, and depend on the number of tx in the block it can result in 3-20x of slow (see below).

This PR fixes the issue by refactoring the Pending state, to replace StorageValue with StorageMap so each tx is present as a single value in the trie, although it results in more db delete in on_finalize but in practice, it is much faster than updating a big single list.

Below is a table about the block execution time of different Pending states where N represents the total number of Ethereum::transact in the block, each tx transfers funds from the same Alice account to a new different account. The test is running on a MacBook Pro, Apple M1 Pro, 10 Core(s)

Tx in block N = 1000 N = 2000 N = 4000 N = 5000 N = 6000 N = 8000
Current Pending state 690ms 2_498ms 9_493ms 14_921ms 21_383ms 37_426ms
No Pending state 194ms 382ms 780ms 1_047ms 1_187ms 1_596ms
Fine-grained Pending state 221ms 449ms 869ms 1_103ms 1_329ms 1_772ms

@NingLin-P NingLin-P requested a review from sorpaas as a code owner December 20, 2024 20:18
@conr2d
Copy link
Contributor

conr2d commented Dec 25, 2024

Interesting results. I thought the Polkadot SDK was optimized enough for this case, but the benchmark you shared suggests otherwise. Could you share the script you used for benchmarking?

@NingLin-P
Copy link
Author

Could you share the script you used for benchmarking?

The benchmark was originally done in our testing framework for the EVM domain (an optimistic rollup based on frontier), there are too many other factors that may affect the result, so I wrote a few unit tests in frontier to demonstrate the issue easier.

You can pull this branch and run the test in the last 3 commits:

cargo test -r -p pallet-ethereum -- --exact tests::evm_transact_1000 --nocapture
cargo test -r -p pallet-ethereum -- --exact tests::evm_transact_4000 --nocapture
cargo test -r -p pallet-ethereum -- --exact tests::evm_transact_8000 --nocapture

The last 3 commits present the case Current Pending state, No Pending state, and Fine-grained Pending state respectively.

NOTE: the result is not exactly the same as the one in the PR description as this is done in the unit test in TestExternalities but it should be enough to show the improvement this PR brings.

@NingLin-P
Copy link
Author

Could you take a look, please? @conr2d @boundless-forest

@conr2d
Copy link
Contributor

conr2d commented Jan 8, 2025

Your tests contain some issues that should be addressed:

  • Deriving private keys and signing transactions should be performed outside of the time measurement.
  • The benchmark should include not only transact but also on_finalize to provide a complete picture.

That said, I did observe the performance degradation you mentioned. However, there are a few considerations to take into account:

  • sp_io::TestExternalities operates entirely in memory, which does not accurately reflect real-world scenarios.
  • Your changes may increase database reads, which are typically expensive in most cases.
  • During single block execution, storage changes are not written directly to persistent storage (disk) at once; are handled in memory. Nonetheless, benchmarking for weight calculations treats them as database reads.
  • You might find CountedStorageMap useful for your specific case.

While I agree that there is potential to improve the current structure, we need to find the root cause of the performance degradation and validate it with benchmark results in a real-world environment. This should include testing in a WASM runtime with RocksDB to ensure the results are representative.

@NingLin-P
Copy link
Author

Thank you! @conr2d

Your tests contain some issues that should be addressed:

You are right, I have force-pushed the branch and updated the test as you suggested, and the result does look slightly different (take evm_transact_8000 as an example):

  • For No Pending state, and Fine-grained Pending state, the time is reduced from 1703ms & 1726ms -> 819ms & 896ms
  • For Current Pending state, the time is mostly the same 17961ms -> 17881ms

Your changes may increase database reads, which are typically expensive in most cases.
During single block execution, storage changes are not written directly to persistent storage (disk) at once; are handled in memory. Nonetheless, benchmarking for weight calculations treats them as database reads.
You might find CountedStorageMap useful for your specific case.

Do you mean the read of NextTxIndex during apply extrinsic? I think it should be fine because appending item to the Pending state requires 1 read & 1 write, while the change requires 1 read of NextTxIndex and 2 write for NextTxIndex and Pending. Also, I did consider CountedStorageMap, but it doesn't preserve order during iterating and may get the wrong result during store_block.

sp_io::TestExternalities operates entirely in memory, which does not accurately reflect real-world scenarios.
we need to find the root cause of the performance degradation and validate it with benchmark results in a real-world environment. This should include testing in a WASM runtime with RocksDB to ensure the results are representative.

I agree that the test is not a rigorous benchmarking, it is more for comparison with different types of Pending states as there is 10-20x time of difference in execution time. The result in the PR description is done in a more real-world scenario (with node and backend db, etc) but there are other factors in our code that may affect the result so it may not be representative, I will find a way to benchmark with the template node in this repo and come back later.

As for the root cause, I did observe (when benchmarking in our code) in TRACE log, the time is indeed spent on appending the Pending state, while the first tx spent <1ms the 1000th tx spent >10ms (see the key 2013754dd003840aea66b349f8241e25f3a612d53af4d3e14369f02b54c7ca4f):

2024-12-17 03:42:01.379 TRACE              tokio-runtime-worker state: [//Alice (DomainId(0))] method="Get" ext_id=9e9b key=26aa394eea5630e07c48ae0c9558cef799e7f93fc6a98f0874fd057f111c4d2d result=Some(..)

2024-12-17 03:42:01.379 TRACE              tokio-runtime-worker state: [//Alice (DomainId(0))] method="Get" ext_id=9e9b key=2013754dd003840aea66b349f8241e25f3a612d53af4d3e14369f02b54c7ca4f result=Some(..)
2024-12-17 03:42:01.391 TRACE              tokio-runtime-worker state: [//Alice (DomainId(0))] method="Append" ext_id=9e9b key=2013754dd003840aea66b349f8241e25f3a612d53af4d3e14369f02b54c7ca4f value=..

2024-12-17 03:42:01.391 TRACE              tokio-runtime-worker state: [//Alice (DomainId(0))] method="Get" ext_id=9e9b key=26aa394eea5630e07c48ae0c9558cef702a5c1b19ab7a04f536c519aca4983ac result=Some(..)

Furthermore, I checked with the unit test the current Pending state result in ~6.7MB in size after 8000 txs, and Vec is a continuous chunk of memory, so if it takes 1ms to allocate/copy such a big chunk of memory then it is 1s in time for the last 1000 txs.

@conr2d
Copy link
Contributor

conr2d commented Jan 9, 2025

Do you mean the read of NextTxIndex during apply extrinsic? I think it should be fine because appending item to the Pending state requires 1 read & 1 write, while the change requires 1 read of NextTxIndex and 2 write for NextTxIndex and Pending. Also, I did consider CountedStorageMap, but it doesn't preserve order during iterating and may get the wrong result during store_block.

Regarding on_finalize, with the original code, all pending transactions can be accessed with a single database read. However, with the new implementation, each transaction needs to be fetched individually from the database.

For more on CountedStorageMap, you can refer to this conr2d/frontier@e7a2ac9.

@conr2d
Copy link
Contributor

conr2d commented Jan 12, 2025

@boundless-forest @crystalin What are your thoughts on this? The issue addressed in this PR appears reasonable to me, as the Pending storage is used only within the same block and cleared afterward. As such, its overhead is primarily influenced by large-sized memory allocations rather than persistent database read/writes.

@boundless-forest
Copy link
Collaborator

The changes make sense to me. I want to confirm once more that the changes don't pose a storage compatibility issue, as the storage is only utilized within a block, and no migration is necessary.

@NingLin-P
Copy link
Author

Hey, I have updated this branch to include a benchmark-block-execution command, its workload is similar to the previous unit test but it is running in a real-world environment (e.g. with client, db backend, WASM runtime, etc).

To run the benchmark on your machine, pull the branch, and for the last 3 commits (which present the case Current Pending state, No Pending state, and Fine-grained Pending state respectively):

// Build the template node
cargo build -r --bin frontier-template-node

// Run the benchmark to execute a block with 8000 txs
./target/release/frontier-template-node benchmark-block-execution --dev --number 8000

On my machine, the result is Current Pending state: 36_878ms, No Pending state: 1_759ms, and Fine-grained Pending state: 1_820ms (which is similar to the result in the PR description).

Also, this PR is updated to use the CountedStorageMap, PTAL, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants