Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute diff of validators list manually #6556

Merged

Conversation

dapplion
Copy link
Collaborator

Issue Addressed

From #5978 (review) I've seen that xdelta3 struggles with very large inputs.

The current approach of throwing the entire state to diff algorithm is nice because we don't need to deal with forks. However the state has only a few fields that are relevant for byte size

N = number of validators

BeaconStateElectra Field Size (bytes) Change
genesis_time -
genesis_validators_root -
slot -
fork -
latest_block_header -
block_roots 32 * 8192 Mutate 32 items per epoch
state_roots 32 * 8192 Mutate 32 items per epoch
historical_roots: 32 * 328724 Frozen since capella
eth1_data -
eth1_data_votes 0 .. 147456 1 add every slot
eth1_deposit_index -
validators 121 * N 0..8 adds every epoch (bounded by churn), unbounded mutations
balances 8 * N Mutate most values change every epoch
randao_mixes 32 * 65536 Mutate 1 every epoch
slashings -
previous_epoch_participation 1 * N Mutate most values every epoch
current_epoch_participation 1 * N Mutate most values every epoch
justification_bits -
previous_justified_checkpoin -
current_justified_checkpoint -
finalized_checkpoint -
inactivity_scores 8 * N Always 0 unless non-finality, then most could mutate
current_sync_committee -
next_sync_committee -
latest_execution_payload_hea -
next_withdrawal_index -
next_withdrawal_validator_in -
historical_summaries 64 * TODO 1 add every 256 epochs
deposit_requests_start_index -
deposit_balance_to_consume -
exit_balance_to_consume -
earliest_exit_epoch -
consolidation_balance_to_con -
earliest_consolidation_epoch -
pending_deposits 192 * ?? Unbounded push-pop queue
pending_partial_withdrawals 24 * ?? Unbounded push-pop queue
pending_consolidations 16 * ?? Unbounded push-pop queue

Plotting the impact on total state size per field we get

Screenshot 2024-10-31 at 20 50 05

So this PR explores handling the biggest offenders manually and leaving the rest to xdelta3

Proposed Changes

Compute a special ValidatorsDiff which is:

  • Compute a validator record diff where each field is the y value if changed, or a default value if equal.
  • Instead of storing all items, store a tuple of (index, validator_diff) to reduce the total input the compressor has to manage. This makes a huge difference in performance!
  • Compress the result. This last step of compression is not super important but makes it easier to compress records that are mostly untouched

Also re-use the existing logic for balance diffs for inactivity scores

Results

tree-states-archive branch

commit f5b42e2

Times

compute hdiff n=1000000 v_mut=0 v_add=0 [35.187 ms 35.421 ms 35.843 ms]
compute hdiff n=1500000 v_mut=0 v_add=0 [54.489 ms 55.719 ms 58.521 ms]
compute hdiff n=2000000 v_mut=0 v_add=0 [73.075 ms 73.718 ms 75.247 ms]

compute hdiff n=1000000 v_mut=100 v_add=10 [342.67 ms 440.49 ms 598.40 ms]
compute hdiff n=1500000 v_mut=100 v_add=10 [518.29 ms 602.20 ms 743.59 ms]
compute hdiff n=2000000 v_mut=100 v_add=10 [649.77 ms 670.14 ms 700.15 ms]

compute hdiff n=1000000 v_mut=1000 v_add=100 [342.39 ms 353.71 ms 363.08 ms]
compute hdiff n=1500000 v_mut=1000 v_add=100 [506.88 ms 566.03 ms 643.94 ms]
compute hdiff n=2000000 v_mut=1000 v_add=100 [665.69 ms 814.25 ms 1.0424 s]

compute hdiff n=1000000 v_mut=100000 v_add=100 [1.0735 s 1.1561 s 1.2658 s]
compute hdiff n=1500000 v_mut=100000 v_add=100 [1.0565 s 1.1065 s 1.1676 s]
compute hdiff n=2000000 v_mut=100000 v_add=100 [1.6224 s 1.7003 s 1.8073 s]

Diff size

compute hdiff n=1000000 v_mut=1000 v_add=100 - 3,066,453
compute hdiff n=1500000 v_mut=1000 v_add=100 - 4,549,252
compute hdiff n=2000000 v_mut=1000 v_add=100 - 6,030,667

compute hdiff n=1000000 v_mut=100000 v_add=100 - 11,236,678
compute hdiff n=1500000 v_mut=100000 v_add=100 - 12,869,200

This branch

commit aaeffcc

compute hdiff n=1000000 v_mut=0 v_add=0 [25.360 ms 25.402 ms 25.484 ms]
compute hdiff n=1500000 v_mut=0 v_add=0 [38.702 ms 38.886 ms 39.273 ms]
compute hdiff n=2000000 v_mut=0 v_add=0 [52.190 ms 52.402 ms 52.895 ms]

compute hdiff n=1000000 v_mut=1000 v_add=100 [26.545 ms 26.652 ms 26.848 ms]
compute hdiff n=1500000 v_mut=1000 v_add=100 [39.236 ms 39.431 ms 39.772 ms]
compute hdiff n=2000000 v_mut=1000 v_add=100 [53.665 ms 53.904 ms 54.391 ms]

compute hdiff n=1000000 v_mut=100000 v_add=100 [102.26 ms 115.23 ms 145.19 ms]
compute hdiff n=1500000 v_mut=100000 v_add=100 [110.87 ms 112.32 ms 114.46 ms]
compute hdiff n=2000000 v_mut=100000 v_add=100 [116.55 ms 136.00 ms 176.70 ms]

Note: Storing all validator diffs increases compute time by 10x

Diff size

compute hdiff n=1000000 v_mut=1000 v_add=100 - 3,011,040
compute hdiff n=1500000 v_mut=1000 v_add=100 - 4,494,934
compute hdiff n=2000000 v_mut=1000 v_add=100 - 5,978,865

} else {
let pubkey_changed = y.pubkey != x.pubkey;
// Note: If researchers attempt to change the Validator container, go quickly to
// All Core Devs and push hard to add another List in the BeaconState instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

   😹

Copy link
Member

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good

I think the main thing we should add before we sync some nodes with this is versioning on the HDiff type. I think superstruct & and a tagged SSZ repr should be sufficient.

beacon_node/store/src/hdiff.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hdiff.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hdiff.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hdiff.rs Show resolved Hide resolved
beacon_node/store/src/hdiff.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hdiff.rs Show resolved Hide resolved
beacon_node/store/src/hdiff.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hdiff.rs Outdated Show resolved Hide resolved
beacon_node/store/src/hdiff.rs Show resolved Hide resolved
beacon_node/store/src/hdiff.rs Outdated Show resolved Hide resolved
@dapplion dapplion marked this pull request as ready for review November 11, 2024 19:54
@michaelsproul michaelsproul merged commit b14a924 into sigp:tree-states-archive Nov 14, 2024
28 checks passed
@michaelsproul michaelsproul deleted the tree-states-archive-diffplus branch November 14, 2024 06:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants