-
-
Notifications
You must be signed in to change notification settings - Fork 405
feat: sync from unfinalized checkpoint state #8527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: unstable
Are you sure you want to change the base?
Conversation
**Motivation** - we already have local checkpoint states in either db or file, this PR use one of them to save syncing time the next time the node is restarted - see #7504 (comment) **Description** 2 new init unfinalized state options: - new `--lastPersistedCheckpointState`, true by default to load from the last safe persisted checkpoint state. And the last safe persisted checkpoint state is considered unfinalized - use `--checkpointState` option: this is old option and support finalized state only, this PR supports booting from unfinalized state similar to `--lastPersistedCheckpointState` ==> to quickly sync a new node, we can use this option. Then remove it and the next time node will use `--lastPersistedCheckpointState` option by default - we can configure one of our nodes with `chain.nHistoricalStatesFileDataStore = true` - then feed other nodes with a persisted "safe checkpoint state" from there in `~/checkpoint_states` folder - a persisted checkpoint state is consider to be safe to boot if - it should be the checkpoint state that's unique in its epoch - its last processed block slot should be at epoch boundary or last slot of previous epoch - state slot should be at epoch boundary - state slot should be equal to epoch * SLOTS_PER_EPOCH other options to boot from `stateArchived` or `checkpointSyncUrl` are considered finalized states including `wssCheckpoint`. It's not possible to use `wssCheckpoint` option with unfinalized state for now. **TODO** - this is for `holesky-rescue`, consider supporting this for `unstable` branch too - update document in that case --------- Co-authored-by: Tuyen Nguyen <[email protected]>
**Motivation** - implement an api to get a node synced asap **Description** - new api: `eth/v1/lodestar/persisted_checkpoint_state` to return a state based on an optional `rootHex:epoch` param - if not specified, return the latest safe checkpoint state - a node need to specify `--checkpointState` from the previous PR #7509 **Test** - [x] `curl -H "Accept: application/octet-stream" http://localhost:9596/eth/v1/lodestar/persisted_checkpoint_state -o latest_checkpoint_state.ssz` - [x] `curl -H "Accept: application/octet-stream" http://localhost:9596/eth/v1/lodestar/persisted_checkpoint_state?checkpoint_id=0x4f4d4c1b81141fe77a4a1c6a376dbe64ed9baa8f123664195e4f710c9fc4238d:118936 -o state_epoch_118936.ssz` --------- Co-authored-by: Tuyen Nguyen <[email protected]> Co-authored-by: Nico Flaig <[email protected]>
…ied (#7542) We have added a new api in #7541 to serve latest checkpoint states. However the problem is that `--checkpointState` right now assumes that we receive a ssz-serialized state bytes, so we also need to ensure that the REST API only returns SSZ responses by applying the correct Accept header. This PR includes the following changes - Ensure we only receive SSZ responses if REST API is queried - Consider `--forceCheckpointSync` flag and if set always prioritze remote checkpoint from `--checkpointState` over latest safe checkpoint in db - Minor tweaks to logs to make it easier to understand what the node is doing - Misc formatting changes
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## unstable #8527 +/- ##
============================================
- Coverage 52.21% 52.03% -0.19%
============================================
Files 852 852
Lines 65101 65528 +427
Branches 4770 4782 +12
============================================
+ Hits 33995 34098 +103
- Misses 31037 31361 +324
Partials 69 69 🚀 New features to boost your workflow:
|
Performance Report✔️ no performance regression detected Full benchmark results
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant feature: the ability to sync from an unfinalized checkpoint state. This is particularly useful for faster node startup and recovery, especially during periods of non-finality. The implementation is comprehensive, touching upon API definitions, CLI options, state initialization logic, and the core fork choice mechanism. The introduction of initializeForkChoiceFromUnfinalizedState
with its proto-array scaffolding is a clever approach to handle this complex scenario. The code is well-structured, and the new "unsafe" options are appropriately hidden and documented with warnings. Overall, this is a solid implementation of a complex feature. I have a couple of minor suggestions for code quality and performance improvements.
/** | ||
* Extract a checkpoint from a string in the format `rootHex:epoch`. | ||
*/ | ||
export function getCheckpointFromArg(checkpointStr: string): Checkpoint { | ||
const checkpointRegex = /^(?:0x)?([0-9a-f]{64}):([0-9]+)$/; | ||
const match = checkpointRegex.exec(checkpointStr.toLowerCase()); | ||
if (!match) { | ||
throw new ApiError(400, `Could not parse checkpoint string: ${checkpointStr}`); | ||
} | ||
return {root: fromHex(match[1]), epoch: parseInt(match[2])}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a minor performance improvement and to follow best practices, the regular expression can be defined as a constant outside the function. This prevents it from being recompiled on every function call.
/** | |
* Extract a checkpoint from a string in the format `rootHex:epoch`. | |
*/ | |
export function getCheckpointFromArg(checkpointStr: string): Checkpoint { | |
const checkpointRegex = /^(?:0x)?([0-9a-f]{64}):([0-9]+)$/; | |
const match = checkpointRegex.exec(checkpointStr.toLowerCase()); | |
if (!match) { | |
throw new ApiError(400, `Could not parse checkpoint string: ${checkpointStr}`); | |
} | |
return {root: fromHex(match[1]), epoch: parseInt(match[2])}; | |
} | |
const CHECKPOINT_REGEX = /^(?:0x)?([0-9a-f]{64}):([0-9]+)$/; | |
/** | |
* Extract a checkpoint from a string in the format `rootHex:epoch`. | |
*/ | |
export function getCheckpointFromArg(checkpointStr: string): Checkpoint { | |
const match = CHECKPOINT_REGEX.exec(checkpointStr.toLowerCase()); | |
if (!match) { | |
throw new ApiError(400, `Could not parse checkpoint string: ${checkpointStr}`); | |
} | |
return {root: fromHex(match[1]), epoch: parseInt(match[2])}; | |
} |
if (datastoreKey == null) { | ||
// should not happen | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No description provided.