-
-
Notifications
You must be signed in to change notification settings - Fork 405
refactor: update the structure of state regen operation #8509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/differential-archive
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @nazarhussain, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This PR refactors the state regeneration logic by breaking it down into three distinct phases: planning the regeneration steps, fetching the necessary data (snapshots and differentials), and applying these artifacts to construct the final state. This change improves code organization and readability by replacing a single large operation file with several smaller, focused modules and a new orchestration layer. Highlights
Accompanying these functional changes, the pull request also includes:
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This is a great refactoring that improves the structure and clarity of the state regeneration logic by splitting it into plan, fetch, and apply stages. I've found a critical issue and a few areas for improvement regarding performance and code clarity. Please see my comments below.
ctx.logger?.verbose("Replaying blocks", { | ||
fromSlot: plan.blockReplay.fromSlot, | ||
tillSlot: plan.blockReplay.tillSlot, | ||
}); | ||
|
||
const replayed = await replayBlocks(ctx, { | ||
stateBytes, | ||
fromSlot: plan.blockReplay.fromSlot, | ||
toSlot: plan.blockReplay.tillSlot, | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a critical bug here in determining the fromSlot
for block replay. plan.blockReplay.fromSlot
is calculated based on the assumption that all planned diffs are available. However, if some diffs are missing (as warned on line 37), stateWithDiffApplied
will be at an earlier slot. Starting the block replay from the planned fromSlot
will cause some blocks to be skipped, leading to an incorrect final state.
The block replay must start from the slot of the state after applying all available diffs, which is stateWithDiffApplied.slot
.
const fromSlot = stateWithDiffApplied.slot;
const toSlot = plan.blockReplay.tillSlot;
ctx.logger?.verbose("Replaying blocks", {
fromSlot,
tillSlot: toSlot,
});
const replayed = await replayBlocks(ctx, {
stateBytes,
fromSlot,
toSlot,
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a check orderedDiffs.at(-1)?.slot !== plan.blockReplay.fromSlot - 1
that will make sure the last diff we have is adjacent to block replay.
const orderedDiffs = []; | ||
for (const diffSlot of plan.diffSlots) { | ||
const diff = artifacts.diffs.find((d) => d.slot === diffSlot); | ||
if (diff) { | ||
orderedDiffs.push(diff); | ||
} | ||
} | ||
|
||
if (orderedDiffs.length + artifacts.missingDiffs.length !== plan.diffSlots.length) { | ||
throw new Error(`Can not find required state diffs ${plan.diffSlots.join(",")}`); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block of code for ordering diffs and checking for missing ones can be made more efficient and concise.
- Inefficient Loop: The current implementation uses a
for
loop withartifacts.diffs.find()
inside, which results in a time complexity of roughly O(M*N), where M isplan.diffSlots.length
and N isartifacts.diffs.length
. This can be optimized to O(M+N) by using aMap
for faster lookups. - Redundant Check: The
if
condition on line 52 is redundant. ThefetchStateRegenArtifacts
function ensures thatartifacts.diffs.length + artifacts.missingDiffs.length
is always equal toplan.diffSlots.length
. SinceorderedDiffs.length
will equalartifacts.diffs.length
, this condition will never be met and can be removed.
Here is a suggested refactoring that addresses both points:
const diffsMap = new Map(artifacts.diffs.map((d) => [d.slot, d]));
const orderedDiffs = plan.diffSlots.flatMap((slot) => {
const diff = diffsMap.get(slot);
return diff ? [diff] : [];
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We prefer to use for
over map
.
for (const edge of plan.diffSlots) { | ||
const diff = await getStateDifferential({db}, {slot: edge}); | ||
diff ? diffs.push(diff) : missingDiffs.push(edge); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation fetches state differentials from the database sequentially within a for...of
loop. This can be inefficient if there are multiple diffs to retrieve. To improve performance, you can fetch them in parallel using Promise.all
.
for (const edge of plan.diffSlots) { | |
const diff = await getStateDifferential({db}, {slot: edge}); | |
diff ? diffs.push(diff) : missingDiffs.push(edge); | |
} | |
await Promise.all( | |
plan.diffSlots.map(async (edge) => { | |
const diff = await getStateDifferential({db}, {slot: edge}); | |
diff ? diffs.push(diff) : missingDiffs.push(edge); | |
}) | |
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We prefer to use for
over map and there will be maximum of 4 diffs in worst case anyway so not a big problem.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## feature/differential-archive #8509 +/- ##
===============================================================
Coverage ? 52.29%
===============================================================
Files ? 859
Lines ? 65291
Branches ? 4804
===============================================================
Hits ? 34142
Misses ? 31080
Partials ? 69 🚀 New features to boost your workflow:
|
Motivation
Use smaller and pure functions for complex logic.
Description
Steps to test or reproduce
Run all tests