-
Notifications
You must be signed in to change notification settings - Fork 256
Versioned bundle and Execution receipt #3623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…y code for previous fraud proof version
5e66e75 to
335e1d6
Compare
teor2345
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, but it's a large change. What is our testing and risk mitigation strategy?
There are a lot of new functions and types without documentation. Can you add a short description to the production ones? (Tests aren't as important, but it can still be useful to say what a test is meant to check.)
What does a version upgrade look like? Do we need documentation that says how to do it, and tests to make sure it works? (Or a documented manual test process.)
There are some duplicated code blocks and types we could fix up, to avoid confusion.
Same as usual.
Both should cover all the cases
no, its purely defined by the runtime and client uses whatever runtime expects. Only thing we should aware is the node release should go in first followed by runtime upgrade for Taurus. For mainnet, since there are no domains, we can release a client as soon as taurus is tested. |
These things seem important to document somewhere, because we'll be doing another upgrade on mainnet with domains at some point in the future. |
Not necessarily. Take a look at how XDM is versioned. This is same as that. Runtime defines the versions that need to be used and client uses that. As for protocol specs, there are no changes to the specs since nothing has changed but rather made type versioned |
…emove unnecessary clone
335e1d6 to
77abccd
Compare
NingLin-P
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of changes and a lot of duplicated code, I don't have the confidence to find every issue (if any) with my bare eye, thus let's see how the test goes.
|
@NingLin-P I think there maybe confusion on Bundle version and Er version when it comes to storing them A runtime will only accept the Bundle version that is defined on the Runtime's CurrentVersion. Now coming to ER version, it changes a bit. now coming to how we store the versions, we hook into the I have added tests to explain this better. Let me know if this is clear or we can do a sync call Note: conflicts will be fixed once we have an approval from the team else merge commits will pollute the actual commits from this PR |
86f298b to
fea69d7
Compare
…ck at which rutime upgrade did takes place
96df613 to
40c14ef
Compare
40c14ef to
5ed52a2
Compare
vedhavyas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More specifically, when the ER version is upgraded from V0 to V1 in consensus block #N, the runtime will assume using V0 for block #N and using V1 starting from block #N+1. While on the client side, when querying the ER versions in block #N it will return V1, thus it will use V1 to construct ER derived from #N. This inconsistency will cause the bundle to be rejected and the domain chain to stop progressing.
Thanks for pointing it out @NingLin-P . This was indeed missed from my end. Should be fixed in the new commits.
I wonder if it is better to do ongoing maintenance of the current code. I think there are risks in merging within the next week (or two). And risks and costs in delaying other plans to fit it in.
@teor2345 The whole reason why the compatibility code is removed to actually push to devnet and subsequently to mainnet since we dont have a tarus beyond this point.
Another way to reduce the risk is splitting the PR into:
I do not agree on this. IF a PR changes or introduces somethings, then its tests and everything affected by it should be part of the PR. We did this earlier and it did not bode well especially its multiple steps of opening a new PR and context switching to something else.
I would rather have reviewers take as much as they need to understand the changes and the side-effects it brings, if any, before merging the PR or else its a No-go from my end.
Hmm, I'm not sure if there's been a misunderstanding here. Why are tests needed for "refactors that do not change functionality at all"? |
teor2345
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some non-blocking code style questions.
There will be some residual risk no matter how much we review this. Let's do some initial testing, and see if any bugs remain?
That will give us a better idea of the risk of this change, and we can decide what to do next when we know more.
| pub(crate) fn set_previous_bundle_and_execution_receipt_version<SV, PV, BEV>( | ||
| block_number: BlockNumberFor<T>, | ||
| set_version: SV, | ||
| previous_versions: PV, | ||
| current_version: BEV, | ||
| ) where | ||
| SV: Fn(BTreeMap<BlockNumberFor<T>, BEV>), | ||
| PV: Fn() -> BTreeMap<BlockNumberFor<T>, BEV>, | ||
| BEV: PartialEq, | ||
| { | ||
| let mut versions = previous_versions(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an unusual code style.
Normally we would just pass a BTreeMap<BlockNumberFor<T>, BEV> directly to the function, return a BTreeMap<BlockNumberFor<T>, BEV> from the function, and then use the return value to set the storage.
Since the get and set are called unconditionally, I'm not sure why we're passing Fns to this function.
Passing Fns will reduce the amount of inlining and optimisation the compiler can do, potentially reducing performance (or increasing code size).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explained why I did above
| pub(crate) fn bundle_and_execution_receipt_version_for_consensus_number<PV, BEV>( | ||
| er_derived_number: BlockNumberFor<T>, | ||
| previous_versions: PV, | ||
| current_version: BEV, | ||
| ) -> Option<BEV> | ||
| where | ||
| PV: Fn() -> BTreeMap<BlockNumberFor<T>, BEV>, | ||
| BEV: Copy + Clone, | ||
| { | ||
| let versions = previous_versions(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar feedback here, normally we would just pass BTreeMap<BlockNumberFor<T>, BEV> to this function directly.
Passing a Fn will reduce the amount of inlining and optimisation the compiler can do, potentially reducing performance (or increasing code size).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done specifically for testing. PTAL at testing this logic with mock versions and mock storage but reusing the same the same logic to pick the correct versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made these comments based on the test code.
What stops us calling the production code as:
bundle_and_execution_receipt_version_for_consensus_number(
er_derived_number,
CurrentBundleAndExecutionReceiptVersion::get(),
PreviousBundleAndExecutionReceiptVersions::get(),
)And the test code as:
bundle_and_execution_receipt_version_for_consensus_number(
er_derived_number,
MockCurrentBundleAndExecutionReceiptVersion::get(),
MockPreviousBundleAndExecutionReceiptVersions::get(),
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is due to the change in type between production and testing for BundleAndExecutionReceiptVersion and in the inner types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might be talking about different things here, I'll open a PR to show what I mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
teor2345
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears 3 benchmarks are broken now:
- pallet_messenger_from_domains_extension::from_domains_relay_message
- pallet_messenger_from_domains_extension::from_domains_relay_message_channel_open
- pallet_messenger_from_domains_extension::from_domains_relay_message_response
https://github.com/autonomys/subspace/actions/runs/16361523792/job/46230168298?pr=3623#step:6:5460
|
Nice! very helpful. Thanks 🙏🏼 |
|
|
||
| /// Returns AutoId genesis domain. | ||
| /// Note: Currently unused since dev or devnet uses EVM domain and not AutoId | ||
| #[allow(dead_code)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this code is always unused, you could mark it as:
| #[allow(dead_code)] | |
| #[expect(dead_code)] |
Then if we use it in future, we'll get a compile error and remove that annotation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to PR #3646
teor2345
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
First of all, changes may appear big but most of the changes are cosmetic.
I did not change everything at once but rather inroduced necessary intermediate types so that I'm aware which parts require direct changes, like runtime, and which parts require an actual compatilibility code. These intermediate types would have be removed in the following commits.
Notable changes
For the reviewers, you can either follow commit by commit to see how the migration is applied or look at overall changes. Overall changes are surprisingly easy to understand.
Please let me know @NingLin-P if there are changes that missed that may cause incomaptibility on current Taurus.
Cleanup
We can safely remove
Once the new traurus is deployed. As for the mainnet, assuming, this PR lands before we instantiate domains this should be safe to remove as well.
Closes: #2942
Code contributor checklist: