-
Notifications
You must be signed in to change notification settings - Fork 11
fix: 🚑 make it so is_in_bucket is consistent across same file key records
#598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Starting 👀 |
| -- If ANY file record with a given file_key has is_in_bucket=true, then ALL | ||
| -- records for that file_key should have is_in_bucket=true. This is because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the only way this can happen is if the SR is already accepted by the MSP and the user send other one to increase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as that's the only way to have more than one file record for the same file key.
| let block_hash_bytes = block_hash.as_bytes().to_vec(); | ||
| let tx_hash_bytes = evm_tx_hash.map(|h| h.as_bytes().to_vec()); | ||
|
|
||
| // Check if this file key is already present in the bucket of the MSP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So when the user ask for more redundancy the MSP needs to "re-accept" it, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, otherwise the new storage request will expire without an MSP response, so it will be considered rejected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
For tracking reasons, it would be cool to be bit more specific with the current bug.
e.g: exactly what was happening with the files with the flag set on false, and why these where triggering a failure in the block import and thus the stall.
Not for this PR, but it would be high priority that we add a test that replicates this specific case, that way we can't re introduce the bug later.
@santikaplan
It's a bit convoluted and hard to explain in detail so it wouldn't add much context, but simplifying it a bit the issue we faced was: |
This PR fixes a bug where the
is_in_bucketfield of files in the indexer DB wasn't being populated correctly when creating new file records for a file that was already previously stored by the MSP.We allow new storage requests for file keys that have already been fulfilled to allow the user to add redundancy to them (in the form of more BSPs), but since the MSP already has the file from the previous storage request, it can accept the new one with an inclusion proof instead of a non-inclusion proof. This makes it so the runtime doesn't update its bucket root (since it shouldn't, as the bucket already had the file from before), which means the
MutationsAppliedevent is never emitted, and this is the event that the indexer detects to update theis_in_bucketfield of a file record.This made it so the new file records in the indexer DB created by these subsequent storage requests permanently had their
is_in_bucketfield set tofalse, and so this could create inconsistencies which could lead to the indexer trying to delete a file record that still has an MSP association, failing to import the block and stalling.The fix is twofold:
is_in_bucketset tofalseand a sibling record (i.e. another file record for the same file key) that has itsis_in_bucketset to true (I believe it would have been enough to check the oldest sibling record, but just in case we check all of them), and setting theis_in_bucketfield for those files totrue.is_in_bucketfield of the new record to false, it checks if any sibling record has itsis_in_bucketfield set totrueand if so, creates the new record with the field set totrueas well.Note
As a comment, I left a function to check MSP associations to file records in the indexer. This is not currently used, but I believe we should add that check before every deletion once we are up and running in an environment where the indexer stalling would end up being critical and we rather keep it working while we work on the fix of whatever caused the inconsistency. Fixing inconsistent information could prove hard though, so we might end up deciding not to use this function at all and maintain the indexer's current behaviour.
Short description
There's a new migration for the indexer DB that must be executed for all existing DBs.
Who is affected
Suggested code changes
No code changes required.