-
Notifications
You must be signed in to change notification settings - Fork 467
Consolidated destructive tablet fate code into a new class #3853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this concept of sharing code, however I am not sure how it will work for delete and merge because of the following diffs.
- Split always starts with a single exact tablet. If that tablet does not exists or has an opid when split runs, then it just returns w/o error and does not wait.
- Delete and merge will need to set the opid on one or more tablets within a range, but will not know what tablets are in the range ahead of time. Also they will need to wait if there is an opid for another operation instead of just returning w/o error. Currently because of tablet locks, a merge could only see concurrenly splits and it could just wait for them to finish.
- Merge may need to release opids while its trying to set an opid on each tablet if it sees overlapping opid for a delete or another merge to avoid deadlock. This is only if we drop table locks though.
- Delete table will not delete just opids at the end of the operation, it will delete entire tablets.
I think there may be opportunity to share code, but will need to see what the other operations needs first. I was about to start working on making merge use conditional mutations.
return null; | ||
if (!canContinue(tid, manager)) { | ||
throw new IllegalStateException( | ||
"Tablet is in an unexpected condition: " + splitInfo.getOriginal()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is changing the behavior, its throwing exceptions under different conditions.
+ (newTabletMetadata == null ? null : newTabletMetadata.getOperationId())); | ||
} | ||
} | ||
if (!canContinue(tid, manager)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this method should be called here. The split code creates new tablets and eventually changes the original tablets prevrow which makes could make it look like it does not exists. In process failure conditions where this code might be rerunning it checking these conditions to see where its at and what is left to be done. However its expected that the code will see either the old tablet or new tablet with an operation id set, if sees neither its an indication of a bug. So I think it should always continue to check if the new tablet exists when the old one does not.
No description provided.