Skip to content

Conversation

dlmarion
Copy link
Contributor

No description provided.

@dlmarion dlmarion requested a review from keith-turner October 16, 2023 21:45
@dlmarion dlmarion self-assigned this Oct 16, 2023
Copy link
Contributor

@keith-turner keith-turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this concept of sharing code, however I am not sure how it will work for delete and merge because of the following diffs.

  • Split always starts with a single exact tablet. If that tablet does not exists or has an opid when split runs, then it just returns w/o error and does not wait.
  • Delete and merge will need to set the opid on one or more tablets within a range, but will not know what tablets are in the range ahead of time. Also they will need to wait if there is an opid for another operation instead of just returning w/o error. Currently because of tablet locks, a merge could only see concurrenly splits and it could just wait for them to finish.
  • Merge may need to release opids while its trying to set an opid on each tablet if it sees overlapping opid for a delete or another merge to avoid deadlock. This is only if we drop table locks though.
  • Delete table will not delete just opids at the end of the operation, it will delete entire tablets.

I think there may be opportunity to share code, but will need to see what the other operations needs first. I was about to start working on making merge use conditional mutations.

return null;
if (!canContinue(tid, manager)) {
throw new IllegalStateException(
"Tablet is in an unexpected condition: " + splitInfo.getOriginal());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is changing the behavior, its throwing exceptions under different conditions.

+ (newTabletMetadata == null ? null : newTabletMetadata.getOperationId()));
}
}
if (!canContinue(tid, manager)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this method should be called here. The split code creates new tablets and eventually changes the original tablets prevrow which makes could make it look like it does not exists. In process failure conditions where this code might be rerunning it checking these conditions to see where its at and what is left to be done. However its expected that the code will see either the old tablet or new tablet with an operation id set, if sees neither its an indication of a bug. So I think it should always continue to check if the new tablet exists when the old one does not.

@ctubbsii ctubbsii added this to the 4.0.0 milestone Jul 12, 2024
@dlmarion dlmarion changed the base branch from elasticity to main August 26, 2024 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants