-
Notifications
You must be signed in to change notification settings - Fork 129
Provide a way to remove old commits #681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Motivation: Central Dogma uses jGit to store data. Due to the nature of Git that stores unlimited history, Central Dogma will eventually get in trouble managing disk usage. We can handle this by maintaing the primary and secondary Git repository internally. This works in this way: 1 Commits are pushed to the primary Git repository. 2 If the number of commits exceed the threshold (`minRetentionCommits`), then the secondary Git repository is created. 3 Commits are pushed to the both primary and secondary Git repositories. 4 If the secondary Git repository has the number of commits more than the threshold; - The secondary Git repository is promoted to the primary Git repository. - The primary Git repository is removed completely. - Another secondary Git repository is created. 5 Back to 3. Modifications: - TBD Result: - Close 575 - TBD Todo: - Provide a way to set `minRetentionCommits` and `minRetentionDay` for each repository. - Support mirroring from CD to external Git.
…itory by the leader
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #681 +/- ##
============================================
- Coverage 65.69% 63.53% -2.17%
- Complexity 3351 3473 +122
============================================
Files 358 366 +8
Lines 13893 15093 +1200
Branches 1496 1687 +191
============================================
+ Hits 9127 9589 +462
- Misses 3916 4591 +675
- Partials 850 913 +63 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This somewhat sounds like a surprising behavior to me. This means a diff command that once worked as expected starts to return completely different result once rolling occurs. Can we instead treat any revision between |
|
This issue seems to be a necessary feature to keep the cluster running reliably. |
|
@aidanbae Sorry for making you wait for this. I'm working on it now. Please stay tuned. |
That makes sense. Let me apply that approach. |
Motivation: Before we apply line#681 that removes old commits, it would be nice if we can find any usages that access data with the INIT Revision or revisions that are more than 5000 (default minimum number of commit retentions) commits ahead from the head. Modifications: - Add a counter that is incresed when INIT Revision or revisions that are more than 5000 ahead from the head are used. Result: - Track the number of usages. - This commit will be reverted after we find the usage and fix it.
Had a chat with the team, and we decided to raise an exception instead of treating it as an empty state. This PR is ready, so PTAL. 😉 |
Motivation: Before we apply #681 that removes old commits, it would be nice if we could find any usages that access data with the INIT Revision or revisions that are more than 5000 (default minimum number of commit retentions) commits ahead from the head. Modifications: - Add a counter that is increased when INIT Revision or revisions that are more than 5000 ahead from the head are used. Result: - Track the number of usages. - This commit will be reverted after we find the usage and fix it.
jrhee17
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall! Left some minor questions 🙇
| requireNonNull(initialRevision, "initialRevision"); | ||
| checkArgument(minRetentionCommits > 0, "minRetentionCommits: %s (expected: > 0)", | ||
| minRetentionCommits); | ||
| checkArgument(minRetentionDays > 0, "minRetentionDays: %s (expected: > 0)", minRetentionDays); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question) isn't it possible for minRetentionDays to be 0?
| checkArgument(minRetentionDays > 0, "minRetentionDays: %s (expected: > 0)", minRetentionDays); | |
| checkArgument(minRetentionDays >= 0, "minRetentionDays: %s (expected: >= 0)", minRetentionDays); |
| this.parent = requireNonNull(parent, "parent"); | ||
| originalRepoName = requireNonNull(repositoryDir, "repositoryDir").getName(); | ||
| this.repositoryWorker = requireNonNull(repositoryWorker, "repositoryWorker"); | ||
| requireNonNull(author, "author"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we can remove the assignment at L171
| requireNonNull(author, "author"); | |
| this.author = requireNonNull(author, "author"); |
| } | ||
| } | ||
|
|
||
| private int applyChanges(@Nullable Revision baseRevision, @Nullable ObjectId baseTreeId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would honestly be very helpful if these kind of methods were deduped instead of copy-pasted so that we don't have to review them again 😅
Anyways, let me just assume that these methods were copy-pasted since I don't think it's realistic to invest more time into looking in detail through this file
| if (!primaryRepoDir.mkdirs()) { | ||
| throw new StorageException("failed to create " + primaryRepoDir + " while migrating to V2."); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be safer to do this before L249 so that we run less of a risk. (If this step fails, then repoDir doesn't exist and only tmpRepoDir exists at this point)
| continue; | ||
| } | ||
|
|
||
| final boolean fetchContent = FindOption.FETCH_CONTENT.get(options); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can compute this once at L441
| this.headRevision = res.revision; | ||
| final InternalRepository secondaryRepo = this.secondaryRepo; | ||
| if (secondaryRepo != null) { | ||
| assert headRevision.equals(secondaryRepo.headRevision()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question) Can this be guaranteed?
For instance, if a CentralDogma server is shut down forcefully due to a jvm crash before the secondaryRepo.commit is invoked, will the server be recoverable? or do we need to manually delete the secondaryRepo folder?
This scenario could also be possible if secondaryRepo == null while a commit is occurring. (due to secondary repo rolling)
Rather, would a mechanism which catches up to the primary repo be a better idea?
| assert headRevision.equals(secondaryRepo.headRevision()); | ||
|
|
||
| // Push the same commit to the secondary repo. | ||
| secondaryRepo.commit(headRevision, headRevision.forward(1), commitTimeMillis, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a StorageException is thrown for whatever reason (e.g. insufficient disk), should the overall call to the commit also fail?
| secondaryRepo.commit(headRevision, headRevision.forward(1), commitTimeMillis, | |
| try { | |
| secondaryRepo.commit(headRevision, headRevision.forward(1), commitTimeMillis, | |
| } (catch ...) {} |
| primarySuffix = newPrimarySuffix; | ||
| writeSuffix(newPrimarySuffix); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should update the suffix in case writeSuffix fails (primaryRepoDir will return the wrong value)
| primarySuffix = newPrimarySuffix; | |
| writeSuffix(newPrimarySuffix); | |
| writeSuffix(newPrimarySuffix); | |
| primarySuffix = newPrimarySuffix; |
| } | ||
|
|
||
| @Override | ||
| public void createRollingRepository(Revision rollingRepositoryInitialRevision, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question) I'm assuming that we don't have a way to synchronize two CreateRollingRepositoryCommand being issued on a single repository simultaneously. (and judging from the current implementation I don't think it should be allowed)
Do you think this scenario is impossible? or do we introduce a separate lock for this command?
Motivation:
Central Dogma uses jGit to store data. Due to the nature of Git that stores unlimited history,
Central Dogma will eventually get in trouble managing disk usage.
We can handle this by maintaining the primary and secondary Git repositories internally.
This works in this way:
minRetentionCommits), then the secondary Git repository is created.Modifications:
CreateRollingRepositoryCommandthat creates the rolling repository by theCommitRetentionManagementPlugin.GitRepositoryV2that manages the rolling jGit repositories.foo_0000000000,foo_0000000001,foo_0000000002and so onRepositoryMetadataDatabasehas the suffix in its file database.GitRepositoryis not removed for the migration test.InternalRepositorythat has jGit repository andCommitIdDatabase.diff,watch,history, etc.) is lower than thefirstRevisionof the current primary repo?Revision.INIT(1)is used, thefirstRevisionis used instead.diff(INIT, new Revision(100) ...)is equals todiff(new Revision(firstRevisionNumber), new Revision(100) ...)RevisionbetweenRevision.INIT(1)and thefirstRevisionis used, aRevisionNotFoundExceptionis raised.watchandfindLatestRevision.Result:
minRetentionCommitswhile keeping the commits made in the recentminRetentionDay.Todo:
minRetentionCommitsandminRetentionDayfor each repository.