-
Notifications
You must be signed in to change notification settings - Fork 829
conflicts: add conflict labels to MergedTree and store in backends
#7850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
f795177 to
f04b616
Compare
|
Hey @martinvonz, could you take a look and let me know your thoughts on this approach when you get a chance? |
6307b55 to
7398991
Compare
7398991 to
0d3d525
Compare
c2c5fd9 to
41f8405
Compare
41f8405 to
17ed34e
Compare
scott2000
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martinvonz I believe I addressed everything we had talked about before, so could you take another look when you get a chance? I also added comments about some specific things I'm unsure about, and I'd appreciate any feedback you have.
| /// them more efficient to clone. | ||
| #[derive(ContentHash, PartialEq, Eq, Clone)] | ||
| pub struct ConflictLabels { | ||
| labels: Option<Arc<Merge<String>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if the Arc is necessary. This type does get cloned fairly often, but it might be premature optimization, and we already clone Merge<TreeId> in many places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it's only cloned when MergedTree is cloned, which is not often, so I would drop the Arc.
| /// resolved or if any label is empty, the labels will be discarded, since | ||
| /// resolved merges cannot have labels, and labels cannot be empty. | ||
| pub fn new(labels: Merge<String>) -> Self { | ||
| if labels.is_resolved() || labels.iter().any(|label| label.is_empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good to make labels all-or-nothing, in that if any side is missing a label, then we discard all of the labels and fall back to the old side #1 and side #2 labels, but I'd like to hear other people's thoughts on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we could use empty strings when there's no label (and render empty strings specially in conflict markers etc). It seems slightly useful to have labels for some terms even if they're missing for other terms. Seems like that should be simpler too.
| /// Returns both the underlying tree IDs and any conflict labels. This can | ||
| /// be used to check whether there are changes in files to be materialized | ||
| /// in the working copy. | ||
| pub fn tree_ids_and_labels(&self) -> (&Merge<TreeId>, &ConflictLabels) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this for comparisons that need to check whether conflict labels or trees changed (e.g. for materializing in the working copy). I think it matches MergedTree::into_tree_ids_and_labels, but I'm not sure if it's strange to have a getter like this.
| #[serde(skip)] // TODO: should be exposed? | ||
| pub root_tree: Merge<TreeId>, | ||
| #[serde(skip)] | ||
| pub conflict_labels: ConflictLabels, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Storing ConflictLabels here makes the code simple, but I'm not sure if it's too high-level of a type. Would it be better to use Option<Merge<String>> or Vec<String> instead? These types don't enforce the requirements of ConflictLabels though (e.g. that resolved trees can't have labels).
We also can't do root_tree: MergedTree, since MergedTree contains an Arc<Store>, and I believe this data is supposed to be independent of the store probably.
d77683d to
aa55930
Compare
| /// them more efficient to clone. | ||
| #[derive(ContentHash, PartialEq, Eq, Clone)] | ||
| pub struct ConflictLabels { | ||
| labels: Option<Arc<Merge<String>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it's only cloned when MergedTree is cloned, which is not often, so I would drop the Arc.
| /// resolved or if any label is empty, the labels will be discarded, since | ||
| /// resolved merges cannot have labels, and labels cannot be empty. | ||
| pub fn new(labels: Merge<String>) -> Self { | ||
| if labels.is_resolved() || labels.iter().any(|label| label.is_empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we could use empty strings when there's no label (and render empty strings specially in conflict markers etc). It seems slightly useful to have labels for some terms even if they're missing for other terms. Seems like that should be simpler too.
| impl From<Merge<String>> for ConflictLabels { | ||
| fn from(value: Merge<String>) -> Self { | ||
| Self::new(value) | ||
| } | ||
| } | ||
|
|
||
| impl From<Merge<&'_ str>> for ConflictLabels { | ||
| fn from(value: Merge<&str>) -> Self { | ||
| Self::new(value.map(|&label| label.to_owned())) | ||
| } | ||
| } | ||
|
|
||
| impl<T> From<Option<T>> for ConflictLabels | ||
| where | ||
| T: Into<Self>, | ||
| { | ||
| fn from(value: Option<T>) -> Self { | ||
| value.map_or_else(Self::unlabeled, T::into) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I often find named constructors clearer than .into(). Maybe that's just me, so feel free to ignore.
Adding this as a separate type will help maintain the invariants that resolved merges cannot have labels, and labels cannot be the empty string. I also added `Arc` to make cloning more efficient, since the conflict labels will be cloned whenever `Commit::tree` is called. I think that storing separate conflict labels for each term of the conflict is the best approach for a couple reasons. Mainly, I think it integrates well with the existing conflict algebra. For instance, a diff of (A - B) and a diff of (B - C) can be easily combined to create a new diff of (A - C), and if we associate a label with each term, then the labels will also naturally be carried over as well. Also, I think it would be simpler to implement than other approaches (such as storing labels for diffs instead of terms), since conflict labels can re-use existing logic from `Merge<T>`. For simplicity, I also think we shouldn't allow mixing labeled terms and unlabeled terms (i.e. if any term doesn't have a label, then we discard all labels and leave the entire merge unlabeled). I think it could be confusing to have conflicts where, for instance, one side says "rebase destination" and another side only says "side #2" with no further information. In cases like these, I think it's better to just fall back to the old labels. In the future, I expect that most conflicts should have labels (since we should eventually be adding labels everywhere conflicts can happen).
To implement simplification of conflict labels, I decided to add more functions such as `zip` and `unzip` to `Merge`. I think these functions could be useful in other situations so I thought this was a nice solution, but an alternative solution could be to make `get_simplified_mapping` and `apply_simplified_mapping` public and manually apply the same mapping to both merges.
The old method is renamed to `MergedTree::merge_unlabeled` to make it easy to find unmigrated callers. The goal is that almost all callers will eventually use `MergedTree::merge` to add labels, unless the resulting tree is never visible to the user.
Conflict labels are stored in a separate header for backwards compatibility.
aa55930 to
e1dea15
Compare
#1176
I split this PR off of #7692 because I think these changes could be reviewed separately to figure out the storage of conflict labels before we go into the specifics of adding the labels for each type of conflict and rendering them in conflict markers. I added some justification for this approach in the description of the first commit, but I'd be open to a different approach if reviewers think it would work better.
Checklist
If applicable:
CHANGELOG.mdREADME.md,docs/,demos/)cli/src/config-schema.json)