Skip to content

Conversation

@scott2000
Copy link
Contributor

#1176

I split this PR off of #7692 because I think these changes could be reviewed separately to figure out the storage of conflict labels before we go into the specifics of adding the labels for each type of conflict and rendering them in conflict markers. I added some justification for this approach in the description of the first commit, but I'd be open to a different approach if reviewers think it would work better.

Checklist

If applicable:

  • I have updated CHANGELOG.md
  • I have updated the documentation (README.md, docs/, demos/)
  • I have updated the config schema (cli/src/config-schema.json)
  • I have added/updated tests to cover my changes

@scott2000 scott2000 requested a review from a team as a code owner October 26, 2025 02:34
@scott2000 scott2000 force-pushed the scott2000/conflict-labels-storage branch 3 times, most recently from f795177 to f04b616 Compare October 31, 2025 23:27
@scott2000
Copy link
Contributor Author

Hey @martinvonz, could you take a look and let me know your thoughts on this approach when you get a chance?

@scott2000 scott2000 force-pushed the scott2000/conflict-labels-storage branch 2 times, most recently from 6307b55 to 7398991 Compare November 2, 2025 03:12
@scott2000 scott2000 force-pushed the scott2000/conflict-labels-storage branch from 7398991 to 0d3d525 Compare November 2, 2025 14:26
@scott2000 scott2000 force-pushed the scott2000/conflict-labels-storage branch 5 times, most recently from c2c5fd9 to 41f8405 Compare November 10, 2025 23:06
@scott2000 scott2000 force-pushed the scott2000/conflict-labels-storage branch from 41f8405 to 17ed34e Compare November 17, 2025 23:11
Copy link
Contributor Author

@scott2000 scott2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinvonz I believe I addressed everything we had talked about before, so could you take another look when you get a chance? I also added comments about some specific things I'm unsure about, and I'd appreciate any feedback you have.

/// them more efficient to clone.
#[derive(ContentHash, PartialEq, Eq, Clone)]
pub struct ConflictLabels {
labels: Option<Arc<Merge<String>>>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if the Arc is necessary. This type does get cloned fairly often, but it might be premature optimization, and we already clone Merge<TreeId> in many places.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it's only cloned when MergedTree is cloned, which is not often, so I would drop the Arc.

/// resolved or if any label is empty, the labels will be discarded, since
/// resolved merges cannot have labels, and labels cannot be empty.
pub fn new(labels: Merge<String>) -> Self {
if labels.is_resolved() || labels.iter().any(|label| label.is_empty()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to make labels all-or-nothing, in that if any side is missing a label, then we discard all of the labels and fall back to the old side #1 and side #2 labels, but I'd like to hear other people's thoughts on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we could use empty strings when there's no label (and render empty strings specially in conflict markers etc). It seems slightly useful to have labels for some terms even if they're missing for other terms. Seems like that should be simpler too.

/// Returns both the underlying tree IDs and any conflict labels. This can
/// be used to check whether there are changes in files to be materialized
/// in the working copy.
pub fn tree_ids_and_labels(&self) -> (&Merge<TreeId>, &ConflictLabels) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this for comparisons that need to check whether conflict labels or trees changed (e.g. for materializing in the working copy). I think it matches MergedTree::into_tree_ids_and_labels, but I'm not sure if it's strange to have a getter like this.

#[serde(skip)] // TODO: should be exposed?
pub root_tree: Merge<TreeId>,
#[serde(skip)]
pub conflict_labels: ConflictLabels,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Storing ConflictLabels here makes the code simple, but I'm not sure if it's too high-level of a type. Would it be better to use Option<Merge<String>> or Vec<String> instead? These types don't enforce the requirements of ConflictLabels though (e.g. that resolved trees can't have labels).

We also can't do root_tree: MergedTree, since MergedTree contains an Arc<Store>, and I believe this data is supposed to be independent of the store probably.

@scott2000 scott2000 force-pushed the scott2000/conflict-labels-storage branch 3 times, most recently from d77683d to aa55930 Compare December 2, 2025 22:52
/// them more efficient to clone.
#[derive(ContentHash, PartialEq, Eq, Clone)]
pub struct ConflictLabels {
labels: Option<Arc<Merge<String>>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it's only cloned when MergedTree is cloned, which is not often, so I would drop the Arc.

/// resolved or if any label is empty, the labels will be discarded, since
/// resolved merges cannot have labels, and labels cannot be empty.
pub fn new(labels: Merge<String>) -> Self {
if labels.is_resolved() || labels.iter().any(|label| label.is_empty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we could use empty strings when there's no label (and render empty strings specially in conflict markers etc). It seems slightly useful to have labels for some terms even if they're missing for other terms. Seems like that should be simpler too.

Comment on lines +100 to +135
impl From<Merge<String>> for ConflictLabels {
fn from(value: Merge<String>) -> Self {
Self::new(value)
}
}

impl From<Merge<&'_ str>> for ConflictLabels {
fn from(value: Merge<&str>) -> Self {
Self::new(value.map(|&label| label.to_owned()))
}
}

impl<T> From<Option<T>> for ConflictLabels
where
T: Into<Self>,
{
fn from(value: Option<T>) -> Self {
value.map_or_else(Self::unlabeled, T::into)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I often find named constructors clearer than .into(). Maybe that's just me, so feel free to ignore.

Adding this as a separate type will help maintain the invariants that
resolved merges cannot have labels, and labels cannot be the empty
string. I also added `Arc` to make cloning more efficient, since the
conflict labels will be cloned whenever `Commit::tree` is called.

I think that storing separate conflict labels for each term of the
conflict is the best approach for a couple reasons. Mainly, I think it
integrates well with the existing conflict algebra. For instance, a diff
of (A - B) and a diff of (B - C) can be easily combined to create a new
diff of (A - C), and if we associate a label with each term, then the
labels will also naturally be carried over as well. Also, I think it
would be simpler to implement than other approaches (such as storing
labels for diffs instead of terms), since conflict labels can re-use
existing logic from `Merge<T>`.

For simplicity, I also think we shouldn't allow mixing labeled terms and
unlabeled terms (i.e. if any term doesn't have a label, then we discard
all labels and leave the entire merge unlabeled). I think it could be
confusing to have conflicts where, for instance, one side says "rebase
destination" and another side only says "side #2" with no further
information. In cases like these, I think it's better to just fall back
to the old labels. In the future, I expect that most conflicts should
have labels (since we should eventually be adding labels everywhere
conflicts can happen).
To implement simplification of conflict labels, I decided to add more
functions such as `zip` and `unzip` to `Merge`. I think these functions
could be useful in other situations so I thought this was a nice
solution, but an alternative solution could be to make
`get_simplified_mapping` and `apply_simplified_mapping` public and
manually apply the same mapping to both merges.
The old method is renamed to `MergedTree::merge_unlabeled` to make it
easy to find unmigrated callers. The goal is that almost all callers
will eventually use `MergedTree::merge` to add labels, unless the
resulting tree is never visible to the user.
Conflict labels are stored in a separate header for backwards
compatibility.
@scott2000 scott2000 force-pushed the scott2000/conflict-labels-storage branch from aa55930 to e1dea15 Compare December 5, 2025 03:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants