diff mbox series

[v2,08/11] merge-ort: add implementation of rename collisions

Message ID edd610321a053b431def5b06bb2983c7f4a84547.1607962900.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series merge-ort: add basic rename detection | expand

Commit Message

Elijah Newren Dec. 14, 2020, 4:21 p.m. UTC
From: Elijah Newren <newren@gmail.com>

Implement rename/rename(2to1) and rename/add handling, i.e. a file is
renamed into a location where another file is added (with that other
file either being a plain add or itself coming from a rename).  Note
that rename collisions can also have a special case stacked on top: the
file being renamed on one side of history is deleted on the other
(yielding either a rename/add/delete conflict or perhaps a
rename/rename(2to1)/delete[/delete]) conflict.

One thing to note here is that when there is a double rename, the code
in question only handles one of them at a time; a later iteration
through the loop will handle the other.  After they've both been
handled, process_entry()'s normal add/add code can handle the collision.

This code replaces the following from merge-recurisve.c:

  * all the 2to1 code in process_renames()
  * the RENAME_TWO_FILES_TO_ONE case of process_entry()
  * handle_rename_rename_2to1()
  * handle_rename_add()

Also, there is some shared code from merge-recursive.c for multiple
different rename cases which we will no longer need for this case (or
other rename cases):

  * handle_file_collision()
  * setup_rename_conflict_info()

The consolidation of six separate codepaths into one is made possible
by a change in design: process_renames() tweaks the conflict_info
entries within opt->priv->paths such that process_entry() can then
handle all the non-rename conflict types (directory/file, modify/delete,
etc.) orthogonally.  This means we're much less likely to miss special
implementation of some kind of combination of conflict types (see
commits brought in by 66c62eaec6 ("Merge branch 'en/merge-tests'",
2020-11-18), especially commit ef52778708 ("merge tests: expect improved
directory/file conflict handling in ort", 2020-10-26) for more details).
That, together with letting worktree/index updating be handled
orthogonally in the merge_switch_to_result() function, dramatically
simplifies the code for various special rename cases.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 51 insertions(+), 3 deletions(-)

Comments

Derrick Stolee Dec. 15, 2020, 2:09 p.m. UTC | #1
On 12/14/2020 11:21 AM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> Implement rename/rename(2to1) and rename/add handling, i.e. a file is
> renamed into a location where another file is added (with that other
> file either being a plain add or itself coming from a rename).  Note
> that rename collisions can also have a special case stacked on top: the
> file being renamed on one side of history is deleted on the other
> (yielding either a rename/add/delete conflict or perhaps a
> rename/rename(2to1)/delete[/delete]) conflict.
> 
> One thing to note here is that when there is a double rename, the code
> in question only handles one of them at a time; a later iteration
> through the loop will handle the other.  After they've both been
> handled, process_entry()'s normal add/add code can handle the collision.
> 
> This code replaces the following from merge-recurisve.c:
> 
>   * all the 2to1 code in process_renames()
>   * the RENAME_TWO_FILES_TO_ONE case of process_entry()
>   * handle_rename_rename_2to1()
>   * handle_rename_add()
> 
> Also, there is some shared code from merge-recursive.c for multiple
> different rename cases which we will no longer need for this case (or
> other rename cases):
> 
>   * handle_file_collision()
>   * setup_rename_conflict_info()
> 
> The consolidation of six separate codepaths into one is made possible
> by a change in design: process_renames() tweaks the conflict_info
> entries within opt->priv->paths such that process_entry() can then
> handle all the non-rename conflict types (directory/file, modify/delete,
> etc.) orthogonally.  This means we're much less likely to miss special
> implementation of some kind of combination of conflict types (see
> commits brought in by 66c62eaec6 ("Merge branch 'en/merge-tests'",
> 2020-11-18), especially commit ef52778708 ("merge tests: expect improved
> directory/file conflict handling in ort", 2020-10-26) for more details).
> That, together with letting worktree/index updating be handled
> orthogonally in the merge_switch_to_result() function, dramatically
> simplifies the code for various special rename cases.

I'm really happy that you broke out the cases earlier, and describe
them so well in the message. It makes this hunk of code really easy
to understand:

> +			const char *pathnames[3];
> +			struct version_info merged;
> +
> +			struct conflict_info *base, *side1, *side2;
> +			unsigned clean;
> +
> +			pathnames[0] = oldpath;
> +			pathnames[other_source_index] = oldpath;
> +			pathnames[target_index] = newpath;
> +
> +			base = strmap_get(&opt->priv->paths, pathnames[0]);
> +			side1 = strmap_get(&opt->priv->paths, pathnames[1]);
> +			side2 = strmap_get(&opt->priv->paths, pathnames[2]);
> +
> +			VERIFY_CI(base);
> +			VERIFY_CI(side1);
> +			VERIFY_CI(side2);
> +
> +			clean = handle_content_merge(opt, pair->one->path,
> +						     &base->stages[0],
> +						     &side1->stages[1],
> +						     &side2->stages[2],
> +						     pathnames,
> +						     1 + 2*opt->priv->call_depth,

nit: " * "

> +						     &merged);
> +
> +			memcpy(&newinfo->stages[target_index], &merged,
> +			       sizeof(merged));
> +			if (!clean) {
> +				path_msg(opt, newpath, 0,
> +					 _("CONFLICT (rename involved in "
> +					   "collision): rename of %s -> %s has "
> +					   "content conflicts AND collides "
> +					   "with another path; this may result "
> +					   "in nested conflict markers."),
> +					 oldpath, newpath);

I was briefly taken aback by "AND collides with another path" wondering if
that wording helps users understand the type of conflict here. But I can't
think of anything better, so *shrug*.

> +			}
>  		} else if (collision && source_deleted) {
> -			/* rename/add/delete or rename/rename(2to1)/delete */
> -			die("Not yet implemented");
> +			/*
> +			 * rename/add/delete or rename/rename(2to1)/delete:
> +			 * since oldpath was deleted on the side that didn't
> +			 * do the rename, there's not much of a content merge
> +			 * we can do for the rename.  oldinfo->merged.is_null
> +			 * was already set, so we just leave things as-is so
> +			 * they look like an add/add conflict.
> +			 */
> +
> +			newinfo->path_conflict = 1;
> +			path_msg(opt, newpath, 0,
> +				 _("CONFLICT (rename/delete): %s renamed "
> +				   "to %s in %s, but deleted in %s."),
> +				 oldpath, newpath, rename_branch, delete_branch);

I think this branch is added in the wrong patch. My compiler is complaining
that 'rename_branch' and 'delete_branch' are not declared (yet).

Thanks,
-Stolee
Elijah Newren Dec. 15, 2020, 4:56 p.m. UTC | #2
On Tue, Dec 15, 2020 at 6:09 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 12/14/2020 11:21 AM, Elijah Newren via GitGitGadget wrote:
> > From: Elijah Newren <newren@gmail.com>
> >
> > Implement rename/rename(2to1) and rename/add handling, i.e. a file is
> > renamed into a location where another file is added (with that other
> > file either being a plain add or itself coming from a rename).  Note
> > that rename collisions can also have a special case stacked on top: the
> > file being renamed on one side of history is deleted on the other
> > (yielding either a rename/add/delete conflict or perhaps a
> > rename/rename(2to1)/delete[/delete]) conflict.
> >
> > One thing to note here is that when there is a double rename, the code
> > in question only handles one of them at a time; a later iteration
> > through the loop will handle the other.  After they've both been
> > handled, process_entry()'s normal add/add code can handle the collision.
> >
> > This code replaces the following from merge-recurisve.c:
> >
> >   * all the 2to1 code in process_renames()
> >   * the RENAME_TWO_FILES_TO_ONE case of process_entry()
> >   * handle_rename_rename_2to1()
> >   * handle_rename_add()
> >
> > Also, there is some shared code from merge-recursive.c for multiple
> > different rename cases which we will no longer need for this case (or
> > other rename cases):
> >
> >   * handle_file_collision()
> >   * setup_rename_conflict_info()
> >
> > The consolidation of six separate codepaths into one is made possible
> > by a change in design: process_renames() tweaks the conflict_info
> > entries within opt->priv->paths such that process_entry() can then
> > handle all the non-rename conflict types (directory/file, modify/delete,
> > etc.) orthogonally.  This means we're much less likely to miss special
> > implementation of some kind of combination of conflict types (see
> > commits brought in by 66c62eaec6 ("Merge branch 'en/merge-tests'",
> > 2020-11-18), especially commit ef52778708 ("merge tests: expect improved
> > directory/file conflict handling in ort", 2020-10-26) for more details).
> > That, together with letting worktree/index updating be handled
> > orthogonally in the merge_switch_to_result() function, dramatically
> > simplifies the code for various special rename cases.
>
> I'm really happy that you broke out the cases earlier, and describe
> them so well in the message. It makes this hunk of code really easy
> to understand:
>
> > +                     const char *pathnames[3];
> > +                     struct version_info merged;
> > +
> > +                     struct conflict_info *base, *side1, *side2;
> > +                     unsigned clean;
> > +
> > +                     pathnames[0] = oldpath;
> > +                     pathnames[other_source_index] = oldpath;
> > +                     pathnames[target_index] = newpath;
> > +
> > +                     base = strmap_get(&opt->priv->paths, pathnames[0]);
> > +                     side1 = strmap_get(&opt->priv->paths, pathnames[1]);
> > +                     side2 = strmap_get(&opt->priv->paths, pathnames[2]);
> > +
> > +                     VERIFY_CI(base);
> > +                     VERIFY_CI(side1);
> > +                     VERIFY_CI(side2);
> > +
> > +                     clean = handle_content_merge(opt, pair->one->path,
> > +                                                  &base->stages[0],
> > +                                                  &side1->stages[1],
> > +                                                  &side2->stages[2],
> > +                                                  pathnames,
> > +                                                  1 + 2*opt->priv->call_depth,
>
> nit: " * "

Will fix.

> > +                                                  &merged);
> > +
> > +                     memcpy(&newinfo->stages[target_index], &merged,
> > +                            sizeof(merged));
> > +                     if (!clean) {
> > +                             path_msg(opt, newpath, 0,
> > +                                      _("CONFLICT (rename involved in "
> > +                                        "collision): rename of %s -> %s has "
> > +                                        "content conflicts AND collides "
> > +                                        "with another path; this may result "
> > +                                        "in nested conflict markers."),
> > +                                      oldpath, newpath);
>
> I was briefly taken aback by "AND collides with another path" wondering if
> that wording helps users understand the type of conflict here. But I can't
> think of anything better, so *shrug*.
>
> > +                     }
> >               } else if (collision && source_deleted) {
> > -                     /* rename/add/delete or rename/rename(2to1)/delete */
> > -                     die("Not yet implemented");
> > +                     /*
> > +                      * rename/add/delete or rename/rename(2to1)/delete:
> > +                      * since oldpath was deleted on the side that didn't
> > +                      * do the rename, there's not much of a content merge
> > +                      * we can do for the rename.  oldinfo->merged.is_null
> > +                      * was already set, so we just leave things as-is so
> > +                      * they look like an add/add conflict.
> > +                      */
> > +
> > +                     newinfo->path_conflict = 1;
> > +                     path_msg(opt, newpath, 0,
> > +                              _("CONFLICT (rename/delete): %s renamed "
> > +                                "to %s in %s, but deleted in %s."),
> > +                              oldpath, newpath, rename_branch, delete_branch);
>
> I think this branch is added in the wrong patch. My compiler is complaining
> that 'rename_branch' and 'delete_branch' are not declared (yet).

Whoops.  This used to be separate patches, with the second half coming
after one of the later patches in the series.  But in the commit
message seemed most natural to talk about "rename collisions" which
then means both types of conflicts here.  So I squashed them...and
broke the build.  I'll rearrange this one to come after the
rename/delete patch so that rename_branch and delete_branch will be
defined.
diff mbox series

Patch

diff --git a/merge-ort.c b/merge-ort.c
index 19477cfae60..04a16837849 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -785,10 +785,58 @@  static int process_renames(struct merge_options *opt,
 		/* Need to check for special types of rename conflicts... */
 		if (collision && !source_deleted) {
 			/* collision: rename/add or rename/rename(2to1) */
-			die("Not yet implemented");
+			const char *pathnames[3];
+			struct version_info merged;
+
+			struct conflict_info *base, *side1, *side2;
+			unsigned clean;
+
+			pathnames[0] = oldpath;
+			pathnames[other_source_index] = oldpath;
+			pathnames[target_index] = newpath;
+
+			base = strmap_get(&opt->priv->paths, pathnames[0]);
+			side1 = strmap_get(&opt->priv->paths, pathnames[1]);
+			side2 = strmap_get(&opt->priv->paths, pathnames[2]);
+
+			VERIFY_CI(base);
+			VERIFY_CI(side1);
+			VERIFY_CI(side2);
+
+			clean = handle_content_merge(opt, pair->one->path,
+						     &base->stages[0],
+						     &side1->stages[1],
+						     &side2->stages[2],
+						     pathnames,
+						     1 + 2*opt->priv->call_depth,
+						     &merged);
+
+			memcpy(&newinfo->stages[target_index], &merged,
+			       sizeof(merged));
+			if (!clean) {
+				path_msg(opt, newpath, 0,
+					 _("CONFLICT (rename involved in "
+					   "collision): rename of %s -> %s has "
+					   "content conflicts AND collides "
+					   "with another path; this may result "
+					   "in nested conflict markers."),
+					 oldpath, newpath);
+			}
 		} else if (collision && source_deleted) {
-			/* rename/add/delete or rename/rename(2to1)/delete */
-			die("Not yet implemented");
+			/*
+			 * rename/add/delete or rename/rename(2to1)/delete:
+			 * since oldpath was deleted on the side that didn't
+			 * do the rename, there's not much of a content merge
+			 * we can do for the rename.  oldinfo->merged.is_null
+			 * was already set, so we just leave things as-is so
+			 * they look like an add/add conflict.
+			 */
+
+			newinfo->path_conflict = 1;
+			path_msg(opt, newpath, 0,
+				 _("CONFLICT (rename/delete): %s renamed "
+				   "to %s in %s, but deleted in %s."),
+				 oldpath, newpath, rename_branch, delete_branch);
 		} else {
 			/* a few different cases... */
 			if (type_changed) {