diff mbox series

[v2,08/17] merge-ort: implement handle_directory_level_conflicts()

Message ID f6efa4350d621cd955965e13329f15f4d1b91bb4.1610055365.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Add directory rename detection to merge-ort | expand

Commit Message

Elijah Newren Jan. 7, 2021, 9:35 p.m. UTC
From: Elijah Newren <newren@gmail.com>

This is modelled on the version of handle_directory_level_conflicts()
from merge-recursive.c, but is massively simplified due to the following
factors:
  * strmap API provides simplifications over using direct hashamp
  * we have a dirs_removed field in struct rename_info that we have an
    easy way to populate from collect_merge_info(); this was already
    used in compute_rename_counts() and thus we do not need to check
    for condition #2.
  * The removal of condition #2 by handling it earlier in the code also
    obviates the need to check for condition #3 -- if both sides renamed
    a directory, meaning that the directory no longer exists on either
    side, then neither side could have added any new files to that
    directory, and thus there are no files whose locations we need to
    move due to such a directory rename.

In fact, the same logic that makes condition #3 irrelevant means
condition #1 is also irrelevant so we could drop this function.
However, it is cheap to check if both sides rename the same directory,
and doing so can save future computation.  So, simply remove any
directories that both sides renamed from the list of directory renames.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

Comments

Taylor Blau Jan. 18, 2021, 9 p.m. UTC | #1
On Thu, Jan 07, 2021 at 09:35:56PM +0000, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
>
> This is modelled on the version of handle_directory_level_conflicts()
> from merge-recursive.c, but is massively simplified due to the following
> factors:
>   * strmap API provides simplifications over using direct hashamp
>   * we have a dirs_removed field in struct rename_info that we have an
>     easy way to populate from collect_merge_info(); this was already
>     used in compute_rename_counts() and thus we do not need to check
>     for condition #2.
>   * The removal of condition #2 by handling it earlier in the code also
>     obviates the need to check for condition #3 -- if both sides renamed
>     a directory, meaning that the directory no longer exists on either
>     side, then neither side could have added any new files to that
>     directory, and thus there are no files whose locations we need to
>     move due to such a directory rename.
>
> In fact, the same logic that makes condition #3 irrelevant means
> condition #1 is also irrelevant so we could drop this function.
> However, it is cheap to check if both sides rename the same directory,
> and doing so can save future computation.  So, simply remove any
> directories that both sides renamed from the list of directory renames.

Beautiful.

>  static void handle_directory_level_conflicts(struct merge_options *opt)
>  {
> -	die("Not yet implemented!");
> +	struct hashmap_iter iter;
> +	struct strmap_entry *entry;
> +	struct string_list duplicated = STRING_LIST_INIT_NODUP;
> +	struct strmap *side1_dir_renames = &opt->priv->renames.dir_renames[1];
> +	struct strmap *side2_dir_renames = &opt->priv->renames.dir_renames[2];

Obviously saying "1" or "MERGE_SIDE1" are two ways of saying the same
thing, but perhaps the latter is more easily grep-able? Dunno.

> +	int i;
> +
> +	strmap_for_each_entry(side1_dir_renames, &iter, entry) {
> +		if (strmap_contains(side2_dir_renames, entry->key))
> +			string_list_append(&duplicated, entry->key);
> +	}
> +
> +	for (i=0; i<duplicated.nr; ++i) {

One small nit: this spacing and prefixed ++ reads a little oddly to me.
Perhaps:

  for (i = 0; i < duplicated.nr; i++) {

?

> +		strmap_remove(side1_dir_renames, duplicated.items[i].string, 0);
> +		strmap_remove(side2_dir_renames, duplicated.items[i].string, 0);
> +	}
> +	string_list_clear(&duplicated, 0);

And this looks like a faithful implementation of what you described
above. Thanks.

Thanks,
Taylor
Elijah Newren Jan. 18, 2021, 9:36 p.m. UTC | #2
On Mon, Jan 18, 2021 at 1:00 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> On Thu, Jan 07, 2021 at 09:35:56PM +0000, Elijah Newren via GitGitGadget wrote:
> > From: Elijah Newren <newren@gmail.com>
> >
> > This is modelled on the version of handle_directory_level_conflicts()
> > from merge-recursive.c, but is massively simplified due to the following
> > factors:
> >   * strmap API provides simplifications over using direct hashamp

Whoops, should be "hashmap"

> >   * we have a dirs_removed field in struct rename_info that we have an
> >     easy way to populate from collect_merge_info(); this was already
> >     used in compute_rename_counts() and thus we do not need to check
> >     for condition #2.
> >   * The removal of condition #2 by handling it earlier in the code also
> >     obviates the need to check for condition #3 -- if both sides renamed
> >     a directory, meaning that the directory no longer exists on either
> >     side, then neither side could have added any new files to that
> >     directory, and thus there are no files whose locations we need to
> >     move due to such a directory rename.
> >
> > In fact, the same logic that makes condition #3 irrelevant means
> > condition #1 is also irrelevant so we could drop this function.
> > However, it is cheap to check if both sides rename the same directory,
> > and doing so can save future computation.  So, simply remove any
> > directories that both sides renamed from the list of directory renames.
>
> Beautiful.
>
> >  static void handle_directory_level_conflicts(struct merge_options *opt)
> >  {
> > -     die("Not yet implemented!");
> > +     struct hashmap_iter iter;
> > +     struct strmap_entry *entry;
> > +     struct string_list duplicated = STRING_LIST_INIT_NODUP;
> > +     struct strmap *side1_dir_renames = &opt->priv->renames.dir_renames[1];
> > +     struct strmap *side2_dir_renames = &opt->priv->renames.dir_renames[2];
>
> Obviously saying "1" or "MERGE_SIDE1" are two ways of saying the same
> thing, but perhaps the latter is more easily grep-able? Dunno.
>
> > +     int i;
> > +
> > +     strmap_for_each_entry(side1_dir_renames, &iter, entry) {
> > +             if (strmap_contains(side2_dir_renames, entry->key))
> > +                     string_list_append(&duplicated, entry->key);
> > +     }
> > +
> > +     for (i=0; i<duplicated.nr; ++i) {
>
> One small nit: this spacing and prefixed ++ reads a little oddly to me.
> Perhaps:
>
>   for (i = 0; i < duplicated.nr; i++) {
>
> ?

Sure, will fix.

>
> > +             strmap_remove(side1_dir_renames, duplicated.items[i].string, 0);
> > +             strmap_remove(side2_dir_renames, duplicated.items[i].string, 0);
> > +     }
> > +     string_list_clear(&duplicated, 0);
>
> And this looks like a faithful implementation of what you described
> above. Thanks.
>
> Thanks,
> Taylor
diff mbox series

Patch

diff --git a/merge-ort.c b/merge-ort.c
index a8fcc026031..feeb838231a 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -941,7 +941,23 @@  static void get_provisional_directory_renames(struct merge_options *opt,
 
 static void handle_directory_level_conflicts(struct merge_options *opt)
 {
-	die("Not yet implemented!");
+	struct hashmap_iter iter;
+	struct strmap_entry *entry;
+	struct string_list duplicated = STRING_LIST_INIT_NODUP;
+	struct strmap *side1_dir_renames = &opt->priv->renames.dir_renames[1];
+	struct strmap *side2_dir_renames = &opt->priv->renames.dir_renames[2];
+	int i;
+
+	strmap_for_each_entry(side1_dir_renames, &iter, entry) {
+		if (strmap_contains(side2_dir_renames, entry->key))
+			string_list_append(&duplicated, entry->key);
+	}
+
+	for (i=0; i<duplicated.nr; ++i) {
+		strmap_remove(side1_dir_renames, duplicated.items[i].string, 0);
+		strmap_remove(side2_dir_renames, duplicated.items[i].string, 0);
+	}
+	string_list_clear(&duplicated, 0);
 }
 
 /*** Function Grouping: functions related to regular rename detection ***/