diff mbox series

[v2,12/20] merge-ort: have process_entries operate in a defined order

Message ID 20201102204344.342633-13-newren@gmail.com (mailing list archive)
State New, archived
Headers show
Series fundamentals of merge-ort implementation | expand

Commit Message

Elijah Newren Nov. 2, 2020, 8:43 p.m. UTC
We want to handle paths below a directory before needing to handle the
directory itself.  Also, we want to handle the directory immediately
after the paths below it, so we can't use simple lexicographic ordering
from strcmp (which would insert foo.txt between foo and foo/file.c).
Copy string_list_df_name_compare() from merge-recursive.c, and set up a
string list of paths sorted by that function so that we can iterate in
the desired order.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-ort.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 47 insertions(+), 2 deletions(-)

Comments

Derrick Stolee Nov. 11, 2020, 4:09 p.m. UTC | #1
On 11/2/2020 3:43 PM, Elijah Newren wrote:
> We want to handle paths below a directory before needing to handle the
> directory itself.  Also, we want to handle the directory immediately
> after the paths below it, so we can't use simple lexicographic ordering
> from strcmp (which would insert foo.txt between foo and foo/file.c).
> Copy string_list_df_name_compare() from merge-recursive.c, and set up a
> string list of paths sorted by that function so that we can iterate in
> the desired order.

This is at least the second time we've copied something from
merge-recursive.c. Should we be starting a merge-utils.[c|h] to group
these together under a common implementation?

> +	/* Put every entry from paths into plist, then sort */
>  	strmap_for_each_entry(&opt->priv->paths, &iter, e) {
> +		string_list_append(&plist, e->key)->util = e->value;
> +	}

nit: are braces required here?

-Stolee
Elijah Newren Nov. 11, 2020, 6:58 p.m. UTC | #2
On Wed, Nov 11, 2020 at 8:09 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 11/2/2020 3:43 PM, Elijah Newren wrote:
> > We want to handle paths below a directory before needing to handle the
> > directory itself.  Also, we want to handle the directory immediately
> > after the paths below it, so we can't use simple lexicographic ordering
> > from strcmp (which would insert foo.txt between foo and foo/file.c).
> > Copy string_list_df_name_compare() from merge-recursive.c, and set up a
> > string list of paths sorted by that function so that we can iterate in
> > the desired order.
>
> This is at least the second time we've copied something from
> merge-recursive.c. Should we be starting a merge-utils.[c|h] to group
> these together under a common implementation?

I'm actually going to replace the function later for performance
reasons, but trying to make the series as simple as possible prompted
me to "just copy something for a starting point".

There will be more functions that I copy, yes, but since I sometimes
also tweak and since the goal is to delete merge-recursive.[ch], I
didn't really want to set up an infrastructure to share stuff.

> > +     /* Put every entry from paths into plist, then sort */
> >       strmap_for_each_entry(&opt->priv->paths, &iter, e) {
> > +             string_list_append(&plist, e->key)->util = e->value;
> > +     }
>
> nit: are braces required here?

It might not be with the current macro definition of
strmap_for_each_entry(), but I think at one point it was (the macro
has undergone some changes over time).  Given the difficulty of
digging through the layers of macros (and the possible risk of it
changing in the future with hashmap or strmap changes), I wonder if
it's simpler for readers to just keep them?
diff mbox series

Patch

diff --git a/merge-ort.c b/merge-ort.c
index 92bbdc7255..3d46d62ed3 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -323,6 +323,33 @@  static int detect_and_process_renames(struct merge_options *opt,
 	return clean;
 }
 
+static int string_list_df_name_compare(const char *one, const char *two)
+{
+	int onelen = strlen(one);
+	int twolen = strlen(two);
+	/*
+	 * Here we only care that entries for D/F conflicts are
+	 * adjacent, in particular with the file of the D/F conflict
+	 * appearing before files below the corresponding directory.
+	 * The order of the rest of the list is irrelevant for us.
+	 *
+	 * To achieve this, we sort with df_name_compare and provide
+	 * the mode S_IFDIR so that D/F conflicts will sort correctly.
+	 * We use the mode S_IFDIR for everything else for simplicity,
+	 * since in other cases any changes in their order due to
+	 * sorting cause no problems for us.
+	 */
+	int cmp = df_name_compare(one, onelen, S_IFDIR,
+				  two, twolen, S_IFDIR);
+	/*
+	 * Now that 'foo' and 'foo/bar' compare equal, we have to make sure
+	 * that 'foo' comes before 'foo/bar'.
+	 */
+	if (cmp)
+		return cmp;
+	return onelen - twolen;
+}
+
 /* Per entry merge function */
 static void process_entry(struct merge_options *opt,
 			  const char *path,
@@ -413,23 +440,41 @@  static void process_entries(struct merge_options *opt,
 {
 	struct hashmap_iter iter;
 	struct strmap_entry *e;
+	struct string_list plist = STRING_LIST_INIT_NODUP;
+	struct string_list_item *entry;
 
 	if (strmap_empty(&opt->priv->paths)) {
 		oidcpy(result_oid, opt->repo->hash_algo->empty_tree);
 		return;
 	}
 
+	/* Hack to pre-allocate plist to the desired size */
+	ALLOC_GROW(plist.items, strmap_get_size(&opt->priv->paths), plist.alloc);
+
+	/* Put every entry from paths into plist, then sort */
 	strmap_for_each_entry(&opt->priv->paths, &iter, e) {
+		string_list_append(&plist, e->key)->util = e->value;
+	}
+	plist.cmp = string_list_df_name_compare;
+	string_list_sort(&plist);
+
+	/*
+	 * Iterate over the items in reverse order, so we can handle paths
+	 * below a directory before needing to handle the directory itself.
+	 */
+	for (entry = &plist.items[plist.nr-1]; entry >= plist.items; --entry) {
+		char *path = entry->string;
 		/*
 		 * WARNING: If ci->merged.clean is true, then ci does not
 		 * actually point to a conflict_info but a struct merge_info.
 		 */
-		struct conflict_info *ci = e->value;
+		struct conflict_info *ci = entry->util;
 
 		if (!ci->merged.clean)
-			process_entry(opt, e->key, e->value);
+			process_entry(opt, path, ci);
 	}
 
+	string_list_clear(&plist, 0);
 	die("Tree creation not yet implemented");
 }