diff mbox series

[v2,11/13] merge-tree: provide easy access to `ls-files -u` style info

Message ID c322e4c6938b7270b6e90998994642074a2813e0.1643479633.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series In-core git merge-tree ("Server side merges") | expand

Commit Message

Elijah Newren Jan. 29, 2022, 6:07 p.m. UTC
From: Elijah Newren <newren@gmail.com>

Much like `git merge` updates the index with information of the form
    (mode, oid, stage, name)
provide this output for conflicted files for merge-tree as well.
Provide an --exclude-modes-oids-stages/-l option for users to exclude
the mode, oid, and stage and only get the list of conflicted filenames.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/git-merge-tree.txt | 30 ++++++++++++++++++++++++------
 builtin/merge-tree.c             | 11 ++++++++++-
 t/t4301-merge-tree-write-tree.sh | 26 ++++++++++++++++++++++++--
 3 files changed, 58 insertions(+), 9 deletions(-)

Comments

Junio C Hamano Feb. 2, 2022, 9:32 p.m. UTC | #1
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> @@ -450,7 +451,11 @@ static int real_merge(struct merge_tree_options *o,
>  		merge_get_conflicted_files(&result, &conflicted_files);
>  		for (i = 0; i < conflicted_files.nr; i++) {
>  			const char *name = conflicted_files.items[i].string;
> -			if (last && !strcmp(last, name))
> +			struct stage_info *c = conflicted_files.items[i].util;
> +			if (!o->exclude_modes_oids_stages)
> +				printf("%06o %s %d\t",
> +				       c->mode, oid_to_hex(&c->oid), c->stage);
> +			else if (last && !strcmp(last, name))
>  				continue;
>  			write_name_quoted_relative(
>  				name, prefix, stdout, line_termination);

OK.  The addition (and disabling of the deduping) is quite trivial.
We do not even have to worry about line termination since the extra
pieces of info are prepended to the pathname.  Nice.

> @@ -485,6 +490,10 @@ int cmd_merge_tree(int argc, const char **argv, const char *prefix)
>  			    N_("do a trivial merge only"), 't'),
>  		OPT_BOOL(0, "messages", &o.show_messages,
>  			 N_("also show informational/conflict messages")),
> +		OPT_BOOL_F('l', "exclude-modes-oids-stages",
> +			   &o.exclude_modes_oids_stages,
> +			   N_("list conflicted files without modes/oids/stages"),
> +			   PARSE_OPT_NONEG),

Why does "-l" give shorter output than without it?  "-l" strongly
hints a longer output than without, at least to me.  Just wondering
if this will not become a source of confusion to future scripting
users.

>  		OPT_END()
>  	};
>
Elijah Newren Feb. 2, 2022, 11:18 p.m. UTC | #2
On Wed, Feb 2, 2022 at 1:32 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > @@ -450,7 +451,11 @@ static int real_merge(struct merge_tree_options *o,
> >               merge_get_conflicted_files(&result, &conflicted_files);
> >               for (i = 0; i < conflicted_files.nr; i++) {
> >                       const char *name = conflicted_files.items[i].string;
> > -                     if (last && !strcmp(last, name))
> > +                     struct stage_info *c = conflicted_files.items[i].util;
> > +                     if (!o->exclude_modes_oids_stages)
> > +                             printf("%06o %s %d\t",
> > +                                    c->mode, oid_to_hex(&c->oid), c->stage);
> > +                     else if (last && !strcmp(last, name))
> >                               continue;
> >                       write_name_quoted_relative(
> >                               name, prefix, stdout, line_termination);
>
> OK.  The addition (and disabling of the deduping) is quite trivial.
> We do not even have to worry about line termination since the extra
> pieces of info are prepended to the pathname.  Nice.
>
> > @@ -485,6 +490,10 @@ int cmd_merge_tree(int argc, const char **argv, const char *prefix)
> >                           N_("do a trivial merge only"), 't'),
> >               OPT_BOOL(0, "messages", &o.show_messages,
> >                        N_("also show informational/conflict messages")),
> > +             OPT_BOOL_F('l', "exclude-modes-oids-stages",
> > +                        &o.exclude_modes_oids_stages,
> > +                        N_("list conflicted files without modes/oids/stages"),
> > +                        PARSE_OPT_NONEG),
>
> Why does "-l" give shorter output than without it?  "-l" strongly
> hints a longer output than without, at least to me.  Just wondering
> if this will not become a source of confusion to future scripting
> users.

Here's another example where I was struggling with naming.  Something
like ls-tree's `--name-only` would have been nice, but I was worried
it'd be confusing since it only affected the conflicted info section
and does not suppress the printing of the toplevel tree or the
informational messages sections.  And the name
--exclude-modes-oids-stages was long enough that I wanted a short flag
for it, and just used the first letter of the description ("list
conflicted files...").  I'm happy to change either the long or the
short name for this flag if anyone has suggestions.
Ævar Arnfjörð Bjarmason Feb. 3, 2022, 1:08 a.m. UTC | #3
On Wed, Feb 02 2022, Elijah Newren wrote:

> On Wed, Feb 2, 2022 at 1:32 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>> > @@ -450,7 +451,11 @@ static int real_merge(struct merge_tree_options *o,
>> >               merge_get_conflicted_files(&result, &conflicted_files);
>> >               for (i = 0; i < conflicted_files.nr; i++) {
>> >                       const char *name = conflicted_files.items[i].string;
>> > -                     if (last && !strcmp(last, name))
>> > +                     struct stage_info *c = conflicted_files.items[i].util;
>> > +                     if (!o->exclude_modes_oids_stages)
>> > +                             printf("%06o %s %d\t",
>> > +                                    c->mode, oid_to_hex(&c->oid), c->stage);
>> > +                     else if (last && !strcmp(last, name))
>> >                               continue;
>> >                       write_name_quoted_relative(
>> >                               name, prefix, stdout, line_termination);
>>
>> OK.  The addition (and disabling of the deduping) is quite trivial.
>> We do not even have to worry about line termination since the extra
>> pieces of info are prepended to the pathname.  Nice.
>>
>> > @@ -485,6 +490,10 @@ int cmd_merge_tree(int argc, const char **argv, const char *prefix)
>> >                           N_("do a trivial merge only"), 't'),
>> >               OPT_BOOL(0, "messages", &o.show_messages,
>> >                        N_("also show informational/conflict messages")),
>> > +             OPT_BOOL_F('l', "exclude-modes-oids-stages",
>> > +                        &o.exclude_modes_oids_stages,
>> > +                        N_("list conflicted files without modes/oids/stages"),
>> > +                        PARSE_OPT_NONEG),
>>
>> Why does "-l" give shorter output than without it?  "-l" strongly
>> hints a longer output than without, at least to me.  Just wondering
>> if this will not become a source of confusion to future scripting
>> users.
>
> Here's another example where I was struggling with naming.  Something
> like ls-tree's `--name-only` would have been nice, but I was worried
> it'd be confusing since it only affected the conflicted info section
> and does not suppress the printing of the toplevel tree or the
> informational messages sections.  And the name
> --exclude-modes-oids-stages was long enough that I wanted a short flag
> for it, and just used the first letter of the description ("list
> conflicted files...").  I'm happy to change either the long or the
> short name for this flag if anyone has suggestions.

There's always sidestepping it by replacing it with a --format :)

Anyway, I'd mentioned that in an earlier review in
<220124.864k5tigto.gmgdl@evledraar.gmail.com>. FWIW here's an experiment
to do that that I polished up (mostly copied from the ls-tree WIP code
I'd written already).

I don't know if it will ever be useful, or if you think it's
worthwhile/simpler, but in either case I think in doing this I spotted
the following issues or otherwise noted inconsistencies in the pre-image:

   The docs say that "<stage> <path>" is SP-separated, but it's
   actually TAB-separated, the rest is SP-separated.

 * That you de-dupe --exclude-modes-oids-stages is a bit of a hidden feature,
   but argubly initiative. Should it by optional? In any case my formatting
   experiment makes it optional, since it then needs to be generalized to de-dupe
   after we've formatted.

 * Perhaps we should support --abbrev as ls-tree does? The below diff shows
   it's easy enough.

 * The dance you have with sed-ing out the hash in the tests could be made much
   easier with "sed 1d <out >actual" and --no-messages for some existing tests.

diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
index 6a2ed475106..e906d1dc9bf 100644
--- a/Documentation/git-merge-tree.txt
+++ b/Documentation/git-merge-tree.txt
@@ -44,10 +44,9 @@ OPTIONS
 	newline.  Also begin the messages section with a NUL character
 	instead of a newline.  See OUTPUT below for more information.
 
---exclude-oids-and-modes::
-	Instead of writing a list of (mode, oid, stage, path) tuples
-	to output for conflicted files, just provide a list of
-	filenames with conflicts.
+--conflict-format::
+	Override the default "%(objectmode) %(objectname)
+	%(stage)%x09%(path)" format.
 
 --[no-]messages::
 	Write any informational messages such as "Auto-merging <path>"
@@ -89,13 +88,13 @@ Conflicted file info
 
 This is a sequence of lines with the format
 
-	<mode> <object> <stage> <filename>
+	%(objectmode) %(objectname) %(stage)%x09%(path)
 
 The filename will be quoted as explained for the configuration
-variable `core.quotePath` (see linkgit:git-config[1]).  However, if
-the `--exclude-oids-and-modes` option is passed, the mode, object, and
-stage will be omitted.  If `-z` is passed, the "lines" are terminated
-by a NUL character instead of a newline character.
+variable `core.quotePath` (see linkgit:git-config[1]).
+
+If `-z` is passed, the "lines" are terminated by a NUL character
+instead of a newline character.
 
 Informational messages
 ~~~~~~~~~~~~~~~~~~~~~~
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index 58c0ddc5a32..14fed95a8ce 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -395,9 +395,64 @@ struct merge_tree_options {
 	int mode;
 	int allow_unrelated_histories;
 	int show_messages;
-	int exclude_modes_oids_stages;
+	const char *conflict_format;
+	int unique_conflicts;
+	int abbrev;
 };
 
+struct expand_conflict_data {
+	const char *prefix;
+	struct string_list_item *item;
+	struct strbuf *scratch;
+	int abbrev;
+	struct strbuf *sb_tmp;
+};
+static size_t expand_conflict_format(struct strbuf *sb,
+				     const char *start,
+				     void *context)
+{
+	struct expand_conflict_data *data = context;
+	struct string_list_item *item = data->item;
+	struct stage_info *info = item->util;
+	const char *end;
+	const char *p;
+	size_t len;
+
+	len = strbuf_expand_literal_cb(sb, start, NULL);
+	if (len)
+		return len;
+
+	if (*start != '(')
+		die(_("bad format as of '%s'"), start);
+	end = strchr(start + 1, ')');
+	if (!end)
+		die(_("format element '%s' does not end in ')'"), start);
+	len = end - start + 1;
+
+	if (skip_prefix(start, "(objectmode)", &p)) {
+		strbuf_addf(sb, "%06o", info->mode);
+	} else if (skip_prefix(start, "(objectname)", &p)) {
+		strbuf_addstr(sb, find_unique_abbrev(&info->oid, data->abbrev));
+	} else if (skip_prefix(start, "(stage)", &p)) {
+		strbuf_addf(sb, "%d", info->stage);
+	} else if (skip_prefix(start, "(path)", &p)) {
+		const char *name = item->string;
+
+		if (data->prefix)
+			name = relative_path(name, data->prefix, data->scratch);
+		strbuf_addstr(sb, name);
+
+		strbuf_reset(data->sb_tmp);
+		/* The relative_path() function resets "scratch" */
+
+	} else {
+		unsigned int errlen = (unsigned long)len;
+		die(_("bad format specifier %%%.*s"), errlen, start);
+	}
+
+	return len;
+}
+
 static int real_merge(struct merge_tree_options *o,
 		      const char *branch1, const char *branch2,
 		      const char *prefix)
@@ -446,23 +501,43 @@ static int real_merge(struct merge_tree_options *o,
 	puts(oid_to_hex(&result.tree->object.oid));
 	if (!result.clean) {
 		struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
-		const char *last = NULL;
-		int i;
+		struct string_list_item *item;
+		char *last = NULL;
+		struct strbuf sb = STRBUF_INIT;
+		struct strbuf tmp = STRBUF_INIT;
 
 		merge_get_conflicted_files(&result, &conflicted_files);
-		for (i = 0; i < conflicted_files.nr; i++) {
-			const char *name = conflicted_files.items[i].string;
-			struct stage_info *c = conflicted_files.items[i].util;
-			if (!o->exclude_modes_oids_stages)
-				printf("%06o %s %d\t",
-				       c->mode, oid_to_hex(&c->oid), c->stage);
-			else if (last && !strcmp(last, name))
+		for_each_string_list_item(item, &conflicted_files) {
+			struct expand_conflict_data ctx = {
+				.prefix = prefix,
+				.item = item,
+				.abbrev = o->abbrev,
+				.scratch = &sb,
+				.sb_tmp = &tmp,
+			};
+
+			strbuf_expand(&sb, o->conflict_format, expand_conflict_format, &ctx);
+			strbuf_addch(&sb, line_termination);
+
+			if (o->unique_conflicts && last && !strcmp(last, sb.buf)) {
+				free(last);
+				last = strbuf_detach(&sb, NULL);
 				continue;
-			write_name_quoted_relative(
-				name, prefix, stdout, line_termination);
-			last = name;
+			}
+
+			fwrite(sb.buf, sb.len, 1, stdout);
+
+			if (o->unique_conflicts) {
+				free(last);
+				last = strbuf_detach(&sb, NULL);
+			} else {
+				strbuf_reset(&sb);
+			}
 		}
 		string_list_clear(&conflicted_files, 1);
+		strbuf_release(&sb);
+		strbuf_release(&tmp);
+		free(last);
 	}
 	if (o->show_messages) {
 		putchar(line_termination);
@@ -474,7 +549,11 @@ static int real_merge(struct merge_tree_options *o,
 
 int cmd_merge_tree(int argc, const char **argv, const char *prefix)
 {
-	struct merge_tree_options o = { .show_messages = -1 };
+	struct merge_tree_options o = {
+		.show_messages = -1,
+		.conflict_format = "%(objectmode) %(objectname) %(stage)%x09%(path)",
+		.unique_conflicts = 1,
+	};
 	int expected_remaining_argc;
 	int original_argc;
 
@@ -493,14 +572,15 @@ int cmd_merge_tree(int argc, const char **argv, const char *prefix)
 			 N_("also show informational/conflict messages")),
 		OPT_SET_INT('z', NULL, &line_termination,
 			    N_("separate paths with the NUL character"), '\0'),
-		OPT_BOOL_F('l', "exclude-modes-oids-stages",
-			   &o.exclude_modes_oids_stages,
-			   N_("list conflicted files without modes/oids/stages"),
-			   PARSE_OPT_NONEG),
+		OPT_STRING(0, "conflict-format", &o.conflict_format, N_("format"),
+			   N_("specify a custom format to use for conflicted files")),
+		OPT_BOOL(0, "unique-conflicts", &o.unique_conflicts,
+			 N_("omit duplicate --conflict-format lines")),
 		OPT_BOOL_F(0, "allow-unrelated-histories",
 			   &o.allow_unrelated_histories,
 			   N_("allow merging unrelated histories"),
 			   PARSE_OPT_NONEG),
+		OPT__ABBREV(&o.abbrev),
 		OPT_END()
 	};
 
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index 4de089d976d..e6354b2d284 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -93,7 +93,7 @@ test_expect_success 'Barf on too many arguments' '
 '
 
 test_expect_success 'test conflict notices and such' '
-	test_expect_code 1 git merge-tree --write-tree --exclude-modes-oids-stages side1 side2 >out &&
+	test_expect_code 1 git merge-tree --write-tree --conflict-format="%(path)" side1 side2 >out &&
 	sed -e "s/[0-9a-f]\{40,\}/HASH/g" out >actual &&
 
 	# Expected results:
@@ -115,8 +115,35 @@ test_expect_success 'test conflict notices and such' '
 	test_cmp expect actual
 '
 
+test_expect_success 'merge-tree --unique-conflicts is the default' '
+	test_expect_code 1 git merge-tree --write-tree --conflict-format="%(path)" --no-messages side1 side2 >out &&
+	sed 1d <out >actual &&
+	cat >expect <<-\EOF &&
+	greeting
+	whatever~side1
+	EOF
+	test_cmp expect actual &&
+
+	test_expect_code 1 git merge-tree --write-tree --conflict-format="%(path)" --no-messages side1 side2 >out2 &&
+	sed 1d <out2 >actual2 &&
+	test_cmp actual actual2
+'
+
+test_expect_success 'merge-tree --no-unique-conflicts' '
+	test_expect_code 1 git merge-tree --write-tree --conflict-format="%(path)" --no-unique-conflicts --no-messages side1 side2 >out &&
+	sed 1d <out >actual &&
+	cat >expect <<-\EOF &&
+	greeting
+	greeting
+	greeting
+	whatever~side1
+	whatever~side1
+	EOF
+	test_cmp expect actual
+'
+
 test_expect_success 'Just the conflicted files without the messages' '
-	test_expect_code 1 git merge-tree --write-tree --no-messages --exclude-modes-oids-stages side1 side2 >out &&
+	test_expect_code 1 git merge-tree --write-tree --no-messages --conflict-format="%(path)" side1 side2 >out &&
 	sed -e "s/[0-9a-f]\{40,\}/HASH/g" out >actual &&
 
 	test_write_lines HASH greeting whatever~side1 >expect &&
Elijah Newren Feb. 3, 2022, 8:39 a.m. UTC | #4
On Wed, Feb 2, 2022 at 5:22 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
> On Wed, Feb 02 2022, Elijah Newren wrote:
>
> > On Wed, Feb 2, 2022 at 1:32 PM Junio C Hamano <gitster@pobox.com> wrote:
> >>
> >> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >>
> >> > @@ -450,7 +451,11 @@ static int real_merge(struct merge_tree_options *o,
> >> >               merge_get_conflicted_files(&result, &conflicted_files);
> >> >               for (i = 0; i < conflicted_files.nr; i++) {
> >> >                       const char *name = conflicted_files.items[i].string;
> >> > -                     if (last && !strcmp(last, name))
> >> > +                     struct stage_info *c = conflicted_files.items[i].util;
> >> > +                     if (!o->exclude_modes_oids_stages)
> >> > +                             printf("%06o %s %d\t",
> >> > +                                    c->mode, oid_to_hex(&c->oid), c->stage);
> >> > +                     else if (last && !strcmp(last, name))
> >> >                               continue;
> >> >                       write_name_quoted_relative(
> >> >                               name, prefix, stdout, line_termination);
> >>
> >> OK.  The addition (and disabling of the deduping) is quite trivial.
> >> We do not even have to worry about line termination since the extra
> >> pieces of info are prepended to the pathname.  Nice.
> >>
> >> > @@ -485,6 +490,10 @@ int cmd_merge_tree(int argc, const char **argv, const char *prefix)
> >> >                           N_("do a trivial merge only"), 't'),
> >> >               OPT_BOOL(0, "messages", &o.show_messages,
> >> >                        N_("also show informational/conflict messages")),
> >> > +             OPT_BOOL_F('l', "exclude-modes-oids-stages",
> >> > +                        &o.exclude_modes_oids_stages,
> >> > +                        N_("list conflicted files without modes/oids/stages"),
> >> > +                        PARSE_OPT_NONEG),
> >>
> >> Why does "-l" give shorter output than without it?  "-l" strongly
> >> hints a longer output than without, at least to me.  Just wondering
> >> if this will not become a source of confusion to future scripting
> >> users.
> >
> > Here's another example where I was struggling with naming.  Something
> > like ls-tree's `--name-only` would have been nice, but I was worried
> > it'd be confusing since it only affected the conflicted info section
> > and does not suppress the printing of the toplevel tree or the
> > informational messages sections.  And the name
> > --exclude-modes-oids-stages was long enough that I wanted a short flag
> > for it, and just used the first letter of the description ("list
> > conflicted files...").  I'm happy to change either the long or the
> > short name for this flag if anyone has suggestions.
>
> There's always sidestepping it by replacing it with a --format :)

Another solution that occurred to me, and I was _really_ close to
doing it for v3, was to just flat drop this patch entirely and not
include any such option.  But...

  * "Which files had conflicts?" seems like such an obvious question
  * I've used `git ls-files -u | awk {print\$4} | uniq` a lot in the
past after `git merge` (Or `git rebase`) to get this info (yeah, it
turns out `git diff --name-only --diff-filter=U` is 4 fewer
characters)
  * "display the list of files where conflicts were present in the web
UI" was listed as an early usecase[1]

[1] https://lore.kernel.org/git/YYlqpuzv+bmZaFzz@nand.local/

So it seemed like making that question easy to answer was worthwhile.

> Anyway, I'd mentioned that in an earlier review in
> <220124.864k5tigto.gmgdl@evledraar.gmail.com>. FWIW here's an experiment
> to do that that I polished up (mostly copied from the ls-tree WIP code
> I'd written already).
>
> I don't know if it will ever be useful, or if you think it's
> worthwhile/simpler, but in either case I think in doing this I spotted
> the following issues or otherwise noted inconsistencies in the pre-image:
>
>    The docs say that "<stage> <path>" is SP-separated, but it's
>    actually TAB-separated, the rest is SP-separated.

Yeah, good catch.  However, it doesn't actually say they are
SP-separated; it's ambiguous about the spacing.  Which probably isn't
a good thing, but it was kind of copied from the ls-files manual:

"""
       git ls-files just outputs the filenames unless --stage is specified in
       which case it outputs:

           [<tag> ]<mode> <object> <stage> <file>
"""

(which also uses a tab between <stage> and <file> and a space
otherwise, but the output above may lead you to believe otherwise.)

>  * That you de-dupe --exclude-modes-oids-stages is a bit of a hidden feature,
>    but argubly initiative. Should it by optional? In any case my formatting
>    experiment makes it optional, since it then needs to be generalized to de-dupe
>    after we've formatted.

I think without de-duping the flag isn't helpful enough to bother
implementing.  Requiring two flags also seems painful, given the
common case scenario.

I hope I'm not coming across as dismissive.  I think eventually adding
a --format and --dedupe (the combination of which might be implied by
whatever flag is used now) might be useful additions.  Maybe --abbrev
too...eventually.  But I'm worried that it's distracting from focusing
on usecases.  In particular, I'm worried it leads to "well, script
writers technically can get what they want because we provided
everything" rather than focusing on making the most common things easy
to get, and then extending the command for flexibility as needed
later.

I'd really rather that early versions _just_ focus on actual usecases
as far as UI is concerned (and thus I was really happy to see Dscho
and Taylor concentrate on that side; I think Christian might have been
talking about that angle some but it was hard to differentiate from
the "merge-tree on steroids" spitballing).  While I want to be careful
to avoid preventing UI flexibility, I think building it in from the
beginning tends to lead to a design that is less usable.  (e.g. the
possible loss of de-duping that would naturally have arisen from
looking at things from the other angle.)  It's just a bias I have.

>  * Perhaps we should support --abbrev as ls-tree does? The below diff shows
>    it's easy enough.

This one is less problematic to me, but I'd still rather that the UI
side of things focused on the usecases for early versions.

>  * The dance you have with sed-ing out the hash in the tests could be made much
>    easier with "sed 1d <out >actual" and --no-messages for some existing tests.

Ignoring the first line is semantically different than verifying it
looks like a hash.  It also only works on the first line, and hashes
appear in multiple places, so you'd need a variety of different sed
commands for different parts of the output, which doesn't seem any
easier at all to me; I think using the same replacement everywhere is
simpler.  But perhaps I should turn it into a shell function that I
use in each case.

> diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
> index 6a2ed475106..e906d1dc9bf 100644
> --- a/Documentation/git-merge-tree.txt
> +++ b/Documentation/git-merge-tree.txt
> @@ -44,10 +44,9 @@ OPTIONS
>         newline.  Also begin the messages section with a NUL character
>         instead of a newline.  See OUTPUT below for more information.
>
> ---exclude-oids-and-modes::
> -       Instead of writing a list of (mode, oid, stage, path) tuples
> -       to output for conflicted files, just provide a list of
> -       filenames with conflicts.
> +--conflict-format::
> +       Override the default "%(objectmode) %(objectname)
> +       %(stage)%x09%(path)" format.
>
>  --[no-]messages::
>         Write any informational messages such as "Auto-merging <path>"
> @@ -89,13 +88,13 @@ Conflicted file info
>
>  This is a sequence of lines with the format
>
> -       <mode> <object> <stage> <filename>
> +       %(objectmode) %(objectname) %(stage)%x09%(path)
>
>  The filename will be quoted as explained for the configuration
> -variable `core.quotePath` (see linkgit:git-config[1]).  However, if
> -the `--exclude-oids-and-modes` option is passed, the mode, object, and
> -stage will be omitted.  If `-z` is passed, the "lines" are terminated
> -by a NUL character instead of a newline character.
> +variable `core.quotePath` (see linkgit:git-config[1]).
> +
> +If `-z` is passed, the "lines" are terminated by a NUL character
> +instead of a newline character.
>
>  Informational messages
>  ~~~~~~~~~~~~~~~~~~~~~~
> diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> index 58c0ddc5a32..14fed95a8ce 100644
> --- a/builtin/merge-tree.c
> +++ b/builtin/merge-tree.c
> @@ -395,9 +395,64 @@ struct merge_tree_options {
>         int mode;
>         int allow_unrelated_histories;
>         int show_messages;
> -       int exclude_modes_oids_stages;
> +       const char *conflict_format;
> +       int unique_conflicts;
> +       int abbrev;
>  };
>
> +struct expand_conflict_data {
> +       const char *prefix;
> +       struct string_list_item *item;
> +       struct strbuf *scratch;
> +       int abbrev;
> +       struct strbuf *sb_tmp;
> +};
> +static size_t expand_conflict_format(struct strbuf *sb,
> +                                    const char *start,
> +                                    void *context)
> +{
> +       struct expand_conflict_data *data = context;
> +       struct string_list_item *item = data->item;
> +       struct stage_info *info = item->util;
> +       const char *end;
> +       const char *p;
> +       size_t len;
> +
> +       len = strbuf_expand_literal_cb(sb, start, NULL);
> +       if (len)
> +               return len;
> +
> +       if (*start != '(')
> +               die(_("bad format as of '%s'"), start);
> +       end = strchr(start + 1, ')');
> +       if (!end)
> +               die(_("format element '%s' does not end in ')'"), start);
> +       len = end - start + 1;
> +
> +       if (skip_prefix(start, "(objectmode)", &p)) {
> +               strbuf_addf(sb, "%06o", info->mode);
> +       } else if (skip_prefix(start, "(objectname)", &p)) {
> +               strbuf_addstr(sb, find_unique_abbrev(&info->oid, data->abbrev));
> +       } else if (skip_prefix(start, "(stage)", &p)) {
> +               strbuf_addf(sb, "%d", info->stage);
> +       } else if (skip_prefix(start, "(path)", &p)) {
> +               const char *name = item->string;
> +
> +               if (data->prefix)
> +                       name = relative_path(name, data->prefix, data->scratch);
> +               strbuf_addstr(sb, name);
> +
> +               strbuf_reset(data->sb_tmp);
> +               /* The relative_path() function resets "scratch" */
> +
> +       } else {
> +               unsigned int errlen = (unsigned long)len;
> +               die(_("bad format specifier %%%.*s"), errlen, start);
> +       }
> +
> +       return len;
> +}
> +
>  static int real_merge(struct merge_tree_options *o,
>                       const char *branch1, const char *branch2,
>                       const char *prefix)
> @@ -446,23 +501,43 @@ static int real_merge(struct merge_tree_options *o,
>         puts(oid_to_hex(&result.tree->object.oid));
>         if (!result.clean) {
>                 struct string_list conflicted_files = STRING_LIST_INIT_NODUP;
> -               const char *last = NULL;
> -               int i;
> +               struct string_list_item *item;
> +               char *last = NULL;
> +               struct strbuf sb = STRBUF_INIT;
> +               struct strbuf tmp = STRBUF_INIT;
>
>                 merge_get_conflicted_files(&result, &conflicted_files);
> -               for (i = 0; i < conflicted_files.nr; i++) {
> -                       const char *name = conflicted_files.items[i].string;
> -                       struct stage_info *c = conflicted_files.items[i].util;
> -                       if (!o->exclude_modes_oids_stages)
> -                               printf("%06o %s %d\t",
> -                                      c->mode, oid_to_hex(&c->oid), c->stage);
> -                       else if (last && !strcmp(last, name))
> +               for_each_string_list_item(item, &conflicted_files) {
> +                       struct expand_conflict_data ctx = {
> +                               .prefix = prefix,
> +                               .item = item,
> +                               .abbrev = o->abbrev,
> +                               .scratch = &sb,
> +                               .sb_tmp = &tmp,
> +                       };
> +
> +                       strbuf_expand(&sb, o->conflict_format, expand_conflict_format, &ctx);
> +                       strbuf_addch(&sb, line_termination);
> +
> +                       if (o->unique_conflicts && last && !strcmp(last, sb.buf)) {
> +                               free(last);
> +                               last = strbuf_detach(&sb, NULL);
>                                 continue;
> -                       write_name_quoted_relative(
> -                               name, prefix, stdout, line_termination);
> -                       last = name;
> +                       }
> +
> +                       fwrite(sb.buf, sb.len, 1, stdout);
> +
> +                       if (o->unique_conflicts) {
> +                               free(last);
> +                               last = strbuf_detach(&sb, NULL);
> +                       } else {
> +                               strbuf_reset(&sb);
> +                       }
>                 }
>                 string_list_clear(&conflicted_files, 1);
> +               strbuf_release(&sb);
> +               strbuf_release(&tmp);
> +               free(last);
>         }
>         if (o->show_messages) {
>                 putchar(line_termination);
> @@ -474,7 +549,11 @@ static int real_merge(struct merge_tree_options *o,
>
>  int cmd_merge_tree(int argc, const char **argv, const char *prefix)
>  {
> -       struct merge_tree_options o = { .show_messages = -1 };
> +       struct merge_tree_options o = {
> +               .show_messages = -1,
> +               .conflict_format = "%(objectmode) %(objectname) %(stage)%x09%(path)",
> +               .unique_conflicts = 1,
> +       };
>         int expected_remaining_argc;
>         int original_argc;
>
> @@ -493,14 +572,15 @@ int cmd_merge_tree(int argc, const char **argv, const char *prefix)
>                          N_("also show informational/conflict messages")),
>                 OPT_SET_INT('z', NULL, &line_termination,
>                             N_("separate paths with the NUL character"), '\0'),
> -               OPT_BOOL_F('l', "exclude-modes-oids-stages",
> -                          &o.exclude_modes_oids_stages,
> -                          N_("list conflicted files without modes/oids/stages"),
> -                          PARSE_OPT_NONEG),
> +               OPT_STRING(0, "conflict-format", &o.conflict_format, N_("format"),
> +                          N_("specify a custom format to use for conflicted files")),
> +               OPT_BOOL(0, "unique-conflicts", &o.unique_conflicts,
> +                        N_("omit duplicate --conflict-format lines")),

The latter of which you didn't include in the manual?  Also,
unique_conflicts seems like something that is trivial to understand
from the coding perspective, but probably require quite a bit more
explanation from the manual.  For example, if objectname is included
in the format, unique-conflicts is essentially a no-op.  And that's
the default...so, you'd probably have to spend time in the manual
explaining under what circumstances it's useful.  I'm also not sure if
a user who wanted (mode, path) would want unique_conflicts to default
to 1; it may be something only meaningful in the particular case of
"just give me conflicted filenames".

>                 OPT_BOOL_F(0, "allow-unrelated-histories",
>                            &o.allow_unrelated_histories,
>                            N_("allow merging unrelated histories"),
>                            PARSE_OPT_NONEG),
> +               OPT__ABBREV(&o.abbrev),
>                 OPT_END()
>         };
>
> diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> index 4de089d976d..e6354b2d284 100755
> --- a/t/t4301-merge-tree-write-tree.sh
> +++ b/t/t4301-merge-tree-write-tree.sh
> @@ -93,7 +93,7 @@ test_expect_success 'Barf on too many arguments' '
>  '
>
>  test_expect_success 'test conflict notices and such' '
> -       test_expect_code 1 git merge-tree --write-tree --exclude-modes-oids-stages side1 side2 >out &&
> +       test_expect_code 1 git merge-tree --write-tree --conflict-format="%(path)" side1 side2 >out &&
>         sed -e "s/[0-9a-f]\{40,\}/HASH/g" out >actual &&
>
>         # Expected results:
> @@ -115,8 +115,35 @@ test_expect_success 'test conflict notices and such' '
>         test_cmp expect actual
>  '
>
> +test_expect_success 'merge-tree --unique-conflicts is the default' '
> +       test_expect_code 1 git merge-tree --write-tree --conflict-format="%(path)" --no-messages side1 side2 >out &&
> +       sed 1d <out >actual &&
> +       cat >expect <<-\EOF &&
> +       greeting
> +       whatever~side1
> +       EOF
> +       test_cmp expect actual &&
> +
> +       test_expect_code 1 git merge-tree --write-tree --conflict-format="%(path)" --no-messages side1 side2 >out2 &&
> +       sed 1d <out2 >actual2 &&
> +       test_cmp actual actual2
> +'
> +
> +test_expect_success 'merge-tree --no-unique-conflicts' '
> +       test_expect_code 1 git merge-tree --write-tree --conflict-format="%(path)" --no-unique-conflicts --no-messages side1 side2 >out &&
> +       sed 1d <out >actual &&
> +       cat >expect <<-\EOF &&
> +       greeting
> +       greeting
> +       greeting
> +       whatever~side1
> +       whatever~side1
> +       EOF
> +       test_cmp expect actual
> +'
> +
>  test_expect_success 'Just the conflicted files without the messages' '
> -       test_expect_code 1 git merge-tree --write-tree --no-messages --exclude-modes-oids-stages side1 side2 >out &&
> +       test_expect_code 1 git merge-tree --write-tree --no-messages --conflict-format="%(path)" side1 side2 >out &&
>         sed -e "s/[0-9a-f]\{40,\}/HASH/g" out >actual &&
>
>         test_write_lines HASH greeting whatever~side1 >expect &&
diff mbox series

Patch

diff --git a/Documentation/git-merge-tree.txt b/Documentation/git-merge-tree.txt
index 160e8f44b62..55bb7bc61c1 100644
--- a/Documentation/git-merge-tree.txt
+++ b/Documentation/git-merge-tree.txt
@@ -38,6 +38,11 @@  See `OUTPUT` below for details.
 OPTIONS
 -------
 
+--exclude-oids-and-modes::
+	Instead of writing a list of (mode, oid, stage, path) tuples
+	to output for conflicted files, just provide a list of
+	filenames with conflicts.
+
 --[no-]messages::
 	Write any informational messages such as "Auto-merging <path>"
 	or CONFLICT notices to the end of stdout.  If unspecified, the
@@ -55,7 +60,7 @@  simply one line:
 Whereas for a conflicted merge, the output is by default of the form:
 
 	<OID of toplevel tree>
-	<Conflicted file list>
+	<Conflicted file info>
 	<Informational messages>
 
 These are discussed individually below.
@@ -67,18 +72,23 @@  This is a tree object that represents what would be checked out in the
 working tree at the end of `git merge`.  If there were conflicts, then
 files within this tree may have embedded conflict markers.
 
-Conflicted file list
+Conflicted file info
 ~~~~~~~~~~~~~~~~~~~~
 
-This is a sequence of lines containing a filename on each line, quoted
-as explained for the configuration variable `core.quotePath` (see
-linkgit:git-config[1]).
+This is a sequence of lines with the format
+
+	<mode> <object> <stage> <filename>
+
+The filename will be quoted as explained for the configuration
+variable `core.quotePath` (see linkgit:git-config[1]).  However, if
+the `--exclude-oids-and-modes` option is passed, the mode, object, and
+stage will be omitted.
 
 Informational messages
 ~~~~~~~~~~~~~~~~~~~~~~
 
 This always starts with a blank line to separate it from the previous
-section, and then has free-form messages about the merge, such as:
+sections, and then has free-form messages about the merge, such as:
 
   * "Auto-merging <file>"
   * "CONFLICT (rename/delete): <oldfile> renamed...but deleted in..."
@@ -110,6 +120,14 @@  plumbing commands since the possibility of merge conflicts give it a
 much higher chance of the command not succeeding (and NEWTREE containing
 a bunch of stuff other than just a toplevel tree).
 
+git-merge-tree was written to provide users with the same information
+that they'd have access to if using `git merge`:
+  * what would be written to the working tree (the <OID of toplevel tree>)
+  * the higher order stages that would be written to the index (the
+    <Conflicted file info>)
+  * any messages that would have been printed to stdout (the <Informational
+    messages>)
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index 54dae018203..dc52cd02dce 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -394,6 +394,7 @@  static int trivial_merge(const char *base,
 struct merge_tree_options {
 	int mode;
 	int show_messages;
+	int exclude_modes_oids_stages;
 };
 
 static int real_merge(struct merge_tree_options *o,
@@ -450,7 +451,11 @@  static int real_merge(struct merge_tree_options *o,
 		merge_get_conflicted_files(&result, &conflicted_files);
 		for (i = 0; i < conflicted_files.nr; i++) {
 			const char *name = conflicted_files.items[i].string;
-			if (last && !strcmp(last, name))
+			struct stage_info *c = conflicted_files.items[i].util;
+			if (!o->exclude_modes_oids_stages)
+				printf("%06o %s %d\t",
+				       c->mode, oid_to_hex(&c->oid), c->stage);
+			else if (last && !strcmp(last, name))
 				continue;
 			write_name_quoted_relative(
 				name, prefix, stdout, line_termination);
@@ -485,6 +490,10 @@  int cmd_merge_tree(int argc, const char **argv, const char *prefix)
 			    N_("do a trivial merge only"), 't'),
 		OPT_BOOL(0, "messages", &o.show_messages,
 			 N_("also show informational/conflict messages")),
+		OPT_BOOL_F('l', "exclude-modes-oids-stages",
+			   &o.exclude_modes_oids_stages,
+			   N_("list conflicted files without modes/oids/stages"),
+			   PARSE_OPT_NONEG),
 		OPT_END()
 	};
 
diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
index 7113d060bc5..1572f460da0 100755
--- a/t/t4301-merge-tree-write-tree.sh
+++ b/t/t4301-merge-tree-write-tree.sh
@@ -47,6 +47,7 @@  test_expect_success 'Content merge and a few conflicts' '
 	expected_tree=$(cat .git/AUTO_MERGE) &&
 
 	# We will redo the merge, while we are still in a conflicted state!
+	git ls-files -u >conflicted-file-info &&
 	test_when_finished "git reset --hard" &&
 
 	test_expect_code 1 git merge-tree --write-tree side1 side2 >RESULT &&
@@ -86,7 +87,7 @@  test_expect_success 'Barf on too many arguments' '
 '
 
 test_expect_success 'test conflict notices and such' '
-	test_expect_code 1 git merge-tree --write-tree side1 side2 >out &&
+	test_expect_code 1 git merge-tree --write-tree --exclude-modes-oids-stages side1 side2 >out &&
 	sed -e "s/[0-9a-f]\{40,\}/HASH/g" out >actual &&
 
 	# Expected results:
@@ -109,7 +110,7 @@  test_expect_success 'test conflict notices and such' '
 '
 
 test_expect_success 'Just the conflicted files without the messages' '
-	test_expect_code 1 git merge-tree --write-tree --no-messages side1 side2 >out &&
+	test_expect_code 1 git merge-tree --write-tree --no-messages --exclude-modes-oids-stages side1 side2 >out &&
 	sed -e "s/[0-9a-f]\{40,\}/HASH/g" out >actual &&
 
 	test_write_lines HASH greeting whatever~side1 >expect &&
@@ -117,4 +118,25 @@  test_expect_success 'Just the conflicted files without the messages' '
 	test_cmp expect actual
 '
 
+test_expect_success 'Check conflicted oids and modes without messages' '
+	test_expect_code 1 git merge-tree --write-tree --no-messages side1 side2 >out &&
+	sed -e "s/[0-9a-f]\{40,\}/HASH/g" out >actual &&
+
+	# Compare the basic output format
+	q_to_tab >expect <<-\EOF &&
+	HASH
+	100644 HASH 1Qgreeting
+	100644 HASH 2Qgreeting
+	100644 HASH 3Qgreeting
+	100644 HASH 1Qwhatever~side1
+	100644 HASH 2Qwhatever~side1
+	EOF
+
+	test_cmp expect actual &&
+
+	# Check the actual hashes against the `ls-files -u` output too
+	tail -n +2 out | sed -e s/side1/HEAD/ >actual &&
+	test_cmp conflicted-file-info actual
+'
+
 test_done