mbox series

[v2,0/8] Add a new --remerge-diff capability to show & log

Message ID pull.1103.v2.git.1640419159.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Add a new --remerge-diff capability to show & log | expand

Message

Jean-Noël Avila via GitGitGadget Dec. 25, 2021, 7:59 a.m. UTC
Here are some patches to add a --remerge-diff capability to show & log,
which works by comparing merge commits to an automatic remerge (note that
the automatic remerge tree can contain files with conflict markers).

Changes since v1 (of the restarted submission, which technically was v2):

 * Restructured the series, so the first patch introduces the feature --
   with a bunch of caveats. Subsequent patches clean up those caveats. This
   avoids introducing not-yet-used functions, and hopefully makes review
   easier.
 * added testcases
 * numerous small improvements suggested by Ævar and Junio

Changes since original submission[1]:

 * Rebased on top of the version of ns/tmp-objdir that Neeraj submitted
   (Neeraj's patches were based on v2.34, but ns/tmp-objdir got applied on
   an old commit and does not even build because of that).
 * Modify ll-merge API to return a status, instead of printing "Cannot merge
   binary files" on stdout[2] (as suggested by Peff)
 * Make conflict messages and other such warnings into diff headers of the
   subsequent remerge-diff rather than appearing in the diff as file content
   of some funny looking filenames (as suggested by Peff[3] and Junio[4])
 * Sergey ack'ed the diff-merges.c portion of the patches, but that wasn't
   limited to one patch so not sure where to record that ack.

[1]
https://lore.kernel.org/git/pull.1080.git.git.1630376800.gitgitgadget@gmail.com/;
GitHub wouldn't let me change the target branch for the PR, so I had to
create a new one with the new base and thus the reason for not sending this
as v2 even though it is. [2]
https://lore.kernel.org/git/YVOZRhWttzF18Xql@coredump.intra.peff.net/,
https://lore.kernel.org/git/YVOZty9D7NRbzhE5@coredump.intra.peff.net/ [3]
https://lore.kernel.org/git/YVOXPTjsp9lrxmS6@coredump.intra.peff.net/ [4]
https://lore.kernel.org/git/xmqqr1d7e4ug.fsf@gitster.g/

=== FURTHER BACKGROUND (original cover letter material) ==

Here are some example commits you can try this out on (with git show
--remerge-diff $COMMIT):

 * git.git conflicted merge: 07601b5b36
 * git.git non-conflicted change: bf04590ecd
 * linux.git conflicted merge: eab3540562fb
 * linux.git non-conflicted change: 223cea6a4f05

Many more can be found by just running git log --merges --remerge-diff in
your repository of choice and searching for diffs (most merges tend to be
clean and unmodified and thus produce no diff but a search of '^diff' in the
log output tends to find the examples nicely).

Some basic high level details about this new option:

 * This option is most naturally compared to --cc, though the output seems
   to be much more understandable to most users than --cc output.
 * Since merges are often clean and unmodified, this new option results in
   an empty diff for most merges.
 * This new option shows things like the removal of conflict markers, which
   hunks users picked from the various conflicted sides to keep or remove,
   and shows changes made outside of conflict markers (which might reflect
   changes needed to resolve semantic conflicts or cleanups of e.g.
   compilation warnings or other additional changes an integrator felt
   belonged in the merged result).
 * This new option does not (currently) work for octopus merges, since
   merge-ort is specific to two-parent merges[1].
 * This option will not work on a read-only or full filesystem[2].
 * We discussed this capability at Git Merge 2020, and one of the
   suggestions was doing a periodic git gc --auto during the operation (due
   to potential new blobs and trees created during the operation). I found a
   way to avoid that; see [2].
 * This option is faster than you'd probably expect; it handles 33.5 merge
   commits per second in linux.git on my computer; see below.

In regards to the performance point above, the timing for running the
following command:

time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l


in linux.git (with v5.4 checked out, since my copy of linux is very out of
date) is as follows:

DIFF_FLAG=--cc:            71m 31.536s
DIFF_FLAG=--remerge-diff:  31m  3.170s


Note that there are 62476 merges in this history. Also, output size is:

DIFF_FLAG=--cc:            2169111 lines
DIFF_FLAG=--remerge-diff:  2458020 lines


So roughly the same amount of output as --cc, as you'd expect.

As a side note: git log --remerge-diff, when run in various repositories and
allowed to run all the way back to the beginning(s) of history, is a nice
stress test of sorts for merge-ort. Especially when users run it for you on
their repositories they are working on, whether intentionally or via a bug
in a tool triggering that command to be run unexpectedly. Long story short,
such a bug in an internal tool existed last December and this command was
run on an internal repository and found a platform-specific bug in merge-ort
on some really old merge commit from that repo. I fixed that bug (a
STABLE_QSORT thing) while upstreaming all the merge-ort patches in the mean
time, but it was nice getting extra testing. Having more folks run this on
their repositories might be useful extra testing of the new merge strategy.

Also, I previously mentioned --remerge-diff-only (a flag to show how
cherry-picks or reverts differ from an automatic cherry-pick or revert, in
addition to showing how merges differ from an automatic merge). This series
does not include the patches to introduce that option; I'll submit them
later.

Two other things that might be interesting but are not included and which I
haven't investigated:

 * some mechanism for passing extra merge options through (e.g.
   -Xignore-space-change)
 * a capability to compare the automatic merge to a second automatic merge
   done with different merge options. (Not sure if this would be of interest
   to end users, but might be interesting while developing new a
   --strategy-option, or maybe checking how changing some default in the
   merge algorithm would affect historical merges in various repositories).

[1] I have nebulous ideas of how an Octopus-centric ORT strategy could be
written -- basically, just repeatedly invoking ort and trying to make sure
nested conflicts can be differentiated. For now, though, a simple warning is
printed that octopus merges are not handled and no diff will be shown. [2]
New blobs/trees can be written by the three-way merging step. These are
written to a temporary area (via tmp-objdir.c) under the git object store
that is cleaned up at the end of the operation, with the new loose objects
from the remerge being cleaned up after each individual merge.

Elijah Newren (8):
  show, log: provide a --remerge-diff capability
  log: clean unneeded objects during `log --remerge-diff`
  ll-merge: make callers responsible for showing warnings
  merge-ort: capture and print ll-merge warnings in our preferred
    fashion
  merge-ort: mark a few more conflict messages as omittable
  merge-ort: format messages slightly different for use in headers
  diff: add ability to insert additional headers for paths
  show, log: include conflict/warning messages in --remerge-diff headers

 Documentation/diff-options.txt |   8 ++
 apply.c                        |   5 +-
 builtin/checkout.c             |  12 ++-
 builtin/log.c                  |  15 +++
 diff-merges.c                  |  12 +++
 diff.c                         | 116 +++++++++++++++++++++-
 diff.h                         |   3 +-
 ll-merge.c                     |  40 ++++----
 ll-merge.h                     |   9 +-
 log-tree.c                     |  70 +++++++++++++-
 merge-blobs.c                  |   5 +-
 merge-ort.c                    |  47 ++++++++-
 merge-ort.h                    |  10 ++
 merge-recursive.c              |   8 +-
 merge-recursive.h              |   1 +
 notes-merge.c                  |   5 +-
 rerere.c                       |  12 ++-
 revision.h                     |   6 +-
 t/t4069-remerge-diff.sh        | 172 +++++++++++++++++++++++++++++++++
 t/t6404-recursive-merge.sh     |   9 +-
 t/t6406-merge-attr.sh          |   9 +-
 tmp-objdir.c                   |   5 +
 tmp-objdir.h                   |   6 ++
 23 files changed, 538 insertions(+), 47 deletions(-)
 create mode 100755 t/t4069-remerge-diff.sh


base-commit: 4e44121c2d7bced65e25eb7ec5156290132bec94
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1103%2Fnewren%2Fremerge-diff-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1103/newren/remerge-diff-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1103

Range-diff vs v1:

  8:  5d5846be0bd !  1:  b3ae62083e1 show, log: provide a --remerge-diff capability
     @@ Commit message
          possibly, just to hide other random changes).
      
          This capability works by creating a temporary object directory and
     -    marking it as the primary object store, so that any blobs or trees
     -    created during the automatic merge, can be easily removed afterwards by
     -    just deleting all objects from the temporary object directory.  We can
     -    do this after handling each merge commit, in order to avoid the need to
     -    worry about doing `git gc --auto` runs while running `git log
     -    --remerge-diff`.
     +    marking it as the primary object store.  This makes it so that any blobs
     +    or trees created during the automatic merge easily removable afterwards
     +    by just deleting all objects from the temporary object directory.
     +
     +    There are a few ways that this implementation is suboptimal:
     +      * `log --remerge-diff` becomes slow, because the temporary object
     +        directory can fills with many loose objects while running
     +      * the log output can be muddied with misplaced "warning: cannot merge
     +        binary files" messages, since ll-merge.c unconditionally writes those
     +        messages to stderr while running instead of allowing callers to
     +        manage them.
     +      * important conflict and warning messages are simply dropped; thus for
     +        conflicts like modify/delete or rename/rename or file/directory which
     +        are not representable with content conflict markers, there may be no
     +        way for a user of --remerge-diff to know that there had been a
     +        conflict which was resolved (and which possibly motivated other
     +        changes in the merge commit).
     +    Subsequent commits will address these issues.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     + ## Documentation/diff-options.txt ##
     +@@ Documentation/diff-options.txt: ifdef::git-log[]
     + 	each of the parents. Separate log entry and diff is generated
     + 	for each parent.
     + +
     ++--diff-merges=remerge:::
     ++--diff-merges=r:::
     ++--remerge-diff:::
     ++	With this option, two-parent merge commits are remerged to
     ++	create a temporary tree object -- potentially containing files
     ++	with conflict markers and such.  A diff is then shown between
     ++	that temporary tree and the actual merge commit.
     +++
     + --diff-merges=combined:::
     + --diff-merges=c:::
     + -c:::
     +
       ## builtin/log.c ##
      @@
       #include "repository.h"
       #include "commit-reach.h"
       #include "range-diff.h"
     -+#include "dir.h"
      +#include "tmp-objdir.h"
       
       #define MAIL_DEFAULT_WRAP 72
       #define COVER_FROM_AUTO_MAX_SUBJECT_LEN 100
      @@ builtin/log.c: static int cmd_log_walk(struct rev_info *rev)
     + 	struct commit *commit;
       	int saved_nrl = 0;
       	int saved_dcctc = 0;
     - 
     ++	struct tmp_objdir *remerge_objdir = NULL;
     ++
      +	if (rev->remerge_diff) {
     -+		rev->remerge_objdir = tmp_objdir_create("remerge-diff");
     -+		if (!rev->remerge_objdir)
     -+			die(_("unable to create temporary object directory"));
     -+		tmp_objdir_replace_primary_odb(rev->remerge_objdir, 1);
     ++		remerge_objdir = tmp_objdir_create("remerge-diff");
     ++		if (!remerge_objdir)
     ++			die_errno(_("unable to create temporary object directory"));
     ++		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
      +	}
     -+
     + 
       	if (rev->early_output)
       		setup_early_output();
     - 
      @@ builtin/log.c: static int cmd_log_walk(struct rev_info *rev)
       	rev->diffopt.no_free = 0;
       	diff_free(&rev->diffopt);
       
     -+	if (rev->remerge_diff) {
     -+		tmp_objdir_destroy(rev->remerge_objdir);
     -+		rev->remerge_objdir = NULL;
     -+	}
     ++	if (rev->remerge_diff)
     ++		tmp_objdir_destroy(remerge_objdir);
      +
       	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
       	    rev->diffopt.flags.check_failed) {
     @@ builtin/log.c: int cmd_format_patch(int argc, const char **argv, const char *pre
       	if (rev.diffopt.output_format & DIFF_FORMAT_CHECKDIFF)
       		die(_("--check does not make sense"));
      +	if (rev.remerge_diff)
     -+		die(_("--remerge_diff does not make sense"));
     ++		die(_("--remerge-diff does not make sense"));
       
       	if (!use_patch_format &&
       		(!rev.diffopt.output_format ||
     @@ log-tree.c
       #include "config.h"
       #include "diff.h"
       #include "object-store.h"
     - #include "repository.h"
     -+#include "tmp-objdir.h"
     - #include "commit.h"
     +@@
       #include "tag.h"
       #include "graph.h"
       #include "log-tree.h"
     @@ log-tree.c
       #include "reflog-walk.h"
       #include "refs.h"
       #include "string-list.h"
     -@@
     - #include "line-log.h"
     - #include "help.h"
     - #include "range-diff.h"
     -+#include "dir.h"
     - 
     - static struct decoration name_decoration = { "object names" };
     - static int decoration_loaded;
      @@ log-tree.c: static int do_diff_combined(struct rev_info *opt, struct commit *commit)
       	return !opt->loginfo;
       }
     @@ log-tree.c: static int do_diff_combined(struct rev_info *opt, struct commit *com
      +{
      +	struct merge_options o;
      +	struct commit_list *bases;
     -+	struct merge_result res;
     ++	struct merge_result res = {0};
      +	struct pretty_print_context ctx = {0};
     -+	struct strbuf commit1 = STRBUF_INIT;
     -+	struct strbuf commit2 = STRBUF_INIT;
     ++	struct commit *parent1 = parents->item;
     ++	struct commit *parent2 = parents->next->item;
     ++	struct strbuf parent1_desc = STRBUF_INIT;
     ++	struct strbuf parent2_desc = STRBUF_INIT;
      +
      +	/* Setup merge options */
      +	init_merge_options(&o, the_repository);
     -+	memset(&res, 0, sizeof(res));
      +	o.show_rename_progress = 0;
      +
      +	ctx.abbrev = DEFAULT_ABBREV;
     -+	format_commit_message(parents->item,       "%h (%s)", &commit1, &ctx);
     -+	format_commit_message(parents->next->item, "%h (%s)", &commit2, &ctx);
     -+	o.branch1 = commit1.buf;
     -+	o.branch2 = commit2.buf;
     -+	o.record_conflict_msgs_as_headers = 1;
     ++	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
     ++	format_commit_message(parent2, "%h (%s)", &parent2_desc, &ctx);
     ++	o.branch1 = parent1_desc.buf;
     ++	o.branch2 = parent2_desc.buf;
      +
      +	/* Parse the relevant commits and get the merge bases */
     -+	parse_commit_or_die(parents->item);
     -+	parse_commit_or_die(parents->next->item);
     -+	bases = get_merge_bases(parents->item, parents->next->item);
     ++	parse_commit_or_die(parent1);
     ++	parse_commit_or_die(parent2);
     ++	bases = get_merge_bases(parent1, parent2);
      +
      +	/* Re-merge the parents */
     -+	merge_incore_recursive(&o,
     -+			       bases, parents->item, parents->next->item,
     -+			       &res);
     ++	merge_incore_recursive(&o, bases, parent1, parent2, &res);
      +
      +	/* Show the diff */
     -+	opt->diffopt.additional_path_headers = res.path_messages;
      +	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
      +	log_tree_diff_flush(opt);
      +
      +	/* Cleanup */
     -+	opt->diffopt.additional_path_headers = NULL;
     -+	strbuf_release(&commit1);
     -+	strbuf_release(&commit2);
     ++	strbuf_release(&parent1_desc);
     ++	strbuf_release(&parent2_desc);
      +	merge_finalize(&o, &res);
     -+
     -+	/* Clean up the temporary object directory */
     -+	if (opt->remerge_objdir != NULL)
     -+		tmp_objdir_discard_objects(opt->remerge_objdir);
     -+	else
     -+		BUG("unable to remove temporary object directory");
     ++	/* TODO: clean up the temporary object directory */
      +
      +	return !opt->loginfo;
      +}
     @@ revision.h: struct rev_info {
       
       	/* Format info */
       	int		show_notes;
     -@@ revision.h: struct rev_info {
     - 
     - 	/* misc. flags related to '--no-kept-objects' */
     - 	unsigned keep_pack_cache_flags;
     +
     + ## t/t4069-remerge-diff.sh (new) ##
     +@@
     ++#!/bin/sh
      +
     -+	/* Location where temporary objects for remerge-diff are written. */
     -+	struct tmp_objdir *remerge_objdir;
     - };
     - 
     - int ref_excluded(struct string_list *, const char *path);
     ++test_description='remerge-diff handling'
     ++
     ++. ./test-lib.sh
     ++
     ++test_expect_success 'setup basic merges' '
     ++	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
     ++	git add numbers &&
     ++	git commit -m base &&
     ++
     ++	git branch feature_a &&
     ++	git branch feature_b &&
     ++	git branch feature_c &&
     ++
     ++	git branch ab_resolution &&
     ++	git branch bc_resolution &&
     ++
     ++	git checkout feature_a &&
     ++	test_write_lines 1 2 three 4 5 6 7 eight 9 >numbers &&
     ++	git commit -a -m change_a &&
     ++
     ++	git checkout feature_b &&
     ++	test_write_lines 1 2 tres 4 5 6 7 8 9 >numbers &&
     ++	git commit -a -m change_b &&
     ++
     ++	git checkout feature_c &&
     ++	test_write_lines 1 2 3 4 5 6 7 8 9 10 >numbers &&
     ++	git commit -a -m change_c &&
     ++
     ++	git checkout bc_resolution &&
     ++	# fast forward
     ++	git merge feature_b &&
     ++	# no conflict
     ++	git merge feature_c &&
     ++
     ++	git checkout ab_resolution &&
     ++	# fast forward
     ++	git merge feature_a &&
     ++	# conflicts!
     ++	test_must_fail git merge feature_b &&
     ++	# Resolve conflict...and make another change elsewhere
     ++	test_write_lines 1 2 drei 4 5 6 7 acht 9 >numbers &&
     ++	git add numbers &&
     ++	git merge --continue
     ++'
     ++
     ++test_expect_success 'remerge-diff on a clean merge' '
     ++	git log -1 --oneline bc_resolution >expect &&
     ++	git show --oneline --remerge-diff bc_resolution >actual &&
     ++	test_cmp expect actual
     ++'
     ++
     ++test_expect_success 'remerge-diff with both a resolved conflict and an unrelated change' '
     ++	git log -1 --oneline ab_resolution >tmp &&
     ++	cat <<-EOF >>tmp &&
     ++	diff --git a/numbers b/numbers
     ++	index a1fb731..6875544 100644
     ++	--- a/numbers
     ++	+++ b/numbers
     ++	@@ -1,13 +1,9 @@
     ++	 1
     ++	 2
     ++	-<<<<<<< b0ed5cb (change_a)
     ++	-three
     ++	-=======
     ++	-tres
     ++	->>>>>>> 6cd3f82 (change_b)
     ++	+drei
     ++	 4
     ++	 5
     ++	 6
     ++	 7
     ++	-eight
     ++	+acht
     ++	 9
     ++	EOF
     ++	# Hashes above are sha1; rip them out so test works with sha256
     ++	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
     ++
     ++	git show --oneline --remerge-diff ab_resolution >tmp &&
     ++	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
     ++	test_cmp expect actual
     ++'
     ++
     ++test_done
  1:  fab1b2c69ea !  2:  54f1fb31d04 tmp_objdir: add a helper function for discarding all contained objects
     @@ Metadata
      Author: Elijah Newren <newren@gmail.com>
      
       ## Commit message ##
     -    tmp_objdir: add a helper function for discarding all contained objects
     +    log: clean unneeded objects during `log --remerge-diff`
     +
     +    The --remerge-diff option will need to create new blobs and trees
     +    representing the "automatic merge" state.  If one is traversing a
     +    long project history, one can easily get hundreds of thousands of
     +    loose objects generated during `log --remerge-diff`.  However, none of
     +    those loose objects are needed after we have completed our diff
     +    operation; they can be summarily deleted.
     +
     +    Add a new helper function to tmp_objdir to discard all the contained
     +    objects, and call it after each merge is handled.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     + ## builtin/log.c ##
     +@@ builtin/log.c: static int cmd_log_walk(struct rev_info *rev)
     + 	struct commit *commit;
     + 	int saved_nrl = 0;
     + 	int saved_dcctc = 0;
     +-	struct tmp_objdir *remerge_objdir = NULL;
     + 
     + 	if (rev->remerge_diff) {
     +-		remerge_objdir = tmp_objdir_create("remerge-diff");
     +-		if (!remerge_objdir)
     ++		rev->remerge_objdir = tmp_objdir_create("remerge-diff");
     ++		if (!rev->remerge_objdir)
     + 			die_errno(_("unable to create temporary object directory"));
     +-		tmp_objdir_replace_primary_odb(remerge_objdir, 1);
     ++		tmp_objdir_replace_primary_odb(rev->remerge_objdir, 1);
     + 	}
     + 
     + 	if (rev->early_output)
     +@@ builtin/log.c: static int cmd_log_walk(struct rev_info *rev)
     + 	rev->diffopt.no_free = 0;
     + 	diff_free(&rev->diffopt);
     + 
     +-	if (rev->remerge_diff)
     +-		tmp_objdir_destroy(remerge_objdir);
     ++	if (rev->remerge_diff) {
     ++		tmp_objdir_destroy(rev->remerge_objdir);
     ++		rev->remerge_objdir = NULL;
     ++	}
     + 
     + 	if (rev->diffopt.output_format & DIFF_FORMAT_CHECKDIFF &&
     + 	    rev->diffopt.flags.check_failed) {
     +
     + ## log-tree.c ##
     +@@
     + #include "diff.h"
     + #include "object-store.h"
     + #include "repository.h"
     ++#include "tmp-objdir.h"
     + #include "commit.h"
     + #include "tag.h"
     + #include "graph.h"
     +@@ log-tree.c: static int do_remerge_diff(struct rev_info *opt,
     + 	strbuf_release(&parent1_desc);
     + 	strbuf_release(&parent2_desc);
     + 	merge_finalize(&o, &res);
     +-	/* TODO: clean up the temporary object directory */
     ++
     ++	/* Clean up the contents of the temporary object directory */
     ++	if (opt->remerge_objdir)
     ++		tmp_objdir_discard_objects(opt->remerge_objdir);
     ++	else
     ++		BUG("unable to remove temporary object directory");
     + 
     + 	return !opt->loginfo;
     + }
     +
     + ## revision.h ##
     +@@ revision.h: struct rev_info {
     + 
     + 	/* misc. flags related to '--no-kept-objects' */
     + 	unsigned keep_pack_cache_flags;
     ++
     ++	/* Location where temporary objects for remerge-diff are written. */
     ++	struct tmp_objdir *remerge_objdir;
     + };
     + 
     + int ref_excluded(struct string_list *, const char *path);
     +
       ## tmp-objdir.c ##
      @@ tmp-objdir.c: static void remove_tmp_objdir_on_signal(int signo)
       	raise(signo);
  2:  d022176618d !  3:  d5566f5d136 ll-merge: make callers responsible for showing warnings
     @@ Commit message
          from ll-merge and instead modify the return status of ll_merge() to
          indicate when a merge of binary files has occurred.
      
     +    This commit continues printing the message as-is; future changes will
     +    start handling the new commit differently in the merge-ort codepath.
     +
          Note that my methodology included first modifying ll_merge() to return
          a struct, so that the compiler would catch all the callers for me and
          ensure I had modified all of them.  After modifying all of them, I then
     @@ apply.c: static int three_way_merge(struct apply_state *state,
       			  NULL);
      +	if (status == LL_MERGE_BINARY_CONFLICT)
      +		warning("Cannot merge binary files: %s (%s vs. %s)",
     -+			"base", "ours", "theirs");
     ++			path, "ours", "theirs");
       	free(base_file.ptr);
       	free(our_file.ptr);
       	free(their_file.ptr);
     @@ rerere.c: static int try_merge(struct index_state *istate,
       	mmfile_t base = {NULL, 0}, other = {NULL, 0};
       
       	if (read_mmfile(&base, rerere_path(id, "preimage")) ||
     - 	    read_mmfile(&other, rerere_path(id, "postimage")))
     +-	    read_mmfile(&other, rerere_path(id, "postimage")))
      -		ret = 1;
      -	else
     ++	    read_mmfile(&other, rerere_path(id, "postimage"))) {
      +		ret = LL_MERGE_CONFLICT;
     -+	else {
     ++	} else {
       		/*
       		 * A three-way merge. Note that this honors user-customizable
       		 * low-level merge driver settings.
  3:  f36395fdee0 =  4:  a02845f12db merge-ort: capture and print ll-merge warnings in our preferred fashion
  4:  1e7eef7705e !  5:  000933c5d7f merge-ort: mark a few more conflict messages as omittable
     @@ Commit message
          are just noise when trying to see what changes users made to create a
          merge commit.  Mark them as omittable.
      
     +    Note that there were already a few messages marked as omittable in
     +    merge-ort when doing a remerge-diff, because the development of
     +    --remerge-diff preceded the upstreaming of merge-ort and I was trying to
     +    ensure merge-ort could handle all the necessary requirements.  See
     +    commit c5a6f65527 ("merge-ort: add modify/delete handling and delayed
     +    output processing", 2020-12-03) for the initial details.  For some
     +    examples of already-marked-as-omittable messages, see either
     +    "Auto-merging <path>" or some of the submodule update hints.  This
     +    commit just adds two more messages that should also be omittable.
     +
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
       ## merge-ort.c ##
     @@ merge-ort.c: static void apply_directory_rename_modifications(struct merge_optio
       				 _("CONFLICT (file location): %s renamed to %s "
       				   "in %s, inside a directory that was renamed "
       				   "in %s, suggesting it should perhaps be "
     -@@ merge-ort.c: static void process_entry(struct merge_options *opt,
     - 				reason = _("add/add");
     - 			if (S_ISGITLINK(merged_file.mode))
     - 				reason = _("submodule");
     --			path_msg(opt, path, 0,
     -+			path_msg(opt, path, 1,
     - 				 _("CONFLICT (%s): Merge conflict in %s"),
     - 				 reason, path);
     - 		}
  7:  b307f63569f !  6:  887e46435c0 merge-ort: format messages slightly different for use in headers
     @@ Metadata
       ## Commit message ##
          merge-ort: format messages slightly different for use in headers
      
     -    We want to add an ability for users to run
     +    When users run
              git show --remerge-diff $MERGE_COMMIT
     -    or even
     +    or
              git log -p --remerge-diff ...
     -    and have git show the differences between where the merge machinery
     -    would stop and what is recorded in merge commits.  However, in such
     -    cases, stdout is not an appropriate location to dump conflict messages.
     -    We instead want these messages to appear as headers in the subsequent
     -    diff.  For them to work as headers, though, we need for any multiline
     +    stdout is not an appropriate location to dump conflict messages, but we
     +    do want to provide them to users.  We will include them in the diff
     +    headers instead...but for that to work, we need for any multiline
          messages to replace newlines with both a newline and a space.  Add a new
          flag to signal when we want these messages modified in such a fashion,
          and use it in path_msg() to modify these messages this way.
  6:  15600df925f !  7:  e9470651303 diff: add ability to insert additional headers for paths
     @@ Metadata
       ## Commit message ##
          diff: add ability to insert additional headers for paths
      
     -    In support of a remerge-diff ability we will add in a few commits, we
     -    want to be able to provide additional headers to show along with a diff.
     -    Add the plumbing necessary to enable this.
     +    When additional headers are provided, we need to
     +      * add diff_filepairs to diff_queued_diff for each paths in the
     +        additional headers map which, unless that path is part of
     +        another diff_filepair already found in diff_queued_diff
     +      * format the headers (colorization, line_prefix for --graph)
     +      * make sure the various codepaths that attempt to return early
     +        if there are "no changes" take into account the headers that
     +        need to be shown.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     @@ diff.c: struct userdiff_driver *get_textconv(struct repository *r,
       	return userdiff_get_textconv(r, one->driver);
       }
       
     -+static struct strbuf* additional_headers(struct diff_options *o,
     ++static struct strbuf *additional_headers(struct diff_options *o,
      +					 const char *path)
      +{
      +	if (!o->additional_path_headers)
     @@ diff.c: struct userdiff_driver *get_textconv(struct repository *r,
      +{
      +	char *next, *newline;
      +
     -+	next = more_headers->buf;
     -+	while ((newline = strchr(next, '\n'))) {
     -+		*newline = '\0';
     -+		strbuf_addf(msg, "%s%s%s%s\n", line_prefix, meta, next, reset);
     -+		*newline = '\n';
     -+		next = newline + 1;
     ++	for (next = more_headers->buf; *next; next = newline) {
     ++		newline = strchrnul(next, '\n');
     ++		strbuf_addf(msg, "%s%s%.*s%s\n", line_prefix, meta,
     ++			    (int)(newline - next), next, reset);
     ++		if (*newline)
     ++			newline++;
      +	}
     -+	if (*next)
     -+		strbuf_addf(msg, "%s%s%s%s\n", line_prefix, meta, next, reset);
      +}
      +
       static void builtin_diff(const char *name_a,
       			 const char *name_b,
       			 struct diff_filespec *one,
     +@@ diff.c: static void builtin_diff(const char *name_a,
     + 	b_two = quote_two(b_prefix, name_b + (*name_b == '/'));
     + 	lbl[0] = DIFF_FILE_VALID(one) ? a_one : "/dev/null";
     + 	lbl[1] = DIFF_FILE_VALID(two) ? b_two : "/dev/null";
     ++	if (!DIFF_FILE_VALID(one) && !DIFF_FILE_VALID(two)) {
     ++		/*
     ++		 * We should only reach this point for pairs from
     ++		 * create_filepairs_for_header_only_notifications().  For
     ++		 * these, we should avoid the "/dev/null" special casing
     ++		 * above, meaning we avoid showing such pairs as either
     ++		 * "new file" or "deleted file" below.
     ++		 */
     ++		lbl[0] = a_one;
     ++		lbl[1] = b_two;
     ++	}
     + 	strbuf_addf(&header, "%s%sdiff --git %s %s%s\n", line_prefix, meta, a_one, b_two, reset);
     + 	if (lbl[0][0] == '/') {
     + 		/* /dev/null */
      @@ diff.c: static void fill_metainfo(struct strbuf *msg,
       	const char *set = diff_get_color(use_color, DIFF_METAINFO);
       	const char *reset = diff_get_color(use_color, DIFF_RESET);
     @@ diff.c: static void fill_metainfo(struct strbuf *msg,
       
       	*must_show_header = 1;
       	strbuf_init(msg, PATH_MAX * 2 + 300);
     -+	if ((more_headers = additional_headers(o, name)))
     +@@ diff.c: static void fill_metainfo(struct strbuf *msg,
     + 	default:
     + 		*must_show_header = 0;
     + 	}
     ++	if ((more_headers = additional_headers(o, name))) {
      +		add_formatted_headers(msg, more_headers,
      +				      line_prefix, set, reset);
     - 	switch (p->status) {
     - 	case DIFF_STATUS_COPIED:
     - 		strbuf_addf(msg, "%s%ssimilarity index %d%%",
     ++		*must_show_header = 1;
     ++	}
     + 	if (one && two && !oideq(&one->oid, &two->oid)) {
     + 		const unsigned hexsz = the_hash_algo->hexsz;
     + 		int abbrev = o->abbrev ? o->abbrev : DEFAULT_ABBREV;
      @@ diff.c: int diff_unmodified_pair(struct diff_filepair *p)
       
       static void diff_flush_patch(struct diff_filepair *p, struct diff_options *o)
       {
      -	if (diff_unmodified_pair(p))
     ++	/*
     ++	 * Check if we can return early without showing a diff.  Note that
     ++	 * diff_filepair only stores {oid, path, mode, is_valid}
     ++	 * information for each path, and thus diff_unmodified_pair() only
     ++	 * considers those bits of info.  However, we do not want pairs
     ++	 * created by create_filepairs_for_header_only_notifications() to
     ++	 * be ignored, so return early if both p is unmodified AND
     ++	 * p->one->path is not in additional headers.
     ++	 */
      +	if (diff_unmodified_pair(p) && !additional_headers(o, p->one->path))
       		return;
       
     ++	/* Actually, we can also return early to avoid showing tree diffs */
       	if ((DIFF_FILE_VALID(p->one) && S_ISDIR(p->one->mode)) ||
     + 	    (DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))
     +-		return; /* no tree diffs in patch format */
     ++		return;
     + 
     + 	run_diff(p, o);
     + }
     +@@ diff.c: static void diff_flush_checkdiff(struct diff_filepair *p,
     + 	run_checkdiff(p, o);
     + }
     + 
     +-int diff_queue_is_empty(void)
     ++int diff_queue_is_empty(struct diff_options *o)
     + {
     + 	struct diff_queue_struct *q = &diff_queued_diff;
     + 	int i;
     ++
     ++	if (o->additional_path_headers &&
     ++	    !strmap_empty(o->additional_path_headers))
     ++		return 0;
     + 	for (i = 0; i < q->nr; i++)
     + 		if (!diff_unmodified_pair(q->queue[i]))
     + 			return 0;
     +@@ diff.c: void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc)
     + 		warning(_(rename_limit_advice), varname, needed);
     + }
     + 
     ++static void create_filepairs_for_header_only_notifications(struct diff_options *o)
     ++{
     ++	struct strset present;
     ++	struct diff_queue_struct *q = &diff_queued_diff;
     ++	struct hashmap_iter iter;
     ++	struct strmap_entry *e;
     ++	int i;
     ++
     ++	strset_init_with_options(&present, /*pool*/ NULL, /*strdup*/ 0);
     ++
     ++	/*
     ++	 * Find out which paths exist in diff_queued_diff, preferring
     ++	 * one->path for any pair that has multiple paths.
     ++	 */
     ++	for (i = 0; i < q->nr; i++) {
     ++		struct diff_filepair *p = q->queue[i];
     ++		char *path = p->one->path ? p->one->path : p->two->path;
     ++
     ++		if (strmap_contains(o->additional_path_headers, path))
     ++			strset_add(&present, path);
     ++	}
     ++
     ++	/*
     ++	 * Loop over paths in additional_path_headers; for each NOT already
     ++	 * in diff_queued_diff, create a synthetic filepair and insert that
     ++	 * into diff_queued_diff.
     ++	 */
     ++	strmap_for_each_entry(o->additional_path_headers, &iter, e) {
     ++		if (!strset_contains(&present, e->key)) {
     ++			struct diff_filespec *one, *two;
     ++			struct diff_filepair *p;
     ++
     ++			one = alloc_filespec(e->key);
     ++			two = alloc_filespec(e->key);
     ++			fill_filespec(one, null_oid(), 0, 0);
     ++			fill_filespec(two, null_oid(), 0, 0);
     ++			p = diff_queue(q, one, two);
     ++			p->status = DIFF_STATUS_MODIFIED;
     ++		}
     ++	}
     ++
     ++	/* Re-sort the filepairs */
     ++	diffcore_fix_diff_index();
     ++
     ++	/* Cleanup */
     ++	strset_clear(&present);
     ++}
     ++
     + static void diff_flush_patch_all_file_pairs(struct diff_options *o)
     + {
     + 	int i;
     +@@ diff.c: static void diff_flush_patch_all_file_pairs(struct diff_options *o)
     + 	if (o->color_moved)
     + 		o->emitted_symbols = &esm;
     + 
     ++	if (o->additional_path_headers)
     ++		create_filepairs_for_header_only_notifications(o);
     ++
     + 	for (i = 0; i < q->nr; i++) {
     + 		struct diff_filepair *p = q->queue[i];
     + 		if (check_pair_status(p))
     +@@ diff.c: void diff_flush(struct diff_options *options)
     + 	 * Order: raw, stat, summary, patch
     + 	 * or:    name/name-status/checkdiff (other bits clear)
     + 	 */
     +-	if (!q->nr)
     ++	if (!q->nr && !options->additional_path_headers)
     + 		goto free_queue;
     + 
     + 	if (output_format & (DIFF_FORMAT_RAW |
      
       ## diff.h ##
      @@ diff.h: struct diff_options {
     @@ diff.h: struct diff_options {
       
       	int no_free;
       };
     +@@ diff.h: void diffcore_fix_diff_index(void);
     + "                show all files diff when -S is used and hit is found.\n" \
     + "  -a  --text    treat all files as text.\n"
     + 
     +-int diff_queue_is_empty(void);
     ++int diff_queue_is_empty(struct diff_options*);
     + void diff_flush(struct diff_options*);
     + void diff_free(struct diff_options*);
     + void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc);
     +
     + ## log-tree.c ##
     +@@ log-tree.c: int log_tree_diff_flush(struct rev_info *opt)
     + 	opt->shown_dashes = 0;
     + 	diffcore_std(&opt->diffopt);
     + 
     +-	if (diff_queue_is_empty()) {
     ++	if (diff_queue_is_empty(&opt->diffopt)) {
     + 		int saved_fmt = opt->diffopt.output_format;
     + 		opt->diffopt.output_format = DIFF_FORMAT_NO_OUTPUT;
     + 		diff_flush(&opt->diffopt);
  5:  dd5461d45de !  8:  4cc53c55a6e merge-ort: make path_messages available to external callers
     @@ Metadata
      Author: Elijah Newren <newren@gmail.com>
      
       ## Commit message ##
     -    merge-ort: make path_messages available to external callers
     +    show, log: include conflict/warning messages in --remerge-diff headers
      
     -    merge-ort is designed to be more flexible so that it could be called as
     -    more of a library function.  Part of that design is not writing to the
     -    working tree or index unless and until requested.  Part of it is
     -    returning tree objects (rather than creating commits and making them
     -    part of HEAD), and allowing callers to do their own special thing with
     -    that merged tree.  Along the same lines, we want to enable callers to do
     -    something special with output messages (conflicts and other warnings)
     -    besides just automatically displaying on stdout/stderr.  Do so by making
     -    the output path messages accessible via a new member of struct
     -    merge_result named path_messages.
     +    Conflicts such as modify/delete, rename/rename, or file/directory are
     +    not representable via content conflict markers, and the normal output
     +    messages notifying users about these were dropped with --remerge-diff.
     +    While we don't want these messages randomly shown before the commit
     +    and diff headers, we do want them to still be shown; include them as
     +    part of the diff headers instead.
      
          Signed-off-by: Elijah Newren <newren@gmail.com>
      
     + ## log-tree.c ##
     +@@ log-tree.c: static int do_remerge_diff(struct rev_info *opt,
     + 	/* Setup merge options */
     + 	init_merge_options(&o, the_repository);
     + 	o.show_rename_progress = 0;
     ++	o.record_conflict_msgs_as_headers = 1;
     + 
     + 	ctx.abbrev = DEFAULT_ABBREV;
     + 	format_commit_message(parent1, "%h (%s)", &parent1_desc, &ctx);
     +@@ log-tree.c: static int do_remerge_diff(struct rev_info *opt,
     + 	merge_incore_recursive(&o, bases, parent1, parent2, &res);
     + 
     + 	/* Show the diff */
     ++	opt->diffopt.additional_path_headers = res.path_messages;
     + 	diff_tree_oid(&res.tree->object.oid, oid, "", &opt->diffopt);
     + 	log_tree_diff_flush(opt);
     + 
     + 	/* Cleanup */
     ++	opt->diffopt.additional_path_headers = NULL;
     + 	strbuf_release(&parent1_desc);
     + 	strbuf_release(&parent2_desc);
     + 	merge_finalize(&o, &res);
     +
       ## merge-ort.c ##
      @@ merge-ort.c: redo:
       	trace2_region_leave("merge", "process_entries", opt->repo);
     @@ merge-ort.h: struct merge_result {
       	/*
       	 * Additional metadata used by merge_switch_to_result() or future calls
       	 * to merge_incore_*().  Includes data needed to update the index (if
     +
     + ## t/t4069-remerge-diff.sh ##
     +@@ t/t4069-remerge-diff.sh: test_description='remerge-diff handling'
     + 
     + . ./test-lib.sh
     + 
     ++# --remerge-diff uses ort under the hood regardless of setting.  However,
     ++# we set up a file/directory conflict beforehand, and the different backends
     ++# handle the conflict differently, which would require separate code paths
     ++# to resolve.  There's not much point in making the code uglier to do that,
     ++# though, when the real thing we are testing (--remerge-diff) will hardcode
     ++# calls directly into the merge-ort API anyway.  So just force the use of
     ++# ort on the setup too.
     ++GIT_TEST_MERGE_ALGORITHM=ort
     ++
     + test_expect_success 'setup basic merges' '
     + 	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
     + 	git add numbers &&
     +@@ t/t4069-remerge-diff.sh: test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
     + 	git log -1 --oneline ab_resolution >tmp &&
     + 	cat <<-EOF >>tmp &&
     + 	diff --git a/numbers b/numbers
     ++	CONFLICT (content): Merge conflict in numbers
     + 	index a1fb731..6875544 100644
     + 	--- a/numbers
     + 	+++ b/numbers
     +@@ t/t4069-remerge-diff.sh: test_expect_success 'remerge-diff with both a resolved conflict and an unrelated
     + 	test_cmp expect actual
     + '
     + 
     ++test_expect_success 'setup non-content conflicts' '
     ++	git switch --orphan base &&
     ++
     ++	test_write_lines 1 2 3 4 5 6 7 8 9 >numbers &&
     ++	test_write_lines a b c d e f g h i >letters &&
     ++	test_write_lines in the way >content &&
     ++	git add numbers letters content &&
     ++	git commit -m base &&
     ++
     ++	git branch side1 &&
     ++	git branch side2 &&
     ++
     ++	git checkout side1 &&
     ++	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
     ++	git mv letters letters_side1 &&
     ++	git mv content file_or_directory &&
     ++	git add numbers &&
     ++	git commit -m side1 &&
     ++
     ++	git checkout side2 &&
     ++	git rm numbers &&
     ++	git mv letters letters_side2 &&
     ++	mkdir file_or_directory &&
     ++	echo hello >file_or_directory/world &&
     ++	git add file_or_directory/world &&
     ++	git commit -m side2 &&
     ++
     ++	git checkout -b resolution side1 &&
     ++	test_must_fail git merge side2 &&
     ++	test_write_lines 1 2 three 4 5 6 7 8 9 >numbers &&
     ++	git add numbers &&
     ++	git add letters_side1 &&
     ++	git rm letters &&
     ++	git rm letters_side2 &&
     ++	git add file_or_directory~HEAD &&
     ++	git mv file_or_directory~HEAD wanted_content &&
     ++	git commit -m resolved
     ++'
     ++
     ++test_expect_success 'remerge-diff with non-content conflicts' '
     ++	git log -1 --oneline resolution >tmp &&
     ++	cat <<-EOF >>tmp &&
     ++	diff --git a/file_or_directory~HASH (side1) b/wanted_content
     ++	similarity index 100%
     ++	rename from file_or_directory~HASH (side1)
     ++	rename to wanted_content
     ++	CONFLICT (file/directory): directory in the way of file_or_directory from HASH (side1); moving it to file_or_directory~HASH (side1) instead.
     ++	diff --git a/letters b/letters
     ++	CONFLICT (rename/rename): letters renamed to letters_side1 in HASH (side1) and to letters_side2 in HASH (side2).
     ++	diff --git a/letters_side2 b/letters_side2
     ++	deleted file mode 100644
     ++	index b236ae5..0000000
     ++	--- a/letters_side2
     ++	+++ /dev/null
     ++	@@ -1,9 +0,0 @@
     ++	-a
     ++	-b
     ++	-c
     ++	-d
     ++	-e
     ++	-f
     ++	-g
     ++	-h
     ++	-i
     ++	diff --git a/numbers b/numbers
     ++	CONFLICT (modify/delete): numbers deleted in HASH (side2) and modified in HASH (side1).  Version HASH (side1) of numbers left in tree.
     ++	EOF
     ++	# We still have some sha1 hashes above; rip them out so test works
     ++	# with sha256
     ++	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect &&
     ++
     ++	git show --oneline --remerge-diff resolution >tmp &&
     ++	sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual &&
     ++	test_cmp expect actual
     ++'
     ++
     + test_done
  9:  4f21969e357 <  -:  ----------- doc/diff-options: explain the new --remerge-diff option

Comments

Ævar Arnfjörð Bjarmason Dec. 26, 2021, 9:52 p.m. UTC | #1
On Sat, Dec 25 2021, Elijah Newren via GitGitGadget wrote:

> === FURTHER BACKGROUND (original cover letter material) ==
>
> Here are some example commits you can try this out on (with git show
> --remerge-diff $COMMIT):
>
>  * git.git conflicted merge: 07601b5b36
>  * git.git non-conflicted change: bf04590ecd
>  * linux.git conflicted merge: eab3540562fb
>  * linux.git non-conflicted change: 223cea6a4f05
>
> Many more can be found by just running git log --merges --remerge-diff in
> your repository of choice and searching for diffs (most merges tend to be
> clean and unmodified and thus produce no diff but a search of '^diff' in the
> log output tends to find the examples nicely).
>
> Some basic high level details about this new option:
>
>  * This option is most naturally compared to --cc, though the output seems
>    to be much more understandable to most users than --cc output.
>  * Since merges are often clean and unmodified, this new option results in
>    an empty diff for most merges.
>  * This new option shows things like the removal of conflict markers, which
>    hunks users picked from the various conflicted sides to keep or remove,
>    and shows changes made outside of conflict markers (which might reflect
>    changes needed to resolve semantic conflicts or cleanups of e.g.
>    compilation warnings or other additional changes an integrator felt
>    belonged in the merged result).
>  * This new option does not (currently) work for octopus merges, since
>    merge-ort is specific to two-parent merges[1].
>  * This option will not work on a read-only or full filesystem[2].
>  * We discussed this capability at Git Merge 2020, and one of the
>    suggestions was doing a periodic git gc --auto during the operation (due
>    to potential new blobs and trees created during the operation). I found a
>    way to avoid that; see [2].
>  * This option is faster than you'd probably expect; it handles 33.5 merge
>    commits per second in linux.git on my computer; see below.
>
> In regards to the performance point above, the timing for running the
> following command:
>
> time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l

I've been trying to come up with some other useful recipies for this new
option (which is already very useful, thanks!)

Some of these (if correct) are suggestions for incorporating into the
(now rather sparse) documentation. I.e. walking users through how to use
this, and how (if at all) it combines with other options.

I wanted to find all merges between "master".."seen" for which Junio's
had to resolve a conflict, a naïve version is:

    $ git log --oneline --remerge-diff -p --min-parents=2 origin/master..origin/seen|grep ^diff -B1 | grep Merge
    [...]

But I found that this new option nicely integrates with --diff-filter,
i.e. we'll end up showing a diff, and the diff machinery allows you to
to filter on it.

It seems to me like all the diffs you show fall under "M", so for
master..seen (2ae0a9cb829..61055c2920d) this is equivalent (and the
output is the same as the above):

    $ git -P log --oneline --remerge-diff --no-patch --min-parents=2 --diff-filter=M origin/master..origin/seen 
    95daa54b1c3 Merge branch 'hn/reftable-fixes' into seen
    26c4c09dd34 Merge branch 'gc/fetch-negotiate-only-early-return' into seen
    e3dc8d073f6 Merge branch 'gc/branch-recurse-submodules' into seen
    aeada898196 Merge branch 'js/branch-track-inherit' into seen
    4dd30e0da45 Merge branch 'jh/builtin-fsmonitor-part2' into seen
    337743b17d0 Merge branch 'ab/config-based-hooks-2' into seen
    261672178c0 Merge branch 'pw/fix-some-issues-in-reset-head' into seen
    1296d35b041 Merge branch 'ms/customizable-ident-expansion' into seen
    7a3d7d05126 Merge branch 'ja/i18n-similar-messages' into seen
    eda714bb8bc Merge branch 'tb/midx-bitmap-corruption-fix' into seen
    ba02295e3f8 Merge branch 'jh/p4-human-unit-numbers' into jch
    751773fc38b Merge branch 'es/test-chain-lint' into jch
    ec17879f495 Merge branch 'tb/cruft-packs' into tb/midx-bitmap-corruption-fix

However for "origin/master..origin/next" (next = 510f9eba9a2 currently)
we'll oddly show this with "-p":
    
    9af51fd1d0d Sync with 'master'
    diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
    CONFLICT (content): Merge conflict in t/lib-gpg.sh
    d6f56f3248e Merge branch 'es/test-chain-lint' into next
    diff --git a/t/t4126-apply-empty.sh b/t/t4126-apply-empty.sh
    CONFLICT (content): Merge conflict in t/t4126-apply-empty.sh
    index 996c93329c6..33860d38290 100755
    --- a/t/t4126-apply-empty.sh
    +++ b/t/t4126-apply-empty.sh
    [...]

The "oddly" applying only to that "9af51fd1d0d Sync with 'master'", not
the second d6f56f3248e, which shows the sort of conflict I'd expect. The
two-line "diff" of:

    diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
    CONFLICT (content): Merge conflict in t/lib-gpg.sh

Shows up with -p --remerge-diff, not a mere -p. I also tried the other
--diff-merges=* options, that behavior is new in
--diff-merges=remerge. Is this a bug?

My local build also has a --pickaxe-patch option. It's something I
submitted on-list before[1] and have been meaning to re-roll.

I'm discussing it here because it skips the stripping of the "+ " and "-
" prefixes under -G<regex> and allows you to search through the -U<n>
context. With that I'm able to do:

    git log --oneline --remerge-diff -p --min-parents=2 --pickaxe-patch -G'^\+' --diff-filter=M origin/master..origin/seen

I.e. on top of the above filter only show those diffs that have
additions. FAICT the conflicting diffs where the committer of the merge
conflict picked one side or the other will only have "-" lines".

So those diffs that have additions look to be those where the person
doing the merge needed to combine the two.

Well, usually. E.g. 26c4c09dd34 (Merge branch
'gc/fetch-negotiate-only-early-return' into seen, 2021-12-25) in that
range shows that isn't strictly true. Most such deletion-only diffs are
less interesting in picking one side or the other of the conflict, but
that one combines the two:
    
    -<<<<<<< d3419aac9f4 (Merge branch 'pw/add-p-hunk-split-fix' into seen)
                            warning(_("protocol does not support --negotiate-only, exiting"));
    -                       return 1;
    -=======
    -                       warning(_("Protocol does not support --negotiate-only, exiting."));
                            result = 1;
                            goto cleanup;
    ->>>>>>> 495e8601f28 (builtin/fetch: die on --negotiate-only and --recurse-submodules)

Which I guess is partially commentary and partially a request (either
for this series, or some follow-up) for something like a
--remerge-diff-filter option. I.e. it would be very useful to be able to
filter on some combination of:

 * Which side(s) of the conflict(s) were picked, or a combination?
 * Is there "new work" in the diff to resolve the conflict?
   AFIACT this will always mean we'll have "+ " lines.

Or maybe that's not useful at all, and just -G<rx> (maybe combined with
my --pickaxe-patch) will cover it?

1. https://lore.kernel.org/git/20190424152215.16251-3-avarab@gmail.com/
Elijah Newren Dec. 27, 2021, 9:11 p.m. UTC | #2
On Sun, Dec 26, 2021 at 2:28 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Sat, Dec 25 2021, Elijah Newren via GitGitGadget wrote:
>
> > === FURTHER BACKGROUND (original cover letter material) ==
> >
> > Here are some example commits you can try this out on (with git show
> > --remerge-diff $COMMIT):
> >
> >  * git.git conflicted merge: 07601b5b36
> >  * git.git non-conflicted change: bf04590ecd
> >  * linux.git conflicted merge: eab3540562fb
> >  * linux.git non-conflicted change: 223cea6a4f05
> >
> > Many more can be found by just running git log --merges --remerge-diff in
> > your repository of choice and searching for diffs (most merges tend to be
> > clean and unmodified and thus produce no diff but a search of '^diff' in the
> > log output tends to find the examples nicely).
> >
> > Some basic high level details about this new option:
> >
> >  * This option is most naturally compared to --cc, though the output seems
> >    to be much more understandable to most users than --cc output.
> >  * Since merges are often clean and unmodified, this new option results in
> >    an empty diff for most merges.
> >  * This new option shows things like the removal of conflict markers, which
> >    hunks users picked from the various conflicted sides to keep or remove,
> >    and shows changes made outside of conflict markers (which might reflect
> >    changes needed to resolve semantic conflicts or cleanups of e.g.
> >    compilation warnings or other additional changes an integrator felt
> >    belonged in the merged result).
> >  * This new option does not (currently) work for octopus merges, since
> >    merge-ort is specific to two-parent merges[1].
> >  * This option will not work on a read-only or full filesystem[2].
> >  * We discussed this capability at Git Merge 2020, and one of the
> >    suggestions was doing a periodic git gc --auto during the operation (due
> >    to potential new blobs and trees created during the operation). I found a
> >    way to avoid that; see [2].
> >  * This option is faster than you'd probably expect; it handles 33.5 merge
> >    commits per second in linux.git on my computer; see below.
> >
> > In regards to the performance point above, the timing for running the
> > following command:
> >
> > time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l
>
> I've been trying to come up with some other useful recipies for this new
> option (which is already very useful, thanks!)

I'm glad you like it.  :-)

> Some of these (if correct) are suggestions for incorporating into the
> (now rather sparse) documentation. I.e. walking users through how to use
> this, and how (if at all) it combines with other options.
>
> I wanted to find all merges between "master".."seen" for which Junio's
> had to resolve a conflict, a naïve version is:
>
>     $ git log --oneline --remerge-diff -p --min-parents=2 origin/master..origin/seen|grep ^diff -B1 | grep Merge
>     [...]

I think the naive version is
  $ git log --remerge-diff --min-parents=2 origin/master..origin/seen
  <search for "^diff" using your pager's search functionality>

Where the "--min-parents=2 origin/master..origin/seen" comes from your
problem description ("find all merges between master..seen").

You can add --oneline to format it, though it's an orthogonal concern.
Also, adding -p is unnecessary: --remerge-diff, like --cc, implies -p.

> But I found that this new option nicely integrates with --diff-filter,
> i.e. we'll end up showing a diff, and the diff machinery allows you to
> to filter on it.
>
> It seems to me like all the diffs you show fall under "M", so for

Yes, the diffs I happened to pick all fell under "M", but by no means
should you rely on that happening for all merges in history.  For
example, make a new merge commit, then add a completely new file (or
delete a file, or rename a file, or copy a file, or change its
mode/type), stage the new/deleted/renamed/copied/changed file, and run
"git commit --amend".

So, although --diff-filter=M can be interesting, I would not rely on it.

> master..seen (2ae0a9cb829..61055c2920d) this is equivalent (and the
> output is the same as the above):
>
>     $ git -P log --oneline --remerge-diff --no-patch --min-parents=2 --diff-filter=M origin/master..origin/seen
>     95daa54b1c3 Merge branch 'hn/reftable-fixes' into seen
>     26c4c09dd34 Merge branch 'gc/fetch-negotiate-only-early-return' into seen
>     e3dc8d073f6 Merge branch 'gc/branch-recurse-submodules' into seen
>     aeada898196 Merge branch 'js/branch-track-inherit' into seen
>     4dd30e0da45 Merge branch 'jh/builtin-fsmonitor-part2' into seen
>     337743b17d0 Merge branch 'ab/config-based-hooks-2' into seen
>     261672178c0 Merge branch 'pw/fix-some-issues-in-reset-head' into seen
>     1296d35b041 Merge branch 'ms/customizable-ident-expansion' into seen
>     7a3d7d05126 Merge branch 'ja/i18n-similar-messages' into seen
>     eda714bb8bc Merge branch 'tb/midx-bitmap-corruption-fix' into seen
>     ba02295e3f8 Merge branch 'jh/p4-human-unit-numbers' into jch
>     751773fc38b Merge branch 'es/test-chain-lint' into jch
>     ec17879f495 Merge branch 'tb/cruft-packs' into tb/midx-bitmap-corruption-fix
>
> However for "origin/master..origin/next" (next = 510f9eba9a2 currently)
> we'll oddly show this with "-p":
>
>     9af51fd1d0d Sync with 'master'
>     diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
>     CONFLICT (content): Merge conflict in t/lib-gpg.sh
>     d6f56f3248e Merge branch 'es/test-chain-lint' into next
>     diff --git a/t/t4126-apply-empty.sh b/t/t4126-apply-empty.sh
>     CONFLICT (content): Merge conflict in t/t4126-apply-empty.sh
>     index 996c93329c6..33860d38290 100755
>     --- a/t/t4126-apply-empty.sh
>     +++ b/t/t4126-apply-empty.sh
>     [...]
>
> The "oddly" applying only to that "9af51fd1d0d Sync with 'master'", not
> the second d6f56f3248e, which shows the sort of conflict I'd expect. The
> two-line "diff" of:
>
>     diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
>     CONFLICT (content): Merge conflict in t/lib-gpg.sh
>
> Shows up with -p --remerge-diff, not a mere -p. I also tried the other
> --diff-merges=* options, that behavior is new in
> --diff-merges=remerge. Is this a bug?

Ugh, this is related to my comment elsewhere that conflicts from inner
merges are not nicely differentiated.  If I also apply my other series
(which has not yet been submitted), this instead appears as follows:

$ git show --oneline --remerge-diff 9af51fd1d0d
9af51fd1d0 Sync with 'master'
diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
  From inner merge:  CONFLICT (content): Merge conflict in t/lib-gpg.sh

and the addition of the "From inner merge: " text makes it clearer why
that line appears.  This is an interesting case where a conflict
notice _only_ appears in the inner merge (i.e. the merge of merge
bases), which means that both sides on the outer merge changed the
relevant portion of the file in the same way, so the outer merge had
no conflict.

However, instead of trying to differentiate messages from inner
merges, I think for --remerge-diff's purposes we should just drop all
notices that come from the inner merges.  Those conflict notices might
be helpful when initially resolving a merge, but at the --remerge-diff
level, they're more likely to be distracting than helpful.

> My local build also has a --pickaxe-patch option. It's something I
> submitted on-list before[1] and have been meaning to re-roll.
>
> I'm discussing it here because it skips the stripping of the "+ " and "-
> " prefixes under -G<regex> and allows you to search through the -U<n>
> context. With that I'm able to do:
>
>     git log --oneline --remerge-diff -p --min-parents=2 --pickaxe-patch -G'^\+' --diff-filter=M origin/master..origin/seen
>
> I.e. on top of the above filter only show those diffs that have
> additions. FAICT the conflicting diffs where the committer of the merge
> conflict picked one side or the other will only have "-" lines".
>
> So those diffs that have additions look to be those where the person
> doing the merge needed to combine the two.
>
> Well, usually. E.g. 26c4c09dd34 (Merge branch
> 'gc/fetch-negotiate-only-early-return' into seen, 2021-12-25) in that
> range shows that isn't strictly true. Most such deletion-only diffs are
> less interesting in picking one side or the other of the conflict, but
> that one combines the two:
>
>     -<<<<<<< d3419aac9f4 (Merge branch 'pw/add-p-hunk-split-fix' into seen)
>                             warning(_("protocol does not support --negotiate-only, exiting"));
>     -                       return 1;
>     -=======
>     -                       warning(_("Protocol does not support --negotiate-only, exiting."));
>                             result = 1;
>                             goto cleanup;
>     ->>>>>>> 495e8601f28 (builtin/fetch: die on --negotiate-only and --recurse-submodules)
>
> Which I guess is partially commentary and partially a request (either
> for this series, or some follow-up) for something like a
> --remerge-diff-filter option. I.e. it would be very useful to be able to
> filter on some combination of:
>
>  * Which side(s) of the conflict(s) were picked, or a combination?
>  * Is there "new work" in the diff to resolve the conflict?
>    AFIACT this will always mean we'll have "+ " lines.

Do any of the following count as "new work"? :

  * the deletion of a file (perhaps one that had no conflict but was
deleted anyway)
  * mode changes (again, perhaps on files that had no conflict)
  * renames of files/directories?

If so, searching for "^+" lines might be insufficient, but it depends
on what you mean by new work.

> Or maybe that's not useful at all, and just -G<rx> (maybe combined with
> my --pickaxe-patch) will cover it?

I'd rather wait until we have a good idea of the potential range of
usecases before adding a filter.  (And I think for now, the -G and
--pickaxe-patch are probably good enough for this usecase.)  These
particular usecases you point out are interesting; thanks for
detailing them.  Here's some others to consider:

  * Finding out when text was added or removed: `git log
--remerge-diff -S<text>` (note that with only -p instead of
--remerge-diff, that command will annoyingly misses cases where a
merge introduced or removed the text)
  * Finding out how a merge differed from one run with some
non-default options (e.g. `git show --remerge-diff -Xours` or `git
show --remerge-diff -Xno-space-change`; although show doesn't take -X
options so this is just an idea at this point)
  * Finding out how a merge would have differed had it been run with
different options (so instead of comparing a remerge to the merge
recorded in history, compare one remerge with default options with a
different merge that uses e.g. -Xno-space-change)

Also, I've got a follow-up series that also introduces a
--remerge-diff-only flag which:
  * For single parent commits that cannot be identified as a revert or
cherry-pick, do not show a diff.
  * For single parent commits that can be identified as a revert or
cherry-pick, instead of showing a diff against the parent of the
commit, redo the revert or cherry-pick in memory and show a diff
against that.
  * For merge commits, act the same as --remerge-diff
Johannes Altmanninger Dec. 28, 2021, 10:55 a.m. UTC | #3
On Sat, Dec 25, 2021 at 07:59:11AM +0000, Elijah Newren via GitGitGadget wrote:
> Here are some example commits you can try this out on (with git show
> --remerge-diff $COMMIT):
> 
>  * git.git conflicted merge: 07601b5b36
>  * git.git non-conflicted change: bf04590ecd
>  * linux.git conflicted merge: eab3540562fb
>  * linux.git non-conflicted change: 223cea6a4f05
> 
> Many more can be found by just running git log --merges --remerge-diff in
> your repository of choice and searching for diffs (most merges tend to be
> clean and unmodified and thus produce no diff but a search of '^diff' in the
> log output tends to find the examples nicely).
> 
> Some basic high level details about this new option:
> 
>  * This option is most naturally compared to --cc, though the output seems
>    to be much more understandable to most users than --cc output.

Agreed. --cc is *simple* but I'm more comfortable reading conflict markers
from --remerge-diff, since I'm used to that.  So at least for content
conflicts it looks simpler.
Ævar Arnfjörð Bjarmason Jan. 10, 2022, 3:48 p.m. UTC | #4
On Mon, Dec 27 2021, Elijah Newren wrote:

> On Sun, Dec 26, 2021 at 2:28 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> On Sat, Dec 25 2021, Elijah Newren via GitGitGadget wrote:
>>
>> > === FURTHER BACKGROUND (original cover letter material) ==
>> >
>> > Here are some example commits you can try this out on (with git show
>> > --remerge-diff $COMMIT):
>> >
>> >  * git.git conflicted merge: 07601b5b36
>> >  * git.git non-conflicted change: bf04590ecd
>> >  * linux.git conflicted merge: eab3540562fb
>> >  * linux.git non-conflicted change: 223cea6a4f05
>> >
>> > Many more can be found by just running git log --merges --remerge-diff in
>> > your repository of choice and searching for diffs (most merges tend to be
>> > clean and unmodified and thus produce no diff but a search of '^diff' in the
>> > log output tends to find the examples nicely).
>> >
>> > Some basic high level details about this new option:
>> >
>> >  * This option is most naturally compared to --cc, though the output seems
>> >    to be much more understandable to most users than --cc output.
>> >  * Since merges are often clean and unmodified, this new option results in
>> >    an empty diff for most merges.
>> >  * This new option shows things like the removal of conflict markers, which
>> >    hunks users picked from the various conflicted sides to keep or remove,
>> >    and shows changes made outside of conflict markers (which might reflect
>> >    changes needed to resolve semantic conflicts or cleanups of e.g.
>> >    compilation warnings or other additional changes an integrator felt
>> >    belonged in the merged result).
>> >  * This new option does not (currently) work for octopus merges, since
>> >    merge-ort is specific to two-parent merges[1].
>> >  * This option will not work on a read-only or full filesystem[2].
>> >  * We discussed this capability at Git Merge 2020, and one of the
>> >    suggestions was doing a periodic git gc --auto during the operation (due
>> >    to potential new blobs and trees created during the operation). I found a
>> >    way to avoid that; see [2].
>> >  * This option is faster than you'd probably expect; it handles 33.5 merge
>> >    commits per second in linux.git on my computer; see below.
>> >
>> > In regards to the performance point above, the timing for running the
>> > following command:
>> >
>> > time git log --min-parents=2 --max-parents=2 $DIFF_FLAG | wc -l
>>
>> I've been trying to come up with some other useful recipies for this new
>> option (which is already very useful, thanks!)
>
> I'm glad you like it.  :-)
>
>> Some of these (if correct) are suggestions for incorporating into the
>> (now rather sparse) documentation. I.e. walking users through how to use
>> this, and how (if at all) it combines with other options.
>>
>> I wanted to find all merges between "master".."seen" for which Junio's
>> had to resolve a conflict, a naïve version is:
>>
>>     $ git log --oneline --remerge-diff -p --min-parents=2 origin/master..origin/seen|grep ^diff -B1 | grep Merge
>>     [...]
>
> I think the naive version is
>   $ git log --remerge-diff --min-parents=2 origin/master..origin/seen
>   <search for "^diff" using your pager's search functionality>
>
> Where the "--min-parents=2 origin/master..origin/seen" comes from your
> problem description ("find all merges between master..seen").
>
> You can add --oneline to format it, though it's an orthogonal concern.
> Also, adding -p is unnecessary: --remerge-diff, like --cc, implies -p.
>
>> But I found that this new option nicely integrates with --diff-filter,
>> i.e. we'll end up showing a diff, and the diff machinery allows you to
>> to filter on it.
>>
>> It seems to me like all the diffs you show fall under "M", so for
>
> Yes, the diffs I happened to pick all fell under "M", but by no means
> should you rely on that happening for all merges in history.  For
> example, make a new merge commit, then add a completely new file (or
> delete a file, or rename a file, or copy a file, or change its
> mode/type), stage the new/deleted/renamed/copied/changed file, and run
> "git commit --amend".
>
> So, although --diff-filter=M can be interesting, I would not rely on it.

*Nod* I hadn't thought of those (in retrospect, rather obvious) cases.

>> master..seen (2ae0a9cb829..61055c2920d) this is equivalent (and the
>> output is the same as the above):
>>
>>     $ git -P log --oneline --remerge-diff --no-patch --min-parents=2 --diff-filter=M origin/master..origin/seen
>>     95daa54b1c3 Merge branch 'hn/reftable-fixes' into seen
>>     26c4c09dd34 Merge branch 'gc/fetch-negotiate-only-early-return' into seen
>>     e3dc8d073f6 Merge branch 'gc/branch-recurse-submodules' into seen
>>     aeada898196 Merge branch 'js/branch-track-inherit' into seen
>>     4dd30e0da45 Merge branch 'jh/builtin-fsmonitor-part2' into seen
>>     337743b17d0 Merge branch 'ab/config-based-hooks-2' into seen
>>     261672178c0 Merge branch 'pw/fix-some-issues-in-reset-head' into seen
>>     1296d35b041 Merge branch 'ms/customizable-ident-expansion' into seen
>>     7a3d7d05126 Merge branch 'ja/i18n-similar-messages' into seen
>>     eda714bb8bc Merge branch 'tb/midx-bitmap-corruption-fix' into seen
>>     ba02295e3f8 Merge branch 'jh/p4-human-unit-numbers' into jch
>>     751773fc38b Merge branch 'es/test-chain-lint' into jch
>>     ec17879f495 Merge branch 'tb/cruft-packs' into tb/midx-bitmap-corruption-fix
>>
>> However for "origin/master..origin/next" (next = 510f9eba9a2 currently)
>> we'll oddly show this with "-p":
>>
>>     9af51fd1d0d Sync with 'master'
>>     diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
>>     CONFLICT (content): Merge conflict in t/lib-gpg.sh
>>     d6f56f3248e Merge branch 'es/test-chain-lint' into next
>>     diff --git a/t/t4126-apply-empty.sh b/t/t4126-apply-empty.sh
>>     CONFLICT (content): Merge conflict in t/t4126-apply-empty.sh
>>     index 996c93329c6..33860d38290 100755
>>     --- a/t/t4126-apply-empty.sh
>>     +++ b/t/t4126-apply-empty.sh
>>     [...]
>>
>> The "oddly" applying only to that "9af51fd1d0d Sync with 'master'", not
>> the second d6f56f3248e, which shows the sort of conflict I'd expect. The
>> two-line "diff" of:
>>
>>     diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
>>     CONFLICT (content): Merge conflict in t/lib-gpg.sh
>>
>> Shows up with -p --remerge-diff, not a mere -p. I also tried the other
>> --diff-merges=* options, that behavior is new in
>> --diff-merges=remerge. Is this a bug?
>
> Ugh, this is related to my comment elsewhere that conflicts from inner
> merges are not nicely differentiated.  If I also apply my other series
> (which has not yet been submitted), this instead appears as follows:
>
> $ git show --oneline --remerge-diff 9af51fd1d0d
> 9af51fd1d0 Sync with 'master'
> diff --git a/t/lib-gpg.sh b/t/lib-gpg.sh
>   From inner merge:  CONFLICT (content): Merge conflict in t/lib-gpg.sh
>
> and the addition of the "From inner merge: " text makes it clearer why
> that line appears.  This is an interesting case where a conflict
> notice _only_ appears in the inner merge (i.e. the merge of merge
> bases), which means that both sides on the outer merge changed the
> relevant portion of the file in the same way, so the outer merge had
> no conflict.
>
> However, instead of trying to differentiate messages from inner
> merges, I think for --remerge-diff's purposes we should just drop all
> notices that come from the inner merges.  Those conflict notices might
> be helpful when initially resolving a merge, but at the --remerge-diff
> level, they're more likely to be distracting than helpful.

Thanks. I see that v3 addresses this. Will try it out...

>> My local build also has a --pickaxe-patch option. It's something I
>> submitted on-list before[1] and have been meaning to re-roll.
>>
>> I'm discussing it here because it skips the stripping of the "+ " and "-
>> " prefixes under -G<regex> and allows you to search through the -U<n>
>> context. With that I'm able to do:
>>
>>     git log --oneline --remerge-diff -p --min-parents=2 --pickaxe-patch -G'^\+' --diff-filter=M origin/master..origin/seen
>>
>> I.e. on top of the above filter only show those diffs that have
>> additions. FAICT the conflicting diffs where the committer of the merge
>> conflict picked one side or the other will only have "-" lines".
>>
>> So those diffs that have additions look to be those where the person
>> doing the merge needed to combine the two.
>>
>> Well, usually. E.g. 26c4c09dd34 (Merge branch
>> 'gc/fetch-negotiate-only-early-return' into seen, 2021-12-25) in that
>> range shows that isn't strictly true. Most such deletion-only diffs are
>> less interesting in picking one side or the other of the conflict, but
>> that one combines the two:
>>
>>     -<<<<<<< d3419aac9f4 (Merge branch 'pw/add-p-hunk-split-fix' into seen)
>>                             warning(_("protocol does not support --negotiate-only, exiting"));
>>     -                       return 1;
>>     -=======
>>     -                       warning(_("Protocol does not support --negotiate-only, exiting."));
>>                             result = 1;
>>                             goto cleanup;
>>     ->>>>>>> 495e8601f28 (builtin/fetch: die on --negotiate-only and --recurse-submodules)
>>
>> Which I guess is partially commentary and partially a request (either
>> for this series, or some follow-up) for something like a
>> --remerge-diff-filter option. I.e. it would be very useful to be able to
>> filter on some combination of:
>>
>>  * Which side(s) of the conflict(s) were picked, or a combination?
>>  * Is there "new work" in the diff to resolve the conflict?
>>    AFIACT this will always mean we'll have "+ " lines.
>
> Do any of the following count as "new work"? :
>
>   * the deletion of a file (perhaps one that had no conflict but was
> deleted anyway)
>   * mode changes (again, perhaps on files that had no conflict)
>   * renames of files/directories?
>
> If so, searching for "^+" lines might be insufficient, but it depends
> on what you mean by new work.

Yes, I think at least for the use-cases I had in mind the useful thing
is knowing which state a merge commit is in:

 A: "Vanilla" merge (i.e. no --remerge-diff output)
 B: non-"vanilla" merge" (i.e. ew have --remerge-diff output)

What I'm requesting/musing about here (which is very much in the
category of stuff we can/should do later, and shouldn't prevent this
"good enough for now" series going in) is being able to disambiguate
those "B" cases.

I.e. being able to see if their/ours side "won", some mixture of the two
etc.

>> Or maybe that's not useful at all, and just -G<rx> (maybe combined with
>> my --pickaxe-patch) will cover it?
>
> I'd rather wait until we have a good idea of the potential range of
> usecases before adding a filter.  (And I think for now, the -G and
> --pickaxe-patch are probably good enough for this usecase.)  These
> particular usecases you point out are interesting; thanks for
> detailing them.  Here's some others to consider:
>
>   * Finding out when text was added or removed: `git log
> --remerge-diff -S<text>` (note that with only -p instead of
> --remerge-diff, that command will annoyingly misses cases where a
> merge introduced or removed the text)
>   * Finding out how a merge differed from one run with some
> non-default options (e.g. `git show --remerge-diff -Xours` or `git
> show --remerge-diff -Xno-space-change`; although show doesn't take -X
> options so this is just an idea at this point)
>   * Finding out how a merge would have differed had it been run with
> different options (so instead of comparing a remerge to the merge
> recorded in history, compare one remerge with default options with a
> different merge that uses e.g. -Xno-space-change)
>
> Also, I've got a follow-up series that also introduces a
> --remerge-diff-only flag which:
>   * For single parent commits that cannot be identified as a revert or
> cherry-pick, do not show a diff.
>   * For single parent commits that can be identified as a revert or
> cherry-pick, instead of showing a diff against the parent of the
> commit, redo the revert or cherry-pick in memory and show a diff
> against that.
>   * For merge commits, act the same as --remerge-diff

*nod*