Message ID | pull.1734.git.1731000007391.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | range-diff: optionally include merge commits' diffs in the analysis | expand |
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> writes: > Let's add corresponding support for `git range-diff`, too. This makes it > more convenient to spot differences between iterations of non-linear > contributions, where so-called "evil merges" are sometimes necessary and > need to be reviewed, too. Nice addition. As the "diff of the diff" outer diff part would be two-"blob" diff, we won't have to worry about this option causing confusions to users, unlike things like "-U8", to which layer of diffs the option would apply. Will queue. Thanks.
Am 07.11.24 um 18:20 schrieb Johannes Schindelin via GitGitGadget: > +--diff-merges=<format>:: > + Instead of ignoring merge commits, generate diffs for them using the > + corresponding `--diff-merges=<format>` option of linkgit:git-log[1], > + and include them in the comparison. > ++ > +Note: Some of the formats supported by linkgit:git-log[1] make less sense in > +the context of the `range-diff` command than other formats, so choose wisely! Can we please be a bit more specific which options are usable or which are not usable? Perhaps you even mean to say that 'first-parent' is the only one that makes sense? At a minimum, we should mention that it is the one that makes most sense (if that is the case). -- Hannes
Hi Hannes On Fri, 8 Nov 2024, Johannes Sixt wrote: > Am 07.11.24 um 18:20 schrieb Johannes Schindelin via GitGitGadget: > > +--diff-merges=<format>:: > > + Instead of ignoring merge commits, generate diffs for them using the > > + corresponding `--diff-merges=<format>` option of linkgit:git-log[1], > > + and include them in the comparison. > > ++ > > +Note: Some of the formats supported by linkgit:git-log[1] make less sense in > > +the context of the `range-diff` command than other formats, so choose wisely! > > Can we please be a bit more specific which options are usable or which > are not usable? There are a couple of reasons: - Most importantly: I had not even studied that question in depth when I wrote the original patch, and not even when I submitted the first iteration. In my reviews of branch thickets, I used `--diff-merges=1` and it served my needs well. Therefore I can speak to this mode, but not really to the other modes. - Which options are usable or not may lack a universally-correct answer. One person might find `--diff-merges=separate` confusing while the next person may find it quite useful. - If the documentation points out an unsuitable mode, then an obvious follow-up wish would be to reject this mode in the command, with an error message. Some users might, however, need precisely this mode, so I am loathe to do that. Maybe this compromise: Change the "Note:" to recommend the `first-parent` mode, with a bit more explanation why that is so (it is consistent with the idea that a merge is kind of a "meta patch", comprising all the merged commits' patches, which is equivalent to the first-parent diff). > Perhaps you even mean to say that 'first-parent' is the only one that > makes sense? I would not go _so_ far as to say that. Before submitting the patch yesterday, I played a little with a few modes, using the commit graph as per the new t3206 test case. What that test case range-diffs looks like this: - * - changeA - changeB ------------ \ \ \ \ conflict \ \ / \ rename - changeA - changeB ----- clean In other words, the main branch modifies a file's contents twice, the topic branch renames it first, then makes the same modifications. The clean merge simply merges the topic, which makes the main branch tree-same to the topic branch. The conflicting merge misses the tip commit of the topic branch, i.e. changeB on the renamed file. The result is tree-same to the clean merge because changeB on the non-renamed file is already part of the main branch. Out of habit, I used `--diff-merge=first-parent` in the new test case, whose output reiterates what I just said: both merges introduce the same first-parent diff (renaming the file, no other changes): 1: 1e6226b < -: ------- s/12/B/ 2: 043e738 = 1: 6ddec15 merge What about `separate`? 2: 043e738 = 1: 6ddec15 merge 1: 1e6226b ! 2: 6ddec15 s/12/B/ @@ ## Metadata ## -Author: Thomas Rast <trast@inf.ethz.ch> +Author: A U Thor <author@example.com> ## Commit message ## - s/12/B/ + merge ## renamed-file ## @@ renamed-file: A This might look confusing at first because there are two merge diffs on the right-hand side, but only one on the left-hand side. But that is all good because the diff of the clean merge between merge head and merge is empty and therefore not shown. And the `range-diff` demonstrates that the conflicting merge had to recapitulate the tip commit of the `renamed-file` topic branch. The `combined` and `dense-combined` modes produce identical output to `first-parent` in this instance, which is expected. Now, let's use `remerge`, which should be illuminating with regards to "evil merges" [*1*], as it shows what was done on top of the automatic merge: 1: 1e6226b < -: ------- s/12/B/ 2: 043e738 < -: ------- merge -: ------- > 1: 6ddec15 merge Huh. That is a bit surprising at first. Let's use a higher creation factor to ask `range-diff` to match correspondences more loosely (I used 999, which kinda forces even non-sensical pairings): 1: 1e6226b ! 1: 6ddec15 s/12/B/ @@ ## Metadata ## -Author: Thomas Rast <trast@inf.ethz.ch> +Author: A U Thor <author@example.com> ## Commit message ## - s/12/B/ + merge ## renamed-file ## + remerge CONFLICT (content): Merge conflict in renamed-file + index 3f1c0da..25ddf7a 100644 + --- renamed-file + +++ renamed-file @@ renamed-file: A 9 10 B +-<<<<<<< 2901f77 (s/12/B/):file + B +-======= -12 -+B +->>>>>>> 3ce7af6 (s/11/B/):renamed-file 13 14 15 2: 043e738 < -: ------- merge (Note that if you repeat this experiment, you will see something different due to a bug in `--remerge-diff` that I will fix separately.) Hold on, that looks wrong! It is actually not wrong, `range-diff` just finds the changeB a better pairing than the merge commit with an empty diff... So: This mode can be useful to detect where merge conflicts had to be resolved. In short, I think that in `range-diff`'s context all of these modes have merit, depending on a particular use case. It's just that `first-parent` is probably the most natural to use in the common case. > At a minimum, we should mention that it is the one that makes most sense > (if that is the case). Yeah, I'll go with that and drop saying anything about other modes. Ciao, Johannes Footnote *1*: We should really find a much better name for this than "evil merge". There is nothing evil about me having to add `#include "environment.h"` in v2.45.1^, for example. It was necessary so that the code would build. Tedious, yes, but not evil.
Johannes Sixt <j6t@kdbg.org> writes: > Am 07.11.24 um 18:20 schrieb Johannes Schindelin via GitGitGadget: >> +--diff-merges=<format>:: >> + Instead of ignoring merge commits, generate diffs for them using the >> + corresponding `--diff-merges=<format>` option of linkgit:git-log[1], >> + and include them in the comparison. >> ++ >> +Note: Some of the formats supported by linkgit:git-log[1] make less sense in >> +the context of the `range-diff` command than other formats, so choose wisely! > > Can we please be a bit more specific which options are usable or which > are not usable? Perhaps you even mean to say that 'first-parent' is the > only one that makes sense? At a minimum, we should mention that it is > the one that makes most sense (if that is the case). Good suggestion. I am curious to see how well things like "--cc" and "--remerge-diff" fare in this use case. I suspect that in a case where the original round forgot to make necessary semantic conflict resolutions (aka "evil merge") that the new round has, all of them (including the "--first-parent") would essentially show the same change, but what is shown in such a case would be more readable with "--first-parent" by treating a merge as if it were just a single parent commit making a lot of changes, iow, merely one among other commits with equal footing.
diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt index fbdbe0befeb..a964e856c3c 100644 --- a/Documentation/git-range-diff.txt +++ b/Documentation/git-range-diff.txt @@ -10,7 +10,7 @@ SYNOPSIS [verse] 'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>] [--no-dual-color] [--creation-factor=<factor>] - [--left-only | --right-only] + [--left-only | --right-only] [--diff-merges=<format>] ( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> ) [[--] <path>...] @@ -81,6 +81,14 @@ to revert to color all lines according to the outer diff markers Suppress commits that are missing from the second specified range (or the "right range" when using the `<rev1>...<rev2>` format). +--diff-merges=<format>:: + Instead of ignoring merge commits, generate diffs for them using the + corresponding `--diff-merges=<format>` option of linkgit:git-log[1], + and include them in the comparison. ++ +Note: Some of the formats supported by linkgit:git-log[1] make less sense in +the context of the `range-diff` command than other formats, so choose wisely! + --[no-]notes[=<ref>]:: This flag is passed to the `git log` program (see linkgit:git-log[1]) that generates the patches. diff --git a/builtin/range-diff.c b/builtin/range-diff.c index 1b33ab66a7b..e41719e0f0d 100644 --- a/builtin/range-diff.c +++ b/builtin/range-diff.c @@ -21,6 +21,7 @@ int cmd_range_diff(int argc, { struct diff_options diffopt = { NULL }; struct strvec other_arg = STRVEC_INIT; + struct strvec diff_merges_arg = STRVEC_INIT; struct range_diff_options range_diff_opts = { .creation_factor = RANGE_DIFF_CREATION_FACTOR_DEFAULT, .diffopt = &diffopt, @@ -36,6 +37,9 @@ int cmd_range_diff(int argc, OPT_PASSTHRU_ARGV(0, "notes", &other_arg, N_("notes"), N_("passed to 'git log'"), PARSE_OPT_OPTARG), + OPT_PASSTHRU_ARGV(0, "diff-merges", &diff_merges_arg, + N_("style"), N_("passed to 'git log'"), + PARSE_OPT_OPTARG), OPT_BOOL(0, "left-only", &left_only, N_("only emit output related to the first range")), OPT_BOOL(0, "right-only", &right_only, @@ -62,6 +66,12 @@ int cmd_range_diff(int argc, if (!simple_color) diffopt.use_color = 1; + /* If `--diff-merges` was specified, imply `--merges` */ + if (diff_merges_arg.nr) { + range_diff_opts.include_merges = 1; + strvec_pushv(&other_arg, diff_merges_arg.v); + } + for (i = 0; i < argc; i++) if (!strcmp(argv[i], "--")) { dash_dash = i; @@ -155,6 +165,7 @@ int cmd_range_diff(int argc, res = show_range_diff(range1.buf, range2.buf, &range_diff_opts); strvec_clear(&other_arg); + strvec_clear(&diff_merges_arg); strbuf_release(&range1); strbuf_release(&range2); diff --git a/range-diff.c b/range-diff.c index bbb0952264b..9e59733059b 100644 --- a/range-diff.c +++ b/range-diff.c @@ -38,7 +38,8 @@ struct patch_util { * as struct object_id (will need to be free()d). */ static int read_patches(const char *range, struct string_list *list, - const struct strvec *other_arg) + const struct strvec *other_arg, + unsigned int include_merges) { struct child_process cp = CHILD_PROCESS_INIT; struct strbuf buf = STRBUF_INIT, contents = STRBUF_INIT; @@ -49,7 +50,7 @@ static int read_patches(const char *range, struct string_list *list, size_t size; int ret = -1; - strvec_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges", + strvec_pushl(&cp.args, "log", "--no-color", "-p", "--reverse", "--date-order", "--decorate=no", "--no-prefix", "--submodule=short", /* @@ -64,6 +65,8 @@ static int read_patches(const char *range, struct string_list *list, "--pretty=medium", "--show-notes-by-default", NULL); + if (!include_merges) + strvec_push(&cp.args, "--no-merges"); strvec_push(&cp.args, range); if (other_arg) strvec_pushv(&cp.args, other_arg->v); @@ -96,11 +99,14 @@ static int read_patches(const char *range, struct string_list *list, } if (skip_prefix(line, "commit ", &p)) { + char *q; if (util) { string_list_append(list, buf.buf)->util = util; strbuf_reset(&buf); } CALLOC_ARRAY(util, 1); + if (include_merges && (q = strstr(p, " (from "))) + *q = '\0'; if (repo_get_oid(the_repository, p, &util->oid)) { error(_("could not parse commit '%s'"), p); FREE_AND_NULL(util); @@ -571,13 +577,14 @@ int show_range_diff(const char *range1, const char *range2, struct string_list branch1 = STRING_LIST_INIT_DUP; struct string_list branch2 = STRING_LIST_INIT_DUP; + unsigned int include_merges = range_diff_opts->include_merges; if (range_diff_opts->left_only && range_diff_opts->right_only) res = error(_("options '%s' and '%s' cannot be used together"), "--left-only", "--right-only"); - if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg)) + if (!res && read_patches(range1, &branch1, range_diff_opts->other_arg, include_merges)) res = error(_("could not parse log for '%s'"), range1); - if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg)) + if (!res && read_patches(range2, &branch2, range_diff_opts->other_arg, include_merges)) res = error(_("could not parse log for '%s'"), range2); if (!res) { diff --git a/range-diff.h b/range-diff.h index 2f69f6a434d..cd85000b5a0 100644 --- a/range-diff.h +++ b/range-diff.h @@ -16,6 +16,7 @@ struct range_diff_options { int creation_factor; unsigned dual_color:1; unsigned left_only:1, right_only:1; + unsigned include_merges:1; const struct diff_options *diffopt; /* may be NULL */ const struct strvec *other_arg; /* may be NULL */ }; diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh index 86010931ab6..c18a3fdab83 100755 --- a/t/t3206-range-diff.sh +++ b/t/t3206-range-diff.sh @@ -909,4 +909,20 @@ test_expect_success 'submodule changes are shown irrespective of diff.submodule' test_cmp expect actual ' +test_expect_success '--diff-merges' ' + renamed_oid=$(git rev-parse --short renamed-file) && + tree=$(git merge-tree unmodified renamed-file) && + clean=$(git commit-tree -m merge -p unmodified -p renamed-file $tree) && + clean_oid=$(git rev-parse --short $clean) && + conflict=$(git commit-tree -m merge -p unmodified -p renamed-file^ $tree) && + conflict_oid=$(git rev-parse --short $conflict) && + + git range-diff --diff-merges=1 $clean...$conflict >actual && + cat >expect <<-EOF && + 1: $renamed_oid < -: ------- s/12/B/ + 2: $clean_oid = 1: $conflict_oid merge + EOF + test_cmp expect actual +' + test_done