Message ID | 20200829201140.23425-1-sorganov@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v2] revision: add separate field for "-m" of "diff-index -m" | expand |
Sergey Organov <sorganov@gmail.com> writes: > Historically, in "diff-index -m", "-m" does not mean "do not ignore merges", but > "match missing". Despite this, diff-index abuses 'ignore_merges' field being set > by "-m", that in turn causes more troubles. "causes more troubles"? When there is no trouble, and no "more" trouble, concretely mentioned, it is a quite weak justfiication. There is no reason to say "historically" here, as it has been like so from beginning of the time, it still is so and it is relied upon. "diff-{files,index,tree}" are about comparing two things, and not about history (where a "merge" might influence "now we are showing this commit. which parent do we compare it with?"), so giving short-and-sweet "-m" its own meaning that is sensible within the context of "diff" was and is perfectly sensible thing to do. What is worth fixing is not "-m" in diff-index means "match missing" while "-m" in log wants to mean "show merges". It is that, even both commands use the same option parsing machinery, and the use of these two options are mutually exclusive so there is no risk of confusion, the flag internally used to record the presense of the "em" option is not named neutrally (e.g. "revs->seen_em_option"). The "log" family of commands and "diff" family of commands share the same command line parsiong machinery. For the former, "-m" means "show merges" while for the latter it means "match missing". Tnis is not a problem at the UI level, as "show/not show merges" is meaningless in the context of "diff", and similarly "match/not match missing" is meaningless in the context of "log". But there are two problems with this arrangement. 1. the field the presense of the option on the command line is recorded in has to be given a name. It is currently called "ignore_merges", which gives an incorrect impression that using it for "diff" family is somehow a mistake, and renaming it to "match_missing" would not be a solution, as it will give an incorrect impression that "log" family is abusing it. However, naming the field to something neutral, e.g. "em_option", would make the code harder to understand. 2. because it uses the same command line parser, giving a default for "diff -m" in a way that is different from the default for "log -m" is quite cumbersome if they use the same field to record it. Introduce a separate "match_missing" field, and flip it and "ignore_merges" when we see the "-m" option on the command line. That way, even when ignore_merges's default is affected by end-user configuration, the default for "match_missing" would not be affected. I think the above would be in line with what you wanted to say but didn't, and I think it supports the split fairly well. I have a very strong objection against changing the built-in default of "log -m", but I do agree that this split of the single field into two is a fairly good idea. So I do not want to be in the position that must reject this change because "log -m" and "diff-index -m" will never be on by default. Basing the justification of this change on end-user configurability would be a good way to sidestep the issue, and avoids taking this change hostage to the discussion on what should be the built-in default for "log/diff-index -m".
Junio C Hamano <gitster@pobox.com> writes: > Sergey Organov <sorganov@gmail.com> writes: > >> Historically, in "diff-index -m", "-m" does not mean "do not ignore >> merges", but >> "match missing". Despite this, diff-index abuses 'ignore_merges' >> field being set >> by "-m", that in turn causes more troubles. > > "causes more troubles"? When there is no trouble, and no "more" > trouble, concretely mentioned, it is a quite weak justfiication. Well, existed comment says "Backward compatibility wart" that sounds like a trouble to me already. No? Then, since "--[no-]diff-merges" is introduced, we have: $ git diff-index HEAD :100644 000000 4aec621a6d1a9a5892f0b4b6feb2ed329fd04bf2 0000000000000000000000000000000000000000 D main/main.cc $ git diff-index -m HEAD $ git diff-index -m --no-diff-merges HEAD :100644 000000 4aec621a6d1a9a5892f0b4b6feb2ed329fd04bf2 0000000000000000000000000000000000000000 D main/main.cc that sounds like yet another trouble. That's why I used "more trouble" in my commit message. If you say "compatibility wart" is not a trouble by itself, -- I'm fine with it, -- then "more" in my commit message is misplaced indeed. > > There is no reason to say "historically" here, as it has been like > so from beginning of the time, it still is so and it is relied > upon. "diff-{files,index,tree}" are about comparing two things, and > not about history (where a "merge" might influence "now we are > showing this commit. which parent do we compare it with?"), so > giving short-and-sweet "-m" its own meaning that is sensible within > the context of "diff" was and is perfectly sensible thing to do. Well, if "historically" makes you feel uncomfortable, -- I'm willing to get rid of it. > > What is worth fixing is not "-m" in diff-index means "match missing" > while "-m" in log wants to mean "show merges". It is that, even both > commands use the same option parsing machinery, and the use of these > two options are mutually exclusive so there is no risk of confusion, > the flag internally used to record the presense of the "em" option is > not named neutrally (e.g. "revs->seen_em_option"). > > The "log" family of commands and "diff" family of commands > share the same command line parsiong machinery. For the > former, "-m" means "show merges" while for the latter it > means "match missing". Tnis is not a problem at the UI > level, as "show/not show merges" is meaningless in the > context of "diff", and similarly "match/not match missing" > is meaningless in the context of "log". > > But there are two problems with this arrangement. > > 1. the field the presense of the option on the command line > is recorded in has to be given a name. It is currently > called "ignore_merges", which gives an incorrect > impression that using it for "diff" family is somehow a > mistake, and renaming it to "match_missing" would not be > a solution, as it will give an incorrect impression that > "log" family is abusing it. However, naming the field to > something neutral, e.g. "em_option", would make the code > harder to understand. > > 2. because it uses the same command line parser, giving a > default for "diff -m" in a way that is different from the > default for "log -m" is quite cumbersome if they use the > same field to record it. > > Introduce a separate "match_missing" field, and flip it and > "ignore_merges" when we see the "-m" option on the command > line. That way, even when ignore_merges's default is > affected by end-user configuration, the default for > "match_missing" would not be affected. > > I think the above would be in line with what you wanted to say but > didn't, and I think it supports the split fairly well. > > I have a very strong objection against changing the built-in default > of "log -m", but I do agree that this split of the single field into > two is a fairly good idea. So I do not want to be in the position > that must reject this change because "log -m" and "diff-index -m" > will never be on by default. Basing the justification of this > change on end-user configurability would be a good way to sidestep > the issue, and avoids taking this change hostage to the discussion > on what should be the built-in default for "log/diff-index -m". This change has nothing to do with defaults. It rather about correct and clear code. I'll re-roll with better commit message. Thanks, -- Sergey
Sergey Organov <sorganov@gmail.com> writes: > $ git diff-index -m --no-diff-merges HEAD > :100644 000000 4aec621a6d1a9a5892f0b4b6feb2ed329fd04bf2 0000000000000000000000000000000000000000 D main/main.cc At the first glance, this looked like a good justification for this patch. > If you say "compatibility wart" is not a trouble by itself, -- I'm fine > with it, -- then "more" in my commit message is misplaced indeed. Yeah, when I wrote the "compatibility wart" comment originally, I was describing "this needs a tricky code because two independent options happen to share the command line parser" and nothing more. I was not reacting to "more", by the way. I was reacting the lack of concrete problem description. "A '-m' option given to the 'diff-index' command can be defeated by giving '--no-diff-merges' later" you showed above can be a good replacement for "causes more troubles". But in the ideal world, "--[no-]diff-merges" should be rejected as an irrelevant/unrecognised option to the "diff" family of commands (as I said in the message you are responding to, it is only relevant to the "log" family of commands where the diff machinery is solely to compare between (some of) its parents and in that context, what, if anything, kind of special treatment is made for merge commits makes sense as an optional instruction to the command). Splitting the field into two fields, setting both fields upon "-m" but toggling only one with longhand "--[no-]diff-merges" would allow the code to notice and make the above command line silently turn the "--[no-]diff-merges" into a no-op, so in that sense it would be a good first step, but an ideal solution would probably need to know if we are parsing for the "log" family or for the "diff" family and error out upon seeing a "log"-only option like "--[no-]diff-merges" when checking the command line option for "diff". > This change has nothing to do with defaults. It rather about correct and > clear code. OK, I misread your intention. Sorry about that. Thanks.
diff --git a/diff-lib.c b/diff-lib.c index 50521e2093fc..f2aee78e7aa2 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -405,14 +405,8 @@ static void do_oneway_diff(struct unpack_trees_options *o, /* if the entry is not checked out, don't examine work tree */ cached = o->index_only || (idx && ((idx->ce_flags & CE_VALID) || ce_skip_worktree(idx))); - /* - * Backward compatibility wart - "diff-index -m" does - * not mean "do not ignore merges", but "match_missing". - * - * But with the revision flag parsing, that's found in - * "!revs->ignore_merges". - */ - match_missing = !revs->ignore_merges; + + match_missing = revs->diff_index_match_missing; if (cached && idx && ce_stage(idx)) { struct diff_filepair *pair; diff --git a/revision.c b/revision.c index 96630e31867d..64b16f7d1033 100644 --- a/revision.c +++ b/revision.c @@ -2345,6 +2345,12 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg revs->diffopt.flags.tree_in_recursive = 1; } else if (!strcmp(arg, "-m")) { revs->ignore_merges = 0; + /* + * Backward compatibility wart - "diff-index -m" does + * not mean "do not ignore merges", but "match_missing", + * so set separate flag for it. + */ + revs->diff_index_match_missing = 1; } else if ((argcount = parse_long_opt("diff-merges", argv, &optarg))) { if (!strcmp(optarg, "off")) { revs->ignore_merges = 1; diff --git a/revision.h b/revision.h index c1e5bcf139d7..5ae8254ffaed 100644 --- a/revision.h +++ b/revision.h @@ -188,6 +188,7 @@ struct rev_info { unsigned int diff:1, full_diff:1, show_root_diff:1, + diff_index_match_missing:1, no_commit_id:1, verbose_header:1, combine_merges:1,
Historically, in "diff-index -m", "-m" does not mean "do not ignore merges", but "match missing". Despite this, diff-index abuses 'ignore_merges' field being set by "-m", that in turn causes more troubles. Add separate 'diff_index_match_missing' field for diff-index to use and set it when we encounter "-m" option. This field won't then be cleared when primary meaning of "-m" is reverted (e.g., by "--no-diff-merges"), nor it will be affected by future option(s) that might drive 'ignore_merges' field. Use this new field from diff-lib:do_oneway_diff() instead of abusing 'ignore_merges' field. Signed-off-by: Sergey Organov <sorganov@gmail.com> --- v2: rebased from 'maint' onto 'master' diff-lib.c | 10 ++-------- revision.c | 6 ++++++ revision.h | 1 + 3 files changed, 9 insertions(+), 8 deletions(-)