diff mbox series

Revert 'diff-merges: let "-m" imply "-p"'

Message ID YQyUM2uZdFBX8G0r@google.com (mailing list archive)
State New, archived
Headers show
Series Revert 'diff-merges: let "-m" imply "-p"' | expand

Commit Message

Jonathan Nieder Aug. 6, 2021, 1:45 a.m. UTC
This reverts commit f5bfcc823ba242a46e20fb6f71c9fbf7ebb222fe, which
made "git log -m" imply "--patch" by default.  The logic was that
"-m", which makes diff generation for merges perform a diff against
each parent, has no use unless I am viewing the diff, so we could save
the user some typing by turning on display of the resulting diff
automatically.  That wasn't expected to adversely affect scripts
because scripts would either be using a command like "git diff-tree"
that already emits diffs by default or would be combining -m with a

Comments

Junio C Hamano Aug. 6, 2021, 5:21 p.m. UTC | #1
Jonathan Nieder <jrnieder@gmail.com> writes:

> The problem is that although diff generation options are only relevant
> for the displayed diff, a script author can imagine them affecting
> path limiting.  For example, I might run

I am somewhat puzzled.  What does "can imagine" exactly mean and
justify this change?  A script author may imagine "git cat-file" can
be expected to meow, but the command actually does not meow and end
up disappointing the author, but that wouldn't justify a rename of
"cat-file" to something else.

> 	git log -w --format=%H -- README
>
> hoping to list commits that edited README, excluding whitespace-only
> changes.  In fact, a whitespace-only change is not TREESAME so the use
> of -w here has no effect (since we don't apply these diff generation
> flags to the diff_options struct rev_info::pruning used for this
> purpose), but the documentation suggests that it should work
>
> 	Suppose you specified foo as the <paths>. We shall call
> 	commits that modify foo !TREESAME, and the rest TREESAME. (In
> 	a diff filtered for foo, they look different and equal,
> 	respectively.)
>
> and a script author who has not tested whitespace-only changes
> wouldn't notice.

It would need to be corrected by a bugfix of either TREESAME
computation, or a documentation fix, I would think.  I fail to see
the similarity you perceive to the "-m" issue at hand, though.

> Similarly, a script author could include
>
> 	git log -m --first-parent --format=%H -- README
>
> to filter the first-parent history for commits that modified README.
> The -m is a no-op but it reflects the script author's intent.

So the expectation is with "-m" we'd give single parent commits on
the fp chain, and merges from side branches that change README, in
addition to merges from side branches that was forked way before the
README was updated on the trunk (hence had ancient README but the
merge kept the version from the trunk)?

> For
> example, until 1e20a407fe2 (stash list: stop passing "-m" to "git
> log", 2021-05-21), "git stash list" did this.

This is not a example that supports your conclusion, though.  The
reason why 288c67ca (stash: default listing to working-tree diff,
2014-08-06) added "-m" on the command line to make it:

  git log --format="%gd: %gs" -g --first-parent -m "$@" $ref_stash --

is to prepare for the users who may pass "-p" as part of the "$@";
they wil get no patches out of these merge commits that represent
stash entries otherwise, and they'd have to pass "-m -p" instead,
without the change.

> As a result, we can't safely change "-m" to imply "-p" without fear of
> breaking such scripts.  Restore the previous behavior.

So the above is *not* an example of a script that would have been
broken with this change.
Junio C Hamano Aug. 6, 2021, 5:55 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

>> Similarly, a script author could include
>>
>> 	git log -m --first-parent --format=%H -- README
>>
>> to filter the first-parent history for commits that modified README.
>> The -m is a no-op but it reflects the script author's intent.
>
> So the expectation is with "-m" we'd give single parent commits on
> the fp chain, and merges from side branches that change README, in
> addition to merges from side branches that was forked way before the
> README was updated on the trunk (hence had ancient README but the
> merge kept the version from the trunk)?
>
>> For
>> example, until 1e20a407fe2 (stash list: stop passing "-m" to "git
>> log", 2021-05-21), "git stash list" did this.
>
> This is not a example that supports your conclusion, though.  The
> reason why 288c67ca (stash: default listing to working-tree diff,
> 2014-08-06) added "-m" on the command line to make it:
>
>   git log --format="%gd: %gs" -g --first-parent -m "$@" $ref_stash --
>
> is to prepare for the users who may pass "-p" as part of the "$@";
> they wil get no patches out of these merge commits that represent
> stash entries otherwise, and they'd have to pass "-m -p" instead,
> without the change.
>
>> As a result, we can't safely change "-m" to imply "-p" without fear of
>> breaking such scripts.  Restore the previous behavior.
>
> So the above is *not* an example of a script that would have been
> broken with this change.

Sorry, I have to take 70% of the above back.  While it is *not* an
example that shows an author's intent that changes from not just
trunk but all side branches to README are to be shown, the original
left after 288c67ca (stash: default listing to working-tree diff,
2014-08-06) would have been broken by the change you are proposing
to revert.  It used to be just "if you give -p we'll show a patch"
but if we make "-m" to mean "-m -p", it does get broken.

Which is BAD.

I am inclined to take the revert, but I do not think reverting it
alone will break more things than it fixes.

For example, 1e20a407 (stash list: stop passing "-m" to "git log",
2021-05-21) that dropped "-m" must be reverted as well, no?
Jonathan Nieder Aug. 6, 2021, 7:57 p.m. UTC | #3
Junio C Hamano wrote:

> For example, 1e20a407 (stash list: stop passing "-m" to "git log",
> 2021-05-21) that dropped "-m" must be reverted as well, no?

No, that change is fine.  The "-m" doesn't have an effect one way or
another after this revert.

Thanks,
Jonathan
Jonathan Nieder Aug. 7, 2021, 1:55 a.m. UTC | #4
Hi,

Junio C Hamano wrote:

> I am somewhat puzzled.  What does "can imagine" exactly mean and
> justify this change?  A script author may imagine "git cat-file" can
> be expected to meow, but the command actually does not meow and end
> up disappointing the author, but that wouldn't justify a rename of
> "cat-file" to something else.

Sorry for the lack of clarity.  I was describing what leads a script
author to include "-m" in a place where it has no effect.

You might be inclined to wonder why it matters _why_ a script author
would do such a thing, if the script author is wrong.  To me, it
matters because it allows us to estimate how common it is for scripts
to use "-m" in this way.

The motivating example (Rust) shows that there is at least one script
that _did_ use "-m" in this way.  Rust has mitigation, but the above
logic leads me to believe that they are not the only project that will
be affected.  And more generally, when a script author has a
reasonable reason to believe something will work, they write scripts
where it _does_ work, and then an update breaks their script, I think
it's reasonable for them not to be happy.

Jonathan
Johannes Sixt Aug. 7, 2021, 6:49 a.m. UTC | #5
Am 07.08.21 um 03:55 schrieb Jonathan Nieder:
> The motivating example (Rust) shows that there is at least one script
> that _did_ use "-m" in this way.  Rust has mitigation, but the above
> logic leads me to believe that they are not the only project that will
> be affected.  And more generally, when a script author has a
> reasonable reason to believe something will work, they write scripts
> where it _does_ work, and then an update breaks their script, I think
> it's reasonable for them not to be happy.

As you know, we have "plumbing" commands with a stable interface and
"porcelain" commands for which we reserve to change the behavior without
advance notice. By your reasoning we would not need to distinguish
between the two categories and were forced to keep all behavior stable.
This undoing of a behavior change in a "porcelain" command with the
argument that one script depended on the old behavior and that others
might do so as well would set an unwanted precedent.

Perhaps we need to point script authors to "plumbing" commands more clearly?

(BTW, I have no opinion on whether -m should or should not imply -p.)

-- Hannes
Jonathan Nieder Aug. 7, 2021, 1:51 p.m. UTC | #6
Hi Hannes,

Johannes Sixt wrote:
> Am 07.08.21 um 03:55 schrieb Jonathan Nieder:

>> The motivating example (Rust) shows that there is at least one script
>> that _did_ use "-m" in this way.  Rust has mitigation, but the above
>> logic leads me to believe that they are not the only project that will
>> be affected.  And more generally, when a script author has a
>> reasonable reason to believe something will work, they write scripts
>> where it _does_ work, and then an update breaks their script, I think
>> it's reasonable for them not to be happy.
>
> As you know, we have "plumbing" commands with a stable interface and
> "porcelain" commands for which we reserve to change the behavior without
> advance notice. By your reasoning we would not need to distinguish
> between the two categories and were forced to keep all behavior stable.
> This undoing of a behavior change in a "porcelain" command with the
> argument that one script depended on the old behavior and that others
> might do so as well would set an unwanted precedent.

Hm, this is worth elucidating a bit more, since I am definitely in
favor of continuing to change porcelain commands for the better where
we can.  If we decide that "git log --format=<fmt>" is no longer part
of the stable scripting interface we provide, then that would be a
huge change for our callers (and it's probably too late), but I would
certainly be in favor of us going back in time and doing that. :)

More generally, we've been able to make changes to porcelain commands
that don't hurt our ability to act as a platform for scripts, and I
want us to continue to be able to do that.  "Do not break any script"
is certainly not the standard I think we should apply, as illustrated
by my thoughts upthread when I thought '-m' in this Rust example was a
typo.

But by now it's very clear to me that it was not a typo.

In other words:

- this isn't only about one obscure script.  The point of the "this
  was not a typo" logic is to illustrate that in addition to the
  examples that we know about it is very likely that there are
  examples that we don't know about, in teams' script collections
  beyond the reach of search engines.

- In fact, in addition to the motivating example that makes it
  possible to build Rust, we had multiple in-tree scripts that would
  also have broken by this, if they had not been adapted to work
  around that in the same series!  I should have noticed that in
  review, and I'm sorry that I didn't.

> Perhaps we need to point script authors to "plumbing" commands more clearly?

I think the existence of "plumbing" is fairly well known, but users
don't always have an easy time using it.  The "porcelain" is what ends
up getting the most attention in improvements, and so while I
encourage script authors to use 'git rev-list <revs> | git diff-tree
-s --stdin --format=<fmt>' in place of 'git log --format=<fmt>
<revs>', most do not listen, and I can't really blame them given how
much more convenient the latter is and how many more options it
supports.

I don't think that situation will change unless we

 a. Maintain a second, parallel implementation of each porcelain
    command that only uses plumbing.  This would provide an example of
    how to use plumbing and would ensure that the plumbing grows in
    capability at the same time as the corresponding porcelain.  Or

 b. Expose a library interface, so that we can expose the actual
    helpers that support the standard implementation of porcelain
    commands.

I tried a little of (a) years ago by updating contrib/examples/ to
pass tests: https://lore.kernel.org/git/20100817065147.GA18293@burratino/.
It was fun but I don't think it's really sustainable.

In the long term, I think (b) is going to be an important thing to do,
and I think it will be helpful.  Some automated callers would
appreciate the ability to pass structured input instead of having to
pretend to be shell scripts. :)  True shell scripts would also benefit
because the plumbing commands can more directly map to such a library
API.

> (BTW, I have no opinion on whether -m should or should not imply -p.)

Nevertheless, thanks for weighing in.

Jonathan
Junio C Hamano Aug. 7, 2021, 5 p.m. UTC | #7
Jonathan Nieder <jrnieder@gmail.com> writes:

> More generally, we've been able to make changes to porcelain commands
> that don't hurt our ability to act as a platform for scripts, and I
> want us to continue to be able to do that.  "Do not break any script"
> is certainly not the standard I think we should apply, as illustrated
> by my thoughts upthread when I thought '-m' in this Rust example was a
> typo.
>
> But by now it's very clear to me that it was not a typo.

This is a tangent that does not change the conclusion, because the
use of "-m" in "stash list" was not a typo but a deliberate attempt
to allow "-p" from the end-users to do what they wanted to do, and
it was clearly broken by this change (as you said, the need to hide
the breakage in the same series should have ringed a loud bell for
us).

But I didn't see how you think your Rust thing is not a typo, and I
still don't.  Unless you think Rust folks expected "-m" to do what
"-m" was not designed to do, that is, and I do not think that
"people thought it did something entirely differently, when it was a
no-op, so we shouldn't suddenly make it not a no-op" is a good
rationale that affects how we choose the evolution path for our
tools.

THanks.
Jonathan Nieder Aug. 7, 2021, 6:08 p.m. UTC | #8
Hi,

Junio C Hamano wrote:

> But I didn't see how you think your Rust thing is not a typo, and I
> still don't.  Unless you think Rust folks expected "-m" to do what
> "-m" was not designed to do, that is, and I do not think that
> "people thought it did something entirely differently, when it was a
> no-op, so we shouldn't suddenly make it not a no-op" is a good
> rationale that affects how we choose the evolution path for our
> tools.

Please don't treat this as an attempt to be argumentative: as you've
said, there's plenty of other reason for us to know in retrospect that
making "-m" imply "-p" is problematic for scripts.  Since you asked, I
think it's still worth describing my logic about the Rust example.

I believe the Rust folks expected "-m" to do something that it is
designed to do.  They _also_ overlooked a different subtlety about the
interaction between diff generation and path limiting.  It's good that
Rust's bootstrap.py is fixed now to be more straightforward (by now
it's even using the plumbing command); but it is very easy for another
script author to have had the same confusion, which I might add was a
harmless confusion until this change.  If we changed the behavior to
match their expectation _better_ then it would be a perfectly fine
compatibility break that would be expected to improve the behavior of
more scripts than it hurts.  This change was not in that category.

1. When I add "-m --first-parent" to my "git log -p" invocation, it
   changes what diff it generates.  Until 9ab89a24390 (log: enable
   "-m" automatically with "--first-parent", 2020-07-29), the diff
   shown for a merge with --first-parent was simply "no diff".  A
   script written before mid-2020 that wants to operate on the
   --first-parent diff is highly likely to pass -m.

2. The -m only affects diff generation and does not affect path
   limiting.  So when no diff is being generated it is in fact a
   no-op.  This point is fairly subtle, though, and because it is not
   documented, script authors _in practice_ would only discover it by
   experimentation.

3. A script using -m with the intent of affecting path limiting
   doesn't get any feedback via experimentation that they've made a
   mistake because path limiting with --first-parent already does what
   the script author was hoping for.

What's relevant is not whether the script author was in the wrong or
in the right: it's whether we expect there to be a significant number
of scripts negatively affected by the change.  Because of (1) to (3)
above, I do.

Jonathan
Junio C Hamano Aug. 8, 2021, 12:42 a.m. UTC | #9
Jonathan Nieder <jrnieder@gmail.com> writes:

> Please don't treat this as an attempt to be argumentative:...

I won't.  I asked you what I didn't understand in what you said.
Answering the question is not being argumentative ;-)

> What's relevant is not whether the script author was in the wrong or
> in the right.

I do not agree with this reasoning at all.  Only if vast majority of
users incorrectly used the command and the option, we may need to
consider such a move as an exception, but not as a general rule.

But "stash list" example shows that "log --first-parent -m" without
"-p" in a script has a valid reason, and a change that hurts those
who correctly used a command and an option in a way they were
intended to do _is_ problematic.

Thanks.
Junio C Hamano Aug. 8, 2021, 5:55 p.m. UTC | #10
Jonathan Nieder <jrnieder@gmail.com> writes:

> Junio C Hamano wrote:
>
>> For example, 1e20a407 (stash list: stop passing "-m" to "git log",
>> 2021-05-21) that dropped "-m" must be reverted as well, no?
>
> No, that change is fine.  The "-m" doesn't have an effect one way or
> another after this revert.

Ah, we are saved by the fact that "--first-parent" was made to imply
"-m", so a "-p" coming from the command line of "git stash list"
would do "log --first-parent -p" that shows the patch we want
without the need for "-m"... nice.
Sergey Organov Aug. 17, 2021, 9:13 a.m. UTC | #11
Junio C Hamano <gitster@pobox.com> writes:

> Jonathan Nieder <jrnieder@gmail.com> writes:
>
>> Junio C Hamano wrote:
>>
>>> For example, 1e20a407 (stash list: stop passing "-m" to "git log",
>>> 2021-05-21) that dropped "-m" must be reverted as well, no?
>>
>> No, that change is fine.  The "-m" doesn't have an effect one way or
>> another after this revert.
>
> Ah, we are saved by the fact that "--first-parent" was made to imply
> "-m", so a "-p" coming from the command line of "git stash list"
> would do "log --first-parent -p" that shows the patch we want
> without the need for "-m"... nice.

So, do I get it right that there is actually no reason to use "log
--first-parent -m" anymore, since the time the much older commit made
--first-parent imply -m?

If so, I'd object against this particular patch as the pros of patch
being reverted outweighs its cons, and the original patch never meant to
be entirely backward compatible in the first place, when it was
accepted.

Thanks,
-- Sergey Organov
Sergey Organov Aug. 17, 2021, 9:17 a.m. UTC | #12
Junio C Hamano <gitster@pobox.com> writes:

> Jonathan Nieder <jrnieder@gmail.com> writes:
>
>> Please don't treat this as an attempt to be argumentative:...
>
> I won't.  I asked you what I didn't understand in what you said.
> Answering the question is not being argumentative ;-)
>
>> What's relevant is not whether the script author was in the wrong or
>> in the right.
>
> I do not agree with this reasoning at all.  Only if vast majority of
> users incorrectly used the command and the option, we may need to
> consider such a move as an exception, but not as a general rule.
>
> But "stash list" example shows that "log --first-parent -m" without
> "-p" in a script has a valid reason, and a change that hurts those
> who correctly used a command and an option in a way they were
> intended to do _is_ problematic.

The patch never meant to be entirely backward compatible in the first
place, and, as far as I can see, "log --first-parent -m" doesn't make
sense anymore, since --fist-parent implies -m, that has been settled
already.

Thanks,
Sergey Organov
Junio C Hamano Aug. 17, 2021, 10:10 p.m. UTC | #13
Sergey Organov <sorganov@gmail.com> writes:

> So, do I get it right that there is actually no reason to use "log
> --first-parent -m" anymore, since the time the much older commit made
> --first-parent imply -m?

It was necessary for scripts to say

    git log --first-parent -m "$@"

if they wanted to optionally show a first-parent diff for a merge
when the user of the script passes "-p" in "$@" (and not to show
patch if the user does not pass "-p").  

That changed with 9ab89a24 (log: enable "-m" automatically with
"--first-parent", 2020-07-29).

After that commit, it no longer was needed, but it still was correct
to expect that no patch will be shown with "--first-parent -m",
unless you give "-p" at the same time.  The original change that the
patch under discussion reverted broke that expectation.

We need to note that the "-m" implied by "--first-parent" is "if we
were to show some comparison, do so also for merge commits", not the
"if the user says '-m', it must mean that the user wants to see
comparison, period, so make it imply '-p'".  The latter is what was
reverted.

> If so, I'd object against this particular patch as the pros of patch
> being reverted outweighs its cons, and the original patch never meant to
> be entirely backward compatible in the first place, when it was
> accepted.

I agree that we both (and if there were other reviewers, they too)
mistakenly thought that the change in behaviour was innocuous enough
when we queued the patch, but our mistakes were caught while the
topic was still cooking in 'next', and I have Jonathan to thank for
being extra careful.

Thanks.
Sergey Organov Aug. 18, 2021, 8:56 a.m. UTC | #14
Junio C Hamano <gitster@pobox.com> writes:

> Sergey Organov <sorganov@gmail.com> writes:
>
>> So, do I get it right that there is actually no reason to use "log
>> --first-parent -m" anymore, since the time the much older commit made
>> --first-parent imply -m?
>
> It was necessary for scripts to say
>
>     git log --first-parent -m "$@"
>
> if they wanted to optionally show a first-parent diff for a merge
> when the user of the script passes "-p" in "$@" (and not to show
> patch if the user does not pass "-p").  
>
> That changed with 9ab89a24 (log: enable "-m" automatically with
> "--first-parent", 2020-07-29).

Yes, and since then it's no more needed to say "--first-parent -m" in
this case, as "--fist-parent" will do.

>
> After that commit, it no longer was needed, but it still was correct
> to expect that no patch will be shown with "--first-parent -m",
> unless you give "-p" at the same time.  The original change that the
> patch under discussion reverted broke that expectation.

>
> We need to note that the "-m" implied by "--first-parent" is "if we
> were to show some comparison, do so also for merge commits", not the
> "if the user says '-m', it must mean that the user wants to see
> comparison, period, so make it imply '-p'".  The latter is what was
> reverted.

Yes, there is minor backward incompatibility indeed, and that was
expected. This could be seen from the patch in the same series that
fixes "git stash" by removing unneeded -m.

The fix for the scripts is as simple as removing -m from "--first-parent
-m". It's a one-time change.

>
>> If so, I'd object against this particular patch as the pros of patch
>> being reverted outweighs its cons, and the original patch never meant to
>> be entirely backward compatible in the first place, when it was
>> accepted.
>
> I agree that we both (and if there were other reviewers, they too)
> mistakenly thought that the change in behaviour was innocuous enough
> when we queued the patch, but our mistakes were caught while the
> topic was still cooking in 'next', and I have Jonathan to thank for
> being extra careful.

So, what would be the procedure to get this change back, as this minor
backward incompatibility shouldn't be the show-stopper for the change
that otherwise is an improvement?

Thanks,
-- Sergey Organov
Junio C Hamano Aug. 19, 2021, 6:50 p.m. UTC | #15
Sergey Organov <sorganov@gmail.com> writes:

>> We need to note that the "-m" implied by "--first-parent" is "if we
>> were to show some comparison, do so also for merge commits", not the
>> "if the user says '-m', it must mean that the user wants to see
>> comparison, period, so make it imply '-p'".  The latter is what was
>> reverted.
>
> Yes, there is minor backward incompatibility indeed, and that was
> expected. This could be seen from the patch in the same series that
> fixes "git stash" by removing unneeded -m.
>
> The fix for the scripts is as simple as removing -m from "--first-parent
> -m". It's a one-time change.
> ...
>> I agree that we both (and if there were other reviewers, they too)
>> mistakenly thought that the change in behaviour was innocuous enough
>> when we queued the patch, but our mistakes were caught while the
>> topic was still cooking in 'next', and I have Jonathan to thank for
>> being extra careful.
>
> So, what would be the procedure to get this change back, as this minor
> backward incompatibility shouldn't be the show-stopper for the change
> that otherwise is an improvement?

Your repeating "minor" does not make it minor.  Anything you force
existing users and scripts to change is "fixing the scripts", but
"working around the breakage you brought to them", which is closer
to being a show-stopper.  I understand that you like this feature a
lot, but you'd need to be a bit more considerate to your users and
other people.

I think it is a design mistake to make a plain vanilla "-m" to imply
"-p" (or any "output of result of comparison"), simply because the
implication goes in the other direction, so there will never be "get
this change back", period, but see below.

"git log" when showing a commit and asked to "output result of
comparison" like patch, combined diff, raw diff, etc. would:

 - show the comparison for non-merge commits and when
   "--first-parent" is specified (the latter is natural since it
   makes us consistently pretend that the merges were squash
   merges).

 - shows the comparison for merge commits when -m is given.

but because "--cc" and "-c" (which are used to specify how the
result of comparison is shown; they are not about specifying if
"normally we show only non-merges" is disabled) do not make sense in
the context of non-merge commits (in other words, the user is better
off giving "-p" if merges are not to be shown), they are made to
imply "-m".  And that is a sensible design choice.  On the other
hand, "--raw" (which is used to specify how the result of comparison
is shown; it not about specifying if "normally we show only
non-merges" is disabled) does make sense in the context of non-merge
commits, so unlike "--cc"/"-c", it does not imply "-m".  And that
also is a sensible design choice.  "-p" falls into the same bucket
as "--raw", so it should not imply "-m".

But some folks may not like "log -p" to be silent about comparison
for merge commits (like you are).  To accomodate them, it might make
sense to have a configuration that says "I like -m, so when -p or
--raw or any 'how to show comparison result' option is given, please
make it imply '-m'", but it should not be the default.

Thanks.
Junio C Hamano Aug. 19, 2021, 6:51 p.m. UTC | #16
Junio C Hamano <gitster@pobox.com> writes:

> Your repeating "minor" does not make it minor.  Anything you force
> existing users and scripts to change is "fixing the scripts", but

... is NOT "fixing the scripts", of course.
Sergey Organov Aug. 20, 2021, 10:24 a.m. UTC | #17
Junio C Hamano <gitster@pobox.com> writes:

> Sergey Organov <sorganov@gmail.com> writes:
>
>>> We need to note that the "-m" implied by "--first-parent" is "if we
>>> were to show some comparison, do so also for merge commits", not the
>>> "if the user says '-m', it must mean that the user wants to see
>>> comparison, period, so make it imply '-p'".  The latter is what was
>>> reverted.
>>
>> Yes, there is minor backward incompatibility indeed, and that was
>> expected. This could be seen from the patch in the same series that
>> fixes "git stash" by removing unneeded -m.
>>
>> The fix for the scripts is as simple as removing -m from "--first-parent
>> -m". It's a one-time change.
>> ...
>>> I agree that we both (and if there were other reviewers, they too)
>>> mistakenly thought that the change in behaviour was innocuous enough
>>> when we queued the patch, but our mistakes were caught while the
>>> topic was still cooking in 'next', and I have Jonathan to thank for
>>> being extra careful.
>>
>> So, what would be the procedure to get this change back, as this minor
>> backward incompatibility shouldn't be the show-stopper for the change
>> that otherwise is an improvement?
>
> Your repeating "minor" does not make it minor.  Anything you force
> existing users and scripts to change is "fixing the scripts", but
> "working around the breakage you brought to them", which is closer
> to being a show-stopper.

Backward compatibility is important, no questions, but later on you
start to say that this change is a *design* mistake, so discussing
backward compatibility issues gets rather useless.

That said, scripts that still have "log --first-parent -m" are remnants
of former sub-optimal design that was improved by "--first-parent imply
-m" change about a year ago, and with current Git these scripts are
confusing anyway, so fixing them by removing the work around they
historically have would be a good idea no matter if the change in
question is accepted or not.

I mean your "working around the breakage you brought to them" is simply
wrong. These changes to the scripts in question are not work-arounds
they are rather improvements. It's "log --first-parent -m" in the
scripts that is a work-around, and getting rid of -m there is getting
rid of work-around that is not needed anymore (for about a year
already.)

> I understand that you like this feature a lot, but you'd need to be a
> bit more considerate to your users and other people.

First, I believe I *am* considerate, and second, I don't either "like"
or "dislike" the feature, personally. It's a matter of consistency of
UI, and the fact that such requests appear on the list (not from me)
only supports this view. There are other people here who do think this
is an improvement.

> I think it is a design mistake to make a plain vanilla "-m" to imply
> "-p" (or any "output of result of comparison"), simply because the
> implication goes in the other direction, so there will never be "get
> this change back", period, but see below.

Well, I thought we've already discussed this to death and agreed this is
an improvement, before I even started to implement the patches, and now
what? I'm confused.

I still believe it's reasonable for "git log -m" to output diffs without
need to explicitly specify -p, and I still see no design mistake here,
especially if it were implemented this way from the beginning,
especially given that "git log --cc" and and "git log -c" already behave
exactly this way.

>
> "git log" when showing a commit and asked to "output result of
> comparison" like patch, combined diff, raw diff, etc. would:
>
>  - show the comparison for non-merge commits and when
>    "--first-parent" is specified (the latter is natural since it
>    makes us consistently pretend that the merges were squash
>    merges).
>
>  - shows the comparison for merge commits when -m is given.
>
> but because "--cc" and "-c" (which are used to specify how the
> result of comparison is shown; they are not about specifying if
> "normally we show only non-merges" is disabled) do not make sense in
> the context of non-merge commits (in other words, the user is better
> off giving "-p" if merges are not to be shown), they are made to
> imply "-m". And that is a sensible design choice.  

No, sorry, they are made to imply -p, not -m.

> On the other
> hand, "--raw" (which is used to specify how the result of comparison
> is shown; it not about specifying if "normally we show only
> non-merges" is disabled) does make sense in the context of non-merge
> commits, so unlike "--cc"/"-c", it does not imply "-m".  And that
> also is a sensible design choice.  "-p" falls into the same bucket
> as "--raw", so it should not imply "-m".

Yes, but this has nothing to do with the patch in question, as -p still
doesn't imply -m with this patch. It's another way around: the patch
makes -m imply -p, the same way -c/--cc imply -p.

>
> But some folks may not like "log -p" to be silent about comparison
> for merge commits (like you are).

No, not me, and I didn't see anybody who insisted on it yet. It's fine
with me it's silent by default.

> To accomodate them, it might make sense to have a configuration that
> says "I like -m, so when -p or --raw or any 'how to show comparison
> result' option is given, please make it imply '-m'", but it should not
> be the default.

This has nothing to do with the patch in question, and I actually don't
like the idea, sorry.

Overall, my opinion is still that there is nothing wrong with "-m
implies -p", as implemented by the patch, as if user asks to output
diffs even for merge commits, it's likely they need diffs for *all* of
them. This is again consistent with how -c/--cc work.

Now, only provided we *again* and *finally* agree that -m should better
imply -p, we can get back to discussing backward incompatibility this
change does introduce, and how to get transition smoother if it needs to
be.

Thanks,
-- Sergey Organov
diff mbox series

Patch

diff generation option such as --name-status.  By saving typing for
interactive use without adversely affecting scripts in the wild, it
would be a pure improvement.

The problem is that although diff generation options are only relevant
for the displayed diff, a script author can imagine them affecting
path limiting.  For example, I might run

	git log -w --format=%H -- README

hoping to list commits that edited README, excluding whitespace-only
changes.  In fact, a whitespace-only change is not TREESAME so the use
of -w here has no effect (since we don't apply these diff generation
flags to the diff_options struct rev_info::pruning used for this
purpose), but the documentation suggests that it should work

	Suppose you specified foo as the <paths>. We shall call
	commits that modify foo !TREESAME, and the rest TREESAME. (In
	a diff filtered for foo, they look different and equal,
	respectively.)

and a script author who has not tested whitespace-only changes
wouldn't notice.

Similarly, a script author could include

	git log -m --first-parent --format=%H -- README

to filter the first-parent history for commits that modified README.
The -m is a no-op but it reflects the script author's intent.  For
example, until 1e20a407fe2 (stash list: stop passing "-m" to "git
log", 2021-05-21), "git stash list" did this.

As a result, we can't safely change "-m" to imply "-p" without fear of
breaking such scripts.  Restore the previous behavior.

Noticed because Rust's src/bootstrap/bootstrap.py made use of this
same construct: https://github.com/rust-lang/rust/pull/87513.  That
script has been updated to omit the unnecessary "-m" option, but we
can expect other scripts in the wild to have similar expectations.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Hi,

Joshua Nelson wrote[1]:
> Jonathan Nieder wrote:

>> What happens if someone wants to build an older version of Rust?
>
> For what it's worth, almost no one builds old versions of rust from source
> except for distros, and distros wouldn't ever set download-ci-llvm = true. So
> this shouldn't affect anyone in practice now that we've removed `-m` on master.

Thanks.  With that out of the way, I started thinking more clearly
about the intent behind this use of `-m` and I'm starting to think it
wasn't a typo after all.

As a result, in terms of

>>  a. Revert 'diff-merges: let "-m" imply "-p"'.  This buys us time to
>>     make a more targeted change, make the change more gradually in a
>>     future release, or just stop encouraging use of "-m" in docs.
>>
>>  b. Make "-m" imply "-p", except in some more 'script-ish'
>>     circumstances (e.g. when using log --format with a format string)
>>
>>  c. Go ahead with the change and advertise it in release notes.

now I lean toward (a).  How about this patch?

Thanks,
Jonathan

[1] https://lore.kernel.org/git/CAJ+j++Vj1gY93QuKDhDODXOJGXTiFFEzy0Oew+LWD7a5e7iaTA@mail.gmail.com/

 Documentation/diff-options.txt | 8 ++++----
 diff-merges.c                  | 1 -
 t/t4013-diff-various.sh        | 4 ++--
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 0aebe832057..c89d530d3d1 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -49,9 +49,10 @@  ifdef::git-log[]
 --diff-merges=m:::
 -m:::
 	This option makes diff output for merge commits to be shown in
-	the default format. The default format could be changed using
+	the default format. `-m` will produce the output only if `-p`
+	is given as well. The default format could be changed using
 	`log.diffMerges` configuration parameter, which default value
-	is `separate`. `-m` implies `-p`.
+	is `separate`.
 +
 --diff-merges=first-parent:::
 --diff-merges=1:::
@@ -61,8 +62,7 @@  ifdef::git-log[]
 --diff-merges=separate:::
 	This makes merge commits show the full diff with respect to
 	each of the parents. Separate log entry and diff is generated
-	for each parent. This is the format that `-m` produced
-	historically.
+	for each parent.
 +
 --diff-merges=combined:::
 --diff-merges=c:::
diff --git a/diff-merges.c b/diff-merges.c
index 0dfcaa1b11b..d897fd8a293 100644
--- a/diff-merges.c
+++ b/diff-merges.c
@@ -107,7 +107,6 @@  int diff_merges_parse_opts(struct rev_info *revs, const char **argv)
 
 	if (!strcmp(arg, "-m")) {
 		set_to_default(revs);
-		revs->merges_imply_patch = 1;
 	} else if (!strcmp(arg, "-c")) {
 		set_combined(revs);
 		revs->merges_imply_patch = 1;
diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh
index 7fadc985ccc..e561a8e4852 100755
--- a/t/t4013-diff-various.sh
+++ b/t/t4013-diff-various.sh
@@ -455,8 +455,8 @@  diff-tree --stat --compact-summary initial mode
 diff-tree -R --stat --compact-summary initial mode
 EOF
 
-test_expect_success 'log -m matches log -m -p' '
-	git log -m -p master >result &&
+test_expect_success 'log -m matches pure log' '
+	git log master >result &&
 	process_diffs result >expected &&
 	git log -m >result &&
 	process_diffs result >actual &&