diff mbox series

diff: fix behaviour of the "-s" option

Message ID xmqqfs8bith1.fsf_-_@gitster.g (mailing list archive)
State New, archived
Headers show
Series diff: fix behaviour of the "-s" option | expand

Commit Message

Junio C Hamano May 4, 2023, 11:10 p.m. UTC
Junio C Hamano <gitster@pobox.com> writes:

> I haven't run any tests (not just your new one, but existing ones)
> but ...

And of course, not writing tests fails to even realize that the bug
has two components, "-s" failing to clear the bits previously set,
and other options not clearing the bit set by "-s".

This version may still be rough, but at least the full test suite
has been run with it, so I have a bit more confidence than the
earlier one (which may not mean much).

------- >8 ------------- >8 ------------- >8 -------------
Sergey Organov noticed and reported "--patch --no-patch --raw"
behaves differently from "--raw".  It turns out there are a few
interesting bugs in the implementation and documentation.

 * First, the documentation for "--no-patch" was unclear that it
   could be read to mean "--no-patch" countermands an earlier
   "--patch" but not other things.  The intention of "--no-patch"
   ever since it was introduced at d09cd15d (diff: allow --no-patch
   as synonym for -s, 2013-07-16) was to serve as a synonym for
   "-s", so "--raw --patch --no-patch" should have produced no
   output, but it can be (mis)read to allow showing only "--raw"
   output.

 * Then the interaction between "-s" and other format options were
   poorly implemented.  Modern versions of Git uses one bit each to
   represent formatting options like "--patch", "--stat" in a single
   output_format word, but for historical reasons, "-s" also is
   represented as another bit in the same word.  This allows two
   interesting bugs to happen, and we have both.

   (1) After setting a format bit, then setting NO_OUTPUT with "-s",
       the code to process another "--<format>" option drops the
       NO_OUTPUT bit to allow output to be shown again.  However,
       the code to handle "-s" only set NO_OUTPUT without unsetting
       format bits set earlier, so the earlier format bit got
       revealed upon seeing the second "--<format>" option.  THis is
       the problem Sergey observed.

   (2) After setting NO_OUTPUT with "-s", code to process
       "--<format>" option can forget to unset NO_OUTPUT, leaving
       the command still silent.

It is tempting to change the meaning of "--no-patch" to mean
"disable only the patch format output" and reimplement "-s" as "not
showing anything", but it would be an end-user visible change in
behaviour.  Let's fix the interactions of these bits to first make
"-s" work as intended.

The fix is conceptually very simple.

 * Whenever we set DIFF_FORMAT_FOO becasuse we saw the "--foo"
   option (e.g. DIFF_FORMAT_RAW is set when the "--raw" option is
   given), we make sure we drop DIFF_FORMAT_NO_OUTPUT.  We forgot to
   do so in some of the options and caused (2) above.

 * When processing "-s" option, we should not just set
   DIFF_FORMAT_NO_OUTPUT bit, but clear other DIFF_FORMAT_* bits.
   We didn't do so and retained format bits set by options
   previously seen, causing (1) above.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/diff-options.txt |  7 +++++--
 diff.c                         | 24 +++++++++++++-----------
 t/t4000-diff-format.sh         | 32 +++++++++++++++++++++++++++++++-
 3 files changed, 49 insertions(+), 14 deletions(-)

Comments

Junio C Hamano May 5, 2023, 5:28 a.m. UTC | #1
Junio C Hamano <gitster@pobox.com> writes:

>  * Then the interaction between "-s" and other format options were
>    poorly implemented.  Modern versions of Git uses one bit each to
>    represent formatting options like "--patch", "--stat" in a single
>    output_format word, but for historical reasons, "-s" also is
>    represented as another bit in the same word.

An obvious improvement strategy is to stop using the NO_OUTPUT bit
and instead make "-s" to clear the "output_format" word, and make
"--[no-]raw", "--[no-]stat", "--[no-]patch", etc. to flip their own
bit in the same "output_format" word.  I think the "historical
reasons" why we did not do that was because we wanted to be able to
do a flexible defaulting.  We may want to say "if no output-format
option is given from the command line, default to "--patch", but
otherwise do not set the "--patch" bit on", for example.
Initializing the "output_format" word with "--patch" bit on would
not work---when "--raw" is given from the command line, we want to
clear that "--patch" bit we set for default and set "--raw" bit on.
We can initialize the "output_format" word to 0, and OR in the bits
for each format option as we process them, and then flip the
"--patch" bit on if "output_format" word is still 0 after command
line parsing is done.  This would almost work, except that it would
make it hard to tell "no command line options" case and "'-s' cleared
all bits" case apart (the former wants to default to "--patch",
while the latter wants to stay "no output"), and it probably was the
reason why we gave an extra NO_OUTPUT bit to the "-s" option.  In
hindsight, the arrangement certainly made other things harder and
prone to unnecessary bugs.

Anyway...
Sergey Organov May 5, 2023, 8:32 a.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> I haven't run any tests (not just your new one, but existing ones)
>> but ...
>
> And of course, not writing tests fails to even realize that the bug
> has two components, "-s" failing to clear the bits previously set,
> and other options not clearing the bit set by "-s".
>
> This version may still be rough, but at least the full test suite
> has been run with it, so I have a bit more confidence than the
> earlier one (which may not mean much).
>
> ------- >8 ------------- >8 ------------- >8 -------------
> Sergey Organov noticed and reported "--patch --no-patch --raw"
> behaves differently from "--raw".  It turns out there are a few
> interesting bugs in the implementation and documentation.
>
>  * First, the documentation for "--no-patch" was unclear that it
>    could be read to mean "--no-patch" countermands an earlier
>    "--patch" but not other things.  The intention of "--no-patch"
>    ever since it was introduced at d09cd15d (diff: allow --no-patch
>    as synonym for -s, 2013-07-16) was to serve as a synonym for
>    "-s", so "--raw --patch --no-patch" should have produced no
>    output, but it can be (mis)read to allow showing only "--raw"
>    output.
>
>  * Then the interaction between "-s" and other format options were
>    poorly implemented.  Modern versions of Git uses one bit each to
>    represent formatting options like "--patch", "--stat" in a single
>    output_format word, but for historical reasons, "-s" also is
>    represented as another bit in the same word.  This allows two
>    interesting bugs to happen, and we have both.
>
>    (1) After setting a format bit, then setting NO_OUTPUT with "-s",
>        the code to process another "--<format>" option drops the
>        NO_OUTPUT bit to allow output to be shown again.  However,
>        the code to handle "-s" only set NO_OUTPUT without unsetting
>        format bits set earlier, so the earlier format bit got
>        revealed upon seeing the second "--<format>" option.  THis is
>        the problem Sergey observed.
>
>    (2) After setting NO_OUTPUT with "-s", code to process
>        "--<format>" option can forget to unset NO_OUTPUT, leaving
>        the command still silent.
>
> It is tempting to change the meaning of "--no-patch" to mean
> "disable only the patch format output" and reimplement "-s" as "not
> showing anything", but it would be an end-user visible change in
> behaviour.  Let's fix the interactions of these bits to first make
> "-s" work as intended.
>
> The fix is conceptually very simple.
>
>  * Whenever we set DIFF_FORMAT_FOO becasuse we saw the "--foo"
>    option (e.g. DIFF_FORMAT_RAW is set when the "--raw" option is
>    given), we make sure we drop DIFF_FORMAT_NO_OUTPUT.  We forgot to
>    do so in some of the options and caused (2) above.
>
>  * When processing "-s" option, we should not just set
>    DIFF_FORMAT_NO_OUTPUT bit, but clear other DIFF_FORMAT_* bits.
>    We didn't do so and retained format bits set by options
>    previously seen, causing (1) above.

Sounds good to me. Doesn't this makes DIFF_FORMAT_NO_OUTPUT obsolete as
well, I wonder, as absence of any output bits effectively means "no
output"?

Thanks,
-- Sergey Organov
Junio C Hamano May 5, 2023, 4:31 p.m. UTC | #3
Sergey Organov <sorganov@gmail.com> writes:

>>  * Whenever we set DIFF_FORMAT_FOO becasuse we saw the "--foo"
>>    option (e.g. DIFF_FORMAT_RAW is set when the "--raw" option is
>>    given), we make sure we drop DIFF_FORMAT_NO_OUTPUT.  We forgot to
>>    do so in some of the options and caused (2) above.
>>
>>  * When processing "-s" option, we should not just set
>>    DIFF_FORMAT_NO_OUTPUT bit, but clear other DIFF_FORMAT_* bits.
>>    We didn't do so and retained format bits set by options
>>    previously seen, causing (1) above.
>
> Sounds good to me. Doesn't this makes DIFF_FORMAT_NO_OUTPUT obsolete as
> well, I wonder, as absence of any output bits effectively means "no
> output"?

Not quite.  The latter is not "set 0 to output_format word", but
"set 0 to output_format word and then flip only NO_OUTPUT bit on".
I've written a bit more on it in a follow-up message to the patch.

Thanks.
Junio C Hamano May 5, 2023, 4:51 p.m. UTC | #4
Junio C Hamano <gitster@pobox.com> writes:

> ...  I think the "historical
> reasons" why we did not do that was because we wanted to be able to
> do a flexible defaulting. ...
> This would almost work, except that it would
> make it hard to tell "no command line options" case and "'-s' cleared
> all bits" case apart (the former wants to default to "--patch",
> while the latter wants to stay "no output"), and it probably was the
> reason why we gave an extra NO_OUTPUT bit to the "-s" option.  In
> hindsight, the arrangement certainly made other things harder and
> prone to unnecessary bugs.
>
> Anyway...

The distinction between the presense of NO_OUTPUT bit and absolutely
empty output_format word indeed is used by "git show", in the
builtin/log.c::show_setup_revisions_tweak() function.

We could lose DIFF_FORMAT_NO_OUTPUT bit, but then we need to replace
it with something else (i.e. DIFF_FORMAT_OPTION_GIVEN bit), and

 * "--patch", "--raw", etc. will set DIFF_FORMAT_$format bit and
   DIFF_FORMAT_OPTION_GIVEN bit on for each format.  "--no-raw", 
   etc. will set off DIFF_FORMAT_$format bit but still record the
   fact that we saw an option from the command line by setting
   DIFF_FORMAT_OPTION_GIVEN bit.

 * "-s" (and its synonym "--no-patch") will set the
   DIFF_FORMAT_OPTION_GIVEN bit on, and clear all other bits.

which I suspect would make the code much cleaner without breaking
any end-user expectations.

Once that is in place, transitioning "--no-patch" to mean the
counterpart of "--patch", just like "--no-raw" only defeats an
earlier "--raw", would be quite simple at the code level.  The
social cost of migrating the end-user expectations might be too
great for it to be worth, but at least the "GIVEN" bit clean-up
alone may be worth it.

Not that I would be starting the process right away...
Sergey Organov May 5, 2023, 5:07 p.m. UTC | #5
Junio C Hamano <gitster@pobox.com> writes:

> Sergey Organov <sorganov@gmail.com> writes:
>
>>>  * Whenever we set DIFF_FORMAT_FOO becasuse we saw the "--foo"
>>> option (e.g. DIFF_FORMAT_RAW is set when the "--raw" option is
>>> given), we make sure we drop DIFF_FORMAT_NO_OUTPUT. We forgot to do
>>> so in some of the options and caused (2) above.
>>> * When processing "-s" option, we should not just set
>>> DIFF_FORMAT_NO_OUTPUT bit, but clear other DIFF_FORMAT_* bits. We
>>> didn't do so and retained format bits set by options previously
>>> seen, causing (1) above.
>> Sounds good to me. Doesn't this makes DIFF_FORMAT_NO_OUTPUT obsolete
>> as well, I wonder, as absence of any output bits effectively means
>> "no output"?
>
> Not quite. The latter is not "set 0 to output_format word", but "set 0
> to output_format word and then flip only NO_OUTPUT bit on". I've
> written a bit more on it in a follow-up message to the patch.

Yep, I've noticed that post after I sent the question.

Thanks,
-- Sergey Organov
Felipe Contreras May 9, 2023, 1:16 a.m. UTC | #6
Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> >  * Then the interaction between "-s" and other format options were
> >    poorly implemented.  Modern versions of Git uses one bit each to
> >    represent formatting options like "--patch", "--stat" in a single
> >    output_format word, but for historical reasons, "-s" also is
> >    represented as another bit in the same word.
> 
> An obvious improvement strategy is to stop using the NO_OUTPUT bit
> and instead make "-s" to clear the "output_format" word, and make
> "--[no-]raw", "--[no-]stat", "--[no-]patch", etc. to flip their own
> bit in the same "output_format" word.  I think the "historical
> reasons" why we did not do that was because we wanted to be able to
> do a flexible defaulting.  We may want to say "if no output-format
> option is given from the command line, default to "--patch", but
> otherwise do not set the "--patch" bit on", for example.
> Initializing the "output_format" word with "--patch" bit on would
> not work---when "--raw" is given from the command line, we want to
> clear that "--patch" bit we set for default and set "--raw" bit on.
> We can initialize the "output_format" word to 0, and OR in the bits
> for each format option as we process them, and then flip the
> "--patch" bit on if "output_format" word is still 0 after command
> line parsing is done.  This would almost work, except that it would
> make it hard to tell "no command line options" case and "'-s' cleared
> all bits" case apart (the former wants to default to "--patch",
> while the latter wants to stay "no output"), and it probably was the
> reason why we gave an extra NO_OUTPUT bit to the "-s" option.  In
> hindsight, the arrangement certainly made other things harder and
> prone to unnecessary bugs.

That's easy to solve by introducing a DIFF_FORMAT_DEFAULT item, which
would be different from 0.

Then every command can update DIFF_FORMAT_DEFAULT to the desired
default, and if the default is cleared (e.g. `--no-patch`) that would
not happen.
diff mbox series

Patch

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 3674ac48e9..7d5bb65a49 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -29,8 +29,11 @@  endif::git-diff[]
 
 -s::
 --no-patch::
-	Suppress diff output. Useful for commands like `git show` that
-	show the patch by default, or to cancel the effect of `--patch`.
+	Suppress all output from the diff machinery.  Useful for
+	commands like `git show` that show the patch by default to
+	squelch their output, or to cancel the effect of options like
+	`--patch`, `--stat` earlier on the command line in an alias.
+
 endif::git-format-patch[]
 
 ifdef::git-log[]
diff --git a/diff.c b/diff.c
index 648f6717a5..5a2f096683 100644
--- a/diff.c
+++ b/diff.c
@@ -4868,6 +4868,7 @@  static int diff_opt_stat(const struct option *opt, const char *value, int unset)
 	} else
 		BUG("%s should not get here", opt->long_name);
 
+	options->output_format &= ~DIFF_FORMAT_NO_OUTPUT;
 	options->output_format |= DIFF_FORMAT_DIFFSTAT;
 	options->stat_name_width = name_width;
 	options->stat_graph_width = graph_width;
@@ -4887,6 +4888,7 @@  static int parse_dirstat_opt(struct diff_options *options, const char *params)
 	 * The caller knows a dirstat-related option is given from the command
 	 * line; allow it to say "return this_function();"
 	 */
+	options->output_format &= ~DIFF_FORMAT_NO_OUTPUT;
 	options->output_format |= DIFF_FORMAT_DIRSTAT;
 	return 1;
 }
@@ -5086,6 +5088,7 @@  static int diff_opt_compact_summary(const struct option *opt,
 		options->flags.stat_with_summary = 0;
 	} else {
 		options->flags.stat_with_summary = 1;
+		options->output_format &= ~DIFF_FORMAT_NO_OUTPUT;
 		options->output_format |= DIFF_FORMAT_DIFFSTAT;
 	}
 	return 0;
@@ -5404,9 +5407,8 @@  static void prep_parse_options(struct diff_options *options)
 		OPT_BITOP('p', "patch", &options->output_format,
 			  N_("generate patch"),
 			  DIFF_FORMAT_PATCH, DIFF_FORMAT_NO_OUTPUT),
-		OPT_BIT_F('s', "no-patch", &options->output_format,
-			  N_("suppress diff output"),
-			  DIFF_FORMAT_NO_OUTPUT, PARSE_OPT_NONEG),
+		OPT_SET_INT('s', "no-patch", &options->output_format,
+			    N_("suppress diff output"), DIFF_FORMAT_NO_OUTPUT),
 		OPT_BITOP('u', NULL, &options->output_format,
 			  N_("generate patch"),
 			  DIFF_FORMAT_PATCH, DIFF_FORMAT_NO_OUTPUT),
@@ -5415,9 +5417,9 @@  static void prep_parse_options(struct diff_options *options)
 			       PARSE_OPT_NONEG | PARSE_OPT_OPTARG, diff_opt_unified),
 		OPT_BOOL('W', "function-context", &options->flags.funccontext,
 			 N_("generate diffs with <n> lines context")),
-		OPT_BIT_F(0, "raw", &options->output_format,
+		OPT_BITOP(0, "raw", &options->output_format,
 			  N_("generate the diff in raw format"),
-			  DIFF_FORMAT_RAW, PARSE_OPT_NONEG),
+			  DIFF_FORMAT_RAW, DIFF_FORMAT_NO_OUTPUT),
 		OPT_BITOP(0, "patch-with-raw", &options->output_format,
 			  N_("synonym for '-p --raw'"),
 			  DIFF_FORMAT_PATCH | DIFF_FORMAT_RAW,
@@ -5426,12 +5428,12 @@  static void prep_parse_options(struct diff_options *options)
 			  N_("synonym for '-p --stat'"),
 			  DIFF_FORMAT_PATCH | DIFF_FORMAT_DIFFSTAT,
 			  DIFF_FORMAT_NO_OUTPUT),
-		OPT_BIT_F(0, "numstat", &options->output_format,
+		OPT_BITOP(0, "numstat", &options->output_format,
 			  N_("machine friendly --stat"),
-			  DIFF_FORMAT_NUMSTAT, PARSE_OPT_NONEG),
-		OPT_BIT_F(0, "shortstat", &options->output_format,
+			  DIFF_FORMAT_NUMSTAT, DIFF_FORMAT_NO_OUTPUT),
+		OPT_BITOP(0, "shortstat", &options->output_format,
 			  N_("output only the last line of --stat"),
-			  DIFF_FORMAT_SHORTSTAT, PARSE_OPT_NONEG),
+			  DIFF_FORMAT_SHORTSTAT, DIFF_FORMAT_NO_OUTPUT),
 		OPT_CALLBACK_F('X', "dirstat", options, N_("<param1,param2>..."),
 			       N_("output the distribution of relative amount of changes for each sub-directory"),
 			       PARSE_OPT_NONEG | PARSE_OPT_OPTARG,
@@ -5447,9 +5449,9 @@  static void prep_parse_options(struct diff_options *options)
 		OPT_BIT_F(0, "check", &options->output_format,
 			  N_("warn if changes introduce conflict markers or whitespace errors"),
 			  DIFF_FORMAT_CHECKDIFF, PARSE_OPT_NONEG),
-		OPT_BIT_F(0, "summary", &options->output_format,
+		OPT_BITOP(0, "summary", &options->output_format,
 			  N_("condensed summary such as creations, renames and mode changes"),
-			  DIFF_FORMAT_SUMMARY, PARSE_OPT_NONEG),
+			  DIFF_FORMAT_SUMMARY, DIFF_FORMAT_NO_OUTPUT),
 		OPT_BIT_F(0, "name-only", &options->output_format,
 			  N_("show only names of changed files"),
 			  DIFF_FORMAT_NAME, PARSE_OPT_NONEG),
diff --git a/t/t4000-diff-format.sh b/t/t4000-diff-format.sh
index bfcaae390f..762b9d4c60 100755
--- a/t/t4000-diff-format.sh
+++ b/t/t4000-diff-format.sh
@@ -5,6 +5,9 @@ 
 
 test_description='Test built-in diff output engine.
 
+We happen to know that all diff plumbing and diff Porcelain share the
+same command line parser, so testing one should be sufficient; pick
+diff-files as a representative.
 '
 
 TEST_PASSES_SANITIZE_LEAK=true
@@ -16,9 +19,11 @@  Line 2
 line 3'
 cat path0 >path1
 chmod +x path1
+mkdir path2
+>path2/path3
 
 test_expect_success 'update-index --add two files with and without +x.' '
-	git update-index --add path0 path1
+	git update-index --add path0 path1 path2/path3
 '
 
 mv path0 path0-
@@ -91,4 +96,29 @@  test_expect_success 'git diff-files --patch --no-patch does not show the patch'
 	test_must_be_empty err
 '
 
+
+# Smudge path2/path3 so that dirstat has something to show
+date >path2/path3
+
+for format in stat raw numstat shortstat dirstat
+do
+	test_expect_success "--no-patch in 'git diff-files --no-patch --$format' is a no-op" '
+		git diff-files --no-patch "--$format" >actual &&
+		git diff-files "--$format" >expect &&
+		test_cmp expect actual
+	'
+
+	test_expect_success "--no-patch clears all previous ones" '
+		git diff-files --$format -s -p >actual &&
+		git diff-files -p >expect &&
+		test_cmp expect actual
+	'
+
+	test_expect_success "--no-patch in 'git diff --no-patch --$format' is a no-op" '
+		git diff --no-patch "--$format" >actual &&
+		git diff "--$format" >expect &&
+		test_cmp expect actual
+	'
+done
+
 test_done