diff mbox series

RFC: Using '--no-output-indicator-old' to only show new state

Message ID CAHk-=wgh8emJn-+FtxN=m_SCPiP6cGKHU-5ozzV9tWBMxn+xcA@mail.gmail.com (mailing list archive)
State New, archived
Headers show
Series RFC: Using '--no-output-indicator-old' to only show new state | expand

Commit Message

Linus Torvalds March 10, 2022, 7:40 p.m. UTC
So after another round of doing

     git diff | grep -v '^-'

to just show what the end result of a patch is, I decided that there
has to be a better way.

Of course, to normal people, that "better way" is probably some GUI
tool that shows diffs as "before/after" in two different frames next
to each other, but I'm a grumpy old man ("Get off my lawn") and I do
everything but read my email in a standard text-only terminal.

So this RFC is that better way for me.

It's a RFC because maybe people think this is stupid, but even more so
becasue I think it should also go with an actual new and more legible
name - which this tech demonstration does *not* do, because I suspect
people have more opinions on the option name than they have on the
trivial technical parts.

IOW, with this patch, you can do

    git show --no-output-indicator-old

and it shows the diff but without the 'old' lines. That option name is
horrible to write, though, and even with command line completion I
think it migth be worth doing as

   git show --new

or something like that.

When would you want this? The case _I_ tend to use that "grep -v '^-'"
thing is when I re-write a function, and a regular diff really is just
very messy and what I really want to see is just the end result to
verify that "yeah, this looks sane".

In case people care, the immediate (current) cause of this was kernel
commit fe673d3f5bf1 ("mm: gup: make fault_in_safe_writeable() use
fixup_user_fault()"), which really is a fairly messy diff, but what
matters is the much simplified end result of the
fault_in_safe_writeable() function.

But in the commit message, I put a random git commit as an example
instead, just to make it easier to see what's up when you're a git
developer, not a kernel developer.

And hey, there are probably smarter things that could be done: this
doesn't work with word-diffs, for example. Maybe people would want an
equivalent "don't show the deleted word" instead. But this is really
really simple.

And maybe there already was some better way to do this.

Comments?

(And for consistency, this obviously also allows
"--no-output-indicator-new", but I've never found _that_ particularly
useful. But GUI diff tools most certainly also tend to show the old
state on one side, so maybe it's would be reasonable to use as a
"compare the old state vs the new state" in terminals side-by-side. I
just tend to care much more about "what is the end result")

               Linus

Comments

Junio C Hamano March 10, 2022, 8:13 p.m. UTC | #1
Linus Torvalds <torvalds@linux-foundation.org> writes:

> So after another round of doing
>
>      git diff | grep -v '^-'
>
> to just show what the end result of a patch is, I decided that there
> has to be a better way.
>
> Of course, to normal people, that "better way" is probably some GUI
> tool that shows diffs as "before/after" in two different frames next
> to each other, but I'm a grumpy old man ("Get off my lawn") and I do
> everything but read my email in a standard text-only terminal.

Sounds like the "apply --no-add" in the opposite direction ;-) I
would find it handy myself, too, though I tend to read my patches
after applying to my tree so the postimage is usually an invocation
of "less" away for me.

I do not think it is a bad idea to have an option to give only the
postimage and another option to give only the preimage.  It would
also trivially allow people to show the side-by-side diff in GUI.
Linus Torvalds March 10, 2022, 8:40 p.m. UTC | #2
On Thu, Mar 10, 2022 at 12:13 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Sounds like the "apply --no-add" in the opposite direction ;-)

I was thinking more the opposite of "--ours/theirs" when merging, but
yeah, I guess "--no-add" is technically even closer.

> I would find it handy myself, too, though I tend to read my patches
> after applying to my tree so the postimage is usually an invocation
> of "less" away for me.

Obviously just looking at the file itself is always an option, and I
do that too. But I traditionally do that "grep -v" trick as I'm
verifying the patch before sending it out (or before committing)
because it's such a nice way to limit the output just to the changed
parts.

> I do not think it is a bad idea to have an option to give only the
> postimage and another option to give only the preimage.  It would
> also trivially allow people to show the side-by-side diff in GUI.

I suspect people doing GUI's are happy just parsing the '-' and '+'
lines themselves, since they want both sides anyway.

For example, 'gitk' already has that diff/old/new checkbox, that does
exactly what my patch does.

And I doubt anybody wants gitk to re-run 'diff' just because somebody
clicked another option - it's only used to visualize the diff that was
already done differently.

Of course, I might be wrong. I didn't actually look at what 'gitk'
does. Maybe it _does_ re-run diff when you click that thing.

But that gitk behavior - which I also do use - is probably the best
way to explain the feature.  It's just that I also want to get that
"New version" behavior for plain "git diff" on the command line.

I don't know what a good command line option would be, though. I'd
like it to be somehat short, because the whole point of this is to be
a convenience feature.

So "--new/old"? "--pre/post"?

Or it could be something random, and tie it with the existing "-U"
option, where "-U+" would be "positive side only", and "-U5-" would be
"5 context lines, negative side only". Very dense and convenient,
maybe not all that intuitive?

                 Linus
Junio C Hamano March 10, 2022, 9:26 p.m. UTC | #3
Linus Torvalds <torvalds@linux-foundation.org> writes:

> So "--new/old"? "--pre/post"?
>
> Or it could be something random, and tie it with the existing "-U"
> option, where "-U+" would be "positive side only", and "-U5-" would be
> "5 context lines, negative side only". Very dense and convenient,
> maybe not all that intuitive?

I often use -W and the above would give us a natural extension, but
I agree that is a bit too dense and totally unintuitive.  As we use
parse-options for patch output formatting options, my pick would be
"--new-only" vs "--old-only" (if there were existing options that
has new/old in their names, "--preimage-only" and "--postimage-only"
would also work, but that's much longer), and let the option parser
accept unique abbreviations like "--new" and "--old".
Linus Torvalds March 10, 2022, 10:13 p.m. UTC | #4
On Thu, Mar 10, 2022 at 1:26 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> I often use -W and the above would give us a natural extension, but
> I agree that is a bit too dense and totally unintuitive.  As we use
> parse-options for patch output formatting options, my pick would be
> "--new-only" vs "--old-only" (

I was "ok, that's really easy" and did

+               OPT_ALIAS(0, "new-only", "no-output-indicator-old"),
+               OPT_ALIAS(0, "old-only", "no-output-indicator-new"),

but sadly the parse-options alias code isn't quite smart enough.

Doing it with an explicit callback obviously works, but the "unique
abbreviations" part doesn't actually work for me. I think it's due to
PARSE_OPT_KEEP_UNKNOWN making the abbreviated options not work, but I
don't know tha option parsing code well enough.

Here's the stupid patch that "works" but doesn't allow the shortened
version. Maybe somebody can point out what silly thing I did wrong.

               Linus
Linus Torvalds March 11, 2022, 12:01 a.m. UTC | #5
On Thu, Mar 10, 2022 at 2:13 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Here's the stupid patch that "works" but doesn't allow the shortened
> version. Maybe somebody can point out what silly thing I did wrong.

I just created a short alias to do this. Maybe there's some smarter
option, but this seems to work.

I've updated the commit message - I kept the --no-output-indicator-xyz
form since it really logically ends up being exactly that, but I guess
those changes could also be dropped.

Hmm?

              Linus
Ævar Arnfjörð Bjarmason March 11, 2022, 7:48 a.m. UTC | #6
On Thu, Mar 10 2022, Linus Torvalds wrote:

> On Thu, Mar 10, 2022 at 2:13 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:

If it's not too much trouble inline patches like those you send to the
LKML would be preferred :) Re-arranging things a bit to quote them....

> diff --git a/diff.c b/diff.c
> index 2bd5e0d817..f37f0b383a 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -1254,6 +1254,8 @@ static void emit_line_ws_markup(struct diff_options *o,
>  	const char *ws = NULL;
>  	int sign = o->output_indicators[sign_index];
>  
> +	if (!sign)
> +		return;
>  	if (o->ws_error_highlight & ws_rule) {
>  		ws = diff_get_color_opt(o, DIFF_WHITESPACE);
>  		if (!*ws)
> @@ -4986,6 +4988,10 @@ static int diff_opt_char(const struct option *opt,
>  {
>  	char *value = opt->value;
>  
> +	if (unset) {
> +		*value = 0;
> +		return 0;
> +	}
>  	BUG_ON_OPT_NEG(unset);
>  	if (arg[1])
>  		return error(_("%s expects a character, got '%s'"),
> @@ -4994,6 +5000,17 @@ static int diff_opt_char(const struct option *opt,
>  	return 0;
>  }
>  
> +static int diff_opt_no_char(const struct option *opt,
> +			 const char *arg, int unset)
> +{
> +	char *value = opt->value;
> +
> +	BUG_ON_OPT_NEG(unset);
> +	BUG_ON_OPT_ARG(arg);
> +	*value = 0;
> +	return 0;
> +}
> +
>  static int diff_opt_color_moved(const struct option *opt,
>  				const char *arg, int unset)
>  {
> @@ -5476,17 +5493,27 @@ static void prep_parse_options(struct diff_options *options)
>  			       &options->output_indicators[OUTPUT_INDICATOR_NEW],
>  			       N_("<char>"),
>  			       N_("specify the character to indicate a new line instead of '+'"),
> -			       PARSE_OPT_NONEG, diff_opt_char),
> +			       0, diff_opt_char),
>  		OPT_CALLBACK_F(0, "output-indicator-old",
>  			       &options->output_indicators[OUTPUT_INDICATOR_OLD],
>  			       N_("<char>"),
>  			       N_("specify the character to indicate an old line instead of '-'"),
> -			       PARSE_OPT_NONEG, diff_opt_char),
> +			       0, diff_opt_char),
>  		OPT_CALLBACK_F(0, "output-indicator-context",
>  			       &options->output_indicators[OUTPUT_INDICATOR_CONTEXT],
>  			       N_("<char>"),
>  			       N_("specify the character to indicate a context instead of ' '"),
>  			       PARSE_OPT_NONEG, diff_opt_char),
> +		OPT_CALLBACK_F(0, "new-only",
> +			       &options->output_indicators[OUTPUT_INDICATOR_OLD], NULL,
> +			       N_("show only new lines in diff"),
> +			       PARSE_OPT_NONEG | PARSE_OPT_NOARG, diff_opt_no_char),
> +		OPT_CALLBACK_F(0, "old-only",
> +			       &options->output_indicators[OUTPUT_INDICATOR_NEW], NULL,
> +			       N_("show only old lines in diff"),
> +			       PARSE_OPT_NONEG | PARSE_OPT_NOARG, diff_opt_no_char),
> +		OPT_ALIAS(0, "new", "new-only"),
> +		OPT_ALIAS(0, "old", "old-only"),
>  

FWIW the reason...

>>
>> Here's the stupid patch that "works" but doesn't allow the shortened
>> version. Maybe somebody can point out what silly thing I did wrong.
>
> I just created a short alias to do this. Maybe there's some smarter
> option, but this seems to work.

..you needed to do that is because we pass PARSE_OPT_KEEP_UNKNOWN
parse_options() there, which turns off our abbreviation discovery logic,
i.e. where we'll take a --foo, --foob, --fooba if we have a --foobar
option defined.

Looking at it there appears to be no good reason for why it's so
overzelous. If I remove the relevant PARSE_OPT_KEEP_UNKNOWN logic in
parse-options.c our entire test suite passes, except for one obscure
part where "git format-patch --output=x" needs to not abbreviate to "git
format-patch --output-directory=x".

Which, for reasons is something where we do option parsing in two
passes, i.e. we hand the "output" option off to the revision walker.

We really should just teach those callsites to "grab" the revisions.c
options and do the parse in one pass, but in the meantime this is a less
invasive way to have that case work, which makes your code work without
the OPT_ALIAS() hunk:

	diff --git a/builtin/log.c b/builtin/log.c
	index c211d66d1d0..adacc65bc7e 100644
	--- a/builtin/log.c
	+++ b/builtin/log.c
	@@ -1811,7 +1811,8 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix)
	 			    PARSE_OPT_NONEG, subject_prefix_callback),
	 		OPT_CALLBACK_F('o', "output-directory", &output_directory,
	 			    N_("dir"), N_("store resulting files in <dir>"),
	-			    PARSE_OPT_NONEG, output_directory_callback),
	+			    PARSE_OPT_NONEG | PARSE_OPT_NO_ABBREV,
	+			    output_directory_callback),
	 		OPT_CALLBACK_F('k', "keep-subject", &rev, NULL,
	 			    N_("don't strip/add [PATCH]"),
	 			    PARSE_OPT_NOARG | PARSE_OPT_NONEG, keep_callback),
	diff --git a/parse-options.c b/parse-options.c
	index 6e57744fd22..9d0c4694482 100644
	--- a/parse-options.c
	+++ b/parse-options.c
	@@ -332,7 +332,7 @@ static enum parse_opt_result parse_long_opt(
	 			rest = NULL;
	 		if (!rest) {
	 			/* abbreviated? */
	-			if (!(p->flags & PARSE_OPT_KEEP_UNKNOWN) &&
	+			if (!(options->flags & PARSE_OPT_NO_ABBREV) &&
	 			    !strncmp(long_name, arg, arg_end - arg)) {
	 is_abbreviated:
	 				if (abbrev_option &&
	diff --git a/parse-options.h b/parse-options.h
	index 685fccac137..f6372f60edb 100644
	--- a/parse-options.h
	+++ b/parse-options.h
	@@ -48,6 +48,7 @@ enum parse_opt_option_flags {
	 	PARSE_OPT_NOCOMPLETE = 1 << 9,
	 	PARSE_OPT_COMP_ARG = 1 << 10,
	 	PARSE_OPT_CMDMODE = 1 << 11,
	+	PARSE_OPT_NO_ABBREV = 1 << 12,
	 };
	 
	 enum parse_opt_result {

Of course that also makes --new and --old work (even --ne and --ol), but
we generally accept that in other places, so... :)

> I've updated the commit message - I kept the --no-output-indicator-xyz
> form since it really logically ends up being exactly that, but I guess
> those changes could also be dropped.

I see this interacts nicely with the -I option, at least in my testing,
i.e. (with my change applied) this will ignore the first hunk of your
patch:

    --new -p -I'sign|return'

What this doesn't interact "well" with, or perhaps it's what we actually
want is e.g. -G:

    git log -1 -p -G'PARSE_OPT_NONEG, diff_opt_char' --new

If the user is expecting us to search through the displayed hunks this
should not show your change, but it does.

FWIW I have a local patch I've been meaning to submit which extends -G
to allow you to search through the full displayed context, right now we
strip it down to the "+" and "-" lines and regex match on that.

So you could do, in this case:

    git log --pickaxe-patch -G'^-.*diff_opt_char' -1

To search through your removed hunks.

All of which is to say that I think the core semantics you're
implementing here make sense, it's just worth thinking about, and we
should have tests for some of the edge cases for a non-RFC. I.e. it's
not obvious that we have a couple of diff "passes", and -G, -I and these
--new-only and --old-only options search through different versions of
that diff.
Linus Torvalds March 11, 2022, 7:15 p.m. UTC | #7
On Fri, Mar 11, 2022 at 12:43 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> If it's not too much trouble inline patches like those you send to the
> LKML would be preferred :)

Heh. When I send patches to LKML inline I end up doing whitespace
damage to them.

Part of the problem is that gmail - which is what I use for normal
email - makes it quite hard to send patches with the legacy tools I
have when you have set your account up to have all the security
measures (2FA etc).

In fact, I think these days gmail actively blocks any clients it
doesn't trust, even if you were to do the special app key thing (which
I have occasionally done).

Despite that, I find gmail very convenient for my workflow, but  the
two issues I have:

 (a) I'd like to have "inline attachments" for the web interface so
that I could attach a text file that would *not* get whitespace
damaged

 (b) the mobile client should have a "no html" mode (for when I'm on
the road and reply to lkml)

And I know a lot of people inside of google, but I gave up trying to
get the message through to the gmail people (everybody I knew said
"it's a different group and we have our own issues with them")

Yes, yes, I could use other tools - and other email setups - for
sending out patches, but realistically these days I never actually
send out patches any more. These kinds of very occasional git patches
are the exception, not the rule.

So, for example, I could trivially use "torvalds@kernel.org" to send
out patches, but I've tried to avoid having multiple active email
accounts and confusing people with different names etc.

My lkml patches are invariably examples (ie "how about we do it this
way") and then I actually mostly *prefer* that they be
whitespace-damaged so that people won't apply them mindlessly. In
fact, I often explicitly indent the patch to make it really obvious
that "yes, this is a patch, but it's meant to be thought about, not
used mindlessly".

> FWIW the reason... [...]
> ..you needed to do that is because we pass PARSE_OPT_KEEP_UNKNOWN
> parse_options() there, which turns off our abbreviation discovery logic,
> i.e. where we'll take a --foo, --foob, --fooba if we have a --foobar
> option defined.

Yeah, I had gotten to the PARSE_OPT_KEEP_UNKNOWN part, and then
decided I didn't understand why the option parsing code did that, and
I wasn't going to touch it.

I saw your patch to fix it, and it looked sane to me. Thanks.

> We really should just teach those callsites to "grab" the revisions.c
> options and do the parse in one pass,

I was assuming the two-pass thing to avoid doing any callbacks before
deciding the option is unambiguous.

Not that I see why that would be a big deal as long as you just error
out anyway, but maybe some users end up ignoring errors?

You know that code better than I do in any case.

> Of course that also makes --new and --old work (even --ne and --ol), but
> we generally accept that in other places, so... :)

I do wonder if abbreviation should have some limit (ie single-letter
abbreviation sounds very suspect even if they are unique, and even
two-letter ones sound odd), but it probably doesn't matter.

> I see this interacts nicely with the -I option, at least in my testing,
> i.e. (with my change applied) this will ignore the first hunk of your
> patch:
>
>     --new -p -I'sign|return'

So it wasn't really intentional - I saw it _purely_ as a "filter
output" thing, not as a "filter the patch generation".

In fact, I like the part where it often shows patch hunks that have no
actual changes, because the only changes in that hunk were removals.
To me, that's a feature - it shows the end result of the patch in that
place.

But it also means:

> What this doesn't interact "well" with, or perhaps it's what we actually
> want is e.g. -G:

Yeah, exactly because I saw it as purely a "filter the final output of
the patch" thing, it means that things that filter on the *contents*
of the patch aren't affected.

So that was my mental model, but yes, if your mental model is that
"-G" acts on the output, then it's a mis-feature.

IOW, to me, this would be a bigger decision about what kinds of
semantics you want.

Now, the *real* downside of "--new" to me is that it really doesn't
work very well with --word-diff. That's where I think it would really
make sense to still work just fine. Particularly with
"--word-diff=color" I think "--new" (and "--old") would make a lot of
sense, but my patch very much did *not* change the word-diff output
machinery to take the output_indicators into account.

A big reason for that is that while I occasionally use --word-diff
(and find it very useful when I do), I've never touched that code, so
it's entirely unfamiliar to me.

                      Linus
Konstantin Ryabitsev March 11, 2022, 7:42 p.m. UTC | #8
On Fri, Mar 11, 2022 at 11:15:10AM -0800, Linus Torvalds wrote:
> So, for example, I could trivially use "torvalds@kernel.org" to send
> out patches, but I've tried to avoid having multiple active email
> accounts and confusing people with different names etc.

FWIW, you can send mail through mail.kernel.org with your regular
torvalds@linux-foundation.org email. This is what akpm does, and it's kosher
as far as DKIM/DMARC is concerned.

If you'd like to try that out, you can set that up with git-send-email:
https://korg.docs.kernel.org/mail.html#sending-outgoing-mail

Just set your sendemail.from=torvalds@linux-foundation.org instead of
torvalds@kernel.org.

-K
diff mbox series

Patch

From 95e926c38267eb7ec8956ea47db55bf8080fb282 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Thu, 10 Mar 2022 11:00:43 -0800
Subject: [PATCH 1/2] Allow '--no-output-indicator-{old,new}' to disable diff
 output

This is particularly useful if you want to just see the end result of a
diff, without the original lines that have been removed (or the reverse).

That's a fairly common model in various GUI diff viewers, but it's not
unusual to want to just see "what is the end result of my changes"
without seeing (a) everything that didn't change and (b) the old removed
state.

Example:

    git show --no-output-indicator-old 04bf052eef

to see just the end result of the changes in commit 04bf052eef ("grep:
simplify config parsing and option parsing").

This is a technology presentation, I think it needs a shorter option
name too if people agree that this is useful (this basically replaces my
hacky

    git diff | grep -v '^-'

that I use for the same purpose, but the very long argument name
obviously makes it no more convenient, although the end result is
qualitatively better).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 diff.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/diff.c b/diff.c
index 2bd5e0d817..895951b849 100644
--- a/diff.c
+++ b/diff.c
@@ -1254,6 +1254,8 @@  static void emit_line_ws_markup(struct diff_options *o,
 	const char *ws = NULL;
 	int sign = o->output_indicators[sign_index];
 
+	if (!sign)
+		return;
 	if (o->ws_error_highlight & ws_rule) {
 		ws = diff_get_color_opt(o, DIFF_WHITESPACE);
 		if (!*ws)
@@ -4986,6 +4988,10 @@  static int diff_opt_char(const struct option *opt,
 {
 	char *value = opt->value;
 
+	if (unset) {
+		*value = 0;
+		return 0;
+	}
 	BUG_ON_OPT_NEG(unset);
 	if (arg[1])
 		return error(_("%s expects a character, got '%s'"),
@@ -5476,12 +5482,12 @@  static void prep_parse_options(struct diff_options *options)
 			       &options->output_indicators[OUTPUT_INDICATOR_NEW],
 			       N_("<char>"),
 			       N_("specify the character to indicate a new line instead of '+'"),
-			       PARSE_OPT_NONEG, diff_opt_char),
+			       0, diff_opt_char),
 		OPT_CALLBACK_F(0, "output-indicator-old",
 			       &options->output_indicators[OUTPUT_INDICATOR_OLD],
 			       N_("<char>"),
 			       N_("specify the character to indicate an old line instead of '-'"),
-			       PARSE_OPT_NONEG, diff_opt_char),
+			       0, diff_opt_char),
 		OPT_CALLBACK_F(0, "output-indicator-context",
 			       &options->output_indicators[OUTPUT_INDICATOR_CONTEXT],
 			       N_("<char>"),
-- 
2.35.1.459.gabbef95d73