diff mbox series

diff: setup pager only before diff contents truly ready

Message ID pull.1817.git.git.1729370390416.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series diff: setup pager only before diff contents truly ready | expand

Commit Message

Philip Yung Oct. 19, 2024, 8:39 p.m. UTC
From: y5c4l3 <y5c4l3@proton.me>

git-diff setups pager at an early stage in cmd_diff; running diff with
invalid options like git diff --invalid will unexpectedly starts a
pager, which causes behavior inconsistency.

The pager setup routine should be moved right before the real diff
contents, in case there is any argv error.

Signed-off-by: y5c4l3 <y5c4l3@proton.me>
---
    diff: setup pager only before diff contents truly ready

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1817%2Fy5c4l3%2Fdiff-invalid-argv-remove-pager-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1817/y5c4l3/diff-invalid-argv-remove-pager-v1
Pull-Request: https://github.com/git/git/pull/1817

 builtin/diff.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)


base-commit: 15030f9556f545b167b1879b877a5d780252dc16

Comments

Kristoffer Haugsbakk Oct. 19, 2024, 8:57 p.m. UTC | #1
On Sat, Oct 19, 2024, at 22:39, Y5 via GitGitGadget wrote:
> From: y5c4l3 <y5c4l3@proton.me>
>
> git-diff setups pager at an early stage in cmd_diff; running diff with
> invalid options like git diff --invalid will unexpectedly starts a

s/starts a/start a/

> pager, which causes behavior inconsistency.
>
> The pager setup routine should be moved right before the real diff
> contents, in case there is any argv error.

*Any* argv error?  Maybe “an argv error”?

“any argv error” looks like there isn’t an agreement on plural/singular.
Jeff King Oct. 19, 2024, 9:19 p.m. UTC | #2
On Sat, Oct 19, 2024 at 08:39:50PM +0000, Y5 via GitGitGadget wrote:

> git-diff setups pager at an early stage in cmd_diff; running diff with
> invalid options like git diff --invalid will unexpectedly starts a
> pager, which causes behavior inconsistency.

I do think the outcome is a little nicer for the user, but I'd hesitate
to call it more inconsistent. Most of the rest of Git is setting up the
pager in git.c, before we even call into the builtin. So any early
errors will likewise go to the pager. E.g., try "git log --foo".

So I dunno. I'm not strictly opposed to making things nicer when we can
do so easily.  But the endgame of this is probably getting rid of
USE_PAGER entirely and asking each builtin to decide when to commit to
using the pager (presumably after option parsing).

And even then, it wouldn't apply to commands implemented as an external
process. And of course we can still die(), etc, after starting the
pager. So it would never be totally consistent.

> Signed-off-by: y5c4l3 <y5c4l3@proton.me>

We usually ask for something approaching a legal name, as this sign-off
is supposed to be certifying the DCO (see the "dco" section in
Documentation/SubmittingPatches).

>  builtin/diff.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)

The patch itself looks plausibly correct. The biggest regression risk
would be missing a spot that needed a new setup_diff_pager() call, and I
suspect we don't have good test coverage here. But going top-down from
builtin_diff(), I don't see any paths you missed.

-Peff
Philip Yung Oct. 21, 2024, 12:11 a.m. UTC | #3
> errors will likewise go to the pager. E.g., try "git log --foo".

Hope that I didn't take it the wrong way, but I don't think `git log --foo`
starts a pager, where the routine `setup_pager()` is put after argv parsing.
(checked by `strace`)

> So I dunno. I'm not strictly opposed to making things nicer when we can
> do so easily. But the endgame of this is probably getting rid of
> USE_PAGER entirely and asking each builtin to decide when to commit to
> using the pager (presumably after option parsing).
> 
> And even then, it wouldn't apply to commands implemented as an external
> process. And of course we can still die(), etc, after starting the
> pager. So it would never be totally consistent.

Despite the example, it is overall convincing since currently there is no design
to ensure the pager consistency. However, it's something we can do right now to
make, at least our own builtins, more consistent.

> We usually ask for something approaching a legal name, as this sign-off
> is supposed to be certifying the DCO (see the "dco" section in
> Documentation/SubmittingPatches).

Sorry if my first GitGitGadget experience bothers the mailing list, thanks for
the reminder. :)

> would be missing a spot that needed a new setup_diff_pager() call, and I
> suspect we don't have good test coverage here.

This is actually my concern as well when I was naively testing the coverage
using GDB, which turned out to be quite tedious. Would you consider it's fine to
add a pager consistency test for builtins, probably in another patch with regard
to `t7006-pager.sh` OR a new test `t7007`?

I'll reword and re-signoff this patch as soon as it looks really fine to you.

Philip Yung
Best Regards
Philip Yung Oct. 21, 2024, 12:17 a.m. UTC | #4
> s/starts a/start a/
>
> Any argv error? Maybe “an argv error”? 
> “any argv error” looks like there isn’t an agreement on plural/singular.
> 
> --
> Kristoffer Haugsbakk

Appreciate your revision, thanks!

Philip Yung
Best Regards
Jeff King Oct. 21, 2024, 7 p.m. UTC | #5
On Mon, Oct 21, 2024 at 12:11:33AM +0000, Philip Yung wrote:

> > errors will likewise go to the pager. E.g., try "git log --foo".
> 
> Hope that I didn't take it the wrong way, but I don't think `git log --foo`
> starts a pager, where the routine `setup_pager()` is put after argv parsing.
> (checked by `strace`)

Hmm, this actually depends on config. If you have pager.log defined,
we'll start it early in git.c, but otherwise not until the setup_pager()
call.

I was mildly surprised that pager.diff would not have the same effect,
even with your patch. But that's because we only handle pager config if
RUN_SETUP is true, which it is not for diff (because we might be doing
an out-of-repo --index diff). And the reason for that is mostly
historical, as reading config early interferes with repo setup (though
I'm even sure that's still the case, as check_pager_config() these days
uses the "early" config mechanism which is supposed to address that).

What a horrid mess of inconsistency and hacks. ;)

Likewise, any builtin that sets USE_PAGER in git.c will turn on the
pager early. So "git shortlog --foo" will go through the pager, as will
range-diff. I was somewhat surprised those are the only two these days.
Looks like 1fda91b511 (Fix 'git log' early pager startup error case,
2010-08-24) dropped many. And I think your patch is the spiritual
successor to that.

So I think in an ideal world we'd:

  - convert those two commands to do the pager setup themselves and
    retire the USE_PAGER flag entirely

  - move configured pager handling down into more commands. So git-log
    should set DELAY_PAGER_CONFIG and then call setup_auto_pager()
    rather than setup_pager(). Ideally DELAY_PAGER_CONFIG would be the
    default, but we can't do that until every builtin makes its own call
    to setup_auto_pager() at the right moment.

  - push calls to setup_pager() (or setup_auto_pager()) as far down
    within commands as possible (right before we start generating
    output). Your patch does that for git-diff, but there may be other
    cases.

  - consistently handle pager config whether USE_SETUP is set or not.
    That means git-diff should set DELAY_PAGER_CONFIG, since it handles
    the pager itself.

And that would make things more consistent overall, and avoid pushing
early errors into the pager (though of course it would still be possible
to get some errors in the pager if they happen after we start it).

I don't blame you if you don't want to start down that rabbit hole. :) I
think it would probably be OK to peck away at it incrementally, and your
patch does that.

> > would be missing a spot that needed a new setup_diff_pager() call, and I
> > suspect we don't have good test coverage here.
> 
> This is actually my concern as well when I was naively testing the coverage
> using GDB, which turned out to be quite tedious. Would you consider it's fine to
> add a pager consistency test for builtins, probably in another patch with regard
> to `t7006-pager.sh` OR a new test `t7007`?

TBH, I am not all that worried about adding tests just for your patch.
You'd need to identify all of the possible diff code paths in order to
add tests for them, which is the same thing you had to do to fix the
code paths. I was mostly just commenting that we're not likely to be
able to rely on existing tests to help us here.

It might be worth adding a test that shows off your improved diff
behavior, though I would be OK if it was a representative command and
not exhaustive. I think adding to t7006 should be fine.

If we fixed some of the bits I mentioned above, some of that should
likewise be covered by tests.

-Peff
Taylor Blau Oct. 21, 2024, 7:38 p.m. UTC | #6
On Mon, Oct 21, 2024 at 03:00:45PM -0400, Jeff King wrote:
> So I think in an ideal world we'd:
>
>   - convert those two commands to do the pager setup themselves and
>     retire the USE_PAGER flag entirely
>
>   - move configured pager handling down into more commands. So git-log
>     should set DELAY_PAGER_CONFIG and then call setup_auto_pager()
>     rather than setup_pager(). Ideally DELAY_PAGER_CONFIG would be the
>     default, but we can't do that until every builtin makes its own call
>     to setup_auto_pager() at the right moment.
>
>   - push calls to setup_pager() (or setup_auto_pager()) as far down
>     within commands as possible (right before we start generating
>     output). Your patch does that for git-diff, but there may be other
>     cases.
>
>   - consistently handle pager config whether USE_SETUP is set or not.
>     That means git-diff should set DELAY_PAGER_CONFIG, since it handles
>     the pager itself.
>
> And that would make things more consistent overall, and avoid pushing
> early errors into the pager (though of course it would still be possible
> to get some errors in the pager if they happen after we start it).
>
> I don't blame you if you don't want to start down that rabbit hole. :) I
> think it would probably be OK to peck away at it incrementally, and your
> patch does that.

Nicely put. I think what's nice about the patch here is that it starts
us down that direction you outlined above, so we'd want it regardless of
how much of the rest of the work Y5 is willing to do.

> > > would be missing a spot that needed a new setup_diff_pager() call, and I
> > > suspect we don't have good test coverage here.
> >
> > This is actually my concern as well when I was naively testing the coverage
> > using GDB, which turned out to be quite tedious. Would you consider it's fine to
> > add a pager consistency test for builtins, probably in another patch with regard
> > to `t7006-pager.sh` OR a new test `t7007`?
>
> TBH, I am not all that worried about adding tests just for your patch.
> You'd need to identify all of the possible diff code paths in order to
> add tests for them, which is the same thing you had to do to fix the
> code paths. I was mostly just commenting that we're not likely to be
> able to rely on existing tests to help us here.
>
> It might be worth adding a test that shows off your improved diff
> behavior, though I would be OK if it was a representative command and
> not exhaustive. I think adding to t7006 should be fine.

Agreed.

Thanks,
Taylor
diff mbox series

Patch

diff --git a/builtin/diff.c b/builtin/diff.c
index dca52d4221e..03340173700 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -105,6 +105,7 @@  static void builtin_diff_b_f(struct rev_info *revs,
 		     1, 0,
 		     blob[0]->path ? blob[0]->path : path,
 		     path);
+	setup_diff_pager(&revs->diffopt);
 	diffcore_std(&revs->diffopt);
 	diff_flush(&revs->diffopt);
 }
@@ -129,6 +130,7 @@  static void builtin_diff_blobs(struct rev_info *revs,
 		     &blob[0]->item->oid, &blob[1]->item->oid,
 		     1, 1,
 		     blob_path(blob[0]), blob_path(blob[1]));
+	setup_diff_pager(&revs->diffopt);
 	diffcore_std(&revs->diffopt);
 	diff_flush(&revs->diffopt);
 }
@@ -164,6 +166,7 @@  static void builtin_diff_index(struct rev_info *revs,
 	} else if (repo_read_index(the_repository) < 0) {
 		die_errno("repo_read_cache");
 	}
+	setup_diff_pager(&revs->diffopt);
 	run_diff_index(revs, option);
 }
 
@@ -201,6 +204,7 @@  static void builtin_diff_tree(struct rev_info *revs,
 		oid[swap] = &ent0->item->oid;
 		oid[1 - swap] = &ent1->item->oid;
 	}
+	setup_diff_pager(&revs->diffopt);
 	diff_tree_oid(oid[0], oid[1], "", &revs->diffopt);
 	log_tree_diff_flush(revs);
 }
@@ -227,6 +231,7 @@  static void builtin_diff_combined(struct rev_info *revs,
 		if (i != first_non_parent)
 			oid_array_append(&parents, &ent[i].item->oid);
 	}
+	setup_diff_pager(&revs->diffopt);
 	diff_tree_combined(&ent[first_non_parent].item->oid, &parents, revs);
 	oid_array_clear(&parents);
 }
@@ -283,6 +288,7 @@  static void builtin_diff_files(struct rev_info *revs, int argc, const char **arg
 				    0) < 0) {
 		die_errno("repo_read_index_preload");
 	}
+	setup_diff_pager(&revs->diffopt);
 	run_diff_files(revs, options);
 }
 
@@ -523,8 +529,6 @@  int cmd_diff(int argc,
 	rev.diffopt.flags.recursive = 1;
 	rev.diffopt.rotate_to_strict = 1;
 
-	setup_diff_pager(&rev.diffopt);
-
 	/*
 	 * Do we have --cached and not have a pending object, then
 	 * default to HEAD by hand.  Eek.