pager: die when paging to non-existing command

Message ID	f7106878-5ec5-4fe7-940b-2fb1d9707f7d@gmail.com (mailing list archive)
State	Superseded
Headers	show Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8D811A4F1D for <git@vger.kernel.org>; Thu, 20 Jun 2024 17:25:46 +0000 (UTC) Message-ID: <f7106878-5ec5-4fe7-940b-2fb1d9707f7d@gmail.com> Date: Thu, 20 Jun 2024 19:25:43 +0200 Precedence: bulk MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: =?utf-8?q?Rub=C3=A9n_Justo?= <rjusto@gmail.com> Content-Language: en-US To: Git List <git@vger.kernel.org> Cc: Jeff King <peff@peff.net> Subject: [PATCH] pager: die when paging to non-existing command Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit
Series	pager: die when paging to non-existing command \| expand pager: die when paging to non-existing command

Rubén Justo June 20, 2024, 5:25 p.m. UTC

When trying to execute a non-existent program from GIT_PAGER, we display
an error.  However, we also send the complete text to the terminal
and return a successful exit code.  This can be confusing for the user
and the displayed error could easily become obscured by a lengthy
text.

For example, here the error message would be very far above after
sending 50 MB of text:

    $ GIT_PAGER=non-existent t/test-terminal.perl git log | wc -c
    error: cannot run non-existent: No such file or directory
    50314363

Let's make the error clear by aborting the process and return an error
so that the user can easily correct their mistake.

This will be the result of the change:

    $ GIT_PAGER=non-existent t/test-terminal.perl git log | wc -c
    error: cannot run non-existent: No such file or directory
    fatal: unable to start the pager: 'non-existent'
    0

The behavior change we're introducing in this commit affects two tests
in t7006, which is a good sign regarding test coverage and requires us
to address it.

The first test is 'git skips paging non-existing command'.  This test
comes from f7991f01f2 (t7006: clean up SIGPIPE handling in trace2 tests,
2021-11-21,) where a modification was made to a test that was originally
introduced in c24b7f6736 (pager: test for exit code with and without
SIGPIPE, 2021-02-02).  That original test was, IMHO, in the same
direction we're going in this commit.

At any rate, this test obviously needs to be adjusted to check the new
behavior we are introducing.  Do it.

The second test being affected is: 'non-existent pager doesnt cause
crash', introduced in f917f57f40 (pager: fix crash when pager program
doesn't exist, 2021-11-24).  As its name states, it has the intention of
checking that we don't introduce a regression that produces a crash when
GIT_PAGER points to a nonexistent program.

This test could be considered redundant nowadays, due to us already
having several tests checking implicitly what a non-existent command in
GIT_PAGER produces.  However, let's maintain a good belt-and-suspenders
strategy; adapt it to the new world.

Finally, it's worth noting that we are not changing the behavior if the
command specified in GIT_PAGER is a shell command.  In such cases, it
is:

    $ GIT_PAGER=:\;non-existent t/test-terminal.perl git log
    :;non-existent: 1: non-existent: not found
    died of signal 13 at t/test-terminal.perl line 33.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
---
 pager.c          |  2 +-
 t/t7006-pager.sh | 15 +++------------
 2 files changed, 4 insertions(+), 13 deletions(-)

Junio C Hamano June 20, 2024, 7:04 p.m. UTC | #1

Rubén Justo <rjusto@gmail.com> writes:

> Finally, it's worth noting that we are not changing the behavior if the
> command specified in GIT_PAGER is a shell command.  In such cases, it
> is:
>
>     $ GIT_PAGER=:\;non-existent t/test-terminal.perl git log
>     :;non-existent: 1: non-existent: not found
>     died of signal 13 at t/test-terminal.perl line 33.

IOW, the behaviours between the case where pager is spawned via the
shell and bypassing the shell are different , and the case where the
shell is involved behaves in a way that is easier to realize the
mistake, so change the other case to match.  WHich makes sense.

This seems to be an ancient regression introduced in bfdd9ffd
(Windows: Make the pager work., 2007-12-08), which did not really
affect anybody but MinGW users, but ea27a18c (spawn pager via
run_command interface, 2008-07-22) inherited the "if we failed to
start the pager, just silently return" from it when non-MinGW code
was unified to use the run_command() codepath (the latter is
attributed to Peff, which I presume is the reason why you cc'ed
him?).

> Signed-off-by: Rubén Justo <rjusto@gmail.com>
> ---
>  pager.c          |  2 +-
>  t/t7006-pager.sh | 15 +++------------
>  2 files changed, 4 insertions(+), 13 deletions(-)
>
> diff --git a/pager.c b/pager.c
> index e9e121db69..e4291cd0aa 100644
> --- a/pager.c
> +++ b/pager.c
> @@ -137,7 +137,7 @@ void setup_pager(void)
>  	pager_process.in = -1;
>  	strvec_push(&pager_process.env, "GIT_PAGER_IN_USE");
>  	if (start_command(&pager_process))
> -		return;
> +		die("unable to start the pager: '%s'", pager);

If this error string is not used elsewhere, it probably is a good
idea to "revert" to the original error message lost by ea27a18c,
which was:

		die("unable to execute pager '%s'", pager);

But I do not think of a reason why we want to avoid dying here.

Just in case there is a reason why we should instead silently return
on MinGW, I'll Cc the author of bfdd9ffd, though.

Will queue.  Thanks.

Rubén Justo June 20, 2024, 8:22 p.m. UTC | #2

On Thu, Jun 20, 2024 at 12:04:03PM -0700, Junio C Hamano wrote:

> > +		die("unable to start the pager: '%s'", pager);
> 
> If this error string is not used elsewhere, it probably is a good
> idea to "revert" to the original error message lost by ea27a18c,
> which was:
> 
> 		die("unable to execute pager '%s'", pager);

Makes sense.  Let me know if you need me to reroll.

> Just in case there is a reason why we should instead silently return
> on MinGW, I'll Cc the author of bfdd9ffd, though.

Yup.  I did notice the MINGW conditions in t7006 but, to be honest, I
hadn't thought about this.  Thank you for considering it and seeking
confirmation.

Junio C Hamano June 20, 2024, 9:03 p.m. UTC | #3

Rubén Justo <rjusto@gmail.com> writes:

> On Thu, Jun 20, 2024 at 12:04:03PM -0700, Junio C Hamano wrote:
>
>> > +		die("unable to start the pager: '%s'", pager);
>> 
>> If this error string is not used elsewhere, it probably is a good
>> idea to "revert" to the original error message lost by ea27a18c,
>> which was:
>> 
>> 		die("unable to execute pager '%s'", pager);
>
> Makes sense.  Let me know if you need me to reroll.
>
>> Just in case there is a reason why we should instead silently return
>> on MinGW, I'll Cc the author of bfdd9ffd, though.
>
> Yup.  I did notice the MINGW conditions in t7006 but, to be honest, I
> hadn't thought about this.  Thank you for considering it and seeking
> confirmation.

After these questions are answered satisfactory, can you send an
updated version to conclude the topic?

Thanks.

Johannes Sixt June 20, 2024, 10:17 p.m. UTC | #4

Am 20.06.24 um 21:04 schrieb Junio C Hamano:
> Just in case there is a reason why we should instead silently return
> on MinGW, I'll Cc the author of bfdd9ffd, though.

I don't think there is a reason. IIRC, originally on Windows, failing to
start a pager would still let Git operate normally, just without paged
output. I might have regarded this as better than to fail the operation.

-- Hannes

Junio C Hamano June 20, 2024, 10:35 p.m. UTC | #5

Johannes Sixt <j6t@kdbg.org> writes:

> Am 20.06.24 um 21:04 schrieb Junio C Hamano:
>> Just in case there is a reason why we should instead silently return
>> on MinGW, I'll Cc the author of bfdd9ffd, though.
>
> I don't think there is a reason. IIRC, originally on Windows, failing to
> start a pager would still let Git operate normally, just without paged
> output. I might have regarded this as better than to fail the operation.

The "better keep going than to fail" is what Rubén finds worse, so
both sides are quite understandable.

It is unlikely that real-world users are taking advantage of the
fact.  If they do not want their invocation of Git command paged,
"GIT_PAGER=cat git foo" is just as easy as "GIT_PAGER=no git foo",
and if it was done by mistake to configure a non-working pager
(e.g., configure core.pager to the program xyzzy and then
uninstalling xyzzy without realizing you still have users), fixing
it would be a one-time operation either way (you update core.pager
or you reinstall xyzzy), so I would say that it is better to make
the failure more stand out.

Thanks for a quick response.

Jeff King June 21, 2024, 6:40 a.m. UTC | #6

On Thu, Jun 20, 2024 at 07:25:43PM +0200, Rubén Justo wrote:

> When trying to execute a non-existent program from GIT_PAGER, we display
> an error.  However, we also send the complete text to the terminal
> and return a successful exit code.  This can be confusing for the user
> and the displayed error could easily become obscured by a lengthy
> text.
> 
> For example, here the error message would be very far above after
> sending 50 MB of text:
> 
>     $ GIT_PAGER=non-existent t/test-terminal.perl git log | wc -c
>     error: cannot run non-existent: No such file or directory
>     50314363
> 
> Let's make the error clear by aborting the process and return an error
> so that the user can easily correct their mistake.
> 
> This will be the result of the change:
> 
>     $ GIT_PAGER=non-existent t/test-terminal.perl git log | wc -c
>     error: cannot run non-existent: No such file or directory
>     fatal: unable to start the pager: 'non-existent'
>     0

OK. My initial reaction was "eh, who care? execve() failing is only one
error mode, and we might see all kinds of failure modes from a missing
or broken pager".

But this:

> Finally, it's worth noting that we are not changing the behavior if the
> command specified in GIT_PAGER is a shell command.  In such cases, it
> is:
> 
>     $ GIT_PAGER=:\;non-existent t/test-terminal.perl git log
>     :;non-existent: 1: non-existent: not found
>     died of signal 13 at t/test-terminal.perl line 33.

...shows what happens in those other cases, and you are making things
more consistent. So that seems reasonable to me.

> The behavior change we're introducing in this commit affects two tests
> in t7006, which is a good sign regarding test coverage and requires us
> to address it.
> 
> The first test is 'git skips paging non-existing command'.  This test
> comes from f7991f01f2 (t7006: clean up SIGPIPE handling in trace2 tests,
> 2021-11-21,) where a modification was made to a test that was originally
> introduced in c24b7f6736 (pager: test for exit code with and without
> SIGPIPE, 2021-02-02).  That original test was, IMHO, in the same
> direction we're going in this commit.

Yeah, the point of f7991f01f2 was just to clean up the tests. The
modification was only documenting what Git happened to do for that case
now, and not meant as an endorsement of the behavior. ;) So I have no
problem changing it.

> The second test being affected is: 'non-existent pager doesnt cause
> crash', introduced in f917f57f40 (pager: fix crash when pager program
> doesn't exist, 2021-11-24).  As its name states, it has the intention of
> checking that we don't introduce a regression that produces a crash when
> GIT_PAGER points to a nonexistent program.
> 
> This test could be considered redundant nowadays, due to us already
> having several tests checking implicitly what a non-existent command in
> GIT_PAGER produces.  However, let's maintain a good belt-and-suspenders
> strategy; adapt it to the new world.

OK. I would also be happy to see it go. The crash was about reusing the
pager child_process struct, and no we know that cannot happen. Either we
run the pager or we immediately bail. I think that the code change in
that commit could also be reverted (to always re-init the child
process), but it's probably more defensive to keep it.

-Peff

Jeff King June 21, 2024, 6:51 a.m. UTC | #7

On Thu, Jun 20, 2024 at 03:35:46PM -0700, Junio C Hamano wrote:

> Johannes Sixt <j6t@kdbg.org> writes:
> 
> > Am 20.06.24 um 21:04 schrieb Junio C Hamano:
> >> Just in case there is a reason why we should instead silently return
> >> on MinGW, I'll Cc the author of bfdd9ffd, though.
> >
> > I don't think there is a reason. IIRC, originally on Windows, failing to
> > start a pager would still let Git operate normally, just without paged
> > output. I might have regarded this as better than to fail the operation.
> 
> The "better keep going than to fail" is what Rubén finds worse, so
> both sides are quite understandable.
> 
> It is unlikely that real-world users are taking advantage of the
> fact.  If they do not want their invocation of Git command paged,
> "GIT_PAGER=cat git foo" is just as easy as "GIT_PAGER=no git foo",
> and if it was done by mistake to configure a non-working pager
> (e.g., configure core.pager to the program xyzzy and then
> uninstalling xyzzy without realizing you still have users), fixing
> it would be a one-time operation either way (you update core.pager
> or you reinstall xyzzy), so I would say that it is better to make
> the failure more stand out.

The compelling thing to me is that just about every other failure mode
of the pager will result in a SIGPIPE, so the "be nice with a
non-working pager" trick really only applies to the very narrow case of
execve() failing.

I did assume that a bogus option like:

  # oops, there is no -l option!
  GIT_PAGER='less -l' git log

would be a plausible such misconfiguration, but to my surprise "less"
prints "hey, there is no -l option" and then pages anyway. How helpful. :)

But something like:

  # oops, there is no -X option!
  GIT_PAGER='cat -X' git log

yields just:

  cat: invalid option -- 'X'
  Try 'cat --help' for more information.

with no other output. It's a little confusing if you don't realize that
"cat" is the pager. We obviously don't want to complain about SIGPIPE,
because it's common for the user to simply exit the pager without
reading all of the possible data. It might be nice if we printed some
message when the pager exits non-zero, but I'd worry there might be
false positives, depending on the behavior of various pagers.

-Peff

Phillip Wood June 21, 2024, 11:28 a.m. UTC | #8

On 20/06/2024 20:04, Junio C Hamano wrote:
> Rubén Justo <rjusto@gmail.com> writes:
>> --- a/pager.c
>> +++ b/pager.c
>> @@ -137,7 +137,7 @@ void setup_pager(void)
>>   	pager_process.in = -1;
>>   	strvec_push(&pager_process.env, "GIT_PAGER_IN_USE");
>>   	if (start_command(&pager_process))
>> -		return;
>> +		die("unable to start the pager: '%s'", pager);
> 
> If this error string is not used elsewhere, it probably is a good
> idea to "revert" to the original error message lost by ea27a18c,
> which was:
> 
> 		die("unable to execute pager '%s'", pager);

Either way I think we want to mark the message for translation

Best Wishes

Phillip

Dragan Simic June 21, 2024, 5:11 p.m. UTC | #9

On 2024-06-21 00:35, Junio C Hamano wrote:
> Johannes Sixt <j6t@kdbg.org> writes:
> 
>> Am 20.06.24 um 21:04 schrieb Junio C Hamano:
>>> Just in case there is a reason why we should instead silently return
>>> on MinGW, I'll Cc the author of bfdd9ffd, though.
>> 
>> I don't think there is a reason. IIRC, originally on Windows, failing 
>> to
>> start a pager would still let Git operate normally, just without paged
>> output. I might have regarded this as better than to fail the 
>> operation.
> 
> The "better keep going than to fail" is what Rubén finds worse, so
> both sides are quite understandable.
> 
> It is unlikely that real-world users are taking advantage of the
> fact.  If they do not want their invocation of Git command paged,
> "GIT_PAGER=cat git foo" is just as easy as "GIT_PAGER=no git foo",
> and if it was done by mistake to configure a non-working pager
> (e.g., configure core.pager to the program xyzzy and then
> uninstalling xyzzy without realizing you still have users), fixing
> it would be a one-time operation either way (you update core.pager
> or you reinstall xyzzy), so I would say that it is better to make
> the failure more stand out.

To me, failing when the configured pager cannot be executed is the
way to go.  Basically, if an invalid pager is configured, we're
actually obliged to produce a failure, simply because we have to
follow and apply the configuration strictly.  This also applies
to (partially) invalid configurations.

Rubén Justo June 21, 2024, 9:11 p.m. UTC | #10

On Fri, Jun 21, 2024 at 02:40:20AM -0400, Jeff King wrote:

> > When trying to execute a non-existent program from GIT_PAGER, we display
> > an error.  However, we also send the complete text to the terminal
> > and return a successful exit code.  This can be confusing for the user
> > and the displayed error could easily become obscured by a lengthy
> > text.
> > 
> > For example, here the error message would be very far above after
> > sending 50 MB of text:
> > 
> >     $ GIT_PAGER=non-existent t/test-terminal.perl git log | wc -c
> >     error: cannot run non-existent: No such file or directory
> >     50314363
> > 
> > Let's make the error clear by aborting the process and return an error
> > so that the user can easily correct their mistake.
> > 
> > This will be the result of the change:
> > 
> >     $ GIT_PAGER=non-existent t/test-terminal.perl git log | wc -c
> >     error: cannot run non-existent: No such file or directory
> >     fatal: unable to start the pager: 'non-existent'
> >     0
> 
> OK. My initial reaction was "eh, who care? execve() failing is only one
> error mode, and we might see all kinds of failure modes from a missing
> or broken pager".
> 
> But this:
> 
> > Finally, it's worth noting that we are not changing the behavior if the
> > command specified in GIT_PAGER is a shell command.  In such cases, it
> > is:
> > 
> >     $ GIT_PAGER=:\;non-existent t/test-terminal.perl git log
> >     :;non-existent: 1: non-existent: not found
> >     died of signal 13 at t/test-terminal.perl line 33.
> 
> ...shows what happens in those other cases, and you are making things
> more consistent. So that seems reasonable to me.
> 
> > The behavior change we're introducing in this commit affects two tests
> > in t7006, which is a good sign regarding test coverage and requires us
> > to address it.
> > 
> > The first test is 'git skips paging non-existing command'.  This test
> > comes from f7991f01f2 (t7006: clean up SIGPIPE handling in trace2 tests,
> > 2021-11-21,) where a modification was made to a test that was originally
> > introduced in c24b7f6736 (pager: test for exit code with and without
> > SIGPIPE, 2021-02-02).  That original test was, IMHO, in the same
> > direction we're going in this commit.
> 
> Yeah, the point of f7991f01f2 was just to clean up the tests. The
> modification was only documenting what Git happened to do for that case
> now, and not meant as an endorsement of the behavior. ;) So I have no
> problem changing it.
> 
> > The second test being affected is: 'non-existent pager doesnt cause
> > crash', introduced in f917f57f40 (pager: fix crash when pager program
> > doesn't exist, 2021-11-24).  As its name states, it has the intention of
> > checking that we don't introduce a regression that produces a crash when
> > GIT_PAGER points to a nonexistent program.
> > 
> > This test could be considered redundant nowadays, due to us already
> > having several tests checking implicitly what a non-existent command in
> > GIT_PAGER produces.  However, let's maintain a good belt-and-suspenders
> > strategy; adapt it to the new world.
> 
> OK. I would also be happy to see it go. The crash was about reusing the
> pager child_process struct, and no we know that cannot happen. Either we
> run the pager or we immediately bail. I think that the code change in
> that commit could also be reverted (to always re-init the child
> process), but it's probably more defensive to keep it.

Yeah.  The name is what took most of my attention, I have to admit.  A
test named like "check that it doesn't crash" is defensive. ;)

Let's keep it.

Thanks for your review.

Junio C Hamano June 21, 2024, 11:21 p.m. UTC | #11

Phillip Wood <phillip.wood123@gmail.com> writes:

> On 20/06/2024 20:04, Junio C Hamano wrote:
>> Rubén Justo <rjusto@gmail.com> writes:
>>> --- a/pager.c
>>> +++ b/pager.c
>>> @@ -137,7 +137,7 @@ void setup_pager(void)
>>>   	pager_process.in = -1;
>>>   	strvec_push(&pager_process.env, "GIT_PAGER_IN_USE");
>>>   	if (start_command(&pager_process))
>>> -		return;
>>> +		die("unable to start the pager: '%s'", pager);
>> If this error string is not used elsewhere, it probably is a good
>> idea to "revert" to the original error message lost by ea27a18c,
>> which was:
>> 		die("unable to execute pager '%s'", pager);
>
> Either way I think we want to mark the message for translation

Given that none of the die() message in this file is marked for
localization, I would strongly prefer to see this patch not to do
so.  Possibly as part of a larger clean-up patch series, but not as
"while at it" item for this fix.

Thanks.

Johannes Schindelin June 24, 2024, 7:35 a.m. UTC | #12

Hi Hannes,

On Fri, 21 Jun 2024, Johannes Sixt wrote:

> Am 20.06.24 um 21:04 schrieb Junio C Hamano:
> > Just in case there is a reason why we should instead silently return
> > on MinGW, I'll Cc the author of bfdd9ffd, though.
>
> I don't think there is a reason. IIRC, originally on Windows, failing to
> start a pager would still let Git operate normally, just without paged
> output. I might have regarded this as better than to fail the operation.

I recall regarding this a much better idea back then, too, because it
was quite finicky to convince the MinGW variant of Git to play nice with
the MSys variant of the pager.

In the meantime, things have become a lot more robust and consider it a
net improvement to the change the behavior to _not_ silently continue if
the pager failed to start.

Ciao,
Johannes

pager: die when paging to non-existing command

Commit Message

Comments

Patch