[v3,01/10] built-in add -i/-p: treat SIGPIPE as EOF

Message ID	5e258a8d2bb271433902b2e44c3a30a988bbf512.1578904171.git.gitgitgadget@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=G+/Y=3C=vger.kernel.org=git-owner@kernel.org> Message-Id: <5e258a8d2bb271433902b2e44c3a30a988bbf512.1578904171.git.gitgitgadget@gmail.com> In-Reply-To: <pull.175.v3.git.1578904171.gitgitgadget@gmail.com> References: <pull.175.v2.git.1577275020.gitgitgadget@gmail.com> <pull.175.v3.git.1578904171.gitgitgadget@gmail.com> From: "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> Date: Mon, 13 Jan 2020 08:29:22 +0000 Subject: [PATCH v3 01/10] built-in add -i/-p: treat SIGPIPE as EOF MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fcc: Sent To: git@vger.kernel.org Cc: Johannes Schindelin <johannes.schindelin@gmx.de>, Johannes Schindelin <johannes.schindelin@gmx.de> Sender: git-owner@vger.kernel.org Precedence: bulk
Series	built-in add -p: add support for the same config settings as the Perl version \| expand [v3,00/10] built-in add -p: add support for the same config settings as the Perl version [v3,01/10] built-in add -i/-p: treat SIGPIPE as EOF [v3,02/10] built-in add -p: support interactive.diffFilter [v3,03/10] built-in add -p: handle diff.algorithm [v3,04/10] terminal: make the code of disable_echo() reusable [v3,05/10] terminal: accommodate Git for Windows' default terminal [v3,06/10] terminal: add a new function to read a single keystroke [v3,07/10] built-in add -p: respect the `interactive.singlekey` config setting [v3,08/10] built-in add -p: handle Escape sequences in interactive.singlekey mode [v3,09/10] built-in add -p: handle Escape sequences more efficiently [v3,10/10] ci: include the built-in `git add -i` in the `linux-gcc` job

Koji Nakamaru via GitGitGadget Jan. 13, 2020, 8:29 a.m. UTC

From: Johannes Schindelin <johannes.schindelin@gmx.de>

As noticed by Gábor Szeder, if we want to run `git add -p` with
redirected input through `test_must_fail` in the test suite, we must
expect that a SIGPIPE can happen due to `stdin` coming to its end.

The appropriate action here is to ignore that signal and treat it as a
regular end-of-file, otherwise the test will fail. In preparation for
such a test, introduce precisely this handling of SIGPIPE into the
built-in version of `git add -p`.

For good measure, teach the built-in `git add -i` the same trick: it
_also_ runs a loop waiting for input, and can receive a SIGPIPE just the
same (and wants to treat it as end-of-file, too).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 add-interactive.c | 3 +++
 add-patch.c       | 4 ++++
 2 files changed, 7 insertions(+)

SZEDER Gábor Jan. 13, 2020, 5:04 p.m. UTC | #1

On Mon, Jan 13, 2020 at 08:29:22AM +0000, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> As noticed by Gábor Szeder, if we want to run `git add -p` with
> redirected input through `test_must_fail` in the test suite, we must
> expect that a SIGPIPE can happen due to `stdin` coming to its end.

I don't think this issue is related to the redirected input: I
modified that flaky test to send "unlimited" data to 'git add's stdin,
i.e.:

  /usr/bin/yes | test_must_fail force_color git add -p

and the test with --stress still failed with SIGPIPE all the same and
just as fast.

After looking into it, the issue seems to be sending data to the
broken diffFilter process.  So in that test the diff is "filtered"
through 'echo too-short', which exits real fast, and doesn't read its
standard input at all (well, apart from e.g. the usual kernel
buffering that might happen on a pipe between the two processes).
Making sure that the diffFilter process reads all the data before
exiting, i.e. changing it to:

  test_config interactive.diffFilter "cat >/dev/null ; echo too-short" &&

made the test reliable, with over 2000 --stress repetitions, and that
with only a single "y" on 'git add's stdin.

Now, merely tweaking the test is clearly insufficient, because we not
only want the test to be realiable, but we want 'git add' to die
gracefully when users out there mess up their configuration.

Ignoring SIGPIPE can surely accomplish that, but I'm not sure about
the scope.  I mean your patch seems to ignore SIGPIPE basically for
almost the whole 'git add -(i|p)' process, but perhaps it should be
limited only to the surroundings of the pipe_command() call running
the diffFilter, and be done as part of the next patch adding the 'if
(diff_filter)' block.

Furthermore, I'm worried that by simply ignoring SIGPIPE we might just
ignore a more fundamental issue in pipe_command(): shouldn't that
function be smart enough not to write() to a fd that has no one on the
other side to read it in the first place?!

So, when the diffFilter process exits unexpectedly early, then the
poll() call in pipe_command() -> pump_io() -> pump_io_round() returns
with success and usually sets 'revents' for the child process' stdin
to 12 (i.e. 'POLLOUT | POLLERR'; gah, how I hate unnamed constants :).
Unfortunately, at that point we don't take any special action on
POLLERR, but call xwrite() to try to write to the dead fd anyway,
which then promptly triggers SIGPIPE.  (This is what usually happens
when stepping through the statements of those functions in a debugger,
and the diffFilter process has all the time in the world to exit.)

We could handle POLLERR with a patch like this:

  --- >8 ---

Subject: run-command: handle POLLERR in pump_io_round() to reduce risk of SIGPIPE

diff --git a/run-command.c b/run-command.c
index 3449db319b..57093f0acc 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1416,25 +1416,31 @@ static int pump_io_round(struct io_pump *slots, int nr, struct pollfd *pfd)
 	if (poll(pfd, pollsize, -1) < 0) {
 		if (errno == EINTR)
 			return 1;
 		die_errno("poll failed");
 	}
 
 	for (i = 0; i < nr; i++) {
 		struct io_pump *io = &slots[i];
 
 		if (io->fd < 0)
 			continue;
 
-		if (!(io->pfd->revents & (POLLOUT|POLLIN|POLLHUP|POLLERR|POLLNVAL)))
+		if (io->pfd->revents & POLLERR) {
+			io->error = ECONNRESET;  /* What should we report to the caller? */
+			close(io->fd);
+			io->fd = -1;
+			continue;
+		}
+		if (!(io->pfd->revents & (POLLOUT|POLLIN|POLLHUP|POLLNVAL)))
 			continue;
 
 		if (io->type == POLLOUT) {
 			ssize_t len = xwrite(io->fd,
 					     io->u.out.buf, io->u.out.len);
 			if (len < 0) {
 				io->error = errno;
 				close(io->fd);
 				io->fd = -1;
 			} else {
 				io->u.out.buf += len;
 				io->u.out.len -= len;

  --- >8 ---

Unfortunately #1, this changes the error 'git add -p' dies with from:

  error: mismatched output from interactive.diffFilter

to:

  error: failed to run 'echo too-short'

It might affect other commands as well, but FWIW the test suite
doesn't catch any.


Unfortunately #2, the above patch doesn't completely eliminates the
SIGPIPE, but only (greatly) reduces its probability.  It is possible
that:

  - poll() returns with success and indicating a writable fd without
    any error, i.e. 'revents = 4'.

  - the bogus diffFilter exits, closing its stdin.

  - 'git add' attempts to xwrite() to the now closed fd, and triggers
    a SIGPIPE right away.

This happens much rarer, 'GIT_TEST_ADD_I_USE_BUILTIN=1
./t3701-add-interactive.sh -r 39,49 --stress-jobs=<4*nr-of-cores>
--stress' tends to take over 200 repetitions.  The patch below
reproduces it fairly reliably by adding two strategically-placed
sleep()s, with a bit of extra debug output:

  --- >8 ---

diff --git a/add-patch.c b/add-patch.c
index d8dafa8168..0fd017bbd3 100644
--- a/add-patch.c
+++ b/add-patch.c
@@ -421,6 +421,7 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
 			filter_cp.git_cmd = 0;
 			filter_cp.use_shell = 1;
 			strbuf_reset(&s->buf);
+			fprintf(stderr, "about to run diffFilter\n");
 			if (pipe_command(&filter_cp,
 					 colored->buf, colored->len,
 					 &s->buf, colored->len,
diff --git a/run-command.c b/run-command.c
index 57093f0acc..49ae88a922 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1419,6 +1419,7 @@ static int pump_io_round(struct io_pump *slots, int nr, struct pollfd *pfd)
 		die_errno("poll failed");
 	}
 
+	sleep(2);
 	for (i = 0; i < nr; i++) {
 		struct io_pump *io = &slots[i];
 
@@ -1435,8 +1436,11 @@ static int pump_io_round(struct io_pump *slots, int nr, struct pollfd *pfd)
 			continue;
 
 		if (io->type == POLLOUT) {
-			ssize_t len = xwrite(io->fd,
+			ssize_t len;
+			fprintf(stderr, "attempting to xwrite() %lu bytes to a fd with revents flags 0x%hx\n", io->u.out.len, io->pfd->revents);
+			len = xwrite(io->fd,
 					     io->u.out.buf, io->u.out.len);
+			fprintf(stderr, "after xwrite()\n");
 			if (len < 0) {
 				io->error = errno;
 				close(io->fd);
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 12ee321707..acffc9af37 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -561,7 +561,7 @@ test_expect_success 'detect bogus diffFilter output' '
 	git reset --hard &&
 
 	echo content >test &&
-	test_config interactive.diffFilter "echo too-short" &&
+	test_config interactive.diffFilter "sleep 1 ; echo too-short" &&
 	printf y >y &&
 	test_must_fail force_color git add -p <y
 '

  --- >8 ---

and 'GIT_TEST_ADD_I_USE_BUILTIN=1 ./t3701-add-interactive.sh -r 39,49'
fails with:

  + test_must_fail force_color git add -p
  about to run diffFilter
  attempting to xwrite() 224 bytes to a fd with revents flags 0x4
  test_must_fail: died by signal 13: force_color git add -p

I don't understand why we get SIGPIPE right away instead of some error
that we can act upon (ECONNRESET?).  FWIW, it fails the same way not
only on my box, but on Travis CI's Linux and OSX images as well.

  https://travis-ci.org/szeder/git/jobs/636446843#L2937


Cc'ing Peff for all things SIGPIPE :) who also happens to be the
author of both pipe_command() and that now flaky test.


> The appropriate action here is to ignore that signal and treat it as a
> regular end-of-file, otherwise the test will fail. In preparation for
> such a test, introduce precisely this handling of SIGPIPE into the
> built-in version of `git add -p`.
> 
> For good measure, teach the built-in `git add -i` the same trick: it
> _also_ runs a loop waiting for input, and can receive a SIGPIPE just the
> same (and wants to treat it as end-of-file, too).
> 
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  add-interactive.c | 3 +++
>  add-patch.c       | 4 ++++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/add-interactive.c b/add-interactive.c
> index a5bb14f2f4..3ff8400ea4 100644
> --- a/add-interactive.c
> +++ b/add-interactive.c
> @@ -9,6 +9,7 @@
>  #include "lockfile.h"
>  #include "dir.h"
>  #include "run-command.h"
> +#include "sigchain.h"
>  
>  static void init_color(struct repository *r, struct add_i_state *s,
>  		       const char *slot_name, char *dst,
> @@ -1097,6 +1098,7 @@ int run_add_i(struct repository *r, const struct pathspec *ps)
>  			->util = util;
>  	}
>  
> +	sigchain_push(SIGPIPE, SIG_IGN);
>  	init_add_i_state(&s, r);
>  
>  	/*
> @@ -1149,6 +1151,7 @@ int run_add_i(struct repository *r, const struct pathspec *ps)
>  	strbuf_release(&print_file_item_data.worktree);
>  	strbuf_release(&header);
>  	prefix_item_list_clear(&commands);
> +	sigchain_pop(SIGPIPE);
>  
>  	return res;
>  }
> diff --git a/add-patch.c b/add-patch.c
> index 46c6c183d5..9a3beed72e 100644
> --- a/add-patch.c
> +++ b/add-patch.c
> @@ -6,6 +6,7 @@
>  #include "pathspec.h"
>  #include "color.h"
>  #include "diff.h"
> +#include "sigchain.h"
>  
>  enum prompt_mode_type {
>  	PROMPT_MODE_CHANGE = 0, PROMPT_DELETION, PROMPT_HUNK,
> @@ -1578,6 +1579,7 @@ int run_add_p(struct repository *r, enum add_p_mode mode,
>  	};
>  	size_t i, binary_count = 0;
>  
> +	sigchain_push(SIGPIPE, SIG_IGN);
>  	init_add_i_state(&s.s, r);
>  
>  	if (mode == ADD_P_STASH)
> @@ -1612,6 +1614,7 @@ int run_add_p(struct repository *r, enum add_p_mode mode,
>  	    parse_diff(&s, ps) < 0) {
>  		strbuf_release(&s.plain);
>  		strbuf_release(&s.colored);
> +		sigchain_pop(SIGPIPE);
>  		return -1;
>  	}
>  
> @@ -1630,5 +1633,6 @@ int run_add_p(struct repository *r, enum add_p_mode mode,
>  	strbuf_release(&s.buf);
>  	strbuf_release(&s.plain);
>  	strbuf_release(&s.colored);
> +	sigchain_pop(SIGPIPE);
>  	return 0;
>  }
> -- 
> gitgitgadget
>

Jeff King Jan. 13, 2020, 6:33 p.m. UTC | #2

On Mon, Jan 13, 2020 at 06:04:17PM +0100, SZEDER Gábor wrote:

> After looking into it, the issue seems to be sending data to the
> broken diffFilter process.  So in that test the diff is "filtered"
> through 'echo too-short', which exits real fast, and doesn't read its
> standard input at all (well, apart from e.g. the usual kernel
> buffering that might happen on a pipe between the two processes).
> Making sure that the diffFilter process reads all the data before
> exiting, i.e. changing it to:
> 
>   test_config interactive.diffFilter "cat >/dev/null ; echo too-short" &&
> 
> made the test reliable, with over 2000 --stress repetitions, and that
> with only a single "y" on 'git add's stdin.

Yeah, I agree the test should be changed. What you wrote above was my
first thought, too, but I think "sed 1d" is actually a more realistic
test (and is shorter and one fewer process).

> Now, merely tweaking the test is clearly insufficient, because we not
> only want the test to be realiable, but we want 'git add' to die
> gracefully when users out there mess up their configuration.

I also agree that it would be nice to deal with this for real-world
cases. I suspect it's not something that would come up a lot, though.

> Ignoring SIGPIPE can surely accomplish that, but I'm not sure about
> the scope.  I mean your patch seems to ignore SIGPIPE basically for
> almost the whole 'git add -(i|p)' process, but perhaps it should be
> limited only to the surroundings of the pipe_command() call running
> the diffFilter, and be done as part of the next patch adding the 'if
> (diff_filter)' block.

The scope there is probably OK in practice. In my opinion SIGPIPE is
usually _not_ what the behavior we want. If we're carefully checking our
write() return values, then we'd get EPIPE in such an instance and
behave appropriately. And if we're not checking our write() return
values, that's generally a bug that ought to be fixed.

The big exception is when we are writing copious output to stdout (or
the pager) via printf() or similar, and want to die rather than continue
writing output nobody will see. But I don't think git-add really counts
as generating a lot of output, where EPIPE could prevent us from doing
useless work (unlike, say, git-log).

> Furthermore, I'm worried that by simply ignoring SIGPIPE we might just
> ignore a more fundamental issue in pipe_command(): shouldn't that
> function be smart enough not to write() to a fd that has no one on the
> other side to read it in the first place?!

Maybe. As you noted below, checking for POLLERR is racy. Seeing that we
"can" write to an fd and doing it to discover what write() returns
(whether error or not) doesn't seem like the worst strategy. If the
caller cares about pipe death, then it needs to be handling SIGPIPE
anyway.

I really wish there was a way to set a handler for SIGPIPE that tells
_which_ descriptor caused it. Because I think logic like "die if it was
fd 1, ignore and let write() return EPIPE otherwise" is the behavior
we'd like. But I don't think there's a portable way to do so.

I've been tempted to say that we should just ignore SIGPIPE everywhere,
and convert even copious-output programs like git-log to just check for
errors (they could probably even just check ferror(stdout) for each
commit we output, if we didn't want to touch every printf call).

-Peff

Johannes Schindelin Jan. 14, 2020, 12:47 p.m. UTC | #3

Hi Gábor,

On Mon, 13 Jan 2020, SZEDER Gábor wrote:

> On Mon, Jan 13, 2020 at 08:29:22AM +0000, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > As noticed by Gábor Szeder, if we want to run `git add -p` with
> > redirected input through `test_must_fail` in the test suite, we must
> > expect that a SIGPIPE can happen due to `stdin` coming to its end.
>
> I don't think this issue is related to the redirected input: I
> modified that flaky test to send "unlimited" data to 'git add's stdin,
> i.e.:
>
>   /usr/bin/yes | test_must_fail force_color git add -p
>
> and the test with --stress still failed with SIGPIPE all the same and
> just as fast.
>
> After looking into it, the issue seems to be sending data to the
> broken diffFilter process.

Ouch. Thank you for investigating. For my education, how did you debug
this? I could not find a way to identify *what* caused that SIGPIPE...

> So in that test the diff is "filtered" through 'echo too-short', which
> exits real fast, and doesn't read its standard input at all (well, apart
> from e.g. the usual kernel buffering that might happen on a pipe between
> the two processes). Making sure that the diffFilter process reads all
> the data before exiting, i.e. changing it to:
>
>   test_config interactive.diffFilter "cat >/dev/null ; echo too-short" &&
>
> made the test reliable, with over 2000 --stress repetitions, and that
> with only a single "y" on 'git add's stdin.

Ah, my diff filter simply ignores the `stdin`... That's easy enough to
fix, and since real-world diff filters probably won't just blatantly
ignore the input, I think it is legitimate to change the test.

> Now, merely tweaking the test is clearly insufficient, because we not
> only want the test to be realiable, but we want 'git add' to die
> gracefully when users out there mess up their configuration.

I think it is sufficient to tweak the test, but I agree that a better
error message might be good when users out there mess up their
configuration.

> Ignoring SIGPIPE can surely accomplish that, but I'm not sure about
> the scope.  I mean your patch seems to ignore SIGPIPE basically for
> almost the whole 'git add -(i|p)' process, but perhaps it should be
> limited only to the surroundings of the pipe_command() call running
> the diffFilter, and be done as part of the next patch adding the 'if
> (diff_filter)' block.

Right. Very heavy-handed, and probably inviting unwanted side effects.

> Furthermore, I'm worried that by simply ignoring SIGPIPE we might just
> ignore a more fundamental issue in pipe_command(): shouldn't that
> function be smart enough not to write() to a fd that has no one on the
> other side to read it in the first place?!
>
> So, when the diffFilter process exits unexpectedly early, then the
> poll() call in pipe_command() -> pump_io() -> pump_io_round() returns
> with success and usually sets 'revents' for the child process' stdin
> to 12 (i.e. 'POLLOUT | POLLERR'; gah, how I hate unnamed constants :).
> Unfortunately, at that point we don't take any special action on
> POLLERR, but call xwrite() to try to write to the dead fd anyway,
> which then promptly triggers SIGPIPE.  (This is what usually happens
> when stepping through the statements of those functions in a debugger,
> and the diffFilter process has all the time in the world to exit.)
>
> We could handle POLLERR with a patch like this:
>
>   --- >8 ---
>
> Subject: run-command: handle POLLERR in pump_io_round() to reduce risk of SIGPIPE
>
> diff --git a/run-command.c b/run-command.c
> index 3449db319b..57093f0acc 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -1416,25 +1416,31 @@ static int pump_io_round(struct io_pump *slots, int nr, struct pollfd *pfd)
>  	if (poll(pfd, pollsize, -1) < 0) {
>  		if (errno == EINTR)
>  			return 1;
>  		die_errno("poll failed");
>  	}
>
>  	for (i = 0; i < nr; i++) {
>  		struct io_pump *io = &slots[i];
>
>  		if (io->fd < 0)
>  			continue;
>
> -		if (!(io->pfd->revents & (POLLOUT|POLLIN|POLLHUP|POLLERR|POLLNVAL)))
> +		if (io->pfd->revents & POLLERR) {
> +			io->error = ECONNRESET;  /* What should we report to the caller? */
> +			close(io->fd);
> +			io->fd = -1;
> +			continue;
> +		}
> +		if (!(io->pfd->revents & (POLLOUT|POLLIN|POLLHUP|POLLNVAL)))
>  			continue;
>
>  		if (io->type == POLLOUT) {
>  			ssize_t len = xwrite(io->fd,
>  					     io->u.out.buf, io->u.out.len);
>  			if (len < 0) {
>  				io->error = errno;
>  				close(io->fd);
>  				io->fd = -1;
>  			} else {
>  				io->u.out.buf += len;
>  				io->u.out.len -= len;
>
>   --- >8 ---
>
> Unfortunately #1, this changes the error 'git add -p' dies with from:
>
>   error: mismatched output from interactive.diffFilter
>
> to:
>
>   error: failed to run 'echo too-short'
>
> It might affect other commands as well, but FWIW the test suite
> doesn't catch any.

Hmm. My first impression is that the error message could be a bit better,
but that it is probably a good thing to have. It would have helped _me_
understand the issue at hand.

> Unfortunately #2, the above patch doesn't completely eliminates the
> SIGPIPE, but only (greatly) reduces its probability.  It is possible
> that:
>
>   - poll() returns with success and indicating a writable fd without
>     any error, i.e. 'revents = 4'.
>
>   - the bogus diffFilter exits, closing its stdin.
>
>   - 'git add' attempts to xwrite() to the now closed fd, and triggers
>     a SIGPIPE right away.
>
> This happens much rarer, 'GIT_TEST_ADD_I_USE_BUILTIN=1
> ./t3701-add-interactive.sh -r 39,49 --stress-jobs=<4*nr-of-cores>
> --stress' tends to take over 200 repetitions.  The patch below
> reproduces it fairly reliably by adding two strategically-placed
> sleep()s, with a bit of extra debug output:
>
>   --- >8 ---
>
> diff --git a/add-patch.c b/add-patch.c
> index d8dafa8168..0fd017bbd3 100644
> --- a/add-patch.c
> +++ b/add-patch.c
> @@ -421,6 +421,7 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps)
>  			filter_cp.git_cmd = 0;
>  			filter_cp.use_shell = 1;
>  			strbuf_reset(&s->buf);
> +			fprintf(stderr, "about to run diffFilter\n");
>  			if (pipe_command(&filter_cp,
>  					 colored->buf, colored->len,
>  					 &s->buf, colored->len,
> diff --git a/run-command.c b/run-command.c
> index 57093f0acc..49ae88a922 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -1419,6 +1419,7 @@ static int pump_io_round(struct io_pump *slots, int nr, struct pollfd *pfd)
>  		die_errno("poll failed");
>  	}
>
> +	sleep(2);
>  	for (i = 0; i < nr; i++) {
>  		struct io_pump *io = &slots[i];
>
> @@ -1435,8 +1436,11 @@ static int pump_io_round(struct io_pump *slots, int nr, struct pollfd *pfd)
>  			continue;
>
>  		if (io->type == POLLOUT) {
> -			ssize_t len = xwrite(io->fd,
> +			ssize_t len;
> +			fprintf(stderr, "attempting to xwrite() %lu bytes to a fd with revents flags 0x%hx\n", io->u.out.len, io->pfd->revents);
> +			len = xwrite(io->fd,
>  					     io->u.out.buf, io->u.out.len);
> +			fprintf(stderr, "after xwrite()\n");
>  			if (len < 0) {
>  				io->error = errno;
>  				close(io->fd);
> diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
> index 12ee321707..acffc9af37 100755
> --- a/t/t3701-add-interactive.sh
> +++ b/t/t3701-add-interactive.sh
> @@ -561,7 +561,7 @@ test_expect_success 'detect bogus diffFilter output' '
>  	git reset --hard &&
>
>  	echo content >test &&
> -	test_config interactive.diffFilter "echo too-short" &&
> +	test_config interactive.diffFilter "sleep 1 ; echo too-short" &&
>  	printf y >y &&
>  	test_must_fail force_color git add -p <y
>  '
>
>   --- >8 ---
>
> and 'GIT_TEST_ADD_I_USE_BUILTIN=1 ./t3701-add-interactive.sh -r 39,49'
> fails with:
>
>   + test_must_fail force_color git add -p
>   about to run diffFilter
>   attempting to xwrite() 224 bytes to a fd with revents flags 0x4
>   test_must_fail: died by signal 13: force_color git add -p
>
> I don't understand why we get SIGPIPE right away instead of some error
> that we can act upon (ECONNRESET?).

Isn't it buffered?

In any case, I would take the above-mentioned patch, even if it makes it
"only" less likely to hit `SIGPIPE`.

> FWIW, it fails the same way not only on my box, but on Travis CI's Linux
> and OSX images as well.
>
>   https://travis-ci.org/szeder/git/jobs/636446843#L2937
>
>
> Cc'ing Peff for all things SIGPIPE :) who also happens to be the
> author of both pipe_command() and that now flaky test.

I'll go with Peff's suggestion to use `sed 1d` instead of `echo
too-short`.

Thanks,
Dscho

Junio C Hamano Jan. 15, 2020, 6:32 p.m. UTC | #4

Jeff King <peff@peff.net> writes:

> On Mon, Jan 13, 2020 at 06:04:17PM +0100, SZEDER Gábor wrote:
>
>> After looking into it, the issue seems to be sending data to the
>> broken diffFilter process.  So in that test the diff is "filtered"
>> through 'echo too-short', which exits real fast, and doesn't read its
>> standard input at all (well, apart from e.g. the usual kernel
>> buffering that might happen on a pipe between the two processes).
>> Making sure that the diffFilter process reads all the data before
>> exiting, i.e. changing it to:
>> 
>>   test_config interactive.diffFilter "cat >/dev/null ; echo too-short" &&
>> 
>> made the test reliable, with over 2000 --stress repetitions, and that
>> with only a single "y" on 'git add's stdin.
>
> Yeah, I agree the test should be changed. What you wrote above was my
> first thought, too, but I think "sed 1d" is actually a more realistic
> test (and is shorter and one fewer process).

I am not sure what we are aiming for.  Are we making sure the
command behaves well in the hands of end users, who may write a
script that consumes only early parts of the input that is needed
for its use and stops reading, or are we just aiming to claim "all
our tests pass"?  I was hoping that we would be doing the former,
and I would understand if the suggestion were "sed 1q" for that
exact reason.

IOW, shouldn't we be fixing the part that drives the external
process, so that the test "passes" even with such a "broken" filter?

>> Now, merely tweaking the test is clearly insufficient, because we not
>> only want the test to be realiable, but we want 'git add' to die
>> gracefully when users out there mess up their configuration.

Yes, and I was hoping that we do not have to touch the test if we
did the latter.

> I really wish there was a way to set a handler for SIGPIPE that tells
> _which_ descriptor caused it. Because I think logic like "die if it was
> fd 1, ignore and let write() return EPIPE otherwise" is the behavior
> we'd like. But I don't think there's a portable way to do so.
>
> I've been tempted to say that we should just ignore SIGPIPE everywhere,
> and convert even copious-output programs like git-log to just check for
> errors (they could probably even just check ferror(stdout) for each
> commit we output, if we didn't want to touch every printf call).

Yeah, I share that temptation.

Jeff King Jan. 15, 2020, 7:03 p.m. UTC | #5

On Wed, Jan 15, 2020 at 10:32:59AM -0800, Junio C Hamano wrote:

> >>   test_config interactive.diffFilter "cat >/dev/null ; echo too-short" &&
> >> 
> >> made the test reliable, with over 2000 --stress repetitions, and that
> >> with only a single "y" on 'git add's stdin.
> >
> > Yeah, I agree the test should be changed. What you wrote above was my
> > first thought, too, but I think "sed 1d" is actually a more realistic
> > test (and is shorter and one fewer process).
> 
> I am not sure what we are aiming for.  Are we making sure the
> command behaves well in the hands of end users, who may write a
> script that consumes only early parts of the input that is needed
> for its use and stops reading, or are we just aiming to claim "all
> our tests pass"?  I was hoping that we would be doing the former,
> and I would understand if the suggestion were "sed 1q" for that
> exact reason.
> 
> IOW, shouldn't we be fixing the part that drives the external
> process, so that the test "passes" even with such a "broken" filter?

The original motivation for this test (and the code that fixes it) was
diff-so-fancy, which read all of the input but didn't have a 1:1 line
correspondence in the output (IIRC it condensed some particular lines,
like rename from/to into a single line).

And I think most sane filters would end up reading all of the content.
Or a misconfiguration would cause them to read nothing at all.

So something like "sed 1d" is more representative of a real filter. If
we want to test SIGPIPE, then the current one that reads _nothing_ is
the most torturous. But "sed 1q" is neither realistic (if that's what
we're going for) nor the hardest thing we can throw at the code (if
that's what we want).

> > I've been tempted to say that we should just ignore SIGPIPE everywhere,
> > and convert even copious-output programs like git-log to just check for
> > errors (they could probably even just check ferror(stdout) for each
> > commit we output, if we didn't want to touch every printf call).
> 
> Yeah, I share that temptation.

Hmm. My recollection was that you were more of a fan of SIGPIPE than I
am. But if you agree, then maybe the time has come for action. :)

-Peff

SZEDER Gábor Jan. 17, 2020, 2:32 p.m. UTC | #6

On Mon, Jan 13, 2020 at 06:04:17PM +0100, SZEDER Gábor wrote:
> and 'GIT_TEST_ADD_I_USE_BUILTIN=1 ./t3701-add-interactive.sh -r 39,49'
> fails with:
> 
>   + test_must_fail force_color git add -p
>   about to run diffFilter
>   attempting to xwrite() 224 bytes to a fd with revents flags 0x4
>   test_must_fail: died by signal 13: force_color git add -p
> 
> I don't understand why we get SIGPIPE right away instead of some error
> that we can act upon (ECONNRESET?).

Doh', because it's a pipe, not a socket, that's why.  pipe(7):

  "If all file descriptors referring to the read end of a pipe have
   been closed, then a write(2) will cause a SIGPIPE signal to be
   generated for the calling process."

So ECONNRESET is definitely not the right error to set on POLLERR,
though I'm still not sure what the right one would be (perhaps
EPIPE?).

Jeff King Jan. 17, 2020, 6:58 p.m. UTC | #7

On Fri, Jan 17, 2020 at 03:32:36PM +0100, SZEDER Gábor wrote:

> On Mon, Jan 13, 2020 at 06:04:17PM +0100, SZEDER Gábor wrote:
> > and 'GIT_TEST_ADD_I_USE_BUILTIN=1 ./t3701-add-interactive.sh -r 39,49'
> > fails with:
> > 
> >   + test_must_fail force_color git add -p
> >   about to run diffFilter
> >   attempting to xwrite() 224 bytes to a fd with revents flags 0x4
> >   test_must_fail: died by signal 13: force_color git add -p
> > 
> > I don't understand why we get SIGPIPE right away instead of some error
> > that we can act upon (ECONNRESET?).
> 
> Doh', because it's a pipe, not a socket, that's why.  pipe(7):
> 
>   "If all file descriptors referring to the read end of a pipe have
>    been closed, then a write(2) will cause a SIGPIPE signal to be
>    generated for the calling process."
> 
> So ECONNRESET is definitely not the right error to set on POLLERR,
> though I'm still not sure what the right one would be (perhaps
> EPIPE?).

Yes, if SIGPIPE is ignored, then that write() would produce EPIPE. So if
you're trying to emulate it via POLLERR, that would be accurate. Of
course it could fail for _other_ reasons, and I don't think we'd know
what those are without actually calling write(). Practically speaking,
though, if we know it's a pipe with a valid descriptor then any error is
basically equivalent to EPIPE (we don't care how, but for whatever
reason we couldn't write to the other end).

-Peff

[v3,01/10] built-in add -i/-p: treat SIGPIPE as EOF

Commit Message

Comments

Patch