Message ID | YZrOLy03s5ZWMQ+t@coredump.intra.peff.net (mailing list archive) |
---|---|
State | Accepted |
Commit | f7991f01f2d1bc800a47adcf66a1b29a1f7cf697 |
Headers | show |
Series | t7006: clean up SIGPIPE handling in trace2 tests | expand |
On Sun, Nov 21, 2021 at 05:54:39PM -0500, Jeff King wrote: > That's actually pretty easy to do. So I came up with the patch below, in > the interests of wrapping up this thread. > > I'm not 100% sure this fixes any possible races, as the race Junio > initially reported seemed to be in the "propagated signals from pager" > test, which I don't think has these flaky-SIGPIPE problems. But I think > it's at least correcting some of the confusion. And we can see if it > happens again (I haven't been able to trigger any failures with --stress > myself). > > -- >8 -- > Subject: [PATCH] t7006: clean up SIGPIPE handling in trace2 tests This one is optional on top. It's just cleanup, but IMHO makes it easier to understand what's going on. -- >8 -- Subject: [PATCH] t7006: simplify exit-code checks for sigpipe tests Some tests in t7006 check for a SIGPIPE result by recording $? and comparing it with test_match_signal. Before the previous commit, the command was on the left-hand side of a pipe, and so we had to do some subshell trickery to extract it. But now that this is no longer the case, we can do things much more simply: just run the command directly, using braces to avoid wrecking the &&-chain, and then record $?. We could almost use test_expect_code here, but it doesn't know about test_match_signal. Likewise, for tests which expect success (i.e., not SIGPIPE), we can just put them in the &&-chain as usual. That even lets us get rid of the !MINGW check, since the expectation is the same on both sides. Signed-off-by: Jeff King <peff@peff.net> --- t/t7006-pager.sh | 22 +++++----------------- 1 file changed, 5 insertions(+), 17 deletions(-) diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh index 577e51b47a..851961c798 100755 --- a/t/t7006-pager.sh +++ b/t/t7006-pager.sh @@ -677,7 +677,7 @@ test_expect_success TTY 'git returns SIGPIPE on early pager exit' ' if test_have_prereq !MINGW then - OUT=$( ((test_terminal git log; echo $? 1>&3) >/dev/null) 3>&1 ) && + { test_terminal git log >/dev/null; OUT=$?; } && test_match_signal 13 "$OUT" else test_terminal git log @@ -698,7 +698,7 @@ test_expect_success TTY 'git returns SIGPIPE on early pager non-zero exit' ' if test_have_prereq !MINGW then - OUT=$( ((test_terminal git log; echo $? 1>&3) >/dev/null) 3>&1 ) && + { test_terminal git log >/dev/null; OUT=$?; } && test_match_signal 13 "$OUT" else test_terminal git log @@ -717,13 +717,7 @@ test_expect_success TTY 'git discards pager non-zero exit without SIGPIPE' ' export GIT_TRACE2 && test_when_finished "unset GIT_TRACE2" && - if test_have_prereq !MINGW - then - OUT=$( ((test_terminal git log; echo $? 1>&3) >/dev/null) 3>&1 ) && - test "$OUT" -eq 0 - else - test_terminal git log - fi && + test_terminal git log && grep child_exit trace.normal >child-exits && test_line_count = 1 child-exits && @@ -738,13 +732,7 @@ test_expect_success TTY 'git skips paging nonexisting command' ' export GIT_TRACE2 && test_when_finished "unset GIT_TRACE2" && - if test_have_prereq !MINGW - then - OUT=$( ((test_terminal git log; echo $? 1>&3) >/dev/null) 3>&1 ) && - test "$OUT" -eq 0 - else - test_terminal git log - fi && + test_terminal git log && grep child_exit trace.normal >child-exits && test_line_count = 1 child-exits && @@ -760,7 +748,7 @@ test_expect_success TTY 'git returns SIGPIPE on propagated signals from pager' ' if test_have_prereq !MINGW then - OUT=$( ((test_terminal git log; echo $? 1>&3) >/dev/null) 3>&1 ) && + { test_terminal git log >/dev/null; OUT=$?; } && test_match_signal 13 "$OUT" else test_terminal git log
Jeff King <peff@peff.net> writes: > I'm not 100% sure this fixes any possible races, as the race Junio > initially reported seemed to be in the "propagated signals from pager" > test, which I don't think has these flaky-SIGPIPE problems. But I think > it's at least correcting some of the confusion. And we can see if it > happens again (I haven't been able to trigger any failures with --stress > myself). Applying this (or this and the follow-up) seems to make t7006, which used to be flaky, to consistently fail at test "git returns SIGPIPE on propagated signals from pager" for me ;-)
On Sun, Nov 21, 2021 at 06:17:04PM -0800, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > I'm not 100% sure this fixes any possible races, as the race Junio > > initially reported seemed to be in the "propagated signals from pager" > > test, which I don't think has these flaky-SIGPIPE problems. But I think > > it's at least correcting some of the confusion. And we can see if it > > happens again (I haven't been able to trigger any failures with --stress > > myself). > > Applying this (or this and the follow-up) seems to make t7006, which > used to be flaky, to consistently fail at test "git returns SIGPIPE > on propagated signals from pager" for me ;-) Well, I guess it's good that we made things more consistent. :) It is curious that you get failures and I don't, though. I wonder what the difference is. One curiosity is that the test does this: test_config core.pager ">pager-used; test-tool sigchain" While "test-tool sigchain" will die with SIGTERM, it's the shell itself which will waitpid() on it. And so in the end, what Git will generally see is the same as if the shell had done "exit 143". I wonder if the difference is between our shells. I know from previous experience that bash will sometimes directly exec the final command in a "-c" command, as an optimization. I don't get any difference running the test with dash or bash, but that makes sense; the pager command is run internally by Git via "sh -c". Aha, that's it. If I recompile with SHELL_PATH=/bin/bash, then I see a failure. Likewise, if I change the test like this: diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh index 851961c798..a87ef37803 100755 --- a/t/t7006-pager.sh +++ b/t/t7006-pager.sh @@ -741,7 +741,7 @@ test_expect_success TTY 'git skips paging nonexisting command' ' test_expect_success TTY 'git returns SIGPIPE on propagated signals from pager' ' test_when_finished "rm pager-used trace.normal" && - test_config core.pager ">pager-used; test-tool sigchain" && + test_config core.pager ">pager-used; exec test-tool sigchain" && GIT_TRACE2="$(pwd)/trace.normal" && export GIT_TRACE2 && test_when_finished "unset GIT_TRACE2" && then it fails even with dash. And that is, I think, closer to what the test was actually trying to cover (since checking a shell's "exit 143" is really no different than "exit 1", and we checked that earlier). So why is it failing? It looks like trace2 reports this as code "-1" rather than 143. I think that is because the fix from be8fc53e36 (pager: properly log pager exit code when signalled, 2021-02-02) is incomplete. It sets WEXITSTATUS() if the pager exited, but it doesn't handle signal death at all. I think it needs: diff --git a/run-command.c b/run-command.c index f40df01c77..ef9d1d4236 100644 --- a/run-command.c +++ b/run-command.c @@ -555,6 +555,8 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal) if (in_signal) { if (WIFEXITED(status)) code = WEXITSTATUS(status); + else if (WIFSIGNALED(status)) + code = 128 + WTERMSIG(status); /* see comment below */ return code; } -Peff
On Sun, Nov 21, 2021 at 11:51:54PM -0500, Jeff King wrote: > So why is it failing? It looks like trace2 reports this as code "-1" > rather than 143. I think that is because the fix from be8fc53e36 (pager: > properly log pager exit code when signalled, 2021-02-02) is incomplete. > It sets WEXITSTATUS() if the pager exited, but it doesn't handle signal > death at all. I think it needs: > > diff --git a/run-command.c b/run-command.c > index f40df01c77..ef9d1d4236 100644 > --- a/run-command.c > +++ b/run-command.c > @@ -555,6 +555,8 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal) > if (in_signal) { > if (WIFEXITED(status)) > code = WEXITSTATUS(status); > + else if (WIFSIGNALED(status)) > + code = 128 + WTERMSIG(status); /* see comment below */ > return code; > } I'm tempted to say that this would be clearer if the in_signal code path just followed the main logic, like this: diff --git a/run-command.c b/run-command.c index f40df01c77..1f58c17b6c 100644 --- a/run-command.c +++ b/run-command.c @@ -552,20 +552,17 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal) while ((waiting = waitpid(pid, &status, 0)) < 0 && errno == EINTR) ; /* nothing */ - if (in_signal) { - if (WIFEXITED(status)) - code = WEXITSTATUS(status); - return code; - } if (waiting < 0) { failed_errno = errno; - error_errno("waitpid for %s failed", argv0); + if (!in_signal) + error_errno("waitpid for %s failed", argv0); } else if (waiting != pid) { - error("waitpid is confused (%s)", argv0); + if (!in_signal) + error("waitpid is confused (%s)", argv0); } else if (WIFSIGNALED(status)) { code = WTERMSIG(status); - if (code != SIGINT && code != SIGQUIT && code != SIGPIPE) + if (!in_signal && code != SIGINT && code != SIGQUIT && code != SIGPIPE) error("%s died of signal %d", argv0, code); /* * This return value is chosen so that code & 0xff @@ -576,10 +573,12 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal) } else if (WIFEXITED(status)) { code = WEXITSTATUS(status); } else { - error("waitpid is confused (%s)", argv0); + if (!in_signal) + error("waitpid is confused (%s)", argv0); } - clear_child_for_cleanup(pid); + if (!in_signal) + clear_child_for_cleanup(pid); errno = failed_errno; return code; That's a lot more tedious "if (!in_signal)" checks, but: - we don't have to duplicate any of the actual application logic - we'd now cover the extra cases for waitpid failing or returning the wrong pid (previously if waitpid() failed we'd still look at status, which could contain complete garbage!) -Peff
Jeff King <peff@peff.net> writes: > I'm tempted to say that this would be clearer if the in_signal code path > just followed the main logic, like this: > > diff --git a/run-command.c b/run-command.c > index f40df01c77..1f58c17b6c 100644 > --- a/run-command.c > +++ b/run-command.c > @@ -552,20 +552,17 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal) > > while ((waiting = waitpid(pid, &status, 0)) < 0 && errno == EINTR) > ; /* nothing */ > - if (in_signal) { > - if (WIFEXITED(status)) > - code = WEXITSTATUS(status); > - return code; > - } > > if (waiting < 0) { > failed_errno = errno; > - error_errno("waitpid for %s failed", argv0); > + if (!in_signal) > + error_errno("waitpid for %s failed", argv0); > } else if (waiting != pid) { > - error("waitpid is confused (%s)", argv0); > + if (!in_signal) > + error("waitpid is confused (%s)", argv0); > } else if (WIFSIGNALED(status)) { > code = WTERMSIG(status); > - if (code != SIGINT && code != SIGQUIT && code != SIGPIPE) > + if (!in_signal && code != SIGINT && code != SIGQUIT && code != SIGPIPE) > error("%s died of signal %d", argv0, code); > /* > * This return value is chosen so that code & 0xff > @@ -576,10 +573,12 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal) > } else if (WIFEXITED(status)) { > code = WEXITSTATUS(status); > } else { > - error("waitpid is confused (%s)", argv0); > + if (!in_signal) > + error("waitpid is confused (%s)", argv0); > } > > - clear_child_for_cleanup(pid); > + if (!in_signal) > + clear_child_for_cleanup(pid); > > errno = failed_errno; > return code; > > That's a lot more tedious "if (!in_signal)" checks, but: > > - we don't have to duplicate any of the actual application logic > > - we'd now cover the extra cases for waitpid failing or returning the > wrong pid (previously if waitpid() failed we'd still look at status, > which could contain complete garbage!) Yeah, the repeated "if (!in_signal)" look a bit ugly, but fixing that "we only deal with ifexited in in_signal case" to do the right thing would make the code even more annoying and harder to maintain.
Junio C Hamano <gitster@pobox.com> writes: > Jeff King <peff@peff.net> writes: > ... >> That's a lot more tedious "if (!in_signal)" checks, but: >> >> - we don't have to duplicate any of the actual application logic >> >> - we'd now cover the extra cases for waitpid failing or returning the >> wrong pid (previously if waitpid() failed we'd still look at status, >> which could contain complete garbage!) > ... > Yeah, the repeated "if (!in_signal)" look a bit ugly, but fixing > that "we only deal with ifexited in in_signal case" to do the right > thing would make the code even more annoying and harder to maintain. Eh, what I meant was that it would be annoying and harder to maintain, if done inside the single "if (in_signal) { ... }" near the beginning. (IOW, I am completely in agreement with your reasoning above).
On Sun, Nov 21, 2021 at 09:49:32PM -0800, Junio C Hamano wrote: > > That's a lot more tedious "if (!in_signal)" checks, but: > > > > - we don't have to duplicate any of the actual application logic > > > > - we'd now cover the extra cases for waitpid failing or returning the > > wrong pid (previously if waitpid() failed we'd still look at status, > > which could contain complete garbage!) > > Yeah, the repeated "if (!in_signal)" look a bit ugly, but fixing > that "we only deal with ifexited in in_signal case" to do the right > thing would make the code even more annoying and harder to maintain. OK. Let me see if I can clean this up into a full series that actually fixes the race you saw, and breaks down the other fixes a bit more. -Peff
diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh index 0e7cf75435..577e51b47a 100755 --- a/t/t7006-pager.sh +++ b/t/t7006-pager.sh @@ -661,6 +661,13 @@ test_expect_success 'setup trace2' ' export GIT_TRACE2_BRIEF ' +test_expect_success 'setup large log output' ' + perl -e " + print \"this is a long commit message\" x 50000 + " >commit-msg && + git commit --allow-empty -F commit-msg +' + test_expect_success TTY 'git returns SIGPIPE on early pager exit' ' test_when_finished "rm pager-used trace.normal" && test_config core.pager ">pager-used; head -n 1; exit 0" && @@ -670,7 +677,7 @@ test_expect_success TTY 'git returns SIGPIPE on early pager exit' ' if test_have_prereq !MINGW then - OUT=$( ((test_terminal git log; echo $? 1>&3) | :) 3>&1 ) && + OUT=$( ((test_terminal git log; echo $? 1>&3) >/dev/null) 3>&1 ) && test_match_signal 13 "$OUT" else test_terminal git log @@ -691,7 +698,7 @@ test_expect_success TTY 'git returns SIGPIPE on early pager non-zero exit' ' if test_have_prereq !MINGW then - OUT=$( ((test_terminal git log; echo $? 1>&3) | :) 3>&1 ) && + OUT=$( ((test_terminal git log; echo $? 1>&3) >/dev/null) 3>&1 ) && test_match_signal 13 "$OUT" else test_terminal git log @@ -712,7 +719,7 @@ test_expect_success TTY 'git discards pager non-zero exit without SIGPIPE' ' if test_have_prereq !MINGW then - OUT=$( ((test_terminal git log; echo $? 1>&3) | :) 3>&1 ) && + OUT=$( ((test_terminal git log; echo $? 1>&3) >/dev/null) 3>&1 ) && test "$OUT" -eq 0 else test_terminal git log @@ -724,28 +731,7 @@ test_expect_success TTY 'git discards pager non-zero exit without SIGPIPE' ' test_path_is_file pager-used ' -test_expect_success TTY 'git discards nonexisting pager without SIGPIPE' ' - test_when_finished "rm pager-used trace.normal" && - test_config core.pager "wc >pager-used; does-not-exist" && - GIT_TRACE2="$(pwd)/trace.normal" && - export GIT_TRACE2 && - test_when_finished "unset GIT_TRACE2" && - - if test_have_prereq !MINGW - then - OUT=$( ((test_terminal git log; echo $? 1>&3) | :) 3>&1 ) && - test "$OUT" -eq 0 - else - test_terminal git log - fi && - - grep child_exit trace.normal >child-exits && - test_line_count = 1 child-exits && - grep " code:127 " child-exits && - test_path_is_file pager-used -' - -test_expect_success TTY 'git attempts to page to nonexisting pager command, gets SIGPIPE' ' +test_expect_success TTY 'git skips paging nonexisting command' ' test_when_finished "rm trace.normal" && test_config core.pager "does-not-exist" && GIT_TRACE2="$(pwd)/trace.normal" && @@ -754,8 +740,8 @@ test_expect_success TTY 'git attempts to page to nonexisting pager command, gets if test_have_prereq !MINGW then - OUT=$( ((test_terminal git log; echo $? 1>&3) | :) 3>&1 ) && - test_match_signal 13 "$OUT" + OUT=$( ((test_terminal git log; echo $? 1>&3) >/dev/null) 3>&1 ) && + test "$OUT" -eq 0 else test_terminal git log fi && @@ -774,7 +760,7 @@ test_expect_success TTY 'git returns SIGPIPE on propagated signals from pager' ' if test_have_prereq !MINGW then - OUT=$( ((test_terminal git log; echo $? 1>&3) | :) 3>&1 ) && + OUT=$( ((test_terminal git log; echo $? 1>&3) >/dev/null) 3>&1 ) && test_match_signal 13 "$OUT" else test_terminal git log