[02/10] t7422: fix flaky test caused by buffered stdout

Message ID	20250103-b4-pks-ci-fixes-v1-2-a9bb95dff833@pks.im (mailing list archive)
State	Superseded
Headers	show Received: from fout-a3-smtp.messagingengine.com (fout-a3-smtp.messagingengine.com [103.168.172.146]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3050D1FA82F for <git@vger.kernel.org>; Fri, 3 Jan 2025 14:47:08 +0000 (UTC) Feedback-ID: i197146af:Fastmail From: Patrick Steinhardt <ps@pks.im> Date: Fri, 03 Jan 2025 15:46:39 +0100 Subject: [PATCH 02/10] t7422: fix flaky test caused by buffered stdout Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20250103-b4-pks-ci-fixes-v1-2-a9bb95dff833@pks.im> References: <20250103-b4-pks-ci-fixes-v1-0-a9bb95dff833@pks.im> In-Reply-To: <20250103-b4-pks-ci-fixes-v1-0-a9bb95dff833@pks.im> To: git@vger.kernel.org Cc: X-Mailer: b4 0.14.2
Series	A couple of CI improvements \| expand [00/10] A couple of CI improvements [01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix [02/10] t7422: fix flaky test caused by buffered stdout [03/10] github: adapt containerized jobs to be rootless [04/10] github: convert all Linux jobs to be containerized [05/10] github: simplify computation of the job's distro [06/10] gitlab-ci: remove the "linux-old" job [07/10] gitlab-ci: add linux32 job testing against i386 [08/10] ci: stop special-casing for Ubuntu 16.04 [09/10] ci: use latest Ubuntu release [10/10] ci: remove stale code for Azure Pipelines

Patrick Steinhardt Jan. 3, 2025, 2:46 p.m. UTC

One test in t7422 asserts that `git submodule status --recursive`
properly handles SIGPIPE. This test is flaky though and may sometimes
not see a SIGPIPE at all:

    expecting success of 7422.18 'git submodule status --recursive propagates SIGPIPE':
            { git submodule status --recursive 2>err; echo $?>status; } |
                    grep -q X/S &&
            test_must_be_empty err &&
            test_match_signal 13 "$(cat status)"
    ++ git submodule status --recursive
    ++ grep -q X/S
    ++ echo 0
    ++ test_must_be_empty err
    ++ test 1 -ne 1
    ++ test_path_is_file err
    ++ test 1 -ne 1
    ++ test -f err
    ++ test -s err
    +++ cat status
    ++ test_match_signal 13 0
    ++ test 0 = 141
    ++ test 0 = 269
    ++ return 1
    error: last command exited with $?=1
    not ok 18 - git submodule status --recursive propagates SIGPIPE

The issue is caused by us using grep(1) to terminate the pipe on the
first matching line in the recursing git-submodule(1) process. Standard
streams are typically buffered though, so this condition is racy and may
cause us to terminate the pipe after git-submodule(1) has already
exited, and in that case we wouldn't see the expected signal.

Fix the issue by converting standard streams to be unbuffered. I have
only been able to reproduce this issue a single time after running t7422
with `--stress` after an extended amount of time, so I cannot claim to
be fully certain that this fix is sufficient.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t7422-submodule-output.sh | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

Jeff King Jan. 3, 2025, 6:17 p.m. UTC | #1

On Fri, Jan 03, 2025 at 03:46:39PM +0100, Patrick Steinhardt wrote:

> One test in t7422 asserts that `git submodule status --recursive`
> properly handles SIGPIPE. This test is flaky though and may sometimes
> not see a SIGPIPE at all:
> 
>     expecting success of 7422.18 'git submodule status --recursive propagates SIGPIPE':
>             { git submodule status --recursive 2>err; echo $?>status; } |
>                     grep -q X/S &&
>             test_must_be_empty err &&
>             test_match_signal 13 "$(cat status)"

I couldn't reproduce with --stress, but you can trigger it all the time
with:

diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index f21e920367..9338c75626 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -168,7 +168,7 @@ done
 
 test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
 	{ git submodule status --recursive 2>err; echo $?>status; } |
-		grep -q X/S &&
+		{ sleep 1 && grep -q X/S; } &&
 	test_must_be_empty err &&
 	test_match_signal 13 "$(cat status)"
 '

The problem is that git-submodule may write all of its output before
grep exits, and it gets stored in the pipe buffer. And then even if grep
exits before reading all of it, it is too late for SIGPIPE, and the data
in the pipe is just discarded by the OS.

So this:

> The issue is caused by us using grep(1) to terminate the pipe on the
> first matching line in the recursing git-submodule(1) process. Standard
> streams are typically buffered though, so this condition is racy and may
> cause us to terminate the pipe after git-submodule(1) has already
> exited, and in that case we wouldn't see the expected signal.
> 
> Fix the issue by converting standard streams to be unbuffered. I have
> only been able to reproduce this issue a single time after running t7422
> with `--stress` after an extended amount of time, so I cannot claim to
> be fully certain that this fix is sufficient.

isn't quite right. Even without input buffering on grep's part, it may
be too slow to read the data. And adding a sleep as above shows that it
still fails with your patch.

The usual way to reliably get SIGPIPE is to make sure the writer
produces enough data to fill the pipe buffer. But it's tricky to get
"submodule status" to produce a lot of data without having a ton of
submodules, which is expensive to set up.

But we can hack around it by stuffing the pipe full with a separate
process. Like this:

diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index f21e920367..c4df2629e8 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -167,8 +167,15 @@ do
 done
 
 test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
-	{ git submodule status --recursive 2>err; echo $?>status; } |
-		grep -q X/S &&
+	{
+		# stuff pipe buffer full of input so that submodule status
+		# will require blocking on write; this script will write over
+		# 128kb. It might itself get SIGPIPE, so we must not &&-chain
+		# it directly.
+		{ perl -le "print q{foo} for (1..33000)" || true; } &&
+		git submodule status --recursive 2>err
+		echo $? >status
+	} | { sleep 1 && head -n 1 >/dev/null; } &&
 	test_must_be_empty err &&
 	test_match_signal 13 "$(cat status)"
 '
A few notes:

  - the sleep is still there to demonstrate that it always works, but
    obviously we'd want to remove that

  - I swapped out "grep" for "head". What we are matching is not
    relevant; the important thing is that the reader closes the pipe
    immediately. So I guess in that sense we could probably even just
    pipe to "true" or similar.

  - I tried using test_seq to avoid the inline perl, but it doesn't
    work! The problem is that it's implemented as a shell function. So
    when it gets SIGPIPE, the whole subshell is killed, and we never
    even run git-submodule at all. So it has to be a separate process
    (though I guess it could be test_seq in a subshell).

Anyway, hopefully that gives you enough to play around with.

-Peff

Patrick Steinhardt Jan. 6, 2025, 11:12 a.m. UTC | #2

On Fri, Jan 03, 2025 at 01:17:39PM -0500, Jeff King wrote:
> On Fri, Jan 03, 2025 at 03:46:39PM +0100, Patrick Steinhardt wrote:
> > One test in t7422 asserts that `git submodule status --recursive`
> > properly handles SIGPIPE. This test is flaky though and may sometimes
> > not see a SIGPIPE at all:
> > 
> >     expecting success of 7422.18 'git submodule status --recursive propagates SIGPIPE':
> >             { git submodule status --recursive 2>err; echo $?>status; } |
> >                     grep -q X/S &&
> >             test_must_be_empty err &&
> >             test_match_signal 13 "$(cat status)"
> 
> I couldn't reproduce with --stress, but you can trigger it all the time
> with:
> 
> diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
> index f21e920367..9338c75626 100755
> --- a/t/t7422-submodule-output.sh
> +++ b/t/t7422-submodule-output.sh
> @@ -168,7 +168,7 @@ done
>  
>  test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
>  	{ git submodule status --recursive 2>err; echo $?>status; } |
> -		grep -q X/S &&
> +		{ sleep 1 && grep -q X/S; } &&
>  	test_must_be_empty err &&
>  	test_match_signal 13 "$(cat status)"
>  '
> 
> The problem is that git-submodule may write all of its output before
> grep exits, and it gets stored in the pipe buffer. And then even if grep
> exits before reading all of it, it is too late for SIGPIPE, and the data
> in the pipe is just discarded by the OS.
> 
> So this:
> 
> > The issue is caused by us using grep(1) to terminate the pipe on the
> > first matching line in the recursing git-submodule(1) process. Standard
> > streams are typically buffered though, so this condition is racy and may
> > cause us to terminate the pipe after git-submodule(1) has already
> > exited, and in that case we wouldn't see the expected signal.
> > 
> > Fix the issue by converting standard streams to be unbuffered. I have
> > only been able to reproduce this issue a single time after running t7422
> > with `--stress` after an extended amount of time, so I cannot claim to
> > be fully certain that this fix is sufficient.
> 
> isn't quite right. Even without input buffering on grep's part, it may
> be too slow to read the data. And adding a sleep as above shows that it
> still fails with your patch.

Great. I was hoping to nerd-snipe somebody into helping me out with the
last sentence in my above paragraph :) Happy to see that you bit.

> The usual way to reliably get SIGPIPE is to make sure the writer
> produces enough data to fill the pipe buffer. But it's tricky to get
> "submodule status" to produce a lot of data without having a ton of
> submodules, which is expensive to set up.
> 
> But we can hack around it by stuffing the pipe full with a separate
> process. Like this:
> 
> diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
> index f21e920367..c4df2629e8 100755
> --- a/t/t7422-submodule-output.sh
> +++ b/t/t7422-submodule-output.sh
> @@ -167,8 +167,15 @@ do
>  done
>  
>  test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
> -	{ git submodule status --recursive 2>err; echo $?>status; } |
> -		grep -q X/S &&
> +	{
> +		# stuff pipe buffer full of input so that submodule status
> +		# will require blocking on write; this script will write over
> +		# 128kb. It might itself get SIGPIPE, so we must not &&-chain
> +		# it directly.
> +		{ perl -le "print q{foo} for (1..33000)" || true; } &&
> +		git submodule status --recursive 2>err
> +		echo $? >status
> +	} | { sleep 1 && head -n 1 >/dev/null; } &&
>  	test_must_be_empty err &&
>  	test_match_signal 13 "$(cat status)"
>  '
> A few notes:
> 
>   - the sleep is still there to demonstrate that it always works, but
>     obviously we'd want to remove that

Nice, this indeed lets me reproduce the issue reliably.

>   - I swapped out "grep" for "head". What we are matching is not
>     relevant; the important thing is that the reader closes the pipe
>     immediately. So I guess in that sense we could probably even just
>     pipe to "true" or similar.

I think the grep(1) is relevant though. The test explicitly verifies
that `--recursive` propagates SIGPIPE, so we must make sure that we
trigger the SIGPIPE when the child process produces output, not when the
parent process produces it. That's why we grep for "X/S", where "X" is a
submodule -- it means that we know that it is currently the subprocess
doing its thing.

It also simplifies the code a bit given that the call to Perl doesn't
need `|| true` anymore.

>   - I tried using test_seq to avoid the inline perl, but it doesn't
>     work! The problem is that it's implemented as a shell function. So
>     when it gets SIGPIPE, the whole subshell is killed, and we never
>     even run git-submodule at all. So it has to be a separate process
>     (though I guess it could be test_seq in a subshell).

And that one should also work if we retain the grep. I wonder though
whether we shouldn't prefer to use Perl regardless as it's likely to be
faster when generating all that gibberish. Perl is basically a hard
prerequisite for our tests anyway, so it doesn't really hurt to call it
here.

Patrick

Jeff King Jan. 7, 2025, 2:39 a.m. UTC | #3

On Mon, Jan 06, 2025 at 12:12:22PM +0100, Patrick Steinhardt wrote:

> > isn't quite right. Even without input buffering on grep's part, it may
> > be too slow to read the data. And adding a sleep as above shows that it
> > still fails with your patch.
> 
> Great. I was hoping to nerd-snipe somebody into helping me out with the
> last sentence in my above paragraph :) Happy to see that you bit.

I think I am a sucker for SIGPIPE races.

> >   - I swapped out "grep" for "head". What we are matching is not
> >     relevant; the important thing is that the reader closes the pipe
> >     immediately. So I guess in that sense we could probably even just
> >     pipe to "true" or similar.
> 
> I think the grep(1) is relevant though. The test explicitly verifies
> that `--recursive` propagates SIGPIPE, so we must make sure that we
> trigger the SIGPIPE when the child process produces output, not when the
> parent process produces it. That's why we grep for "X/S", where "X" is a
> submodule -- it means that we know that it is currently the subprocess
> doing its thing.

Hmm, I see what you mean. I don't think we can do that reliably, though,
or that the perl byte-stuffing is actually helping.

As I wrote it, perl always gets SIGPIPE first (because either "head"
exits while it is writing, or it fills up the pipe buffer and blocks,
waiting for head to exit, and then sees the pipe close).

And thus when we run git-submodule, the pipe is reliably closed and
we'll see SIGPIPE.

But with grep, that does not happen. The grep will run through all of
the data from perl (since it does not contain X/S), and there will not
be anything left in the pipe buffer by the time git-submodule starts. So
all of that data did nothing (though it fools the "sleep 1 && grep" from
losing the race because perl will block until grep starts, after the
sleep is finished).

And so we're left with the same race as before. git-submodule writes the
X/S line, grep reads it and then tries to exit while git-submodule is
writing more. And either:

  a. grep may exit immediately, before git-submodule writes any more
     data. In which case git sees SIGPIPE, which is what we want.

  a. git-submodule may write all of its data before grep exits. It will
     not block, because all of the stuff perl put in the buffer is long
     since gone, having been read by grep already. The data goes into
     the pipe buffer, and git-submodule has no idea it is discarded when
     grep exits. The test fails.

It's hard to simulate this one with a sleep, because it requires either
git-submodule to write quickly, or for grep to be slow after reading the
matching line but before exiting.

For the latter you can do:

diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index 976f91b0eb..e2961e57dc 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -174,7 +174,7 @@ test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE'
 		perl -le "print q{foo} for (1..33000)" &&
 		git submodule status --recursive 2>err
 		echo $?>status
-	} | grep -q X/S &&
+	} | { grep -q X/S && sleep 1; } &&
 	test_must_be_empty err &&
 	test_match_signal 13 "$(cat status)"
 '

on top of your patch, which reliably fails the test. I know that looks
kind of ridiculous and fake, but you can imagine it as that first grep
just taking a long time to call exit() and close the pipe.

It's hard to make git-submodule faster, because its output is really
coming from recursive invocations of itself. But you could imagine a
world where we do the submodule recursion in a single process, buffering
it via stdio, and then write all of the lines at once. And then
git-submodule always wins the race (it issues a single write() syscall
and then exits), and the test fails.

To make the test reliable, you'd need to pause or fill the pipe buffer
_after_ writing X/S via git-submodule but before writing the rest of the
data. Or to perhaps convince git-submodule only to write the recursive
data, and then pre-stuff the pipe as I suggested earlier. But I'm not
sure how to do the latter. Even if we ask for:

  git submodule status --recursive -- X

it will print out the status of "X" before recursing into it to show
X/S, etc, which will give us SIGPIPE in the parent submodule process,
not the recursive one.

For the former, I guess you'd need some hook that runs when we recurse
into the submodule and dumps a bunch of garbage into the pipe buffer.
But I don't think there is any such hook that runs here. Unless perhaps
you abused core.fsmonitor or something, but I don't think that's
portable.

So I don't really see a way to do this robustly.

-Peff

Jeff King Jan. 7, 2025, 2:48 a.m. UTC | #4

On Mon, Jan 06, 2025 at 12:12:22PM +0100, Patrick Steinhardt wrote:

> >   - I tried using test_seq to avoid the inline perl, but it doesn't
> >     work! The problem is that it's implemented as a shell function. So
> >     when it gets SIGPIPE, the whole subshell is killed, and we never
> >     even run git-submodule at all. So it has to be a separate process
> >     (though I guess it could be test_seq in a subshell).
> 
> And that one should also work if we retain the grep. I wonder though
> whether we shouldn't prefer to use Perl regardless as it's likely to be
> faster when generating all that gibberish. Perl is basically a hard
> prerequisite for our tests anyway, so it doesn't really hurt to call it
> here.

I don't think we should pursue this direction any more because we need
to get the SIGPIPE mid-way through the git-submodule command (see the
other message I just sent).

But because it is a basic technique for establishing a reliable SIGPIPE,
and we might end up using it elsewhere, I thought I'd post a slightly
improved version.

The two things I didn't like about what I posted earlier were:

  - the guess at the pipe buffer size. 128k is probably enough in
    practice, but it's not guaranteed.

  - piping to "head" actually made our buffer size guess worse. We know
    that "head" is going to read the first line and then exit. But how
    much more data might it read? It might easily buffer 4k or even 8k,
    leaving the buffer not-quite full.

So I think a simpler and more robust version is just this:

  {
	{ yes || true; } &&
	command_expecting_sigpipe; echo $? >status
  } | true

We'll keep producing data in "yes" until the pipe is closed. So it will
closed before command_expecting_sigpipe even starts, and there is no
race there. And because we're using "true" on the right-hand side of the
pipe, nothing is read at all from the pipe. So there's no guessing about
how much might have been read.

And it works no matter how slow the right-hand side of the pipe is
(e.g., you can add a "sleep 1" there and it still works).

Like I said, this won't help our current situation, but after having
spent a little time on it (before realizing that) I figured it was worth
documenting.

-Peff

Patrick Steinhardt Jan. 7, 2025, 8:47 a.m. UTC | #5

On Mon, Jan 06, 2025 at 09:39:04PM -0500, Jeff King wrote:
> So I don't really see a way to do this robustly.

I think I found a way, which goes back to the inital idea of just
generating heaps of submodules. My current version generates a submodule
"A" with a couple of recursive submodules followed by 2.5k additional
submodules, which overall generates ~150kB of data. This can be done
somewhat efficiently via git-hash-object-object(1) and git-mktree(1),
and things work with a sleep before and after the call to grep(1).

I'm a bit torn though. The required setup is quite complex, and I wonder
whether it is really worth it just to test this edge case. On the other
hand it is there to cover a recent fix in 082caf527e (submodule status:
propagate SIGPIPE, 2024-09-20), so losing the test coverage isn't all
that great, either. And keeping the race is not an option to me, either.

So I'm inclined to go with the below version. WDYT?

Patrick


diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index f21e920367..fbfc60936c 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -167,10 +167,38 @@ do
 done
 
 test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
-	{ git submodule status --recursive 2>err; echo $?>status; } |
-		grep -q X/S &&
-	test_must_be_empty err &&
-	test_match_signal 13 "$(cat status)"
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		git clone . subrepo &&
+
+		COMMIT=$(git rev-parse HEAD) &&
+		for i in $(test_seq 2500)
+		do
+			printf "[submodule \"sm-$i\"]\npath = submodule-path-$i\n" "$i" ||
+			return 1
+		done >gitmodules &&
+		BLOB=$(git hash-object -w --stdin <gitmodules) &&
+
+		printf "100644 blob $BLOB\t.gitmodules\n" >tree &&
+		for i in $(test_seq 2500)
+		do
+			printf "160000 commit $COMMIT\tsubmodule-path-%d\n" "$i" ||
+			return 1
+		done >>tree &&
+		TREE=$(git mktree <tree) &&
+
+		COMMIT=$(git commit-tree "$TREE") &&
+		git reset --hard "$COMMIT" &&
+		GIT_ALLOW_PROTOCOL=file git submodule add "$(pwd)"/../X A &&
+
+		{ git submodule status --recursive 2>err; echo $?>status; } |
+			{ sleep 1 && grep -q A/S && sleep 1; } &&
+		test_must_be_empty err &&
+		test_match_signal 13 "$(cat status)"
+	)
 '
 
 test_done

Patrick Steinhardt Jan. 7, 2025, 8:50 a.m. UTC | #6

On Tue, Jan 07, 2025 at 09:47:44AM +0100, Patrick Steinhardt wrote:
> On Mon, Jan 06, 2025 at 09:39:04PM -0500, Jeff King wrote:
> > So I don't really see a way to do this robustly.
> 
> I think I found a way, which goes back to the inital idea of just
> generating heaps of submodules. My current version generates a submodule
> "A" with a couple of recursive submodules followed by 2.5k additional
> submodules, which overall generates ~150kB of data. This can be done
> somewhat efficiently via git-hash-object-object(1) and git-mktree(1),
> and things work with a sleep before and after the call to grep(1).
> 
> I'm a bit torn though. The required setup is quite complex, and I wonder
> whether it is really worth it just to test this edge case. On the other
> hand it is there to cover a recent fix in 082caf527e (submodule status:
> propagate SIGPIPE, 2024-09-20), so losing the test coverage isn't all
> that great, either. And keeping the race is not an option to me, either.
> 
> So I'm inclined to go with the below version. WDYT?

Gah, this of course needs to be adapted so that it is the submodule that
contains 2.5k recursive submodules. But the idea would still work.

Patrick

Junio C Hamano Jan. 7, 2025, 4:02 p.m. UTC | #7

Jeff King <peff@peff.net> writes:

> So I think a simpler and more robust version is just this:
>
>   {
> 	{ yes || true; } &&
> 	command_expecting_sigpipe; echo $? >status
>   } | true
>
> We'll keep producing data in "yes" until the pipe is closed. So it will
> closed before command_expecting_sigpipe even starts, and there is no
> race there. And because we're using "true" on the right-hand side of the
> pipe, nothing is read at all from the pipe. So there's no guessing about
> how much might have been read.

;-)

> Like I said, this won't help our current situation, but after having
> spent a little time on it (before realizing that) I figured it was worth
> documenting.

It is always fun to ses these clever hacks on the list.

Thanks.

[02/10] t7422: fix flaky test caused by buffered stdout

Commit Message

Comments

Patch