diff mbox series

[v4,02/10] t7422: fix flaky test caused by buffered stdout

Message ID 20250110-b4-pks-ci-fixes-v4-2-6e4613446080@pks.im (mailing list archive)
State Accepted
Commit 65f586132bfa21c3e9fe7b2803ef526133a3b269
Headers show
Series A couple of CI improvements | expand

Commit Message

Patrick Steinhardt Jan. 10, 2025, 11:31 a.m. UTC
One test in t7422 asserts that `git submodule status --recursive`
properly handles SIGPIPE. This test is flaky though and may sometimes
not see a SIGPIPE at all:

    expecting success of 7422.18 'git submodule status --recursive propagates SIGPIPE':
            { git submodule status --recursive 2>err; echo $?>status; } |
                    grep -q X/S &&
            test_must_be_empty err &&
            test_match_signal 13 "$(cat status)"
    ++ git submodule status --recursive
    ++ grep -q X/S
    ++ echo 0
    ++ test_must_be_empty err
    ++ test 1 -ne 1
    ++ test_path_is_file err
    ++ test 1 -ne 1
    ++ test -f err
    ++ test -s err
    +++ cat status
    ++ test_match_signal 13 0
    ++ test 0 = 141
    ++ test 0 = 269
    ++ return 1
    error: last command exited with $?=1
    not ok 18 - git submodule status --recursive propagates SIGPIPE

The issue is caused by a race between git-submodule(1) and grep(1):

  1. git-submodule(1) (or its child process) writes the first X/S line
     we're trying to match.

  2. grep(1) matches the line.

  3a. grep(1) exits, closing the pipe.

  3b. git-submodule(1) (or its child process) writes the rest of its
  lines.

Steps 3a and 3b happen at the same time without any guarantees. If 3a
happens first, we get SIGPIPE. Otherwise, we don't and the test fails.

Fix the issue by generating a couple thousand nested submodules and
matching on the first nested submodule. This ensures that the recursive
git-submodule(1) process completely fills its stdout buffer, which makes
subsequent writes block until the downstream consumer of the pipe either
reads more or closes it.

To verify that this works as expected one can apply the following patch
to the preimage of this commit, which used to reliably trigger the race:

    diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
    index 3c5177cc30..df6001f8a0 100755
    --- a/t/t7422-submodule-output.sh
    +++ b/t/t7422-submodule-output.sh
    @@ -202,7 +202,7 @@ test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE'
     		cd repo &&
     		GIT_ALLOW_PROTOCOL=file git submodule add "$(pwd)"/../submodule &&
     		{ git submodule status --recursive 2>err; echo $?>status; } |
    -			grep -q recursive-submodule-path-1 &&
    +			{ sleep 1 && grep -q recursive-submodule-path-1 && sleep 1; } &&
     		test_must_be_empty err &&
     		test_match_signal 13 "$(cat status)"
     	)

With the pipe-stuffing workaround the test runs successfully.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t7422-submodule-output.sh | 43 +++++++++++++++++++++++++++++++++++++++----
 1 file changed, 39 insertions(+), 4 deletions(-)

Comments

Christian Couder Jan. 24, 2025, 9:16 a.m. UTC | #1
On Fri, Jan 10, 2025 at 12:32 PM Patrick Steinhardt <ps@pks.im> wrote:

> Fix the issue by generating a couple thousand nested submodules and
> matching on the first nested submodule. This ensures that the recursive
> git-submodule(1) process completely fills its stdout buffer,

The patch looks great to me and I like the previous discussion with
Peff about it. I just want to say that, after reading the discussion
and then this paragraph, I wondered if it would have been possible to
instead have a `test-tool submodule` helper that would behave the same
as `git submodule` except that it would call setvbuf() to reduce the
size of the stdout buffer. This might have allowed a test that didn't
need 2000 nested submodules, and thus might have been faster. No need
to change anything though.

> which makes
> subsequent writes block until the downstream consumer of the pipe either
> reads more or closes it.
diff mbox series

Patch

diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index f21e9203678b94701281d5339ae8bfe53d5de0ed..023a5cbdc44bac2389fca45cf7017750627c4ce9 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -167,10 +167,45 @@  do
 done
 
 test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
-	{ git submodule status --recursive 2>err; echo $?>status; } |
-		grep -q X/S &&
-	test_must_be_empty err &&
-	test_match_signal 13 "$(cat status)"
+	# The test setup is somewhat involved because triggering a SIGPIPE is
+	# racy with buffered pipes. To avoid the raciness we thus need to make
+	# sure that the subprocess in question fills the buffers completely,
+	# which requires a couple thousand submodules in total.
+	test_when_finished "rm -rf submodule repo" &&
+	git init submodule &&
+	(
+		cd submodule &&
+		test_commit initial &&
+
+		COMMIT=$(git rev-parse HEAD) &&
+		for i in $(test_seq 2000)
+		do
+			printf "[submodule \"sm-$i\"]\npath = recursive-submodule-path-$i\n" "$i" ||
+			return 1
+		done >gitmodules &&
+		BLOB=$(git hash-object -w --stdin <gitmodules) &&
+
+		printf "100644 blob $BLOB\t.gitmodules\n" >tree &&
+		for i in $(test_seq 2000)
+		do
+			printf "160000 commit $COMMIT\trecursive-submodule-path-%d\n" "$i" ||
+			return 1
+		done >>tree &&
+		TREE=$(git mktree <tree) &&
+
+		COMMIT=$(git commit-tree "$TREE") &&
+		git reset --hard "$COMMIT"
+	) &&
+
+	git init repo &&
+	(
+		cd repo &&
+		GIT_ALLOW_PROTOCOL=file git submodule add "$(pwd)"/../submodule &&
+		{ git submodule status --recursive 2>err; echo $?>status; } |
+			grep -q recursive-submodule-path-1 &&
+		test_must_be_empty err &&
+		test_match_signal 13 "$(cat status)"
+	)
 '
 
 test_done