[v2,02/10] t7422: fix flaky test caused by buffered stdout

Message ID	20250106-b4-pks-ci-fixes-v2-2-06ae540771b7@pks.im (mailing list archive)
State	Superseded
Headers	show Received: from fhigh-a2-smtp.messagingengine.com (fhigh-a2-smtp.messagingengine.com [103.168.172.153]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F130E1DD0D6 for <git@vger.kernel.org>; Mon, 6 Jan 2025 11:16:58 +0000 (UTC) Feedback-ID: i197146af:Fastmail From: Patrick Steinhardt <ps@pks.im> Date: Mon, 06 Jan 2025 12:16:51 +0100 Subject: [PATCH v2 02/10] t7422: fix flaky test caused by buffered stdout Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20250106-b4-pks-ci-fixes-v2-2-06ae540771b7@pks.im> References: <20250106-b4-pks-ci-fixes-v2-0-06ae540771b7@pks.im> In-Reply-To: <20250106-b4-pks-ci-fixes-v2-0-06ae540771b7@pks.im> To: git@vger.kernel.org Cc: Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>
Series	A couple of CI improvements \| expand [v2,00/10] A couple of CI improvements [v2,01/10] t0060: fix EBUSY in MinGW when setting up runtime prefix [v2,02/10] t7422: fix flaky test caused by buffered stdout [v2,03/10] github: adapt containerized jobs to be rootless [v2,04/10] github: convert all Linux jobs to be containerized [v2,05/10] github: simplify computation of the job's distro [v2,06/10] gitlab-ci: remove the "linux-old" job [v2,07/10] gitlab-ci: add linux32 job testing against i386 [v2,08/10] ci: stop special-casing for Ubuntu 16.04 [v2,09/10] ci: use latest Ubuntu release [v2,10/10] ci: remove stale code for Azure Pipelines

Message ID

20250106-b4-pks-ci-fixes-v2-2-06ae540771b7@pks.im (mailing list archive)

State

Superseded

Headers

Feedback-ID: i197146af:Fastmail
From: Patrick Steinhardt <ps@pks.im>
Date: Mon, 06 Jan 2025 12:16:51 +0100
Subject: [PATCH v2 02/10] t7422: fix flaky test caused by buffered stdout
Precedence: bulk
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
Message-Id: <20250106-b4-pks-ci-fixes-v2-2-06ae540771b7@pks.im>
References: <20250106-b4-pks-ci-fixes-v2-0-06ae540771b7@pks.im>
In-Reply-To: <20250106-b4-pks-ci-fixes-v2-0-06ae540771b7@pks.im>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>

Series

A couple of CI improvements | expand

Commit Message

Patrick Steinhardt Jan. 6, 2025, 11:16 a.m. UTC

One test in t7422 asserts that `git submodule status --recursive`
properly handles SIGPIPE. This test is flaky though and may sometimes
not see a SIGPIPE at all:

    expecting success of 7422.18 'git submodule status --recursive propagates SIGPIPE':
            { git submodule status --recursive 2>err; echo $?>status; } |
                    grep -q X/S &&
            test_must_be_empty err &&
            test_match_signal 13 "$(cat status)"
    ++ git submodule status --recursive
    ++ grep -q X/S
    ++ echo 0
    ++ test_must_be_empty err
    ++ test 1 -ne 1
    ++ test_path_is_file err
    ++ test 1 -ne 1
    ++ test -f err
    ++ test -s err
    +++ cat status
    ++ test_match_signal 13 0
    ++ test 0 = 141
    ++ test 0 = 269
    ++ return 1
    error: last command exited with $?=1
    not ok 18 - git submodule status --recursive propagates SIGPIPE

The issue is caused by us using grep(1) to terminate the pipe on the
first matching line in the recursing git-submodule(1) process. Standard
streams are typically buffered though, so this condition is racy and may
cause us to terminate the pipe after git-submodule(1) has already
exited, and in that case we wouldn't see the expected signal.

Fix the issue by making the writer fill the pipe buffer before we
execute git-submodule(1). Ideally, it would be git-submodule(1) itself
that does produce all that data, but it would require us to create a
large amount of submodules, which is inefficient. Instead, we use Perl
to print gibberish until the buffer is filled.

To verify that this works as expected one can apply the following patch
to the preimage of this commit, which used to reliably trigger the race:

    diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
    index f21e920367..9338c75626 100755
    --- a/t/t7422-submodule-output.sh
    +++ b/t/t7422-submodule-output.sh
    @@ -168,7 +168,7 @@ done

     test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
            { git submodule status --recursive 2>err; echo $?>status; } |
    -		grep -q X/S &&
    +		{ sleep 1 && grep -q X/S; } &&
            test_must_be_empty err &&
            test_match_signal 13 "$(cat status)"
     '

With the pipe-stuffing workaround the test runs successfully.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 t/t7422-submodule-output.sh | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Jeff King Jan. 7, 2025, 2:49 a.m. UTC | #1

On Mon, Jan 06, 2025 at 12:16:51PM +0100, Patrick Steinhardt wrote:

> Fix the issue by making the writer fill the pipe buffer before we
> execute git-submodule(1). Ideally, it would be git-submodule(1) itself
> that does produce all that data, but it would require us to create a
> large amount of submodules, which is inefficient. Instead, we use Perl
> to print gibberish until the buffer is filled.
> 
> To verify that this works as expected one can apply the following patch
> to the preimage of this commit, which used to reliably trigger the race:
> 
>     diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
>     index f21e920367..9338c75626 100755
>     --- a/t/t7422-submodule-output.sh
>     +++ b/t/t7422-submodule-output.sh
>     @@ -168,7 +168,7 @@ done
> 
>      test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
>             { git submodule status --recursive 2>err; echo $?>status; } |
>     -		grep -q X/S &&
>     +		{ sleep 1 && grep -q X/S; } &&
>             test_must_be_empty err &&
>             test_match_signal 13 "$(cat status)"
>      '
> 
> With the pipe-stuffing workaround the test runs successfully.

Sadly this isn't enough. The pipe-stuffing solves the race with grep
_starting_ (and thus the extra "sleep"), but the fundamental race we've
seen in practice still remains. See my reply the v1 thread for details.

-Peff

diff --git a/t/t7422-submodule-output.sh b/t/t7422-submodule-output.sh
index f21e9203678b94701281d5339ae8bfe53d5de0ed..976f91b0ebd9d82daee3802a212dd3f4031fe86b 100755
--- a/t/t7422-submodule-output.sh
+++ b/t/t7422-submodule-output.sh
@@ -167,8 +167,14 @@  do
 done
 
 test_expect_success !MINGW 'git submodule status --recursive propagates SIGPIPE' '
-	{ git submodule status --recursive 2>err; echo $?>status; } |
-		grep -q X/S &&
+	{
+		# Stuff pipe buffer full of input so that `git submodule
+		# status` will block on write; this script will write over
+		# 128kb.
+		perl -le "print q{foo} for (1..33000)" &&
+		git submodule status --recursive 2>err
+		echo $?>status
+	} | grep -q X/S &&
 	test_must_be_empty err &&
 	test_match_signal 13 "$(cat status)"
 '

[v2,02/10] t7422: fix flaky test caused by buffered stdout

Commit Message

Comments

Patch