t5562: chunked sleep to avoid lost SIGCHILD
diff mbox series

Message ID 20190218205028.32486-1-max@max630.net
State New
Headers show
Series
  • t5562: chunked sleep to avoid lost SIGCHILD
Related show

Commit Message

Max Kirillov Feb. 18, 2019, 8:50 p.m. UTC
If was found during stress-test run that a test may hang by 60 seconds.
It supposedly happens because SIGCHILD was received before sleep has
started.

Fix by looping by smaller chunks, checking $exited after each of them.
Then lost SIGCHILD would not cause longer delay than 1 second.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Max Kirillov <max@max630.net>
---
Submitting as proper patch. Note: I believe it does not relate to other issues
discussed in this thread.
 t/t5562/invoke-with-content-length.pl | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Randall S. Becker Feb. 18, 2019, 8:54 p.m. UTC | #1
On February 18, 2019 15:50, Max Kirillov wrote:
> To: SZEDER Gábor <szeder.dev@gmail.com>; git@vger.kernel.org
> Cc: Max Kirillov <max@max630.net>; Johannes Schindelin
> <Johannes.Schindelin@gmx.de>; Randall S. Becker
> <rsbecker@nexbridge.com>; 'Junio C Hamano' <gitster@pobox.com>
> Subject: [PATCH] t5562: chunked sleep to avoid lost SIGCHILD
> 
> If was found during stress-test run that a test may hang by 60 seconds.
> It supposedly happens because SIGCHILD was received before sleep has
> started.
> 
> Fix by looping by smaller chunks, checking $exited after each of them.
> Then lost SIGCHILD would not cause longer delay than 1 second.
> 
> Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
> Signed-off-by: Max Kirillov <max@max630.net>
> ---
> Submitting as proper patch. Note: I believe it does not relate to other issues
> discussed in this thread.
>  t/t5562/invoke-with-content-length.pl | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/t/t5562/invoke-with-content-length.pl b/t/t5562/invoke-with-
> content-length.pl
> index 0943474af2..257e280e3b 100644
> --- a/t/t5562/invoke-with-content-length.pl
> +++ b/t/t5562/invoke-with-content-length.pl
> @@ -29,7 +29,12 @@
>  }
>  print $out $body_data or die "Cannot write data: $!";
> 
> -sleep 60; # is interrupted by SIGCHLD
> +my $counter = 0;
> +while (not $exited and $counter < 60) {
> +        sleep 1;
> +        $counter = $counter + 1;
> +}
> +
>  if (!$exited) {
>          close($out);
>          die "Command did not exit after reading whole body";

I tried this fix and it made no difference to the hang on NonStop. I do not think this fixes the root cause as sleep was never an issue and SIGCHLD was not missed in any test I conducted. Maybe on another platform it is required.
Max Kirillov Feb. 18, 2019, 8:59 p.m. UTC | #2
On Mon, Feb 18, 2019 at 03:54:27PM -0500, Randall S. Becker wrote:
> On February 18, 2019 15:50, Max Kirillov wrote:
> > To: SZEDER Gábor <szeder.dev@gmail.com>; git@vger.kernel.org
> > Cc: Max Kirillov <max@max630.net>; Johannes Schindelin
> > <Johannes.Schindelin@gmx.de>; Randall S. Becker
> > <rsbecker@nexbridge.com>; 'Junio C Hamano' <gitster@pobox.com>
> > Subject: [PATCH] t5562: chunked sleep to avoid lost SIGCHILD
> > 
> > If was found during stress-test run that a test may hang by 60 seconds.
> > It supposedly happens because SIGCHILD was received before sleep has
> > started.
> > 
> > Fix by looping by smaller chunks, checking $exited after each of them.
> > Then lost SIGCHILD would not cause longer delay than 1 second.
> > 
> > Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
> > Signed-off-by: Max Kirillov <max@max630.net>
> > ---
> > Submitting as proper patch. Note: I believe it does not relate to other issues
> > discussed in this thread.
> >  t/t5562/invoke-with-content-length.pl | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/t/t5562/invoke-with-content-length.pl b/t/t5562/invoke-with-
> > content-length.pl
> > index 0943474af2..257e280e3b 100644
> > --- a/t/t5562/invoke-with-content-length.pl
> > +++ b/t/t5562/invoke-with-content-length.pl
> > @@ -29,7 +29,12 @@
> >  }
> >  print $out $body_data or die "Cannot write data: $!";
> > 
> > -sleep 60; # is interrupted by SIGCHLD
> > +my $counter = 0;
> > +while (not $exited and $counter < 60) {
> > +        sleep 1;
> > +        $counter = $counter + 1;
> > +}
> > +
> >  if (!$exited) {
> >          close($out);
> >          die "Command did not exit after reading whole body";
> 
> I tried this fix and it made no difference to the hang on
> NonStop. I do not think this fixes the root cause as sleep
> was never an issue and SIGCHLD was not missed in any test
> I conducted. Maybe on another platform it is required.

Correct, as I said it should not be related.
Junio C Hamano Feb. 19, 2019, 6:38 p.m. UTC | #3
Max Kirillov <max@max630.net> writes:

> If was found during stress-test run that a test may hang by 60 seconds.
> It supposedly happens because SIGCHILD was received before sleep has
> started.
>
> Fix by looping by smaller chunks, checking $exited after each of them.
> Then lost SIGCHILD would not cause longer delay than 1 second.
>
> Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
> Signed-off-by: Max Kirillov <max@max630.net>
> ---
> Submitting as proper patch. Note: I believe it does not relate to other issues
> discussed in this thread.
>  t/t5562/invoke-with-content-length.pl | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/t/t5562/invoke-with-content-length.pl b/t/t5562/invoke-with-content-length.pl
> index 0943474af2..257e280e3b 100644
> --- a/t/t5562/invoke-with-content-length.pl
> +++ b/t/t5562/invoke-with-content-length.pl
> @@ -29,7 +29,12 @@
>  }
>  print $out $body_data or die "Cannot write data: $!";
>  
> -sleep 60; # is interrupted by SIGCHLD

Ah, of course.  If SIGCHLD interrupts, sets $existed in the handler,
then we won't go back to sleep.  But if the signal came before the
sleep starts, we spend full 60 seconds here before we check $exited.

Makes sense.

> +my $counter = 0;
> +while (not $exited and $counter < 60) {
> +        sleep 1;
> +        $counter = $counter + 1;
> +}
> +
>  if (!$exited) {
>          close($out);
>          die "Command did not exit after reading whole body";

Patch
diff mbox series

diff --git a/t/t5562/invoke-with-content-length.pl b/t/t5562/invoke-with-content-length.pl
index 0943474af2..257e280e3b 100644
--- a/t/t5562/invoke-with-content-length.pl
+++ b/t/t5562/invoke-with-content-length.pl
@@ -29,7 +29,12 @@ 
 }
 print $out $body_data or die "Cannot write data: $!";
 
-sleep 60; # is interrupted by SIGCHLD
+my $counter = 0;
+while (not $exited and $counter < 60) {
+        sleep 1;
+        $counter = $counter + 1;
+}
+
 if (!$exited) {
         close($out);
         die "Command did not exit after reading whole body";