diff mbox series

[v2] tests/tcg: Skip failing ppc64 multi-threaded tests

Message ID 20240725154003.428065-1-npiggin@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v2] tests/tcg: Skip failing ppc64 multi-threaded tests | expand

Commit Message

Nicholas Piggin July 25, 2024, 3:40 p.m. UTC
In Gitlab CI, some ppc64 multi-threaded tcg tests crash when run in the
clang-user job with an assertion failure in glibc that seems to
indicate corruption:

  signals: allocatestack.c:223: allocate_stack:
    Assertion `powerof2 (pagesize_m1 + 1)' failed.

Disable these tests for now.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 tests/tcg/ppc64/Makefile.target | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Alex Bennée July 25, 2024, 8:22 p.m. UTC | #1
Nicholas Piggin <npiggin@gmail.com> writes:

> In Gitlab CI, some ppc64 multi-threaded tcg tests crash when run in the
> clang-user job with an assertion failure in glibc that seems to
> indicate corruption:
>
>   signals: allocatestack.c:223: allocate_stack:
>     Assertion `powerof2 (pagesize_m1 + 1)' failed.
>
> Disable these tests for now.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  tests/tcg/ppc64/Makefile.target | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/tests/tcg/ppc64/Makefile.target b/tests/tcg/ppc64/Makefile.target
> index 8c3e4e4038..509a20be2b 100644
> --- a/tests/tcg/ppc64/Makefile.target
> +++ b/tests/tcg/ppc64/Makefile.target
> @@ -11,6 +11,18 @@ config-cc.mak: Makefile
>  
>  -include config-cc.mak
>  
> +# multi-threaded tests are known to fail (e.g., clang-user CI job)
> +# See: https://gitlab.com/qemu-project/qemu/-/issues/2456

Given this is only a problem with clang can we only apply these
workaround if we detect "clang" in $(CC)?

> +run-signals: signals
> +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> +run-plugin-signals-with-%:
> +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> +
> +run-threadcount: threadcount
> +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> +run-plugin-threadcount-with-%:
> +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> +
>  ifneq ($(CROSS_CC_HAS_POWER8_VECTOR),)
>  PPC64_TESTS=bcdsub non_signalling_xscv
>  endif
Alex Bennée July 25, 2024, 8:29 p.m. UTC | #2
Alex Bennée <alex.bennee@linaro.org> writes:

> Nicholas Piggin <npiggin@gmail.com> writes:
>
>> In Gitlab CI, some ppc64 multi-threaded tcg tests crash when run in the
>> clang-user job with an assertion failure in glibc that seems to
>> indicate corruption:
>>
>>   signals: allocatestack.c:223: allocate_stack:
>>     Assertion `powerof2 (pagesize_m1 + 1)' failed.
>>
>> Disable these tests for now.
>>
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>  tests/tcg/ppc64/Makefile.target | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/tests/tcg/ppc64/Makefile.target b/tests/tcg/ppc64/Makefile.target
>> index 8c3e4e4038..509a20be2b 100644
>> --- a/tests/tcg/ppc64/Makefile.target
>> +++ b/tests/tcg/ppc64/Makefile.target
>> @@ -11,6 +11,18 @@ config-cc.mak: Makefile
>>  
>>  -include config-cc.mak
>>  
>> +# multi-threaded tests are known to fail (e.g., clang-user CI job)
>> +# See: https://gitlab.com/qemu-project/qemu/-/issues/2456
>
> Given this is only a problem with clang can we only apply these
> workaround if we detect "clang" in $(CC)?

ifeq ($(findstring clang,$(CC)),clang)
...
endif

should do the trick
>
>> +run-signals: signals
>> +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
>> +run-plugin-signals-with-%:
>> +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
>> +
>> +run-threadcount: threadcount
>> +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
>> +run-plugin-threadcount-with-%:
>> +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
>> +
>>  ifneq ($(CROSS_CC_HAS_POWER8_VECTOR),)
>>  PPC64_TESTS=bcdsub non_signalling_xscv
>>  endif
Nicholas Piggin July 25, 2024, 10:23 p.m. UTC | #3
On Fri Jul 26, 2024 at 6:29 AM AEST, Alex Bennée wrote:
> Alex Bennée <alex.bennee@linaro.org> writes:
>
> > Nicholas Piggin <npiggin@gmail.com> writes:
> >
> >> In Gitlab CI, some ppc64 multi-threaded tcg tests crash when run in the
> >> clang-user job with an assertion failure in glibc that seems to
> >> indicate corruption:
> >>
> >>   signals: allocatestack.c:223: allocate_stack:
> >>     Assertion `powerof2 (pagesize_m1 + 1)' failed.
> >>
> >> Disable these tests for now.
> >>
> >> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> >> ---
> >>  tests/tcg/ppc64/Makefile.target | 12 ++++++++++++
> >>  1 file changed, 12 insertions(+)
> >>
> >> diff --git a/tests/tcg/ppc64/Makefile.target b/tests/tcg/ppc64/Makefile.target
> >> index 8c3e4e4038..509a20be2b 100644
> >> --- a/tests/tcg/ppc64/Makefile.target
> >> +++ b/tests/tcg/ppc64/Makefile.target
> >> @@ -11,6 +11,18 @@ config-cc.mak: Makefile
> >>  
> >>  -include config-cc.mak
> >>  
> >> +# multi-threaded tests are known to fail (e.g., clang-user CI job)
> >> +# See: https://gitlab.com/qemu-project/qemu/-/issues/2456
> >
> > Given this is only a problem with clang can we only apply these
> > workaround if we detect "clang" in $(CC)?
>
> ifeq ($(findstring clang,$(CC)),clang)
> ...
> endif
>
> should do the trick

I did try that, but unfortunately the target CC (ppc64 in this case). 
I'll just send the big hammer to CI unstuck, and I'll try to work it
out later.

Thanks,
Nick
Thomas Huth July 26, 2024, 9:11 a.m. UTC | #4
On 25/07/2024 17.40, Nicholas Piggin wrote:
> In Gitlab CI, some ppc64 multi-threaded tcg tests crash when run in the
> clang-user job with an assertion failure in glibc that seems to
> indicate corruption:
> 
>    signals: allocatestack.c:223: allocate_stack:
>      Assertion `powerof2 (pagesize_m1 + 1)' failed.
> 
> Disable these tests for now.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   tests/tcg/ppc64/Makefile.target | 12 ++++++++++++
>   1 file changed, 12 insertions(+)
> 
> diff --git a/tests/tcg/ppc64/Makefile.target b/tests/tcg/ppc64/Makefile.target
> index 8c3e4e4038..509a20be2b 100644
> --- a/tests/tcg/ppc64/Makefile.target
> +++ b/tests/tcg/ppc64/Makefile.target
> @@ -11,6 +11,18 @@ config-cc.mak: Makefile
>   
>   -include config-cc.mak
>   
> +# multi-threaded tests are known to fail (e.g., clang-user CI job)
> +# See: https://gitlab.com/qemu-project/qemu/-/issues/2456
> +run-signals: signals
> +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> +run-plugin-signals-with-%:
> +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> +
> +run-threadcount: threadcount
> +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> +run-plugin-threadcount-with-%:
> +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> +
>   ifneq ($(CROSS_CC_HAS_POWER8_VECTOR),)
>   PPC64_TESTS=bcdsub non_signalling_xscv
>   endif

Could you please check whether this is already fixed by Richard's patch:

  https://gitlab.com/qemu-project/qemu/-/commit/8e466dd092469e5ab0f355775c57

?

  Thanks,
   Thomas
Nicholas Piggin July 26, 2024, 12:01 p.m. UTC | #5
On Fri Jul 26, 2024 at 7:11 PM AEST, Thomas Huth wrote:
> On 25/07/2024 17.40, Nicholas Piggin wrote:
> > In Gitlab CI, some ppc64 multi-threaded tcg tests crash when run in the
> > clang-user job with an assertion failure in glibc that seems to
> > indicate corruption:
> > 
> >    signals: allocatestack.c:223: allocate_stack:
> >      Assertion `powerof2 (pagesize_m1 + 1)' failed.
> > 
> > Disable these tests for now.
> > 
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> >   tests/tcg/ppc64/Makefile.target | 12 ++++++++++++
> >   1 file changed, 12 insertions(+)
> > 
> > diff --git a/tests/tcg/ppc64/Makefile.target b/tests/tcg/ppc64/Makefile.target
> > index 8c3e4e4038..509a20be2b 100644
> > --- a/tests/tcg/ppc64/Makefile.target
> > +++ b/tests/tcg/ppc64/Makefile.target
> > @@ -11,6 +11,18 @@ config-cc.mak: Makefile
> >   
> >   -include config-cc.mak
> >   
> > +# multi-threaded tests are known to fail (e.g., clang-user CI job)
> > +# See: https://gitlab.com/qemu-project/qemu/-/issues/2456
> > +run-signals: signals
> > +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> > +run-plugin-signals-with-%:
> > +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> > +
> > +run-threadcount: threadcount
> > +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> > +run-plugin-threadcount-with-%:
> > +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> > +
> >   ifneq ($(CROSS_CC_HAS_POWER8_VECTOR),)
> >   PPC64_TESTS=bcdsub non_signalling_xscv
> >   endif
>
> Could you please check whether this is already fixed by Richard's patch:
>
>   https://gitlab.com/qemu-project/qemu/-/commit/8e466dd092469e5ab0f355775c57
>
> ?

No, doesn't seem to unfortunately. Here is the same fail -

https://gitlab.com/npiggin/qemu/-/jobs/7436325582

I did build it with clang and using the same sanitize flags on
my local system, and could not reproduce. So not sure.

I might try run the clang-user without any sanitize flags.

I sent this patch with ppc pull request already, I think we
just do that for now to get clang-user passing. Simple enough
to revert once fixed.

Thanks,
Nick
Nicholas Piggin July 26, 2024, 4:12 p.m. UTC | #6
On Fri Jul 26, 2024 at 7:11 PM AEST, Thomas Huth wrote:
> On 25/07/2024 17.40, Nicholas Piggin wrote:
> > In Gitlab CI, some ppc64 multi-threaded tcg tests crash when run in the
> > clang-user job with an assertion failure in glibc that seems to
> > indicate corruption:
> > 
> >    signals: allocatestack.c:223: allocate_stack:
> >      Assertion `powerof2 (pagesize_m1 + 1)' failed.
> > 
> > Disable these tests for now.
> > 
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> >   tests/tcg/ppc64/Makefile.target | 12 ++++++++++++
> >   1 file changed, 12 insertions(+)
> > 
> > diff --git a/tests/tcg/ppc64/Makefile.target b/tests/tcg/ppc64/Makefile.target
> > index 8c3e4e4038..509a20be2b 100644
> > --- a/tests/tcg/ppc64/Makefile.target
> > +++ b/tests/tcg/ppc64/Makefile.target
> > @@ -11,6 +11,18 @@ config-cc.mak: Makefile
> >   
> >   -include config-cc.mak
> >   
> > +# multi-threaded tests are known to fail (e.g., clang-user CI job)
> > +# See: https://gitlab.com/qemu-project/qemu/-/issues/2456
> > +run-signals: signals
> > +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> > +run-plugin-signals-with-%:
> > +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> > +
> > +run-threadcount: threadcount
> > +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> > +run-plugin-threadcount-with-%:
> > +	$(call skip-test, $<, "BROKEN (flaky with clang) ")
> > +
> >   ifneq ($(CROSS_CC_HAS_POWER8_VECTOR),)
> >   PPC64_TESTS=bcdsub non_signalling_xscv
> >   endif
>
> Could you please check whether this is already fixed by Richard's patch:
>
>   https://gitlab.com/qemu-project/qemu/-/commit/8e466dd092469e5ab0f355775c57

Okay removing the sanitizing entirely fixes it, e.g., this patch

https://gitlab.com/npiggin/qemu/-/commit/6160a7dd834b2d0e7bb08f13f709693ffa7c8d06

Result:

https://gitlab.com/npiggin/qemu/-/jobs/7436997610

Thanks,
Nick
diff mbox series

Patch

diff --git a/tests/tcg/ppc64/Makefile.target b/tests/tcg/ppc64/Makefile.target
index 8c3e4e4038..509a20be2b 100644
--- a/tests/tcg/ppc64/Makefile.target
+++ b/tests/tcg/ppc64/Makefile.target
@@ -11,6 +11,18 @@  config-cc.mak: Makefile
 
 -include config-cc.mak
 
+# multi-threaded tests are known to fail (e.g., clang-user CI job)
+# See: https://gitlab.com/qemu-project/qemu/-/issues/2456
+run-signals: signals
+	$(call skip-test, $<, "BROKEN (flaky with clang) ")
+run-plugin-signals-with-%:
+	$(call skip-test, $<, "BROKEN (flaky with clang) ")
+
+run-threadcount: threadcount
+	$(call skip-test, $<, "BROKEN (flaky with clang) ")
+run-plugin-threadcount-with-%:
+	$(call skip-test, $<, "BROKEN (flaky with clang) ")
+
 ifneq ($(CROSS_CC_HAS_POWER8_VECTOR),)
 PPC64_TESTS=bcdsub non_signalling_xscv
 endif