diff mbox series

gitlab-ci: Fix the build-cfi-aarch64 and build-cfi-ppc64-s390x jobs

Message ID 20220603124809.70794-1-thuth@redhat.com (mailing list archive)
State New, archived
Headers show
Series gitlab-ci: Fix the build-cfi-aarch64 and build-cfi-ppc64-s390x jobs | expand

Commit Message

Thomas Huth June 3, 2022, 12:48 p.m. UTC
The job definitions recently got a second "variables:" section by
accident and thus are failing now if one tries to run them. Merge
the two sections into one again to fix the issue.

And while we're at it, bump the timeout here (70 minutes are currently
not enough for the aarch64 job). The jobs are marked as manual anyway,
so if the user starts them, they want to see their result for sure and
then it's annoying if the job timeouts too early.

Fixes: e312d1fdbb ("gitlab: convert build/container jobs to .base_job_template")
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 I wonder whether we should remove the build-cfi-aarch64 job instead.
 When I tried to run it during the past months, it was always failing
 for me. This time, I tried to bump the timeout while I was at it,
 and it takes longer than 80 minutes here to finish - so I asume
 nobody ever ran this successfully in the last months... Is anybody
 using this job at all? I think if we want to have CFI coverage here,
 it should get replaced by a custom runner job that runs on a more
 beefy machine... (the ppc64-s390x job is fine by the way, it often
 only runs a little bit longer than 60 minutes - I still bumped the
 timeout here, too, just to be on the safe side)

 .gitlab-ci.d/buildtest.yml | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)

Comments

Richard Henderson June 3, 2022, 4:17 p.m. UTC | #1
On 6/3/22 05:48, Thomas Huth wrote:
> The job definitions recently got a second "variables:" section by
> accident and thus are failing now if one tries to run them. Merge
> the two sections into one again to fix the issue.
> 
> And while we're at it, bump the timeout here (70 minutes are currently
> not enough for the aarch64 job). The jobs are marked as manual anyway,
> so if the user starts them, they want to see their result for sure and
> then it's annoying if the job timeouts too early.
> 
> Fixes: e312d1fdbb ("gitlab: convert build/container jobs to .base_job_template")
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>   I wonder whether we should remove the build-cfi-aarch64 job instead.
>   When I tried to run it during the past months, it was always failing
>   for me. This time, I tried to bump the timeout while I was at it,
>   and it takes longer than 80 minutes here to finish - so I asume
>   nobody ever ran this successfully in the last months... Is anybody
>   using this job at all? I think if we want to have CFI coverage here,
>   it should get replaced by a custom runner job that runs on a more
>   beefy machine... (the ppc64-s390x job is fine by the way, it often
>   only runs a little bit longer than 60 minutes - I still bumped the
>   timeout here, too, just to be on the safe side)

Acked-by: Richard Henderson <richard.henderson@linaro.org>

I think it might be useful to extend the other s390x jobs a bit too.  The last couple of 
fails have the test *nearly* completing.  E.g. your most recent pr:

https://gitlab.com/qemu-project/qemu/-/jobs/2544009687

Whether that indicates we've a speed regression, or host loading, or simply changes to the 
testsuite, I don't know.


r~
Thomas Huth June 3, 2022, 4:32 p.m. UTC | #2
On 03/06/2022 18.17, Richard Henderson wrote:
> On 6/3/22 05:48, Thomas Huth wrote:
>> The job definitions recently got a second "variables:" section by
>> accident and thus are failing now if one tries to run them. Merge
>> the two sections into one again to fix the issue.
>>
>> And while we're at it, bump the timeout here (70 minutes are currently
>> not enough for the aarch64 job). The jobs are marked as manual anyway,
>> so if the user starts them, they want to see their result for sure and
>> then it's annoying if the job timeouts too early.
>>
>> Fixes: e312d1fdbb ("gitlab: convert build/container jobs to 
>> .base_job_template")
>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>   I wonder whether we should remove the build-cfi-aarch64 job instead.
>>   When I tried to run it during the past months, it was always failing
>>   for me. This time, I tried to bump the timeout while I was at it,
>>   and it takes longer than 80 minutes here to finish - so I asume
>>   nobody ever ran this successfully in the last months... Is anybody
>>   using this job at all? I think if we want to have CFI coverage here,
>>   it should get replaced by a custom runner job that runs on a more
>>   beefy machine... (the ppc64-s390x job is fine by the way, it often
>>   only runs a little bit longer than 60 minutes - I still bumped the
>>   timeout here, too, just to be on the safe side)
> 
> Acked-by: Richard Henderson <richard.henderson@linaro.org>
> 
> I think it might be useful to extend the other s390x jobs a bit too.  The 
> last couple of fails have the test *nearly* completing.  E.g. your most 
> recent pr:
> 
> https://gitlab.com/qemu-project/qemu/-/jobs/2544009687

These tests are running on the custom s390x runner machine - I don't have 
access to that one, i.e. I also do not have any means to test changes here 
--> it would be great if that change could be done by somebody who has 
access to that machine... Peter? Christian?

  Thomas
Alex Bennée June 8, 2022, 2:59 p.m. UTC | #3
Thomas Huth <thuth@redhat.com> writes:

> The job definitions recently got a second "variables:" section by
> accident and thus are failing now if one tries to run them. Merge
> the two sections into one again to fix the issue.
>
> And while we're at it, bump the timeout here (70 minutes are currently
> not enough for the aarch64 job). The jobs are marked as manual anyway,
> so if the user starts them, they want to see their result for sure and
> then it's annoying if the job timeouts too early.
>
> Fixes: e312d1fdbb ("gitlab: convert build/container jobs to .base_job_template")
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  I wonder whether we should remove the build-cfi-aarch64 job instead.
>  When I tried to run it during the past months, it was always failing
>  for me. This time, I tried to bump the timeout while I was at it,
>  and it takes longer than 80 minutes here to finish - so I asume
>  nobody ever ran this successfully in the last months... Is anybody
>  using this job at all? I think if we want to have CFI coverage here,
>  it should get replaced by a custom runner job that runs on a more
>  beefy machine... (the ppc64-s390x job is fine by the way, it often
>  only runs a little bit longer than 60 minutes - I still bumped the
>  timeout here, too, just to be on the safe side)
>
>  .gitlab-ci.d/buildtest.yml | 22 ++++++++++------------
>  1 file changed, 10 insertions(+), 12 deletions(-)
>
> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
> index ecac3ec50c..baaa0ebb87 100644
> --- a/.gitlab-ci.d/buildtest.yml
> +++ b/.gitlab-ci.d/buildtest.yml
> @@ -355,16 +355,15 @@ build-cfi-aarch64:
>        --enable-safe-stack --enable-slirp=git
>      TARGETS: aarch64-softmmu
>      MAKE_CHECK_ARGS: check-build
> -  timeout: 70m
> -  artifacts:
> -    expire_in: 2 days
> -    paths:
> -      - build
> -  variables:
>      # FIXME: This job is often failing, likely due to out-of-memory problems in
>      # the constrained containers of the shared runners. Thus this is marked as
>      # skipped until the situation has been solved.
>      QEMU_JOB_SKIPPED: 1
> +  timeout: 90m
> +  artifacts:
> +    expire_in: 2 days
> +    paths:
> +      - build
>  
>  check-cfi-aarch64:
>    extends: .native_test_job_template
> @@ -396,16 +395,15 @@ build-cfi-ppc64-s390x:
>        --enable-safe-stack --enable-slirp=git
>      TARGETS: ppc64-softmmu s390x-softmmu
>      MAKE_CHECK_ARGS: check-build
> -  timeout: 70m
> -  artifacts:
> -    expire_in: 2 days
> -    paths:
> -      - build
> -  variables:
>      # FIXME: This job is often failing, likely due to out-of-memory problems in
>      # the constrained containers of the shared runners. Thus this is marked as
>      # skipped until the situation has been solved.
>      QEMU_JOB_SKIPPED: 1
> +  timeout: 80m
> +  artifacts:
> +    expire_in: 2 days
> +    paths:
> +      - build
>  
>  check-cfi-ppc64-s390x:
>    extends: .native_test_job_template

Queued to testing/next, thanks.
diff mbox series

Patch

diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index ecac3ec50c..baaa0ebb87 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -355,16 +355,15 @@  build-cfi-aarch64:
       --enable-safe-stack --enable-slirp=git
     TARGETS: aarch64-softmmu
     MAKE_CHECK_ARGS: check-build
-  timeout: 70m
-  artifacts:
-    expire_in: 2 days
-    paths:
-      - build
-  variables:
     # FIXME: This job is often failing, likely due to out-of-memory problems in
     # the constrained containers of the shared runners. Thus this is marked as
     # skipped until the situation has been solved.
     QEMU_JOB_SKIPPED: 1
+  timeout: 90m
+  artifacts:
+    expire_in: 2 days
+    paths:
+      - build
 
 check-cfi-aarch64:
   extends: .native_test_job_template
@@ -396,16 +395,15 @@  build-cfi-ppc64-s390x:
       --enable-safe-stack --enable-slirp=git
     TARGETS: ppc64-softmmu s390x-softmmu
     MAKE_CHECK_ARGS: check-build
-  timeout: 70m
-  artifacts:
-    expire_in: 2 days
-    paths:
-      - build
-  variables:
     # FIXME: This job is often failing, likely due to out-of-memory problems in
     # the constrained containers of the shared runners. Thus this is marked as
     # skipped until the situation has been solved.
     QEMU_JOB_SKIPPED: 1
+  timeout: 80m
+  artifacts:
+    expire_in: 2 days
+    paths:
+      - build
 
 check-cfi-ppc64-s390x:
   extends: .native_test_job_template