mbox

[PULL,00/12] CI fixes and various clean-ups

Message ID 20241007115027.243425-1-thuth@redhat.com (mailing list archive)
State New, archived
Headers show

Pull-request

https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-07

Message

Thomas Huth Oct. 7, 2024, 11:50 a.m. UTC
The following changes since commit b5ab62b3c0050612c7f9b0b4baeb44ebab42775a:

  Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2024-10-04 19:28:37 +0100)

are available in the Git repository at:

  https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-07

for you to fetch changes up to d841f720c98475c0f67695d99f27794bde69ed6e:

  tests/functional: Bump timeout of some tests (2024-10-07 13:21:41 +0200)

----------------------------------------------------------------
* Mark "gluster" support as deprecated
* Update CI to use macOS 14 instead of 13, and add a macOS 15 job
* Use gitlab mirror for advent calendar test images (seems more stable)
* Bump timeouts of some tests
* Remove CRIS disassembler
* Some m68k and s390x cleanups with regards to load and store APIs

----------------------------------------------------------------
Michael Tokarev (1):
      gitlab-ci/build-oss-fuzz: print FAILED marker in case the test failed and run all tests

Philippe Mathieu-Daudé (8):
      .gitlab-ci.d/cirrus: Drop support for macOS 13 (Ventura)
      .gitlab-ci.d/cirrus: Add manual testing of macOS 15 (Sequoia)
      disas: Remove CRIS disassembler
      hw/m68k: Use explicit big-endian LD/ST API
      target/m68k: Use explicit big-endian LD/ST API
      hw/s390x: Use explicit big-endian LD/ST API
      target/s390x: Replace ldtul_p() -> ldq_p()
      target/s390x: Use explicit big-endian LD/ST API

Thomas Huth (3):
      docs: Mark "gluster" support in QEMU as deprecated
      tests/functional: Switch back to the gitlab URLs for the advent calendar tests
      tests/functional: Bump timeout of some tests

 MAINTAINERS                                        |    5 -
 docs/about/deprecated.rst                          |    9 +
 meson.build                                        |    1 -
 qapi/block-core.json                               |    8 +-
 hw/m68k/bootinfo.h                                 |   28 +-
 include/disas/dis-asm.h                            |    6 -
 include/exec/poison.h                              |    1 -
 block/gluster.c                                    |    2 +
 disas/cris.c                                       | 2863 --------------------
 hw/m68k/mcf5208.c                                  |    2 +-
 hw/m68k/next-cube.c                                |    2 +-
 hw/m68k/q800.c                                     |    4 +-
 hw/s390x/ipl.c                                     |    4 +-
 hw/s390x/s390-pci-inst.c                           |  166 +-
 target/m68k/gdbstub.c                              |    2 +-
 target/m68k/helper.c                               |   10 +-
 target/s390x/gdbstub.c                             |   34 +-
 target/s390x/ioinst.c                              |    2 +-
 .gitlab-ci.d/buildtest.yml                         |    5 +-
 .gitlab-ci.d/cirrus.yml                            |   12 +-
 .../cirrus/{macos-13.vars => macos-15.vars}        |    2 +-
 disas/meson.build                                  |    1 -
 tests/docker/dockerfiles/opensuse-leap.docker      |    2 +-
 tests/functional/meson.build                       |    9 +-
 tests/functional/test_arm_vexpress.py              |    2 +-
 tests/functional/test_m68k_mcf5208evb.py           |    2 +-
 tests/functional/test_or1k_sim.py                  |    2 +-
 tests/functional/test_ppc64_e500.py                |    2 +-
 tests/functional/test_ppc_mac.py                   |    2 +-
 tests/functional/test_sh4_r2d.py                   |    2 +-
 tests/functional/test_sparc_sun4m.py               |    2 +-
 tests/functional/test_xtensa_lx60.py               |    2 +-
 tests/lcitool/libvirt-ci                           |    2 +-
 tests/lcitool/refresh                              |    2 +-
 34 files changed, 173 insertions(+), 3027 deletions(-)
 delete mode 100644 disas/cris.c
 rename .gitlab-ci.d/cirrus/{macos-13.vars => macos-15.vars} (95%)

Comments

Peter Maydell Oct. 7, 2024, 1:43 p.m. UTC | #1
On Mon, 7 Oct 2024 at 12:50, Thomas Huth <thuth@redhat.com> wrote:
>
> The following changes since commit b5ab62b3c0050612c7f9b0b4baeb44ebab42775a:
>
>   Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2024-10-04 19:28:37 +0100)
>
> are available in the Git repository at:
>
>   https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-07
>
> for you to fetch changes up to d841f720c98475c0f67695d99f27794bde69ed6e:
>
>   tests/functional: Bump timeout of some tests (2024-10-07 13:21:41 +0200)
>
> ----------------------------------------------------------------
> * Mark "gluster" support as deprecated
> * Update CI to use macOS 14 instead of 13, and add a macOS 15 job
> * Use gitlab mirror for advent calendar test images (seems more stable)
> * Bump timeouts of some tests
> * Remove CRIS disassembler
> * Some m68k and s390x cleanups with regards to load and store APIs
>
> ----------------------------------------------------------------

This suggests it's moving back to the gitlab mirror for the
advent calendar tests, but one CI test still failed trying
to access http://www.qemu-advent-calendar.org/2023/download/day13.tar.gz
and getting a 503 from it:

  https://gitlab.com/qemu-project/qemu/-/jobs/8009902301

The clang-system test also hit a couple of timeouts:
  https://gitlab.com/qemu-project/qemu/-/jobs/8009902206

61/109 qemu:qtest+qtest-alpha / qtest-alpha/qmp-cmd-test
  TIMEOUT 60.10s killed by signal 15 SIGTERM
93/109 qemu:qtest+qtest-arm / qtest-arm/qmp-cmd-test
  TIMEOUT 60.04s killed by signal 15 SIGTERM

which are presumably pre-existing intermittents, but I
mention them here just FYI. Some of the other qmp-cmd-test
runs in that job also came close to timing out:

102/109 qemu:qtest+qtest-m68k / qtest-m68k/qmp-cmd-test OK 56.56s 65
subtests passed
105/109 qemu:qtest+qtest-mips64 / qtest-mips64/qmp-cmd-test OK 53.74s
65 subtests passed
106/109 qemu:qtest+qtest-s390x / qtest-s390x/qmp-cmd-test OK 45.48s 65
subtests passed

so maybe we should add it to slow_tests with a 120s
timeout...

thanks
-- PMM
Peter Maydell Oct. 7, 2024, 2:13 p.m. UTC | #2
On Mon, 7 Oct 2024 at 14:43, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Mon, 7 Oct 2024 at 12:50, Thomas Huth <thuth@redhat.com> wrote:
> >
> > The following changes since commit b5ab62b3c0050612c7f9b0b4baeb44ebab42775a:
> >
> >   Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2024-10-04 19:28:37 +0100)
> >
> > are available in the Git repository at:
> >
> >   https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-07
> >
> > for you to fetch changes up to d841f720c98475c0f67695d99f27794bde69ed6e:
> >
> >   tests/functional: Bump timeout of some tests (2024-10-07 13:21:41 +0200)
> >
> > ----------------------------------------------------------------
> > * Mark "gluster" support as deprecated
> > * Update CI to use macOS 14 instead of 13, and add a macOS 15 job
> > * Use gitlab mirror for advent calendar test images (seems more stable)
> > * Bump timeouts of some tests
> > * Remove CRIS disassembler
> > * Some m68k and s390x cleanups with regards to load and store APIs
> >
> > ----------------------------------------------------------------
>
> This suggests it's moving back to the gitlab mirror for the
> advent calendar tests, but one CI test still failed trying
> to access http://www.qemu-advent-calendar.org/2023/download/day13.tar.gz
> and getting a 503 from it:
>
>   https://gitlab.com/qemu-project/qemu/-/jobs/8009902301

On the rerun it managed to download:
https://gitlab.com/qemu-project/qemu/-/jobs/8011303154

> The clang-system test also hit a couple of timeouts:
>   https://gitlab.com/qemu-project/qemu/-/jobs/8009902206
>
> 61/109 qemu:qtest+qtest-alpha / qtest-alpha/qmp-cmd-test
>   TIMEOUT 60.10s killed by signal 15 SIGTERM
> 93/109 qemu:qtest+qtest-arm / qtest-arm/qmp-cmd-test
>   TIMEOUT 60.04s killed by signal 15 SIGTERM
>
> which are presumably pre-existing intermittents, but I
> mention them here just FYI. Some of the other qmp-cmd-test
> runs in that job also came close to timing out:
>
> 102/109 qemu:qtest+qtest-m68k / qtest-m68k/qmp-cmd-test OK 56.56s 65
> subtests passed
> 105/109 qemu:qtest+qtest-mips64 / qtest-mips64/qmp-cmd-test OK 53.74s
> 65 subtests passed
> 106/109 qemu:qtest+qtest-s390x / qtest-s390x/qmp-cmd-test OK 45.48s 65
> subtests passed
>
> so maybe we should add it to slow_tests with a 120s
> timeout...

As expected, these are all intermittents; on the passing job:

https://gitlab.com/qemu-project/qemu/-/jobs/8011303114

they completed in 19s, 20s, 19s, 19s, 19s. So we're seeing
factor-of-3 variation in job runtime on this k8s runner :-(

Anyway, I've pushed this pullreq; we can look at the above
two things as follow-on fixes.

thanks
-- PMM
Thomas Huth Oct. 7, 2024, 4:41 p.m. UTC | #3
On 07/10/2024 16.13, Peter Maydell wrote:
> On Mon, 7 Oct 2024 at 14:43, Peter Maydell <peter.maydell@linaro.org> wrote:
>>
>> On Mon, 7 Oct 2024 at 12:50, Thomas Huth <thuth@redhat.com> wrote:
>>>
>>> The following changes since commit b5ab62b3c0050612c7f9b0b4baeb44ebab42775a:
>>>
>>>    Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2024-10-04 19:28:37 +0100)
>>>
>>> are available in the Git repository at:
>>>
>>>    https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-07
>>>
>>> for you to fetch changes up to d841f720c98475c0f67695d99f27794bde69ed6e:
>>>
>>>    tests/functional: Bump timeout of some tests (2024-10-07 13:21:41 +0200)
>>>
>>> ----------------------------------------------------------------
>>> * Mark "gluster" support as deprecated
>>> * Update CI to use macOS 14 instead of 13, and add a macOS 15 job
>>> * Use gitlab mirror for advent calendar test images (seems more stable)
>>> * Bump timeouts of some tests
>>> * Remove CRIS disassembler
>>> * Some m68k and s390x cleanups with regards to load and store APIs
>>>
>>> ----------------------------------------------------------------
>>
>> This suggests it's moving back to the gitlab mirror for the
>> advent calendar tests, but one CI test still failed trying
>> to access http://www.qemu-advent-calendar.org/2023/download/day13.tar.gz
>> and getting a 503 from it:
>>
>>    https://gitlab.com/qemu-project/qemu/-/jobs/8009902301

Yes, that day13.tar.gz is from 2023 which is not included in the mirror on 
gitlab (yet). If we continue to see failures with the original site, I can 
have a try to put it into the mirror repository, too.

> On the rerun it managed to download:
> https://gitlab.com/qemu-project/qemu/-/jobs/8011303154
> 
>> The clang-system test also hit a couple of timeouts:
>>    https://gitlab.com/qemu-project/qemu/-/jobs/8009902206
>>
>> 61/109 qemu:qtest+qtest-alpha / qtest-alpha/qmp-cmd-test
>>    TIMEOUT 60.10s killed by signal 15 SIGTERM
>> 93/109 qemu:qtest+qtest-arm / qtest-arm/qmp-cmd-test
>>    TIMEOUT 60.04s killed by signal 15 SIGTERM
>>
>> which are presumably pre-existing intermittents, but I
>> mention them here just FYI.

I neither had anything related to arm/alpha nor to qtests in my pull 
request, so yes, it's likely something pre-existing... maybe something from 
the previous pull requests? (or did you see these in the past already?)

>> Some of the other qmp-cmd-test
>> runs in that job also came close to timing out:
>>
>> 102/109 qemu:qtest+qtest-m68k / qtest-m68k/qmp-cmd-test OK 56.56s 65
>> subtests passed
>> 105/109 qemu:qtest+qtest-mips64 / qtest-mips64/qmp-cmd-test OK 53.74s
>> 65 subtests passed
>> 106/109 qemu:qtest+qtest-s390x / qtest-s390x/qmp-cmd-test OK 45.48s 65
>> subtests passed
>>
>> so maybe we should add it to slow_tests with a 120s
>> timeout...

Ok, m68k and s390x have been touched by this PR ... but still, it's one 
qtest (qmp-cmd-test) that is failing for multiple targets, so it rather 
sounds like we've got a regression in one of the previous PRs?

> As expected, these are all intermittents; on the passing job:
> 
> https://gitlab.com/qemu-project/qemu/-/jobs/8011303114
> 
> they completed in 19s, 20s, 19s, 19s, 19s. So we're seeing
> factor-of-3 variation in job runtime on this k8s runner :-(
> 
> Anyway, I've pushed this pullreq; we can look at the above
> two things as follow-on fixes.

Thanks!

   Thomas
Peter Maydell Oct. 7, 2024, 4:51 p.m. UTC | #4
On Mon, 7 Oct 2024 at 17:41, Thomas Huth <thuth@redhat.com> wrote:
>
> On 07/10/2024 16.13, Peter Maydell wrote:
> >> Some of the other qmp-cmd-test
> >> runs in that job also came close to timing out:
> >>
> >> 102/109 qemu:qtest+qtest-m68k / qtest-m68k/qmp-cmd-test OK 56.56s 65
> >> subtests passed
> >> 105/109 qemu:qtest+qtest-mips64 / qtest-mips64/qmp-cmd-test OK 53.74s
> >> 65 subtests passed
> >> 106/109 qemu:qtest+qtest-s390x / qtest-s390x/qmp-cmd-test OK 45.48s 65
> >> subtests passed
> >>
> >> so maybe we should add it to slow_tests with a 120s
> >> timeout...
>
> Ok, m68k and s390x have been touched by this PR ... but still, it's one
> qtest (qmp-cmd-test) that is failing for multiple targets, so it rather
> sounds like we've got a regression in one of the previous PRs?

I think it's more likely that the k8s runners are just
horrifically inconsistent about speed: they have been
the flaky CI jobs in one way or another at least since
I started doing pullreq handling for this release cycle.

If they reliably ran these jobs in 20s then there would be
no issue, we would have tons of headroom between that and
the 60s timeout. (My local dev box runs them in 13s, and
it's not super high-powered.) If they reliably took 60s
then we'd have fixed up the timeouts already (but that
would imply a very slow CPU).

Our other option would be to use that meson "multiply
all the timeouts by X" feature for the k8s jobs. Of
course if it does go that slowly for the whole job
then we run into the whole-job timeout...

Paolo: do you have any idea why our k8s runner jobs
have such inconsistent performance ?

-- PMM