[PULL,00/12] CI fixes and various clean-ups

Message ID	20241007115027.243425-1-thuth@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org> From: Thomas Huth <thuth@redhat.com> To: qemu-devel@nongnu.org Cc: Peter Maydell <peter.maydell@linaro.org> Subject: [PULL 00/12] CI fixes and various clean-ups Date: Mon, 7 Oct 2024 13:50:15 +0200 Message-ID: <20241007115027.243425-1-thuth@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.133.124; envelope-from=thuth@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.153, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org

Message ID

20241007115027.243425-1-thuth@redhat.com (mailing list archive)

State

New, archived

Headers

From: Thomas Huth <thuth@redhat.com>
To: qemu-devel@nongnu.org
Cc: Peter Maydell <peter.maydell@linaro.org>
Subject: [PULL 00/12] CI fixes and various clean-ups
Date: Mon,  7 Oct 2024 13:50:15 +0200
Message-ID: <20241007115027.243425-1-thuth@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=170.10.133.124; envelope-from=thuth@redhat.com;
 helo=us-smtp-delivery-124.mimecast.com
X-Spam_score_int: -21
X-Spam_score: -2.2
X-Spam_bar: --
X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.153,
 DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001,
 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001,
 RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001,
 SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org

Message

Thomas Huth Oct. 7, 2024, 11:50 a.m. UTC

The following changes since commit b5ab62b3c0050612c7f9b0b4baeb44ebab42775a:

  Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2024-10-04 19:28:37 +0100)

are available in the Git repository at:

  https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-07

for you to fetch changes up to d841f720c98475c0f67695d99f27794bde69ed6e:

  tests/functional: Bump timeout of some tests (2024-10-07 13:21:41 +0200)

----------------------------------------------------------------
* Mark "gluster" support as deprecated
* Update CI to use macOS 14 instead of 13, and add a macOS 15 job
* Use gitlab mirror for advent calendar test images (seems more stable)
* Bump timeouts of some tests
* Remove CRIS disassembler
* Some m68k and s390x cleanups with regards to load and store APIs

----------------------------------------------------------------
Michael Tokarev (1):
      gitlab-ci/build-oss-fuzz: print FAILED marker in case the test failed and run all tests

Philippe Mathieu-Daudé (8):
      .gitlab-ci.d/cirrus: Drop support for macOS 13 (Ventura)
      .gitlab-ci.d/cirrus: Add manual testing of macOS 15 (Sequoia)
      disas: Remove CRIS disassembler
      hw/m68k: Use explicit big-endian LD/ST API
      target/m68k: Use explicit big-endian LD/ST API
      hw/s390x: Use explicit big-endian LD/ST API
      target/s390x: Replace ldtul_p() -> ldq_p()
      target/s390x: Use explicit big-endian LD/ST API

Thomas Huth (3):
      docs: Mark "gluster" support in QEMU as deprecated
      tests/functional: Switch back to the gitlab URLs for the advent calendar tests
      tests/functional: Bump timeout of some tests

 MAINTAINERS                                        |    5 -
 docs/about/deprecated.rst                          |    9 +
 meson.build                                        |    1 -
 qapi/block-core.json                               |    8 +-
 hw/m68k/bootinfo.h                                 |   28 +-
 include/disas/dis-asm.h                            |    6 -
 include/exec/poison.h                              |    1 -
 block/gluster.c                                    |    2 +
 disas/cris.c                                       | 2863 --------------------
 hw/m68k/mcf5208.c                                  |    2 +-
 hw/m68k/next-cube.c                                |    2 +-
 hw/m68k/q800.c                                     |    4 +-
 hw/s390x/ipl.c                                     |    4 +-
 hw/s390x/s390-pci-inst.c                           |  166 +-
 target/m68k/gdbstub.c                              |    2 +-
 target/m68k/helper.c                               |   10 +-
 target/s390x/gdbstub.c                             |   34 +-
 target/s390x/ioinst.c                              |    2 +-
 .gitlab-ci.d/buildtest.yml                         |    5 +-
 .gitlab-ci.d/cirrus.yml                            |   12 +-
 .../cirrus/{macos-13.vars => macos-15.vars}        |    2 +-
 disas/meson.build                                  |    1 -
 tests/docker/dockerfiles/opensuse-leap.docker      |    2 +-
 tests/functional/meson.build                       |    9 +-
 tests/functional/test_arm_vexpress.py              |    2 +-
 tests/functional/test_m68k_mcf5208evb.py           |    2 +-
 tests/functional/test_or1k_sim.py                  |    2 +-
 tests/functional/test_ppc64_e500.py                |    2 +-
 tests/functional/test_ppc_mac.py                   |    2 +-
 tests/functional/test_sh4_r2d.py                   |    2 +-
 tests/functional/test_sparc_sun4m.py               |    2 +-
 tests/functional/test_xtensa_lx60.py               |    2 +-
 tests/lcitool/libvirt-ci                           |    2 +-
 tests/lcitool/refresh                              |    2 +-
 34 files changed, 173 insertions(+), 3027 deletions(-)
 delete mode 100644 disas/cris.c
 rename .gitlab-ci.d/cirrus/{macos-13.vars => macos-15.vars} (95%)

Comments

Peter Maydell Oct. 7, 2024, 1:43 p.m. UTC | #1

On Mon, 7 Oct 2024 at 12:50, Thomas Huth <thuth@redhat.com> wrote:
>
> The following changes since commit b5ab62b3c0050612c7f9b0b4baeb44ebab42775a:
>
>   Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2024-10-04 19:28:37 +0100)
>
> are available in the Git repository at:
>
>   https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-07
>
> for you to fetch changes up to d841f720c98475c0f67695d99f27794bde69ed6e:
>
>   tests/functional: Bump timeout of some tests (2024-10-07 13:21:41 +0200)
>
> ----------------------------------------------------------------
> * Mark "gluster" support as deprecated
> * Update CI to use macOS 14 instead of 13, and add a macOS 15 job
> * Use gitlab mirror for advent calendar test images (seems more stable)
> * Bump timeouts of some tests
> * Remove CRIS disassembler
> * Some m68k and s390x cleanups with regards to load and store APIs
>
> ----------------------------------------------------------------

This suggests it's moving back to the gitlab mirror for the
advent calendar tests, but one CI test still failed trying
to access http://www.qemu-advent-calendar.org/2023/download/day13.tar.gz
and getting a 503 from it:

  https://gitlab.com/qemu-project/qemu/-/jobs/8009902301

The clang-system test also hit a couple of timeouts:
  https://gitlab.com/qemu-project/qemu/-/jobs/8009902206

61/109 qemu:qtest+qtest-alpha / qtest-alpha/qmp-cmd-test
  TIMEOUT 60.10s killed by signal 15 SIGTERM
93/109 qemu:qtest+qtest-arm / qtest-arm/qmp-cmd-test
  TIMEOUT 60.04s killed by signal 15 SIGTERM

which are presumably pre-existing intermittents, but I
mention them here just FYI. Some of the other qmp-cmd-test
runs in that job also came close to timing out:

102/109 qemu:qtest+qtest-m68k / qtest-m68k/qmp-cmd-test OK 56.56s 65
subtests passed
105/109 qemu:qtest+qtest-mips64 / qtest-mips64/qmp-cmd-test OK 53.74s
65 subtests passed
106/109 qemu:qtest+qtest-s390x / qtest-s390x/qmp-cmd-test OK 45.48s 65
subtests passed

so maybe we should add it to slow_tests with a 120s
timeout...

thanks
-- PMM

Peter Maydell Oct. 7, 2024, 2:13 p.m. UTC | #2

On Mon, 7 Oct 2024 at 14:43, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Mon, 7 Oct 2024 at 12:50, Thomas Huth <thuth@redhat.com> wrote:
> >
> > The following changes since commit b5ab62b3c0050612c7f9b0b4baeb44ebab42775a:
> >
> >   Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2024-10-04 19:28:37 +0100)
> >
> > are available in the Git repository at:
> >
> >   https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-07
> >
> > for you to fetch changes up to d841f720c98475c0f67695d99f27794bde69ed6e:
> >
> >   tests/functional: Bump timeout of some tests (2024-10-07 13:21:41 +0200)
> >
> > ----------------------------------------------------------------
> > * Mark "gluster" support as deprecated
> > * Update CI to use macOS 14 instead of 13, and add a macOS 15 job
> > * Use gitlab mirror for advent calendar test images (seems more stable)
> > * Bump timeouts of some tests
> > * Remove CRIS disassembler
> > * Some m68k and s390x cleanups with regards to load and store APIs
> >
> > ----------------------------------------------------------------
>
> This suggests it's moving back to the gitlab mirror for the
> advent calendar tests, but one CI test still failed trying
> to access http://www.qemu-advent-calendar.org/2023/download/day13.tar.gz
> and getting a 503 from it:
>
>   https://gitlab.com/qemu-project/qemu/-/jobs/8009902301

On the rerun it managed to download:
https://gitlab.com/qemu-project/qemu/-/jobs/8011303154

> The clang-system test also hit a couple of timeouts:
>   https://gitlab.com/qemu-project/qemu/-/jobs/8009902206
>
> 61/109 qemu:qtest+qtest-alpha / qtest-alpha/qmp-cmd-test
>   TIMEOUT 60.10s killed by signal 15 SIGTERM
> 93/109 qemu:qtest+qtest-arm / qtest-arm/qmp-cmd-test
>   TIMEOUT 60.04s killed by signal 15 SIGTERM
>
> which are presumably pre-existing intermittents, but I
> mention them here just FYI. Some of the other qmp-cmd-test
> runs in that job also came close to timing out:
>
> 102/109 qemu:qtest+qtest-m68k / qtest-m68k/qmp-cmd-test OK 56.56s 65
> subtests passed
> 105/109 qemu:qtest+qtest-mips64 / qtest-mips64/qmp-cmd-test OK 53.74s
> 65 subtests passed
> 106/109 qemu:qtest+qtest-s390x / qtest-s390x/qmp-cmd-test OK 45.48s 65
> subtests passed
>
> so maybe we should add it to slow_tests with a 120s
> timeout...

As expected, these are all intermittents; on the passing job:

https://gitlab.com/qemu-project/qemu/-/jobs/8011303114

they completed in 19s, 20s, 19s, 19s, 19s. So we're seeing
factor-of-3 variation in job runtime on this k8s runner :-(

Anyway, I've pushed this pullreq; we can look at the above
two things as follow-on fixes.

thanks
-- PMM

Thomas Huth Oct. 7, 2024, 4:41 p.m. UTC | #3

On 07/10/2024 16.13, Peter Maydell wrote:
> On Mon, 7 Oct 2024 at 14:43, Peter Maydell <peter.maydell@linaro.org> wrote:
>>
>> On Mon, 7 Oct 2024 at 12:50, Thomas Huth <thuth@redhat.com> wrote:
>>>
>>> The following changes since commit b5ab62b3c0050612c7f9b0b4baeb44ebab42775a:
>>>
>>>    Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2024-10-04 19:28:37 +0100)
>>>
>>> are available in the Git repository at:
>>>
>>>    https://gitlab.com/thuth/qemu.git tags/pull-request-2024-10-07
>>>
>>> for you to fetch changes up to d841f720c98475c0f67695d99f27794bde69ed6e:
>>>
>>>    tests/functional: Bump timeout of some tests (2024-10-07 13:21:41 +0200)
>>>
>>> ----------------------------------------------------------------
>>> * Mark "gluster" support as deprecated
>>> * Update CI to use macOS 14 instead of 13, and add a macOS 15 job
>>> * Use gitlab mirror for advent calendar test images (seems more stable)
>>> * Bump timeouts of some tests
>>> * Remove CRIS disassembler
>>> * Some m68k and s390x cleanups with regards to load and store APIs
>>>
>>> ----------------------------------------------------------------
>>
>> This suggests it's moving back to the gitlab mirror for the
>> advent calendar tests, but one CI test still failed trying
>> to access http://www.qemu-advent-calendar.org/2023/download/day13.tar.gz
>> and getting a 503 from it:
>>
>>    https://gitlab.com/qemu-project/qemu/-/jobs/8009902301

Yes, that day13.tar.gz is from 2023 which is not included in the mirror on 
gitlab (yet). If we continue to see failures with the original site, I can 
have a try to put it into the mirror repository, too.

> On the rerun it managed to download:
> https://gitlab.com/qemu-project/qemu/-/jobs/8011303154
> 
>> The clang-system test also hit a couple of timeouts:
>>    https://gitlab.com/qemu-project/qemu/-/jobs/8009902206
>>
>> 61/109 qemu:qtest+qtest-alpha / qtest-alpha/qmp-cmd-test
>>    TIMEOUT 60.10s killed by signal 15 SIGTERM
>> 93/109 qemu:qtest+qtest-arm / qtest-arm/qmp-cmd-test
>>    TIMEOUT 60.04s killed by signal 15 SIGTERM
>>
>> which are presumably pre-existing intermittents, but I
>> mention them here just FYI.

I neither had anything related to arm/alpha nor to qtests in my pull 
request, so yes, it's likely something pre-existing... maybe something from 
the previous pull requests? (or did you see these in the past already?)

>> Some of the other qmp-cmd-test
>> runs in that job also came close to timing out:
>>
>> 102/109 qemu:qtest+qtest-m68k / qtest-m68k/qmp-cmd-test OK 56.56s 65
>> subtests passed
>> 105/109 qemu:qtest+qtest-mips64 / qtest-mips64/qmp-cmd-test OK 53.74s
>> 65 subtests passed
>> 106/109 qemu:qtest+qtest-s390x / qtest-s390x/qmp-cmd-test OK 45.48s 65
>> subtests passed
>>
>> so maybe we should add it to slow_tests with a 120s
>> timeout...

Ok, m68k and s390x have been touched by this PR ... but still, it's one 
qtest (qmp-cmd-test) that is failing for multiple targets, so it rather 
sounds like we've got a regression in one of the previous PRs?

> As expected, these are all intermittents; on the passing job:
> 
> https://gitlab.com/qemu-project/qemu/-/jobs/8011303114
> 
> they completed in 19s, 20s, 19s, 19s, 19s. So we're seeing
> factor-of-3 variation in job runtime on this k8s runner :-(
> 
> Anyway, I've pushed this pullreq; we can look at the above
> two things as follow-on fixes.

Thanks!

   Thomas

Peter Maydell Oct. 7, 2024, 4:51 p.m. UTC | #4

On Mon, 7 Oct 2024 at 17:41, Thomas Huth <thuth@redhat.com> wrote:
>
> On 07/10/2024 16.13, Peter Maydell wrote:
> >> Some of the other qmp-cmd-test
> >> runs in that job also came close to timing out:
> >>
> >> 102/109 qemu:qtest+qtest-m68k / qtest-m68k/qmp-cmd-test OK 56.56s 65
> >> subtests passed
> >> 105/109 qemu:qtest+qtest-mips64 / qtest-mips64/qmp-cmd-test OK 53.74s
> >> 65 subtests passed
> >> 106/109 qemu:qtest+qtest-s390x / qtest-s390x/qmp-cmd-test OK 45.48s 65
> >> subtests passed
> >>
> >> so maybe we should add it to slow_tests with a 120s
> >> timeout...
>
> Ok, m68k and s390x have been touched by this PR ... but still, it's one
> qtest (qmp-cmd-test) that is failing for multiple targets, so it rather
> sounds like we've got a regression in one of the previous PRs?

I think it's more likely that the k8s runners are just
horrifically inconsistent about speed: they have been
the flaky CI jobs in one way or another at least since
I started doing pullreq handling for this release cycle.

If they reliably ran these jobs in 20s then there would be
no issue, we would have tons of headroom between that and
the 60s timeout. (My local dev box runs them in 13s, and
it's not super high-powered.) If they reliably took 60s
then we'd have fixed up the timeouts already (but that
would imply a very slow CPU).

Our other option would be to use that meson "multiply
all the timeouts by X" feature for the k8s jobs. Of
course if it does go that slowly for the whole job
then we run into the whole-job timeout...

Paolo: do you have any idea why our k8s runner jobs
have such inconsistent performance ?

-- PMM

[PULL,00/12] CI fixes and various clean-ups

Pull-request

Message

Comments