Message ID | 20231215070357.10888-1-thuth@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | tests: enable meson test timeouts to improve debuggability | expand |
Thomas Huth <thuth@redhat.com> writes: > This is a respin of Daniel's series that re-enables the meson test > runner timeouts. To make sure that we do not get into trouble on > older systems, I ran all the tests with "make check SPEED=slow -j32" > on my laptop that has only 16 SMT threads, so each test was running > quite a bit slower than with a normal "-j$(nproc)" run. I think > that these timeouts should now work in most cases - if not, we still > can adjust them easily later. Queued to testing/next, thanks.
15.12.2023 10:03, Thomas Huth wrote: > This is a respin of Daniel's series that re-enables the meson test > runner timeouts. To make sure that we do not get into trouble on > older systems, I ran all the tests with "make check SPEED=slow -j32" > on my laptop that has only 16 SMT threads, so each test was running > quite a bit slower than with a normal "-j$(nproc)" run. I think > that these timeouts should now work in most cases - if not, we still > can adjust them easily later. I'm picking this up for stable branches too, since there we have the same problems in CI environment. In particular, bios-tables-test almost always times out, even hitting retry doesn't help. Let's see how it goes.. JFYI. Thanks, /mjt
On Tue, Jan 23, 2024 at 07:50:09PM +0300, Michael Tokarev wrote: > 15.12.2023 10:03, Thomas Huth wrote: > > This is a respin of Daniel's series that re-enables the meson test > > runner timeouts. To make sure that we do not get into trouble on > > older systems, I ran all the tests with "make check SPEED=slow -j32" > > on my laptop that has only 16 SMT threads, so each test was running > > quite a bit slower than with a normal "-j$(nproc)" run. I think > > that these timeouts should now work in most cases - if not, we still > > can adjust them easily later. > > I'm picking this up for stable branches too, since there we have the same > problems in CI environment. In particular, bios-tables-test almost always > times out, even hitting retry doesn't help. Let's see how it goes.. > > JFYI. There have been a bunch of followups that Thomas has posted since this series merged that you should pick up too when they merge. With regards, Daniel
On 23/01/2024 17.50, Michael Tokarev wrote: > 15.12.2023 10:03, Thomas Huth wrote: >> This is a respin of Daniel's series that re-enables the meson test >> runner timeouts. To make sure that we do not get into trouble on >> older systems, I ran all the tests with "make check SPEED=slow -j32" >> on my laptop that has only 16 SMT threads, so each test was running >> quite a bit slower than with a normal "-j$(nproc)" run. I think >> that these timeouts should now work in most cases - if not, we still >> can adjust them easily later. > > I'm picking this up for stable branches too, since there we have the same > problems in CI environment. In particular, bios-tables-test almost always > times out, even hitting retry doesn't help. Let's see how it goes.. Uh, wait, that does not make too much sense ... if bios-tables-test already times out *without* the additional meson-based timeouts, then adding the meson timeouts won't help. bios-tables-test uses the manually coded timeout from boot_sector_test() that is currently set to 600 seconds. If you hit that timeout, that likely means that something is really broken in your branch - or is it sometimes still succeeding? Thomas
23.01.2024 20:47, Thomas Huth: > On 23/01/2024 17.50, Michael Tokarev wrote: .. >> I'm picking this up for stable branches too, since there we have the same >> problems in CI environment. In particular, bios-tables-test almost always >> times out, even hitting retry doesn't help. Let's see how it goes.. > > Uh, wait, that does not make too much sense ... if bios-tables-test already times out *without* the additional meson-based timeouts, then adding the > meson timeouts won't help. bios-tables-test uses the manually coded timeout from boot_sector_test() that is currently set to 600 seconds. If you hit > that timeout, that likely means that something is really broken in your branch - or is it sometimes still succeeding? I mistyped the test name as I was dealing with bios-tables-test at that time in another context (unrelated). Actual failing test in this case, among others, is avocado acpi_smbios_bits, eg https://gitlab.com/qemu-project/qemu/-/jobs/5991505589#L231 which timed out on multiple attempts. In this example it took a bit less than 65s. Subsequent retry succeeded in 51s: https://gitlab.com/qemu-project/qemu/-/jobs/5995055845#L212 but this run was at much later time, apparently when gitlab was had less load, - as whole run was significantly faster. So this particular failure has nothing to do with this patchset, and the patchset does not do anything to it. (I was in a bit distracted mode whole day today due to $ork issues). Thanks, /mjt
On Tue, 23 Jan 2024 at 20:52, Michael Tokarev <mjt@tls.msk.ru> wrote: > > 23.01.2024 20:47, Thomas Huth: > > On 23/01/2024 17.50, Michael Tokarev wrote: > .. > > >> I'm picking this up for stable branches too, since there we have the same > >> problems in CI environment. In particular, bios-tables-test almost always > >> times out, even hitting retry doesn't help. Let's see how it goes.. > > > > Uh, wait, that does not make too much sense ... if bios-tables-test already times out *without* the additional meson-based timeouts, then adding the > > meson timeouts won't help. bios-tables-test uses the manually coded timeout from boot_sector_test() that is currently set to 600 seconds. If you hit > > that timeout, that likely means that something is really broken in your branch - or is it sometimes still succeeding? > > I mistyped the test name as I was dealing with bios-tables-test at that > time in another context (unrelated). Actual failing test in this case, > among others, is avocado acpi_smbios_bits, eg > https://gitlab.com/qemu-project/qemu/-/jobs/5991505589#L231 which timed > out on multiple attempts. In this example it took a bit less than 65s. The fix for that flakiness is commit 7ef4c41e91d59 ("acpi/tests/avocado/bits: wait for 200 seconds for SHUTDOWN event from bits VM"). thanks -- PMM