Message ID | 20200629202053.1223342-1-its@irrelevant.dk (mailing list archive) |
---|---|
Headers | show |
Series | hw/block/nvme: handle transient dma errors | expand |
Patchew URL: https://patchew.org/QEMU/20200629202053.1223342-1-its@irrelevant.dk/ Hi, This series failed the docker-quick@centos7 build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #!/bin/bash make docker-image-centos7 V=1 NETWORK=1 time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1 === TEST SCRIPT END === --- /tmp/qemu-test/src/tests/qemu-iotests/040.out 2020-06-29 20:12:10.000000000 +0000 +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 20:58:48.288790818 +0000 @@ -1,3 +1,5 @@ +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest ........................................................... ---------------------------------------------------------------------- Ran 59 tests --- Not run: 259 Failures: 040 Failed 1 of 119 iotests make: *** [check-tests/check-block.sh] Error 1 make: *** Waiting for unfinished jobs.... TEST check-qtest-aarch64: tests/qtest/qos-test Traceback (most recent call last): --- raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=da25eaa8bdd04cb783e2c427c6a5aa94', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-98l7koy2/src/docker-src.2020-06-29-16.51.46.20742:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2. filter=--filter=label=com.qemu.instance.uuid=da25eaa8bdd04cb783e2c427c6a5aa94 make[1]: *** [docker-run] Error 1 make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-98l7koy2/src' make: *** [docker-run-test-quick@centos7] Error 2 real 15m57.590s user 0m9.240s The full log is available at http://patchew.org/logs/20200629202053.1223342-1-its@irrelevant.dk/testing.docker-quick@centos7/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-devel@redhat.com
On Jun 29 14:07, no-reply@patchew.org wrote: > Patchew URL: https://patchew.org/QEMU/20200629202053.1223342-1-its@irrelevant.dk/ > > > > Hi, > > This series failed the docker-quick@centos7 build test. Please find the testing commands and > their output below. If you have Docker installed, you can probably reproduce it > locally. > > === TEST SCRIPT BEGIN === > #!/bin/bash > make docker-image-centos7 V=1 NETWORK=1 > time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1 > === TEST SCRIPT END === > > --- /tmp/qemu-test/src/tests/qemu-iotests/040.out 2020-06-29 20:12:10.000000000 +0000 > +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 20:58:48.288790818 +0000 > @@ -1,3 +1,5 @@ > +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest > +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest Hmm, I can't seem to reproduce this locally and the test succeeded on the next series[1] that is based on this. Is this a flaky test? Or a bad test runner? I'm of course worried when a qcow2 test fails and I touch something else than the nvme device ;) [1]: https://patchew.org/QEMU/20200629203155.1236860-1-its@irrelevant.dk/
On 6/29/20 11:34 PM, Klaus Jensen wrote: > On Jun 29 14:07, no-reply@patchew.org wrote: >> Patchew URL: https://patchew.org/QEMU/20200629202053.1223342-1-its@irrelevant.dk/ >> --- /tmp/qemu-test/src/tests/qemu-iotests/040.out 2020-06-29 20:12:10.000000000 +0000 >> +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 20:58:48.288790818 +0000 >> @@ -1,3 +1,5 @@ >> +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest >> +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest Kevin, Max, can iotests/040 be affected by this change? > > > Hmm, I can't seem to reproduce this locally and the test succeeded on > the next series[1] that is based on this. > > Is this a flaky test? Or a bad test runner? I'm of course worried when > a qcow2 test fails and I touch something else than the nvme device ;) > > > [1]: https://patchew.org/QEMU/20200629203155.1236860-1-its@irrelevant.dk/ >
Am 01.07.2020 um 14:58 hat Philippe Mathieu-Daudé geschrieben: > On 6/29/20 11:34 PM, Klaus Jensen wrote: > > On Jun 29 14:07, no-reply@patchew.org wrote: > >> Patchew URL: https://patchew.org/QEMU/20200629202053.1223342-1-its@irrelevant.dk/ > > >> --- /tmp/qemu-test/src/tests/qemu-iotests/040.out 2020-06-29 20:12:10.000000000 +0000 > >> +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 20:58:48.288790818 +0000 > >> @@ -1,3 +1,5 @@ > >> +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest > >> +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest > > Kevin, Max, can iotests/040 be affected by this change? The diffstat of this series looks like it doesn't touch anything outside of the nvme emuation, which isn't used by this test, so at least I'd say it's not the fault of the patch series. I think test cases use SIGKILL primarily in timeout handlers, so maybe the test host was overloaded and didn't shutdown QEMU in time so it was killed. There is no actually failing test case: ........................................................... ---------------------------------------------------------------------- Ran 59 tests You would have 'F' or 'E' for fail/error instead of '.' otherwise. Kevin > > > > > > Hmm, I can't seem to reproduce this locally and the test succeeded on > > the next series[1] that is based on this. > > > > Is this a flaky test? Or a bad test runner? I'm of course worried when > > a qcow2 test fails and I touch something else than the nvme device ;) > > > > > > [1]: https://patchew.org/QEMU/20200629203155.1236860-1-its@irrelevant.dk/ > > >
On 7/3/20 9:50 AM, Kevin Wolf wrote: > Am 01.07.2020 um 14:58 hat Philippe Mathieu-Daudé geschrieben: >> On 6/29/20 11:34 PM, Klaus Jensen wrote: >>> On Jun 29 14:07, no-reply@patchew.org wrote: >>>> Patchew URL: https://patchew.org/QEMU/20200629202053.1223342-1-its@irrelevant.dk/ >> >>>> --- /tmp/qemu-test/src/tests/qemu-iotests/040.out 2020-06-29 20:12:10.000000000 +0000 >>>> +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 20:58:48.288790818 +0000 >>>> @@ -1,3 +1,5 @@ >>>> +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest >>>> +WARNING:qemu.machine:qemu received signal 9: /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults -display none -accel qtest >> >> Kevin, Max, can iotests/040 be affected by this change? > > The diffstat of this series looks like it doesn't touch anything outside > of the nvme emuation, which isn't used by this test, so at least I'd say > it's not the fault of the patch series. > > I think test cases use SIGKILL primarily in timeout handlers, so maybe > the test host was overloaded and didn't shutdown QEMU in time so it was > killed. There is no actually failing test case: > > ........................................................... > ---------------------------------------------------------------------- > Ran 59 tests > > You would have 'F' or 'E' for fail/error instead of '.' otherwise. TIL how to read that line :) Thanks for your analysis Kevin! > > Kevin > >>> >>> >>> Hmm, I can't seem to reproduce this locally and the test succeeded on >>> the next series[1] that is based on this. >>> >>> Is this a flaky test? Or a bad test runner? I'm of course worried when >>> a qcow2 test fails and I touch something else than the nvme device ;) >>> >>> >>> [1]: https://patchew.org/QEMU/20200629203155.1236860-1-its@irrelevant.dk/ >>> >> >
From: Klaus Jensen <k.jensen@samsung.com> QEMU actually respects that Bus Master Enabling for a PCI device gets flipped, so in order to succesfully pass the block/011 test ("disable PCI device while doing I/O") the nvme device needs to know if a dma transfer was successful or not. Based-on: <20200629195017.1217056-1-its@irrelevant.dk> ("[PATCH 00/17] hw/block/nvme: AIO and address mapping refactoring") Klaus Jensen (2): pci: pass along the return value of dma_memory_rw hw/block/nvme: handle dma errors hw/block/nvme.c | 43 ++++++++++++++++++++++++++++++++----------- hw/block/trace-events | 2 ++ include/block/nvme.h | 2 +- include/hw/pci/pci.h | 3 +-- 4 files changed, 36 insertions(+), 14 deletions(-)