Message ID | 20240319085015.3901051-1-shinichiro.kawasaki@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [blktests] nbd/rc: check nbd connection with nbd-client -check command | expand |
Hi Shinichiro Thanks for the fix, with this change, the issue still can be reproduced, here is the log: =======================98 nbd/002 (tests on partition handling for an nbd device) [failed] runtime 1.436s ... 0.917s --- tests/nbd/002.out 2024-03-19 04:51:34.051614893 +0100 +++ /root/blktests/results/nodev/nbd/002.out.bad 2024-03-20 07:01:28.769392087 +0100 @@ -1,4 +1,4 @@ Running nbd/002 Testing IOCTL path Testing the netlink path -Test complete +Didn't have partition on the netlink path dmesg: [ 737.405376] run blktests nbd/002 at 2024-03-20 07:01:27 [ 738.102997] nbd0: detected capacity change from 0 to 20971520 [ 738.122439] nbd0: [ 738.157483] block nbd0: NBD_DISCONNECT [ 738.157742] block nbd0: Disconnected due to user request. [ 738.158094] block nbd0: shutting down sockets [ 738.206999] nbd0: detected capacity change from 0 to 20971520 [ 738.208587] nbd0: p1 [ 738.246641] block nbd0: NBD_DISCONNECT [ 738.246893] block nbd0: Disconnected due to user request. [ 738.247217] block nbd0: shutting down sockets [ 738.313979] nbd0: detected capacity change from 0 to 20971520 [ 738.315450] nbd0: p1 [ 738.319949] block nbd0: NBD_DISCONNECT [ 738.320244] block nbd0: Disconnected due to user request. [ 738.320535] block nbd0: shutting down sockets [ 738.321276] blk_print_req_error: 4 callbacks suppressed [ 738.321280] I/O error, dev nbd0, sector 272 op 0x0:(READ) flags 0x80700 phys_seg 30 prio class 0 [ 738.322466] I/O error, dev nbd0, sector 272 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [ 738.322901] buffer_io_error: 4 callbacks suppressed [ 738.322903] Buffer I/O error on dev nbd0, logical block 34, async page read [ 738.326007] I/O error, dev nbd0, sector 16 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 [ 738.326916] I/O error, dev nbd0, sector 16 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [ 738.327381] Buffer I/O error on dev nbd0, logical block 2, async page read On Tue, Mar 19, 2024 at 4:50 PM Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> wrote: > > _wait_for_nbd_connect() checks nbd connections by checking the existence > of a debugfs attribute file. However, even when the file exists, nbd > connections are not fully ready, and the stat command for the nbd device > file in the test case nbd/002 may fail with unexpected I/O errors. > > To avoid the failure, check the nbd connections not only by the debugfs > attribute file, but also by "nbd-client -check" command. > > Link: https://github.com/osandov/blktests/pull/134 > Reported-by: Yi Zhang <yi.zhang@redhat.com> > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> > --- > tests/nbd/rc | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/tests/nbd/rc b/tests/nbd/rc > index 9c1c15b..266befd 100644 > --- a/tests/nbd/rc > +++ b/tests/nbd/rc > @@ -43,7 +43,8 @@ _have_nbd_netlink() { > > _wait_for_nbd_connect() { > for ((i = 0; i < 3; i++)); do > - if [[ -e /sys/kernel/debug/nbd/nbd0/tasks ]]; then > + if [[ -e /sys/kernel/debug/nbd/nbd0/tasks ]] && \ > + nbd-client -check /dev/nbd0 &> /dev/null; then > return 0 > fi > sleep 1 > -- > 2.44.0 >
On Mar 20, 2024 / 14:12, Yi Zhang wrote: > Hi Shinichiro > > Thanks for the fix, with this change, the issue still can be > reproduced, here is the log: > > =======================98 > nbd/002 (tests on partition handling for an nbd device) [failed] > runtime 1.436s ... 0.917s > --- tests/nbd/002.out 2024-03-19 04:51:34.051614893 +0100 > +++ /root/blktests/results/nodev/nbd/002.out.bad 2024-03-20 > 07:01:28.769392087 +0100 > @@ -1,4 +1,4 @@ > Running nbd/002 > Testing IOCTL path > Testing the netlink path > -Test complete > +Didn't have partition on the netlink path Thanks. The patch reduces the ratio of the failure, but it it done not fix the bug completely. Without the patch, the failure happens once in a twice. With the patch, the failure happens once in a 30 times repeats of the test case. I will dig in further.
diff --git a/tests/nbd/rc b/tests/nbd/rc index 9c1c15b..266befd 100644 --- a/tests/nbd/rc +++ b/tests/nbd/rc @@ -43,7 +43,8 @@ _have_nbd_netlink() { _wait_for_nbd_connect() { for ((i = 0; i < 3; i++)); do - if [[ -e /sys/kernel/debug/nbd/nbd0/tasks ]]; then + if [[ -e /sys/kernel/debug/nbd/nbd0/tasks ]] && \ + nbd-client -check /dev/nbd0 &> /dev/null; then return 0 fi sleep 1
_wait_for_nbd_connect() checks nbd connections by checking the existence of a debugfs attribute file. However, even when the file exists, nbd connections are not fully ready, and the stat command for the nbd device file in the test case nbd/002 may fail with unexpected I/O errors. To avoid the failure, check the nbd connections not only by the debugfs attribute file, but also by "nbd-client -check" command. Link: https://github.com/osandov/blktests/pull/134 Reported-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> --- tests/nbd/rc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)