diff mbox series

[v3,1/3] blkdebug: fix one shot rule processing

Message ID 1542006398-30037-1-git-send-email-marcolso@amazon.com (mailing list archive)
State New, archived
Headers show
Series [v3,1/3] blkdebug: fix one shot rule processing | expand

Commit Message

Zhijian Li (Fujitsu)" via Nov. 12, 2018, 7:06 a.m. UTC
If 'once' is specified, the rule should execute just once, regardless if
it is supposed to return an error or not. Take the example where you
want the first IO to an LBA to succeed, but subsequent IOs to fail. You
could either use state transitions, or create two rules, one with
error = 0 and once set to true, and one with a non-zero error.

Signed-off-by: Marc Olson <marcolso@amazon.com>
---
 block/blkdebug.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Dongli Zhang Nov. 12, 2018, 11:15 a.m. UTC | #1
Hi Marc,

When I play with the v3 patch set, the qemu hangs again and I need to kill it
with "kill -9".

I got below from guest:

[  104.828127] nvme nvme0: I/O 52 QID 1 timeout, aborting
[  104.828470] nvme nvme0: Abort status: 0x4001

nvme abort is not supported by qemu and therefore 0x4001 (NVME_INVALID_OPCODE |
NVME_DNR) is returned. qemu does not restart the device.


Below is the environment:

host: most recent qemu (master branch) with 3 patches applied, on top of Ubuntu
16.04.4 with 4.15.0-36-generic

guest: Ubuntu 16.04.4 with 4.13.0-39-generic.


The image to emulate nvme is a 128MB raw image (ext4). I randomly run "dd
if=/dev/zero of=output bs=1M count=30" on the nvme raw image to hit the
sector="40960" with "inject-delay" enabled as shown below:

[inject-delay]
event = "write_aio"
latency = "9999999999"
sector = "40960"


sudo ./x86_64-softmmu/qemu-system-x86_64 \
-drive file=os.img,format=raw,if=none,id=disk0 \
-device
virtio-blk-pci,drive=disk0,id=device0,num-queues=2,iothread=io1,bootindex=0 \
-object iothread,id=io1 \
-smp 2 -m 2000M -enable-kvm  -vnc :0 -monitor stdio \
-device nvme,drive=nvmedrive,serial=deadbeaf1 \
-drive file=blkdebug:blkdebug.config:nvme.img,format=raw,if=none,id=nvmedrive \
-net nic -net user,hostfwd=tcp::5022-:22

I will debug where it hangs.

Dongli Zhang


On 11/12/2018 03:06 PM, Marc Olson via Qemu-devel wrote:
> If 'once' is specified, the rule should execute just once, regardless if
> it is supposed to return an error or not. Take the example where you
> want the first IO to an LBA to succeed, but subsequent IOs to fail. You
> could either use state transitions, or create two rules, one with
> error = 0 and once set to true, and one with a non-zero error.
> 
> Signed-off-by: Marc Olson <marcolso@amazon.com>
> ---
>  block/blkdebug.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index 0759452..327049b 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -488,7 +488,7 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes)
>          }
>      }
>  
> -    if (!rule || !rule->options.inject.error) {
> +    if (!rule) {
>          return 0;
>      }
>  
> @@ -500,7 +500,7 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes)
>          remove_rule(rule);
>      }
>  
> -    if (!immediately) {
> +    if (error && !immediately) {
>          aio_co_schedule(qemu_get_current_aio_context(), qemu_coroutine_self());
>          qemu_coroutine_yield();
>      }
>
John Snow Nov. 13, 2018, 11 p.m. UTC | #2
On 11/12/18 2:06 AM, Marc Olson via Qemu-devel wrote:
> If 'once' is specified, the rule should execute just once, regardless if
> it is supposed to return an error or not. Take the example where you
> want the first IO to an LBA to succeed, but subsequent IOs to fail. You
> could either use state transitions, or create two rules, one with
> error = 0 and once set to true, and one with a non-zero error.
> 
> Signed-off-by: Marc Olson <marcolso@amazon.com>
> ---
>  block/blkdebug.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index 0759452..327049b 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -488,7 +488,7 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes)
>          }
>      }
>  
> -    if (!rule || !rule->options.inject.error) {
> +    if (!rule) {
>          return 0;
>      }
>  

This gets rid of the early return so that later we check to see if
'once' was set and remove the rule, regardless of if it did anything or not,

> @@ -500,7 +500,7 @@ static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes)
>          remove_rule(rule);
>      }
>  
> -    if (!immediately) {
> +    if (error && !immediately) {

And then we modify this to only trigger if we have an error to inject.

>          aio_co_schedule(qemu_get_current_aio_context(), qemu_coroutine_self());
>          qemu_coroutine_yield();
>      }
> 

[down here, we return -error, but that should still be zero.]


This changes the mechanism of 'once' slightly, but only when errno was
set to zero. I'm not sure we make use of that anywhere, so I think this
should be a safe change. Certainly we don't stipulate that we only
respect once if you bothered to set errno to a non-zero value.

I thiink this is probably fine.

Reviewed-by: John Snow <jsnow@redhat.com>
Max Reitz Jan. 11, 2019, 2:36 p.m. UTC | #3
On 12.11.18 08:06, Marc Olson wrote:
> If 'once' is specified, the rule should execute just once, regardless if
> it is supposed to return an error or not. Take the example where you
> want the first IO to an LBA to succeed, but subsequent IOs to fail. You
> could either use state transitions, or create two rules, one with
> error = 0 and once set to true, and one with a non-zero error.
> 
> Signed-off-by: Marc Olson <marcolso@amazon.com>
> ---
>  block/blkdebug.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

I don't quite see the point, because the example in the commit message
clearly should be done with a state transition, but the code change
looks reasonable, so:

Reviewed-by: Max Reitz <mreitz@redhat.com>
diff mbox series

Patch

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 0759452..327049b 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -488,7 +488,7 @@  static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes)
         }
     }
 
-    if (!rule || !rule->options.inject.error) {
+    if (!rule) {
         return 0;
     }
 
@@ -500,7 +500,7 @@  static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes)
         remove_rule(rule);
     }
 
-    if (!immediately) {
+    if (error && !immediately) {
         aio_co_schedule(qemu_get_current_aio_context(), qemu_coroutine_self());
         qemu_coroutine_yield();
     }