Message ID | 20170214192525.18624-5-eblake@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Am 14.02.2017 um 20:25 hat Eric Blake geschrieben: > In order to test the effects of artificial geometry constraints > on operations like write zero or discard, we first need blkdebug > to manage these actions. It also allows us to inject errors on > those operations, just like we can for read/write/flush. > > We can also test the contract promised by the block layer; namely, > if a device has specified limits on alignment or maximum size, > then those limits must be obeyed (for now, the blkdebug driver > merely inherits limits from whatever it is wrapping, but the next > patch will further enhance it to allow specific limit overrides). > > This patch intentionally refuses to service requests smaller than > the requested alignments; this is because an upcoming patch adds > a qemu-iotest to prove that the block layer is correctly handling > fragmentation, but the test only works if there is a way to tell > the difference at artificial alignment boundaries when blkdebug is > using a larger-than-default alignment. If we let the blkdebug > layer always defer to the underlying layer, which potentially has > a smaller granularity, the iotest will be thwarted. > > Tested by setting up an NBD server with export 'foo', then invoking: > $ ./qemu-io > qemu-io> open -o driver=blkdebug blkdebug::nbd://localhost:10809/foo > qemu-io> d 0 15M > qemu-io> w -z 0 15M > > Pre-patch, the server never sees the discard (it was silently > eaten by the block layer); post-patch it is passed across the > wire. Likewise, pre-patch the write is always passed with > NBD_WRITE (with 15M of zeroes on the wire), while post-patch > it can utilize NBD_WRITE_ZEROES (for less traffic). > > Signed-off-by: Eric Blake <eblake@redhat.com> > Reviewed-by: Max Reitz <mreitz@redhat.com> > > --- > v5: include 2017 copyright > v4: correct error injection to respect byte range, tweak formatting > v3: rebase to byte-based read/write, improve docs on why no > partial write zero passthrough > v2: new patch > --- > block/blkdebug.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 86 insertions(+) > > diff --git a/block/blkdebug.c b/block/blkdebug.c > index 37094a2..b2d5f7d 100644 > --- a/block/blkdebug.c > +++ b/block/blkdebug.c > @@ -1,6 +1,7 @@ > /* > * Block protocol for I/O error injection > * > + * Copyright (C) 2016-2017 Red Hat, Inc. > * Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com> > * > * Permission is hereby granted, free of charge, to any person obtaining a copy > @@ -382,6 +383,11 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags, > goto out; > } > > + bs->supported_write_flags = BDRV_REQ_FUA & > + bs->file->bs->supported_write_flags; > + bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) & > + bs->file->bs->supported_zero_flags; > + > /* Set request alignment */ > align = qemu_opt_get_size(opts, "align", 0); > if (align < INT_MAX && is_power_of_2(align)) { > @@ -511,6 +517,84 @@ static int blkdebug_co_flush(BlockDriverState *bs) > return bdrv_co_flush(bs->file->bs); > } > > +static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs, > + int64_t offset, int count, > + BdrvRequestFlags flags) > +{ > + BDRVBlkdebugState *s = bs->opaque; > + BlkdebugRule *rule = NULL; > + uint32_t align = MAX(bs->bl.request_alignment, > + bs->bl.pwrite_zeroes_alignment); > + > + /* Only pass through requests that are larger than requested > + * preferred alignment (so that we test the fallback to writes on > + * unaligned portions), and check that the block layer never hands > + * us anything crossing an alignment boundary. */ "crossing an alignment boundary" isn't really what we're interested in (and also not what your code checks), but just that things are properly aligned. > + if (count < align) { > + return -ENOTSUP; > + } > + assert(QEMU_IS_ALIGNED(offset, align)); > + assert(QEMU_IS_ALIGNED(count, align)); > + if (bs->bl.max_pwrite_zeroes) { > + assert(count <= bs->bl.max_pwrite_zeroes); > + } > + > + QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) { > + uint64_t inject_offset = rule->options.inject.offset; > + > + if (inject_offset == -1 || > + (inject_offset >= offset && inject_offset < offset + count)) > + { > + break; > + } > + } > + > + if (rule && rule->options.inject.error) { > + return inject_error(bs, rule); > + } > + > + return bdrv_co_pwrite_zeroes(bs->file, offset, count, flags); > +} > + > +static int coroutine_fn blkdebug_co_pdiscard(BlockDriverState *bs, > + int64_t offset, int count) > +{ > + BDRVBlkdebugState *s = bs->opaque; > + BlkdebugRule *rule = NULL; > + uint32_t align = bs->bl.pdiscard_alignment; > + > + /* Only pass through requests that are larger than requested > + * minimum alignment, and ensure that unaligned requests do not > + * cross optimum discard boundaries. */ > + if (count < bs->bl.request_alignment) { > + return -ENOTSUP; > + } > + assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment)); > + assert(QEMU_IS_ALIGNED(count, bs->bl.request_alignment)); > + if (align && count >= align) { > + assert(QEMU_IS_ALIGNED(offset, align)); > + assert(QEMU_IS_ALIGNED(count, align)); Here, in contrast, I think you really want to do what the comment says (because the contract is that you get head, bulk and tail of a discard request separately), but the code fails to do so: We could have count < align, but still cross a optimum discard alignment boundary if offset is misaligned, too, and we have two partial accesses (i.e. head and tail in the same call). > + } > + if (bs->bl.max_pdiscard) { > + assert(count <= bs->bl.max_pdiscard); > + } > + > + QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) { > + uint64_t inject_offset = rule->options.inject.offset; > + > + if (inject_offset == -1 || > + (inject_offset >= offset && inject_offset < offset + count)) > + { > + break; > + } > + } > + > + if (rule && rule->options.inject.error) { > + return inject_error(bs, rule); > + } This piece of code is duplicated in each I/O function. Should we consider factoring it out? > + return bdrv_co_pdiscard(bs->file->bs, offset, count); > +} Kevin
diff --git a/block/blkdebug.c b/block/blkdebug.c index 37094a2..b2d5f7d 100644 --- a/block/blkdebug.c +++ b/block/blkdebug.c @@ -1,6 +1,7 @@ /* * Block protocol for I/O error injection * + * Copyright (C) 2016-2017 Red Hat, Inc. * Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com> * * Permission is hereby granted, free of charge, to any person obtaining a copy @@ -382,6 +383,11 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags, goto out; } + bs->supported_write_flags = BDRV_REQ_FUA & + bs->file->bs->supported_write_flags; + bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) & + bs->file->bs->supported_zero_flags; + /* Set request alignment */ align = qemu_opt_get_size(opts, "align", 0); if (align < INT_MAX && is_power_of_2(align)) { @@ -511,6 +517,84 @@ static int blkdebug_co_flush(BlockDriverState *bs) return bdrv_co_flush(bs->file->bs); } +static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs, + int64_t offset, int count, + BdrvRequestFlags flags) +{ + BDRVBlkdebugState *s = bs->opaque; + BlkdebugRule *rule = NULL; + uint32_t align = MAX(bs->bl.request_alignment, + bs->bl.pwrite_zeroes_alignment); + + /* Only pass through requests that are larger than requested + * preferred alignment (so that we test the fallback to writes on + * unaligned portions), and check that the block layer never hands + * us anything crossing an alignment boundary. */ + if (count < align) { + return -ENOTSUP; + } + assert(QEMU_IS_ALIGNED(offset, align)); + assert(QEMU_IS_ALIGNED(count, align)); + if (bs->bl.max_pwrite_zeroes) { + assert(count <= bs->bl.max_pwrite_zeroes); + } + + QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) { + uint64_t inject_offset = rule->options.inject.offset; + + if (inject_offset == -1 || + (inject_offset >= offset && inject_offset < offset + count)) + { + break; + } + } + + if (rule && rule->options.inject.error) { + return inject_error(bs, rule); + } + + return bdrv_co_pwrite_zeroes(bs->file, offset, count, flags); +} + +static int coroutine_fn blkdebug_co_pdiscard(BlockDriverState *bs, + int64_t offset, int count) +{ + BDRVBlkdebugState *s = bs->opaque; + BlkdebugRule *rule = NULL; + uint32_t align = bs->bl.pdiscard_alignment; + + /* Only pass through requests that are larger than requested + * minimum alignment, and ensure that unaligned requests do not + * cross optimum discard boundaries. */ + if (count < bs->bl.request_alignment) { + return -ENOTSUP; + } + assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment)); + assert(QEMU_IS_ALIGNED(count, bs->bl.request_alignment)); + if (align && count >= align) { + assert(QEMU_IS_ALIGNED(offset, align)); + assert(QEMU_IS_ALIGNED(count, align)); + } + if (bs->bl.max_pdiscard) { + assert(count <= bs->bl.max_pdiscard); + } + + QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) { + uint64_t inject_offset = rule->options.inject.offset; + + if (inject_offset == -1 || + (inject_offset >= offset && inject_offset < offset + count)) + { + break; + } + } + + if (rule && rule->options.inject.error) { + return inject_error(bs, rule); + } + + return bdrv_co_pdiscard(bs->file->bs, offset, count); +} static void blkdebug_close(BlockDriverState *bs) { @@ -763,6 +847,8 @@ static BlockDriver bdrv_blkdebug = { .bdrv_co_preadv = blkdebug_co_preadv, .bdrv_co_pwritev = blkdebug_co_pwritev, .bdrv_co_flush_to_disk = blkdebug_co_flush, + .bdrv_co_pwrite_zeroes = blkdebug_co_pwrite_zeroes, + .bdrv_co_pdiscard = blkdebug_co_pdiscard, .bdrv_debug_event = blkdebug_debug_event, .bdrv_debug_breakpoint = blkdebug_debug_breakpoint,