Message ID | 20200710142149.40962-2-kwolf@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | qemu-img convert: Fix abort with unaligned image size | expand |
On 7/10/20 9:21 AM, Kevin Wolf wrote: > Unaligned requests will automatically be aligned to bl.request_alignment > and we don't want to extend requests to access space beyond the end of > the image, so it's required that the image size is aligned. Yep, that's what I've already done on nbd images. nbdkit has '--filter=truncate' which rounds an image size up to alignment by reading the absent tail as zeros, and permitting writes that rewrite zero but failing with EIO any write that would attempt to change the tail. We may eventually want that complexity in qemu's block layer for ALL drivers (as part of switching the block layer to byte-accurate sizing everywhere), but that's a LOT more effort. The short term of just mandating alignment is much easier and still defensible. > > With write requests, this could cause assertion failures like this if > RESIZE permissions weren't requested: > > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. > > This was e.g. triggered by qemu-img converting to a target image with 4k > request alignment when the image was only aligned to 512 bytes, but not > to 4k. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) Reviewed-by: Eric Blake <eblake@redhat.com> > > diff --git a/block.c b/block.c > index cc377d7ef3..c635777911 100644 > --- a/block.c > +++ b/block.c > @@ -1489,6 +1489,16 @@ static int bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv, > return -EINVAL; > } > > + /* > + * Unaligned requests will automatically be aligned to bl.request_alignment > + * and we don't want to extend requests to access space beyond the end of > + * the image, so it's required that the image size is aligned. > + */ > + if ((bs->total_sectors * BDRV_SECTOR_SIZE) % bs->bl.request_alignment) { > + error_setg(errp, "Image size is not a multiple of request alignment"); > + return -EINVAL; > + } > + Do we have any iotest coverage of this new message? (If none of our existing tests broke, then you should add one...)
On 10.07.20 16:21, Kevin Wolf wrote: > Unaligned requests will automatically be aligned to bl.request_alignment > and we don't want to extend requests to access space beyond the end of > the image, so it's required that the image size is aligned. > > With write requests, this could cause assertion failures like this if > RESIZE permissions weren't requested: > > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. > > This was e.g. triggered by qemu-img converting to a target image with 4k > request alignment when the image was only aligned to 512 bytes, but not > to 4k. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) (I think we had some proposal like this before, but I can’t find it, unfortunately...) I can’t see how with this patch you could create qcow2 images and then use them with direct I/O, because AFAICS, qemu-img create doesn’t allow specifying caching options, so AFAIU you’re stuck with: $ ./qemu-img create -f qcow2 /mnt/tmp/foo.qcow2 1M Formatting '/mnt/tmp/foo.qcow2', fmt=qcow2 cluster_size=65536 compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16 $ sudo ./qemu-io -t none /mnt/tmp/foo.qcow2 qemu-io: can't open device /mnt/tmp/foo.qcow2: Image size is not a multiple of request alignment (/mnt/tmp is a filesystem on a “losetup -b 4096” device.) Or you use blockdev-create, that seems to work (because of course you can set the cache mode on the protocol node when you open it for formatting). But, well, I think there should be a working qemu-img create case. Also, I’m afraid of breaking existing use cases with this patch (just qemu-img create + using the image with cache=none). Max
On 13.07.20 13:19, Max Reitz wrote: > On 10.07.20 16:21, Kevin Wolf wrote: >> Unaligned requests will automatically be aligned to bl.request_alignment >> and we don't want to extend requests to access space beyond the end of >> the image, so it's required that the image size is aligned. >> >> With write requests, this could cause assertion failures like this if >> RESIZE permissions weren't requested: >> >> qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. >> >> This was e.g. triggered by qemu-img converting to a target image with 4k >> request alignment when the image was only aligned to 512 bytes, but not >> to 4k. >> >> Signed-off-by: Kevin Wolf <kwolf@redhat.com> >> --- >> block.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) > > (I think we had some proposal like this before, but I can’t find it, > unfortunately...) (Ah, here it is: https://lists.nongnu.org/archive/html/qemu-devel/2020-03/msg03077.html (Which interestingly teases yet another mysterious “we had a discussion on this before”...)) > I can’t see how with this patch you could create qcow2 images and then > use them with direct I/O, because AFAICS, qemu-img create doesn’t allow > specifying caching options, so AFAIU you’re stuck with: > > $ ./qemu-img create -f qcow2 /mnt/tmp/foo.qcow2 1M > Formatting '/mnt/tmp/foo.qcow2', fmt=qcow2 cluster_size=65536 > compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16 > > $ sudo ./qemu-io -t none /mnt/tmp/foo.qcow2 > qemu-io: can't open device /mnt/tmp/foo.qcow2: Image size is not a > multiple of request alignment > > (/mnt/tmp is a filesystem on a “losetup -b 4096” device.) > > Or you use blockdev-create, that seems to work (because of course you > can set the cache mode on the protocol node when you open it for > formatting). But, well, I think there should be a working qemu-img > create case. > > Also, I’m afraid of breaking existing use cases with this patch (just > qemu-img create + using the image with cache=none). > > Max >
Am 13.07.2020 um 13:19 hat Max Reitz geschrieben: > On 10.07.20 16:21, Kevin Wolf wrote: > > Unaligned requests will automatically be aligned to bl.request_alignment > > and we don't want to extend requests to access space beyond the end of > > the image, so it's required that the image size is aligned. > > > > With write requests, this could cause assertion failures like this if > > RESIZE permissions weren't requested: > > > > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. > > > > This was e.g. triggered by qemu-img converting to a target image with 4k > > request alignment when the image was only aligned to 512 bytes, but not > > to 4k. > > > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > > --- > > block.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > (I think we had some proposal like this before, but I can’t find it, > unfortunately...) > > I can’t see how with this patch you could create qcow2 images and then > use them with direct I/O, because AFAICS, qemu-img create doesn’t allow > specifying caching options, so AFAIU you’re stuck with: > > $ ./qemu-img create -f qcow2 /mnt/tmp/foo.qcow2 1M > Formatting '/mnt/tmp/foo.qcow2', fmt=qcow2 cluster_size=65536 > compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16 > > $ sudo ./qemu-io -t none /mnt/tmp/foo.qcow2 > qemu-io: can't open device /mnt/tmp/foo.qcow2: Image size is not a > multiple of request alignment > > (/mnt/tmp is a filesystem on a “losetup -b 4096” device.) Hm, that looks like some regrettable collateral damage... Well, you could argue that we should be writing full L1 tables with zero padding instead of just the used part. I thought we had fixed this long ago. But looks like we haven't. But we should still avoid crashing in other cases, so what is the difference between both? Is it just that qcow2 has the RESIZE permission anyway so it doesn't matter? If so, maybe attaching to a block node with WRITE, but not RESIZE is what needs to fail when the image size is unaligned? > Or you use blockdev-create, that seems to work (because of course you > can set the cache mode on the protocol node when you open it for > formatting). But, well, I think there should be a working qemu-img > create case. > > Also, I’m afraid of breaking existing use cases with this patch (just > qemu-img create + using the image with cache=none). I think for raw images, failure on start is better than crashing when the VM is running. The qcow2 case needs to be fixed, of course. Either case, I guess patch 2 can already be merged and would solve at least the immediate bug report. Kevin
On Fri, Jul 10, 2020 at 5:22 PM Kevin Wolf <kwolf@redhat.com> wrote: > > Unaligned requests will automatically be aligned to bl.request_alignment > and we don't want to extend requests to access space beyond the end of > the image, so it's required that the image size is aligned. > > With write requests, this could cause assertion failures like this if > RESIZE permissions weren't requested: > > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. > > This was e.g. triggered by qemu-img converting to a target image with 4k > request alignment when the image was only aligned to 512 bytes, but not > to 4k. Was it on NFS? Shouldn't this be fix by the next patch then? > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/block.c b/block.c > index cc377d7ef3..c635777911 100644 > --- a/block.c > +++ b/block.c > @@ -1489,6 +1489,16 @@ static int bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv, > return -EINVAL; > } > > + /* > + * Unaligned requests will automatically be aligned to bl.request_alignment > + * and we don't want to extend requests to access space beyond the end of > + * the image, so it's required that the image size is aligned. > + */ > + if ((bs->total_sectors * BDRV_SECTOR_SIZE) % bs->bl.request_alignment) { > + error_setg(errp, "Image size is not a multiple of request alignment"); > + return -EINVAL; > + } > + > assert(bdrv_opt_mem_align(bs) != 0); > assert(bdrv_min_mem_align(bs) != 0); > assert(is_power_of_2(bs->bl.request_alignment)); > -- > 2.25.4 >
Am 13.07.2020 um 18:33 hat Nir Soffer geschrieben: > On Fri, Jul 10, 2020 at 5:22 PM Kevin Wolf <kwolf@redhat.com> wrote: > > > > Unaligned requests will automatically be aligned to bl.request_alignment > > and we don't want to extend requests to access space beyond the end of > > the image, so it's required that the image size is aligned. > > > > With write requests, this could cause assertion failures like this if > > RESIZE permissions weren't requested: > > > > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. > > > > This was e.g. triggered by qemu-img converting to a target image with 4k > > request alignment when the image was only aligned to 512 bytes, but not > > to 4k. > > Was it on NFS? Shouldn't this be fix by the next patch then? Patch 2 makes the problem go away for NFS because NFS doesn't even require the 4k alignment. But on storage that legitimately needs 4k alignment (or possibly other filesystems that are misdetected), you would still hit the same problem. Kevin
On 13.07.20 16:29, Kevin Wolf wrote: > Am 13.07.2020 um 13:19 hat Max Reitz geschrieben: >> On 10.07.20 16:21, Kevin Wolf wrote: >>> Unaligned requests will automatically be aligned to bl.request_alignment >>> and we don't want to extend requests to access space beyond the end of >>> the image, so it's required that the image size is aligned. >>> >>> With write requests, this could cause assertion failures like this if >>> RESIZE permissions weren't requested: >>> >>> qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. >>> >>> This was e.g. triggered by qemu-img converting to a target image with 4k >>> request alignment when the image was only aligned to 512 bytes, but not >>> to 4k. >>> >>> Signed-off-by: Kevin Wolf <kwolf@redhat.com> >>> --- >>> block.c | 10 ++++++++++ >>> 1 file changed, 10 insertions(+) >> >> (I think we had some proposal like this before, but I can’t find it, >> unfortunately...) >> >> I can’t see how with this patch you could create qcow2 images and then >> use them with direct I/O, because AFAICS, qemu-img create doesn’t allow >> specifying caching options, so AFAIU you’re stuck with: >> >> $ ./qemu-img create -f qcow2 /mnt/tmp/foo.qcow2 1M >> Formatting '/mnt/tmp/foo.qcow2', fmt=qcow2 cluster_size=65536 >> compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16 >> >> $ sudo ./qemu-io -t none /mnt/tmp/foo.qcow2 >> qemu-io: can't open device /mnt/tmp/foo.qcow2: Image size is not a >> multiple of request alignment >> >> (/mnt/tmp is a filesystem on a “losetup -b 4096” device.) > > Hm, that looks like some regrettable collateral damage... > > Well, you could argue that we should be writing full L1 tables with zero > padding instead of just the used part. I thought we had fixed this long > ago. But looks like we haven't. That would help for the standard case. It wouldn’t when the cluster size is smaller than the request alignment, which, while maybe not important, would still be a shame. > But we should still avoid crashing in other cases, so what is the > difference between both? Is it just that qcow2 has the RESIZE permission > anyway so it doesn't matter? I assume so. > If so, maybe attaching to a block node with WRITE, but not RESIZE is > what needs to fail when the image size is unaligned? That sounds reasonable. The obvious question is what happens when the RESIZE capability is removed. Dropping capabilities may never fail – I suppose we could force-keep the RESIZE capability for such nodes? Or we could immediately align such files to the block size once they are opened (with the RESIZE capability). >> Or you use blockdev-create, that seems to work (because of course you >> can set the cache mode on the protocol node when you open it for >> formatting). But, well, I think there should be a working qemu-img >> create case. >> >> Also, I’m afraid of breaking existing use cases with this patch (just >> qemu-img create + using the image with cache=none). > > I think for raw images, failure on start is better than crashing when > the VM is running. Agreed. > The qcow2 case needs to be fixed, of course. > > Either case, I guess patch 2 can already be merged and would solve at > least the immediate bug report. Also true. Max
Am 14.07.2020 um 11:56 hat Max Reitz geschrieben: > On 13.07.20 16:29, Kevin Wolf wrote: > > Am 13.07.2020 um 13:19 hat Max Reitz geschrieben: > >> On 10.07.20 16:21, Kevin Wolf wrote: > >>> Unaligned requests will automatically be aligned to bl.request_alignment > >>> and we don't want to extend requests to access space beyond the end of > >>> the image, so it's required that the image size is aligned. > >>> > >>> With write requests, this could cause assertion failures like this if > >>> RESIZE permissions weren't requested: > >>> > >>> qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. > >>> > >>> This was e.g. triggered by qemu-img converting to a target image with 4k > >>> request alignment when the image was only aligned to 512 bytes, but not > >>> to 4k. > >>> > >>> Signed-off-by: Kevin Wolf <kwolf@redhat.com> > >>> --- > >>> block.c | 10 ++++++++++ > >>> 1 file changed, 10 insertions(+) > >> > >> (I think we had some proposal like this before, but I can’t find it, > >> unfortunately...) > >> > >> I can’t see how with this patch you could create qcow2 images and then > >> use them with direct I/O, because AFAICS, qemu-img create doesn’t allow > >> specifying caching options, so AFAIU you’re stuck with: > >> > >> $ ./qemu-img create -f qcow2 /mnt/tmp/foo.qcow2 1M > >> Formatting '/mnt/tmp/foo.qcow2', fmt=qcow2 cluster_size=65536 > >> compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16 > >> > >> $ sudo ./qemu-io -t none /mnt/tmp/foo.qcow2 > >> qemu-io: can't open device /mnt/tmp/foo.qcow2: Image size is not a > >> multiple of request alignment > >> > >> (/mnt/tmp is a filesystem on a “losetup -b 4096” device.) > > > > Hm, that looks like some regrettable collateral damage... > > > > Well, you could argue that we should be writing full L1 tables with zero > > padding instead of just the used part. I thought we had fixed this long > > ago. But looks like we haven't. > > That would help for the standard case. It wouldn’t when the cluster > size is smaller than the request alignment, which, while maybe not > important, would still be a shame. I don't think it would be unreasonable to require a cluster size that is a multiple of the logical block size of your host storage if you want to use O_DIRECT. But we have unaligned images in practice, so this is pure theory anyway. > > But we should still avoid crashing in other cases, so what is the > > difference between both? Is it just that qcow2 has the RESIZE permission > > anyway so it doesn't matter? > > I assume so. > > > If so, maybe attaching to a block node with WRITE, but not RESIZE is > > what needs to fail when the image size is unaligned? > > That sounds reasonable. > > The obvious question is what happens when the RESIZE capability is > removed. Dropping capabilities may never fail – I suppose we could > force-keep the RESIZE capability for such nodes? It's not nice, but I think we already have this kind of behaviour for unlocking failures. So yes, that sounds like an option. > Or we could immediately align such files to the block size once they > are opened (with the RESIZE capability). Automatically resizing the image file is obviously harmless for qcow2 images, but it would be a guest-visible change for raw images. It might be better to avoid this. Kevin
On 14.07.20 13:08, Kevin Wolf wrote: > Am 14.07.2020 um 11:56 hat Max Reitz geschrieben: >> On 13.07.20 16:29, Kevin Wolf wrote: >>> Am 13.07.2020 um 13:19 hat Max Reitz geschrieben: >>>> On 10.07.20 16:21, Kevin Wolf wrote: >>>>> Unaligned requests will automatically be aligned to bl.request_alignment >>>>> and we don't want to extend requests to access space beyond the end of >>>>> the image, so it's required that the image size is aligned. >>>>> >>>>> With write requests, this could cause assertion failures like this if >>>>> RESIZE permissions weren't requested: >>>>> >>>>> qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. >>>>> >>>>> This was e.g. triggered by qemu-img converting to a target image with 4k >>>>> request alignment when the image was only aligned to 512 bytes, but not >>>>> to 4k. >>>>> >>>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com> >>>>> --- >>>>> block.c | 10 ++++++++++ >>>>> 1 file changed, 10 insertions(+) >>>> >>>> (I think we had some proposal like this before, but I can’t find it, >>>> unfortunately...) >>>> >>>> I can’t see how with this patch you could create qcow2 images and then >>>> use them with direct I/O, because AFAICS, qemu-img create doesn’t allow >>>> specifying caching options, so AFAIU you’re stuck with: >>>> >>>> $ ./qemu-img create -f qcow2 /mnt/tmp/foo.qcow2 1M >>>> Formatting '/mnt/tmp/foo.qcow2', fmt=qcow2 cluster_size=65536 >>>> compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16 >>>> >>>> $ sudo ./qemu-io -t none /mnt/tmp/foo.qcow2 >>>> qemu-io: can't open device /mnt/tmp/foo.qcow2: Image size is not a >>>> multiple of request alignment >>>> >>>> (/mnt/tmp is a filesystem on a “losetup -b 4096” device.) >>> >>> Hm, that looks like some regrettable collateral damage... >>> >>> Well, you could argue that we should be writing full L1 tables with zero >>> padding instead of just the used part. I thought we had fixed this long >>> ago. But looks like we haven't. >> >> That would help for the standard case. It wouldn’t when the cluster >> size is smaller than the request alignment, which, while maybe not >> important, would still be a shame. > > I don't think it would be unreasonable to require a cluster size that is > a multiple of the logical block size of your host storage if you want to > use O_DIRECT. True. > But we have unaligned images in practice, so this is pure theory anyway. Hm. Maybe it would help to just adjust the error message to instruct the user to resize the image to fit the request alignment? (e.g. “is not a multiple of the request alignment %u (try resizing the image to %llu bytes)”) >>> But we should still avoid crashing in other cases, so what is the >>> difference between both? Is it just that qcow2 has the RESIZE permission >>> anyway so it doesn't matter? >> >> I assume so. >> >>> If so, maybe attaching to a block node with WRITE, but not RESIZE is >>> what needs to fail when the image size is unaligned? >> >> That sounds reasonable. >> >> The obvious question is what happens when the RESIZE capability is >> removed. Dropping capabilities may never fail – I suppose we could >> force-keep the RESIZE capability for such nodes? > > It's not nice, but I think we already have this kind of behaviour for > unlocking failures. So yes, that sounds like an option. > >> Or we could immediately align such files to the block size once they >> are opened (with the RESIZE capability). > > Automatically resizing the image file is obviously harmless for qcow2 > images, but it would be a guest-visible change for raw images. It might > be better to avoid this. Well, it seems to be what already happens if the guest device has taken the RESIZE capability (i.e., whenever there’s no failing assertion). The only difference that appears to me is just that it happens only when writing to the end of the image instead of unconditionally when opening it. Max
Am 14.07.2020 um 18:22 hat Max Reitz geschrieben: > On 14.07.20 13:08, Kevin Wolf wrote: > > Am 14.07.2020 um 11:56 hat Max Reitz geschrieben: > >> On 13.07.20 16:29, Kevin Wolf wrote: > >>> Am 13.07.2020 um 13:19 hat Max Reitz geschrieben: > >>>> On 10.07.20 16:21, Kevin Wolf wrote: > >>>>> Unaligned requests will automatically be aligned to bl.request_alignment > >>>>> and we don't want to extend requests to access space beyond the end of > >>>>> the image, so it's required that the image size is aligned. > >>>>> > >>>>> With write requests, this could cause assertion failures like this if > >>>>> RESIZE permissions weren't requested: > >>>>> > >>>>> qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. > >>>>> > >>>>> This was e.g. triggered by qemu-img converting to a target image with 4k > >>>>> request alignment when the image was only aligned to 512 bytes, but not > >>>>> to 4k. > >>>>> > >>>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com> > >>>>> --- > >>>>> block.c | 10 ++++++++++ > >>>>> 1 file changed, 10 insertions(+) > >>>> > >>>> (I think we had some proposal like this before, but I can’t find it, > >>>> unfortunately...) > >>>> > >>>> I can’t see how with this patch you could create qcow2 images and then > >>>> use them with direct I/O, because AFAICS, qemu-img create doesn’t allow > >>>> specifying caching options, so AFAIU you’re stuck with: > >>>> > >>>> $ ./qemu-img create -f qcow2 /mnt/tmp/foo.qcow2 1M > >>>> Formatting '/mnt/tmp/foo.qcow2', fmt=qcow2 cluster_size=65536 > >>>> compression_type=zlib size=1048576 lazy_refcounts=off refcount_bits=16 > >>>> > >>>> $ sudo ./qemu-io -t none /mnt/tmp/foo.qcow2 > >>>> qemu-io: can't open device /mnt/tmp/foo.qcow2: Image size is not a > >>>> multiple of request alignment > >>>> > >>>> (/mnt/tmp is a filesystem on a “losetup -b 4096” device.) > >>> > >>> Hm, that looks like some regrettable collateral damage... > >>> > >>> Well, you could argue that we should be writing full L1 tables with zero > >>> padding instead of just the used part. I thought we had fixed this long > >>> ago. But looks like we haven't. > >> > >> That would help for the standard case. It wouldn’t when the cluster > >> size is smaller than the request alignment, which, while maybe not > >> important, would still be a shame. > > > > I don't think it would be unreasonable to require a cluster size that is > > a multiple of the logical block size of your host storage if you want to > > use O_DIRECT. > > True. > > > But we have unaligned images in practice, so this is pure theory anyway. > > Hm. Maybe it would help to just adjust the error message to instruct > the user to resize the image to fit the request alignment? (e.g. “is > not a multiple of the request alignment %u (try resizing the image to > %llu bytes)”) This would require management tools to automatically do this or we would break any users that don't manually invoke QEMU. I don't think this is a realistic option, especially since "management tools" must probably include all those one-off shell scripts that people use. > >>> But we should still avoid crashing in other cases, so what is the > >>> difference between both? Is it just that qcow2 has the RESIZE permission > >>> anyway so it doesn't matter? > >> > >> I assume so. > >> > >>> If so, maybe attaching to a block node with WRITE, but not RESIZE is > >>> what needs to fail when the image size is unaligned? > >> > >> That sounds reasonable. > >> > >> The obvious question is what happens when the RESIZE capability is > >> removed. Dropping capabilities may never fail – I suppose we could > >> force-keep the RESIZE capability for such nodes? > > > > It's not nice, but I think we already have this kind of behaviour for > > unlocking failures. So yes, that sounds like an option. > > > >> Or we could immediately align such files to the block size once they > >> are opened (with the RESIZE capability). > > > > Automatically resizing the image file is obviously harmless for qcow2 > > images, but it would be a guest-visible change for raw images. It might > > be better to avoid this. > > Well, it seems to be what already happens if the guest device has taken > the RESIZE capability (i.e., whenever there’s no failing assertion). > The only difference that appears to me is just that it happens only when > writing to the end of the image instead of unconditionally when opening it. I would have considered this as part of the bug rather than a desirable future behaviour. blk_check_byte_request() tries to catch any request going past EOF, it just doesn't know anything about request_alignment. Kevin
On Mon, Jul 13, 2020 at 7:56 PM Kevin Wolf <kwolf@redhat.com> wrote: > > Am 13.07.2020 um 18:33 hat Nir Soffer geschrieben: > > On Fri, Jul 10, 2020 at 5:22 PM Kevin Wolf <kwolf@redhat.com> wrote: > > > > > > Unaligned requests will automatically be aligned to bl.request_alignment > > > and we don't want to extend requests to access space beyond the end of > > > the image, so it's required that the image size is aligned. > > > > > > With write requests, this could cause assertion failures like this if > > > RESIZE permissions weren't requested: > > > > > > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. > > > > > > This was e.g. triggered by qemu-img converting to a target image with 4k > > > request alignment when the image was only aligned to 512 bytes, but not > > > to 4k. > > > > Was it on NFS? Shouldn't this be fix by the next patch then? > > Patch 2 makes the problem go away for NFS because NFS doesn't even > require the 4k alignment. But on storage that legitimately needs 4k > alignment (or possibly other filesystems that are misdetected), you > would still hit the same problem. I want to add oVirt point of view on this. We enforce raw image alignment of 4k on file based storage, and 128m on block storage, so our raw images cannot have this issue. We have an issue with empty qcow2 images which are unaligned size, but we don't create such images in normal flows. Nir
Am 15.07.2020 um 15:22 hat Nir Soffer geschrieben: > On Mon, Jul 13, 2020 at 7:56 PM Kevin Wolf <kwolf@redhat.com> wrote: > > > > Am 13.07.2020 um 18:33 hat Nir Soffer geschrieben: > > > On Fri, Jul 10, 2020 at 5:22 PM Kevin Wolf <kwolf@redhat.com> wrote: > > > > > > > > Unaligned requests will automatically be aligned to bl.request_alignment > > > > and we don't want to extend requests to access space beyond the end of > > > > the image, so it's required that the image size is aligned. > > > > > > > > With write requests, this could cause assertion failures like this if > > > > RESIZE permissions weren't requested: > > > > > > > > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. > > > > > > > > This was e.g. triggered by qemu-img converting to a target image with 4k > > > > request alignment when the image was only aligned to 512 bytes, but not > > > > to 4k. > > > > > > Was it on NFS? Shouldn't this be fix by the next patch then? > > > > Patch 2 makes the problem go away for NFS because NFS doesn't even > > require the 4k alignment. But on storage that legitimately needs 4k > > alignment (or possibly other filesystems that are misdetected), you > > would still hit the same problem. > > I want to add oVirt point of view on this. We enforce raw image > alignment of 4k on file based storage, and 128m on block storage, so > our raw images cannot have this issue. Yes, then you won't hit the problem. > We have an issue with empty qcow2 images which are unaligned size, but > we don't create such images in normal flows. Can you give a reproducer where qcow2 images would be affected? Generally speaking, the qcow2 driver either takes both WRITE and RESIZE permissions or neither. So it should just automatically resize the image as needed instead of crashing. Kevin
On Wed, Jul 15, 2020 at 04:22:06PM +0300, Nir Soffer wrote: > On Mon, Jul 13, 2020 at 7:56 PM Kevin Wolf <kwolf@redhat.com> wrote: > > > > Am 13.07.2020 um 18:33 hat Nir Soffer geschrieben: > > > On Fri, Jul 10, 2020 at 5:22 PM Kevin Wolf <kwolf@redhat.com> wrote: > > > > > > > > Unaligned requests will automatically be aligned to bl.request_alignment > > > > and we don't want to extend requests to access space beyond the end of > > > > the image, so it's required that the image size is aligned. > > > > > > > > With write requests, this could cause assertion failures like this if > > > > RESIZE permissions weren't requested: > > > > > > > > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. > > > > > > > > This was e.g. triggered by qemu-img converting to a target image with 4k > > > > request alignment when the image was only aligned to 512 bytes, but not > > > > to 4k. > > > > > > Was it on NFS? Shouldn't this be fix by the next patch then? > > > > Patch 2 makes the problem go away for NFS because NFS doesn't even > > require the 4k alignment. But on storage that legitimately needs 4k > > alignment (or possibly other filesystems that are misdetected), you > > would still hit the same problem. > > I want to add oVirt point of view on this. We enforce raw image > alignment of 4k on > file based storage, and 128m on block storage, so our raw images cannot have > this issue. OpenStack should have minimium alignment of 1 GB for image sizes, so this change is also no trouble for it. Regards, Daniel
On Wed, Jul 15, 2020 at 4:42 PM Kevin Wolf <kwolf@redhat.com> wrote: > > Am 15.07.2020 um 15:22 hat Nir Soffer geschrieben: > > On Mon, Jul 13, 2020 at 7:56 PM Kevin Wolf <kwolf@redhat.com> wrote: > > > > > > Am 13.07.2020 um 18:33 hat Nir Soffer geschrieben: > > > > On Fri, Jul 10, 2020 at 5:22 PM Kevin Wolf <kwolf@redhat.com> wrote: > > > > > > > > > > Unaligned requests will automatically be aligned to bl.request_alignment > > > > > and we don't want to extend requests to access space beyond the end of > > > > > the image, so it's required that the image size is aligned. > > > > > > > > > > With write requests, this could cause assertion failures like this if > > > > > RESIZE permissions weren't requested: > > > > > > > > > > qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. > > > > > > > > > > This was e.g. triggered by qemu-img converting to a target image with 4k > > > > > request alignment when the image was only aligned to 512 bytes, but not > > > > > to 4k. > > > > > > > > Was it on NFS? Shouldn't this be fix by the next patch then? > > > > > > Patch 2 makes the problem go away for NFS because NFS doesn't even > > > require the 4k alignment. But on storage that legitimately needs 4k > > > alignment (or possibly other filesystems that are misdetected), you > > > would still hit the same problem. > > > > I want to add oVirt point of view on this. We enforce raw image > > alignment of 4k on file based storage, and 128m on block storage, so > > our raw images cannot have this issue. > > Yes, then you won't hit the problem. > > > We have an issue with empty qcow2 images which are unaligned size, but > > we don't create such images in normal flows. > > Can you give a reproducer where qcow2 images would be affected? > Generally speaking, the qcow2 driver either takes both WRITE and RESIZE > permissions or neither. So it should just automatically resize the image > as needed instead of crashing. I think this is a theoretical issue in other programs trying to access the unaligned images using direct I/O.
diff --git a/block.c b/block.c index cc377d7ef3..c635777911 100644 --- a/block.c +++ b/block.c @@ -1489,6 +1489,16 @@ static int bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv, return -EINVAL; } + /* + * Unaligned requests will automatically be aligned to bl.request_alignment + * and we don't want to extend requests to access space beyond the end of + * the image, so it's required that the image size is aligned. + */ + if ((bs->total_sectors * BDRV_SECTOR_SIZE) % bs->bl.request_alignment) { + error_setg(errp, "Image size is not a multiple of request alignment"); + return -EINVAL; + } + assert(bdrv_opt_mem_align(bs) != 0); assert(bdrv_min_mem_align(bs) != 0); assert(is_power_of_2(bs->bl.request_alignment));
Unaligned requests will automatically be aligned to bl.request_alignment and we don't want to extend requests to access space beyond the end of the image, so it's required that the image size is aligned. With write requests, this could cause assertion failures like this if RESIZE permissions weren't requested: qemu-img: block/io.c:1910: bdrv_co_write_req_prepare: Assertion `end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE' failed. This was e.g. triggered by qemu-img converting to a target image with 4k request alignment when the image was only aligned to 512 bytes, but not to 4k. Signed-off-by: Kevin Wolf <kwolf@redhat.com> --- block.c | 10 ++++++++++ 1 file changed, 10 insertions(+)