Message ID | 20191120140319.1505-3-kwolf@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | block: Fix resize (extending) of short overlays | expand |
On 11/20/19 8:03 AM, Kevin Wolf wrote: > When extending the size of an image that has a backing file larger than > its old size, make sure that the backing file data doesn't become > visible in the guest, but the added area is properly zeroed out. > > The old behaviour made a difference in 'block_resize' (where showing the > backing file data from an old snapshot rather than zeros is > questionable) as well as in commit block jobs (both from active and > intermediate nodes) and HMP 'commit', where committing to a short > backing file would incorrectly omit writing zeroes for unallocated > blocks on the top layer after the EOF of the short backing file. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block/io.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) Reviewed-by: Eric Blake <eblake@redhat.com>
20.11.2019 17:03, Kevin Wolf wrote: > When extending the size of an image that has a backing file larger than > its old size, make sure that the backing file data doesn't become > visible in the guest, but the added area is properly zeroed out. > > The old behaviour made a difference in 'block_resize' (where showing the > backing file data from an old snapshot rather than zeros is > questionable) as well as in commit block jobs (both from active and > intermediate nodes) and HMP 'commit', where committing to a short > backing file would incorrectly omit writing zeroes for unallocated > blocks on the top layer after the EOF of the short backing file. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block/io.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/block/io.c b/block/io.c > index 003f4ea38c..8683f7a4bd 100644 > --- a/block/io.c > +++ b/block/io.c > @@ -3382,6 +3382,31 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact, > goto out; > } > > + /* > + * If the image has a backing file that is large enough that it would > + * provide data for the new area, we cannot leave it unallocated because > + * then the backing file content would become visible. Instead, zero-fill > + * the area where backing file and new area overlap. > + */ Should we mention that, still, we don't care if user for some reason will change backing file in future? > + if (new_bytes && bs->backing && prealloc == PREALLOC_MODE_OFF) { > + int64_t backing_len; > + > + backing_len = bdrv_getlength(backing_bs(bs)); > + if (backing_len < 0) { > + ret = backing_len; > + goto out; > + } > + > + if (backing_len > old_size) { > + ret = bdrv_co_do_pwrite_zeroes(bs, old_size, > + MIN(new_bytes, backing_len - old_size), > + BDRV_REQ_ZERO_WRITE | BDRV_REQ_MAY_UNMAP); two over-80 lines > + if (ret < 0) { > + goto out; > + } > + } > + } should we improve "off" mode specification in qapi? > + > ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS); > if (ret < 0) { > error_setg_errno(errp, -ret, "Could not refresh total sector count"); > Hmm. is it correct to call write_zeroes before refresh_total_sectors? Note that qcow2_co_pwrite_zeroes rely on bs->total_sectors...
Am 20.11.2019 um 15:47 hat Vladimir Sementsov-Ogievskiy geschrieben: > 20.11.2019 17:03, Kevin Wolf wrote: > > When extending the size of an image that has a backing file larger than > > its old size, make sure that the backing file data doesn't become > > visible in the guest, but the added area is properly zeroed out. > > > > The old behaviour made a difference in 'block_resize' (where showing the > > backing file data from an old snapshot rather than zeros is > > questionable) as well as in commit block jobs (both from active and > > intermediate nodes) and HMP 'commit', where committing to a short > > backing file would incorrectly omit writing zeroes for unallocated > > blocks on the top layer after the EOF of the short backing file. > > > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > > --- > > block/io.c | 25 +++++++++++++++++++++++++ > > 1 file changed, 25 insertions(+) > > > > diff --git a/block/io.c b/block/io.c > > index 003f4ea38c..8683f7a4bd 100644 > > --- a/block/io.c > > +++ b/block/io.c > > @@ -3382,6 +3382,31 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact, > > goto out; > > } > > > > + /* > > + * If the image has a backing file that is large enough that it would > > + * provide data for the new area, we cannot leave it unallocated because > > + * then the backing file content would become visible. Instead, zero-fill > > + * the area where backing file and new area overlap. > > + */ > > Should we mention that, still, we don't care if user for some reason will change > backing file in future? This should be obvious, but I can add something. > > + if (new_bytes && bs->backing && prealloc == PREALLOC_MODE_OFF) { > > + int64_t backing_len; > > + > > + backing_len = bdrv_getlength(backing_bs(bs)); > > + if (backing_len < 0) { > > + ret = backing_len; > > + goto out; > > + } > > + > > + if (backing_len > old_size) { > > + ret = bdrv_co_do_pwrite_zeroes(bs, old_size, > > + MIN(new_bytes, backing_len - old_size), > > + BDRV_REQ_ZERO_WRITE | BDRV_REQ_MAY_UNMAP); > > two over-80 lines Will fix. > > + if (ret < 0) { > > + goto out; > > + } > > + } > > + } > > should we improve "off" mode specification in qapi? I don't think we're changing the semantics of "off". We're merely fixing a bug that happens not to exist with preallocation. > > + > > ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS); > > if (ret < 0) { > > error_setg_errno(errp, -ret, "Could not refresh total sector count"); > > > > Hmm. is it correct to call write_zeroes before refresh_total_sectors? > Note that qcow2_co_pwrite_zeroes rely on bs->total_sectors... Hm... I placed the code where I did because I didn't want to make the new area valid before it isn't zeroed. But you're probably right that we shouldn't run requests with inconsistent bs->total_sectors, so I'll switch the order. Kevin
20.11.2019 17:03, Kevin Wolf wrote: > When extending the size of an image that has a backing file larger than > its old size, make sure that the backing file data doesn't become > visible in the guest, but the added area is properly zeroed out. > > The old behaviour made a difference in 'block_resize' (where showing the > backing file data from an old snapshot rather than zeros is > questionable) as well as in commit block jobs (both from active and > intermediate nodes) and HMP 'commit', where committing to a short > backing file would incorrectly omit writing zeroes for unallocated > blocks on the top layer after the EOF of the short backing file. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block/io.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/block/io.c b/block/io.c > index 003f4ea38c..8683f7a4bd 100644 > --- a/block/io.c > +++ b/block/io.c > @@ -3382,6 +3382,31 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact, > goto out; > } > > + /* > + * If the image has a backing file that is large enough that it would > + * provide data for the new area, we cannot leave it unallocated because > + * then the backing file content would become visible. Instead, zero-fill > + * the area where backing file and new area overlap. > + */ > + if (new_bytes && bs->backing && prealloc == PREALLOC_MODE_OFF) { > + int64_t backing_len; > + > + backing_len = bdrv_getlength(backing_bs(bs)); > + if (backing_len < 0) { > + ret = backing_len; > + goto out; > + } > + > + if (backing_len > old_size) { > + ret = bdrv_co_do_pwrite_zeroes(bs, old_size, > + MIN(new_bytes, backing_len - old_size), > + BDRV_REQ_ZERO_WRITE | BDRV_REQ_MAY_UNMAP); > + if (ret < 0) { > + goto out; > + } > + } > + } > + > ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS); > if (ret < 0) { > error_setg_errno(errp, -ret, "Could not refresh total sector count"); > Hmmmm. I'm think that for commit, we also should zero truncated area if !bdrv_has_zero_init_truncate(bs). But we should not do it here, as it should not be done if we just resizing disk.. What formats are that bad?
Am 20.11.2019 um 19:01 hat Vladimir Sementsov-Ogievskiy geschrieben: > 20.11.2019 17:03, Kevin Wolf wrote: > > When extending the size of an image that has a backing file larger than > > its old size, make sure that the backing file data doesn't become > > visible in the guest, but the added area is properly zeroed out. > > > > The old behaviour made a difference in 'block_resize' (where showing the > > backing file data from an old snapshot rather than zeros is > > questionable) as well as in commit block jobs (both from active and > > intermediate nodes) and HMP 'commit', where committing to a short > > backing file would incorrectly omit writing zeroes for unallocated > > blocks on the top layer after the EOF of the short backing file. > > > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > > --- > > block/io.c | 25 +++++++++++++++++++++++++ > > 1 file changed, 25 insertions(+) > > > > diff --git a/block/io.c b/block/io.c > > index 003f4ea38c..8683f7a4bd 100644 > > --- a/block/io.c > > +++ b/block/io.c > > @@ -3382,6 +3382,31 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact, > > goto out; > > } > > > > + /* > > + * If the image has a backing file that is large enough that it would > > + * provide data for the new area, we cannot leave it unallocated because > > + * then the backing file content would become visible. Instead, zero-fill > > + * the area where backing file and new area overlap. > > + */ > > + if (new_bytes && bs->backing && prealloc == PREALLOC_MODE_OFF) { > > + int64_t backing_len; > > + > > + backing_len = bdrv_getlength(backing_bs(bs)); > > + if (backing_len < 0) { > > + ret = backing_len; > > + goto out; > > + } > > + > > + if (backing_len > old_size) { > > + ret = bdrv_co_do_pwrite_zeroes(bs, old_size, > > + MIN(new_bytes, backing_len - old_size), > > + BDRV_REQ_ZERO_WRITE | BDRV_REQ_MAY_UNMAP); > > + if (ret < 0) { > > + goto out; > > + } > > + } > > + } > > + > > ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS); > > if (ret < 0) { > > error_setg_errno(errp, -ret, "Could not refresh total sector count"); > > Hmmmm. I'm think that for commit, we also should zero truncated area > if !bdrv_has_zero_init_truncate(bs). But we should not do it here, as > it should not be done if we just resizing disk.. Hm, yes, we need to do this for METADATA and FALLOC preallocation at least. I think we already guarantee zeros for FULL, do we? Resize needs zero init in the opposite case: When you are resizing a short backing file, the longer overlay still needs to read the same zeros it read before after EOF of the backing file. This one sounds actually even nastier to fix than what this series does. :-/ Anyway, maybe instead of the no_fallback parameter I introduced in v3, what we really want is a need_zero_init parameter that only commit jobs set for now? Or actually add a new preallocation mode like you suggested that would add a zero write and pass OFF to the driver implementations. Then we wouldn't have to add a new parameter everywhere. We'd still unconditionally write zeros where it's necessary to allocate blocks to cover the backing file (and to provide correct data to the overlay if we ever figure out how to check this condition). I think I've come to the conclusion that blocking on block_resize is better than failing. > What formats are that bad? You mean that they don't have zero init? The usual suspect for bad image formats is raw, but fortunately that doesn't support backing files. So maybe it's not a problem we would see in practice. Kevin
diff --git a/block/io.c b/block/io.c index 003f4ea38c..8683f7a4bd 100644 --- a/block/io.c +++ b/block/io.c @@ -3382,6 +3382,31 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact, goto out; } + /* + * If the image has a backing file that is large enough that it would + * provide data for the new area, we cannot leave it unallocated because + * then the backing file content would become visible. Instead, zero-fill + * the area where backing file and new area overlap. + */ + if (new_bytes && bs->backing && prealloc == PREALLOC_MODE_OFF) { + int64_t backing_len; + + backing_len = bdrv_getlength(backing_bs(bs)); + if (backing_len < 0) { + ret = backing_len; + goto out; + } + + if (backing_len > old_size) { + ret = bdrv_co_do_pwrite_zeroes(bs, old_size, + MIN(new_bytes, backing_len - old_size), + BDRV_REQ_ZERO_WRITE | BDRV_REQ_MAY_UNMAP); + if (ret < 0) { + goto out; + } + } + } + ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS); if (ret < 0) { error_setg_errno(errp, -ret, "Could not refresh total sector count");
When extending the size of an image that has a backing file larger than its old size, make sure that the backing file data doesn't become visible in the guest, but the added area is properly zeroed out. The old behaviour made a difference in 'block_resize' (where showing the backing file data from an old snapshot rather than zeros is questionable) as well as in commit block jobs (both from active and intermediate nodes) and HMP 'commit', where committing to a short backing file would incorrectly omit writing zeroes for unallocated blocks on the top layer after the EOF of the short backing file. Signed-off-by: Kevin Wolf <kwolf@redhat.com> --- block/io.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)