[2/6] block: truncate: Don't make backing file data visible
diff mbox series

Message ID 20191120140319.1505-3-kwolf@redhat.com
State New
Headers show
Series
  • block: Fix resize (extending) of short overlays
Related show

Commit Message

Kevin Wolf Nov. 20, 2019, 2:03 p.m. UTC
When extending the size of an image that has a backing file larger than
its old size, make sure that the backing file data doesn't become
visible in the guest, but the added area is properly zeroed out.

The old behaviour made a difference in 'block_resize' (where showing the
backing file data from an old snapshot rather than zeros is
questionable) as well as in commit block jobs (both from active and
intermediate nodes) and HMP 'commit', where committing to a short
backing file would incorrectly omit writing zeroes for unallocated
blocks on the top layer after the EOF of the short backing file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/io.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

Comments

Eric Blake Nov. 20, 2019, 2:20 p.m. UTC | #1
On 11/20/19 8:03 AM, Kevin Wolf wrote:
> When extending the size of an image that has a backing file larger than
> its old size, make sure that the backing file data doesn't become
> visible in the guest, but the added area is properly zeroed out.
> 
> The old behaviour made a difference in 'block_resize' (where showing the
> backing file data from an old snapshot rather than zeros is
> questionable) as well as in commit block jobs (both from active and
> intermediate nodes) and HMP 'commit', where committing to a short
> backing file would incorrectly omit writing zeroes for unallocated
> blocks on the top layer after the EOF of the short backing file.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>   block/io.c | 25 +++++++++++++++++++++++++
>   1 file changed, 25 insertions(+)

Reviewed-by: Eric Blake <eblake@redhat.com>
Vladimir Sementsov-Ogievskiy Nov. 20, 2019, 2:47 p.m. UTC | #2
20.11.2019 17:03, Kevin Wolf wrote:
> When extending the size of an image that has a backing file larger than
> its old size, make sure that the backing file data doesn't become
> visible in the guest, but the added area is properly zeroed out.
> 
> The old behaviour made a difference in 'block_resize' (where showing the
> backing file data from an old snapshot rather than zeros is
> questionable) as well as in commit block jobs (both from active and
> intermediate nodes) and HMP 'commit', where committing to a short
> backing file would incorrectly omit writing zeroes for unallocated
> blocks on the top layer after the EOF of the short backing file.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>   block/io.c | 25 +++++++++++++++++++++++++
>   1 file changed, 25 insertions(+)
> 
> diff --git a/block/io.c b/block/io.c
> index 003f4ea38c..8683f7a4bd 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -3382,6 +3382,31 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact,
>           goto out;
>       }
>   
> +    /*
> +     * If the image has a backing file that is large enough that it would
> +     * provide data for the new area, we cannot leave it unallocated because
> +     * then the backing file content would become visible. Instead, zero-fill
> +     * the area where backing file and new area overlap.
> +     */

Should we mention that, still, we don't care if user for some reason will change
backing file in future?

> +    if (new_bytes && bs->backing && prealloc == PREALLOC_MODE_OFF) {
> +        int64_t backing_len;
> +
> +        backing_len = bdrv_getlength(backing_bs(bs));
> +        if (backing_len < 0) {
> +            ret = backing_len;
> +            goto out;
> +        }
> +
> +        if (backing_len > old_size) {
> +            ret = bdrv_co_do_pwrite_zeroes(bs, old_size,
> +                                           MIN(new_bytes, backing_len - old_size),
> +                                           BDRV_REQ_ZERO_WRITE | BDRV_REQ_MAY_UNMAP);

two over-80 lines

> +            if (ret < 0) {
> +                goto out;
> +            }
> +        }
> +    }

should we improve "off" mode specification in qapi?

> +
>       ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
>       if (ret < 0) {
>           error_setg_errno(errp, -ret, "Could not refresh total sector count");
> 

Hmm. is it correct to call write_zeroes before refresh_total_sectors?
Note that qcow2_co_pwrite_zeroes rely on bs->total_sectors...
Kevin Wolf Nov. 20, 2019, 3:12 p.m. UTC | #3
Am 20.11.2019 um 15:47 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 20.11.2019 17:03, Kevin Wolf wrote:
> > When extending the size of an image that has a backing file larger than
> > its old size, make sure that the backing file data doesn't become
> > visible in the guest, but the added area is properly zeroed out.
> > 
> > The old behaviour made a difference in 'block_resize' (where showing the
> > backing file data from an old snapshot rather than zeros is
> > questionable) as well as in commit block jobs (both from active and
> > intermediate nodes) and HMP 'commit', where committing to a short
> > backing file would incorrectly omit writing zeroes for unallocated
> > blocks on the top layer after the EOF of the short backing file.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >   block/io.c | 25 +++++++++++++++++++++++++
> >   1 file changed, 25 insertions(+)
> > 
> > diff --git a/block/io.c b/block/io.c
> > index 003f4ea38c..8683f7a4bd 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -3382,6 +3382,31 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact,
> >           goto out;
> >       }
> >   
> > +    /*
> > +     * If the image has a backing file that is large enough that it would
> > +     * provide data for the new area, we cannot leave it unallocated because
> > +     * then the backing file content would become visible. Instead, zero-fill
> > +     * the area where backing file and new area overlap.
> > +     */
> 
> Should we mention that, still, we don't care if user for some reason will change
> backing file in future?

This should be obvious, but I can add something.

> > +    if (new_bytes && bs->backing && prealloc == PREALLOC_MODE_OFF) {
> > +        int64_t backing_len;
> > +
> > +        backing_len = bdrv_getlength(backing_bs(bs));
> > +        if (backing_len < 0) {
> > +            ret = backing_len;
> > +            goto out;
> > +        }
> > +
> > +        if (backing_len > old_size) {
> > +            ret = bdrv_co_do_pwrite_zeroes(bs, old_size,
> > +                                           MIN(new_bytes, backing_len - old_size),
> > +                                           BDRV_REQ_ZERO_WRITE | BDRV_REQ_MAY_UNMAP);
> 
> two over-80 lines

Will fix.

> > +            if (ret < 0) {
> > +                goto out;
> > +            }
> > +        }
> > +    }
> 
> should we improve "off" mode specification in qapi?

I don't think we're changing the semantics of "off". We're merely fixing
a bug that happens not to exist with preallocation.

> > +
> >       ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
> >       if (ret < 0) {
> >           error_setg_errno(errp, -ret, "Could not refresh total sector count");
> > 
> 
> Hmm. is it correct to call write_zeroes before refresh_total_sectors?
> Note that qcow2_co_pwrite_zeroes rely on bs->total_sectors...

Hm... I placed the code where I did because I didn't want to make the
new area valid before it isn't zeroed. But you're probably right that
we shouldn't run requests with inconsistent bs->total_sectors, so I'll
switch the order.

Kevin
Vladimir Sementsov-Ogievskiy Nov. 20, 2019, 6:01 p.m. UTC | #4
20.11.2019 17:03, Kevin Wolf wrote:
> When extending the size of an image that has a backing file larger than
> its old size, make sure that the backing file data doesn't become
> visible in the guest, but the added area is properly zeroed out.
> 
> The old behaviour made a difference in 'block_resize' (where showing the
> backing file data from an old snapshot rather than zeros is
> questionable) as well as in commit block jobs (both from active and
> intermediate nodes) and HMP 'commit', where committing to a short
> backing file would incorrectly omit writing zeroes for unallocated
> blocks on the top layer after the EOF of the short backing file.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>   block/io.c | 25 +++++++++++++++++++++++++
>   1 file changed, 25 insertions(+)
> 
> diff --git a/block/io.c b/block/io.c
> index 003f4ea38c..8683f7a4bd 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -3382,6 +3382,31 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact,
>           goto out;
>       }
>   
> +    /*
> +     * If the image has a backing file that is large enough that it would
> +     * provide data for the new area, we cannot leave it unallocated because
> +     * then the backing file content would become visible. Instead, zero-fill
> +     * the area where backing file and new area overlap.
> +     */
> +    if (new_bytes && bs->backing && prealloc == PREALLOC_MODE_OFF) {
> +        int64_t backing_len;
> +
> +        backing_len = bdrv_getlength(backing_bs(bs));
> +        if (backing_len < 0) {
> +            ret = backing_len;
> +            goto out;
> +        }
> +
> +        if (backing_len > old_size) {
> +            ret = bdrv_co_do_pwrite_zeroes(bs, old_size,
> +                                           MIN(new_bytes, backing_len - old_size),
> +                                           BDRV_REQ_ZERO_WRITE | BDRV_REQ_MAY_UNMAP);
> +            if (ret < 0) {
> +                goto out;
> +            }
> +        }
> +    }
> +
>       ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
>       if (ret < 0) {
>           error_setg_errno(errp, -ret, "Could not refresh total sector count");
> 


Hmmmm. I'm think that for commit, we also should zero truncated area if !bdrv_has_zero_init_truncate(bs).
But we should not do it here, as it should not be done if we just resizing disk..

What formats are that bad?
Kevin Wolf Nov. 25, 2019, 11:06 a.m. UTC | #5
Am 20.11.2019 um 19:01 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 20.11.2019 17:03, Kevin Wolf wrote:
> > When extending the size of an image that has a backing file larger than
> > its old size, make sure that the backing file data doesn't become
> > visible in the guest, but the added area is properly zeroed out.
> > 
> > The old behaviour made a difference in 'block_resize' (where showing the
> > backing file data from an old snapshot rather than zeros is
> > questionable) as well as in commit block jobs (both from active and
> > intermediate nodes) and HMP 'commit', where committing to a short
> > backing file would incorrectly omit writing zeroes for unallocated
> > blocks on the top layer after the EOF of the short backing file.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >   block/io.c | 25 +++++++++++++++++++++++++
> >   1 file changed, 25 insertions(+)
> > 
> > diff --git a/block/io.c b/block/io.c
> > index 003f4ea38c..8683f7a4bd 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -3382,6 +3382,31 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact,
> >           goto out;
> >       }
> >   
> > +    /*
> > +     * If the image has a backing file that is large enough that it would
> > +     * provide data for the new area, we cannot leave it unallocated because
> > +     * then the backing file content would become visible. Instead, zero-fill
> > +     * the area where backing file and new area overlap.
> > +     */
> > +    if (new_bytes && bs->backing && prealloc == PREALLOC_MODE_OFF) {
> > +        int64_t backing_len;
> > +
> > +        backing_len = bdrv_getlength(backing_bs(bs));
> > +        if (backing_len < 0) {
> > +            ret = backing_len;
> > +            goto out;
> > +        }
> > +
> > +        if (backing_len > old_size) {
> > +            ret = bdrv_co_do_pwrite_zeroes(bs, old_size,
> > +                                           MIN(new_bytes, backing_len - old_size),
> > +                                           BDRV_REQ_ZERO_WRITE | BDRV_REQ_MAY_UNMAP);
> > +            if (ret < 0) {
> > +                goto out;
> > +            }
> > +        }
> > +    }
> > +
> >       ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
> >       if (ret < 0) {
> >           error_setg_errno(errp, -ret, "Could not refresh total sector count");
> 
> Hmmmm. I'm think that for commit, we also should zero truncated area
> if !bdrv_has_zero_init_truncate(bs). But we should not do it here, as
> it should not be done if we just resizing disk..

Hm, yes, we need to do this for METADATA and FALLOC preallocation at
least. I think we already guarantee zeros for FULL, do we?

Resize needs zero init in the opposite case: When you are resizing a
short backing file, the longer overlay still needs to read the same
zeros it read before after EOF of the backing file. This one sounds
actually even nastier to fix than what this series does. :-/

Anyway, maybe instead of the no_fallback parameter I introduced in v3,
what we really want is a need_zero_init parameter that only commit jobs
set for now? Or actually add a new preallocation mode like you suggested
that would add a zero write and pass OFF to the driver implementations.
Then we wouldn't have to add a new parameter everywhere.

We'd still unconditionally write zeros where it's necessary to allocate
blocks to cover the backing file (and to provide correct data to the
overlay if we ever figure out how to check this condition). I think I've
come to the conclusion that blocking on block_resize is better than
failing.

> What formats are that bad?

You mean that they don't have zero init? The usual suspect for bad image
formats is raw, but fortunately that doesn't support backing files. So
maybe it's not a problem we would see in practice.

Kevin

Patch
diff mbox series

diff --git a/block/io.c b/block/io.c
index 003f4ea38c..8683f7a4bd 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3382,6 +3382,31 @@  int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact,
         goto out;
     }
 
+    /*
+     * If the image has a backing file that is large enough that it would
+     * provide data for the new area, we cannot leave it unallocated because
+     * then the backing file content would become visible. Instead, zero-fill
+     * the area where backing file and new area overlap.
+     */
+    if (new_bytes && bs->backing && prealloc == PREALLOC_MODE_OFF) {
+        int64_t backing_len;
+
+        backing_len = bdrv_getlength(backing_bs(bs));
+        if (backing_len < 0) {
+            ret = backing_len;
+            goto out;
+        }
+
+        if (backing_len > old_size) {
+            ret = bdrv_co_do_pwrite_zeroes(bs, old_size,
+                                           MIN(new_bytes, backing_len - old_size),
+                                           BDRV_REQ_ZERO_WRITE | BDRV_REQ_MAY_UNMAP);
+            if (ret < 0) {
+                goto out;
+            }
+        }
+    }
+
     ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
     if (ret < 0) {
         error_setg_errno(errp, -ret, "Could not refresh total sector count");