Message ID | 99b45e3beb4a38b17eb50fcde1e09cdefdb99724.1584468723.git.berto@igalia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add subcluster allocation to qcow2 | expand |
On 17.03.20 19:16, Alberto Garcia wrote: > Two changes are needed in this function: > > 1) A full discard deallocates a cluster so we can skip the operation if > it is already unallocated. With extended L2 entries however if any > of the subclusters has the 'all zeroes' bit set then we have to > clear it. > > 2) Setting the QCOW_OFLAG_ZERO bit of the L2 entry is forbidden if an > image has extended L2 entries. Instead, the individual 'all zeroes' > bits must be used. > > Signed-off-by: Alberto Garcia <berto@igalia.com> > --- > block/qcow2-cluster.c | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c > index 746006a117..824c710760 100644 > --- a/block/qcow2-cluster.c > +++ b/block/qcow2-cluster.c > @@ -1790,12 +1790,20 @@ static int discard_in_l2_slice(BlockDriverState *bs, uint64_t offset, > * TODO We might want to use bdrv_block_status(bs) here, but we're > * holding s->lock, so that doesn't work today. > * > - * If full_discard is true, the sector should not read back as zeroes, > + * If full_discard is true, the cluster should not read back as zeroes, > * but rather fall through to the backing file. > */ > switch (qcow2_get_cluster_type(bs, old_l2_entry)) { > case QCOW2_CLUSTER_UNALLOCATED: > - if (full_discard || !bs->backing) { > + if (full_discard) { > + /* If the image has extended L2 entries we can only > + * skip this operation if the L2 bitmap is zero. */ > + uint64_t bitmap = has_subclusters(s) ? > + get_l2_bitmap(s, l2_slice, l2_index + i) : 0; Isn’t this bitmap only valid for standard clusters? In this case, the whole cluster is unallocated, so the bitmap shouldn’t be relevant, AFAIU. Max > + if (bitmap == 0) { > + continue; > + } > + } else if (!bs->backing) { > continue; > } > break;
On Thu 09 Apr 2020 12:05:12 PM CEST, Max Reitz wrote: >> switch (qcow2_get_cluster_type(bs, old_l2_entry)) { >> case QCOW2_CLUSTER_UNALLOCATED: >> - if (full_discard || !bs->backing) { >> + if (full_discard) { >> + /* If the image has extended L2 entries we can only >> + * skip this operation if the L2 bitmap is zero. */ >> + uint64_t bitmap = has_subclusters(s) ? >> + get_l2_bitmap(s, l2_slice, l2_index + i) : 0; > > Isn’t this bitmap only valid for standard clusters? In this case, the > whole cluster is unallocated, so the bitmap shouldn’t be relevant, > AFAIU. I'm not sure if I follow you. An unallocated cluster can still have QCOW_OFLAG_SUB_ZERO set in some of its subclusters. Those read as zeroes and the rest go to the backing file. After a full discard all subclusters should be completely deallocated so those bits should be cleared. If the bitmap is already 0 (the whole cluster is already unallocated) or if the image does not have extended L2 entries (which also means that the whole cluster is already unallocated) then we can skip the discard. Berto
On 10.04.20 14:47, Alberto Garcia wrote: > On Thu 09 Apr 2020 12:05:12 PM CEST, Max Reitz wrote: >>> switch (qcow2_get_cluster_type(bs, old_l2_entry)) { >>> case QCOW2_CLUSTER_UNALLOCATED: >>> - if (full_discard || !bs->backing) { >>> + if (full_discard) { >>> + /* If the image has extended L2 entries we can only >>> + * skip this operation if the L2 bitmap is zero. */ >>> + uint64_t bitmap = has_subclusters(s) ? >>> + get_l2_bitmap(s, l2_slice, l2_index + i) : 0; >> >> Isn’t this bitmap only valid for standard clusters? In this case, the >> whole cluster is unallocated, so the bitmap shouldn’t be relevant, >> AFAIU. > > I'm not sure if I follow you. > > An unallocated cluster can still have QCOW_OFLAG_SUB_ZERO set in some of > its subclusters. Those read as zeroes and the rest go to the backing > file. Hm, right, this is the only way to have non-preallocated zero clusters after all. I suppose I read the spec wrong and assumed somehow that unallocated clusters don’t use “standard cluster descriptors”, so their bitmap usage would be undefined. Don’t know how that happened. > After a full discard all subclusters should be completely deallocated so > those bits should be cleared. > > If the bitmap is already 0 (the whole cluster is already unallocated) or > if the image does not have extended L2 entries (which also means that > the whole cluster is already unallocated) then we can skip the discard. Yep, seems right. Reviewed-by: Max Reitz <mreitz@redhat.com>
17.03.2020 21:16, Alberto Garcia wrote: > Two changes are needed in this function: > > 1) A full discard deallocates a cluster so we can skip the operation if > it is already unallocated. With extended L2 entries however if any > of the subclusters has the 'all zeroes' bit set then we have to > clear it. > > 2) Setting the QCOW_OFLAG_ZERO bit of the L2 entry is forbidden if an > image has extended L2 entries. Instead, the individual 'all zeroes' > bits must be used. > > Signed-off-by: Alberto Garcia <berto@igalia.com> > --- > block/qcow2-cluster.c | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c > index 746006a117..824c710760 100644 > --- a/block/qcow2-cluster.c > +++ b/block/qcow2-cluster.c > @@ -1790,12 +1790,20 @@ static int discard_in_l2_slice(BlockDriverState *bs, uint64_t offset, > * TODO We might want to use bdrv_block_status(bs) here, but we're > * holding s->lock, so that doesn't work today. > * > - * If full_discard is true, the sector should not read back as zeroes, > + * If full_discard is true, the cluster should not read back as zeroes, > * but rather fall through to the backing file. > */ > switch (qcow2_get_cluster_type(bs, old_l2_entry)) { > case QCOW2_CLUSTER_UNALLOCATED: > - if (full_discard || !bs->backing) { > + if (full_discard) { > + /* If the image has extended L2 entries we can only > + * skip this operation if the L2 bitmap is zero. */ > + uint64_t bitmap = has_subclusters(s) ? > + get_l2_bitmap(s, l2_slice, l2_index + i) : 0; > + if (bitmap == 0) { > + continue; > + } > + } else if (!bs->backing) { > continue; > } Hmm, so you do continue if full_discard is false AND bitmap != 0 & !bs->backing, but you do not continue if full_discard is true AND bitmap != 0 & !bs->backing (as you will not go to "else") branch. Seems it's a mistake. I think, correct condition is if (!bs->backing || full_discard && !get_l2_bitmap(s, l2_slice, l2_index + i)) , but, for doing so we also need --- a/block/qcow2.h +++ b/block/qcow2.h @@ -565,6 +565,7 @@ static inline uint64_t get_l2_entry(BDRVQcow2State *s, uint64_t *l2_slice, return be64_to_cpu(l2_slice[idx]); } +/* Return l2-entry bitmap if image has subclusters and 0 otherwise. */ static inline uint64_t get_l2_bitmap(BDRVQcow2State *s, uint64_t *l2_slice, int idx) { @@ -572,7 +573,6 @@ static inline uint64_t get_l2_bitmap(BDRVQcow2State *s, uint64_t *l2_slice, idx *= l2_entry_size(s) / sizeof(uint64_t); return be64_to_cpu(l2_slice[idx + 1]); } else { - /* For convenience only; the caller should ignore this value. */ return 0; } } or if you don't want, keep it explicit if (!bs->backing || full_discard && (!has_subclusters(s) || !get_l2_bitmap(s, l2_slice, l2_index + i))) ===== In case QCOW2_CLUSTER_ZERO_PLAIN, worth assert !has_subclusters(s) or mark image corrupted ? > break; > @@ -1817,7 +1825,11 @@ static int discard_in_l2_slice(BlockDriverState *bs, uint64_t offset, > > /* First remove L2 entries */ > qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); > - if (!full_discard && s->qcow_version >= 3) { > + if (has_subclusters(s)) { > + set_l2_entry(s, l2_slice, l2_index + i, 0); > + set_l2_bitmap(s, l2_slice, l2_index + i, > + full_discard ? 0 : QCOW_L2_BITMAP_ALL_ZEROES); > + } else if (!full_discard && s->qcow_version >= 3) { > set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); > } else { > set_l2_entry(s, l2_slice, l2_index + i, 0); >
On Wed 22 Apr 2020 01:35:25 PM CEST, Vladimir Sementsov-Ogievskiy wrote: > 17.03.2020 21:16, Alberto Garcia wrote: >> Two changes are needed in this function: >> >> 1) A full discard deallocates a cluster so we can skip the operation if >> it is already unallocated. With extended L2 entries however if any >> of the subclusters has the 'all zeroes' bit set then we have to >> clear it. >> >> 2) Setting the QCOW_OFLAG_ZERO bit of the L2 entry is forbidden if an >> image has extended L2 entries. Instead, the individual 'all zeroes' >> bits must be used. >> >> Signed-off-by: Alberto Garcia <berto@igalia.com> >> --- >> block/qcow2-cluster.c | 18 +++++++++++++++--- >> 1 file changed, 15 insertions(+), 3 deletions(-) >> >> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c >> index 746006a117..824c710760 100644 >> --- a/block/qcow2-cluster.c >> +++ b/block/qcow2-cluster.c >> @@ -1790,12 +1790,20 @@ static int discard_in_l2_slice(BlockDriverState *bs, uint64_t offset, >> * TODO We might want to use bdrv_block_status(bs) here, but we're >> * holding s->lock, so that doesn't work today. >> * >> - * If full_discard is true, the sector should not read back as zeroes, >> + * If full_discard is true, the cluster should not read back as zeroes, >> * but rather fall through to the backing file. >> */ >> switch (qcow2_get_cluster_type(bs, old_l2_entry)) { >> case QCOW2_CLUSTER_UNALLOCATED: >> - if (full_discard || !bs->backing) { >> + if (full_discard) { >> + /* If the image has extended L2 entries we can only >> + * skip this operation if the L2 bitmap is zero. */ >> + uint64_t bitmap = has_subclusters(s) ? >> + get_l2_bitmap(s, l2_slice, l2_index + i) : 0; >> + if (bitmap == 0) { >> + continue; >> + } >> + } else if (!bs->backing) { >> continue; >> } > > Hmm, so you do continue if full_discard is false AND bitmap != 0 & > !bs->backing, > but you do not continue if full_discard is true AND bitmap != 0 & > !bs->backing (as you will not go to "else") branch. 1. If full_discard is true it means that the entry and the bitmap should always be set to 0, regardless of whether there's a backing file or any other consideration. This is used e.g when shrinking an image, or by qcow2_make_empty(). We can only skip this operation if both the entry and the bitmap are already 0 (the former we know because of QCOW2_CLUSTER_UNALLOCATED). 2. If full_discard is false it means that we must ensure that the cluster reads back as zeroes, but there's no need to clear the bitmap (in fact we must set QCOW_OFLAG_ZERO or QCOW_L2_BITMAP_ALL_ZEROES depending on the type of image). We can skip this operation if there's no backing file and the cluster is already unallocated (because then we know that it already reads as zeroes). One optimization would be to skip the operation also if the image has subclusters and the bitmap is QCOW_L2_BITMAP_ALL_ZEROES, I can do that for the next version. > In case QCOW2_CLUSTER_ZERO_PLAIN, worth assert !has_subclusters(s) or > mark image corrupted ? I think that should be handled directly in qcow2_get_cluster_type(). There's currently an inconsistency now that I think of it: if an image has subclusters and QCOW_OFLAG_ZERO set then qcow2_get_cluster_type() returns QCOW2_CLUSTER_ZERO_* but qcow2_get_subcluster_type() returns QCOW2_SUBCLUSTER_INVALID. Two alternatives: - We add QCOW2_CLUSTER_INVALID so we get an error in both cases. Problem: any function that calls qcow2_get_cluster_type() should be modified to handle that. - We ignore QCOW_OFLAG_ZERO. Simpler, and it would allow us to use that bit in the future if we wanted. Berto
22.04.2020 20:42, Alberto Garcia wrote: > On Wed 22 Apr 2020 01:35:25 PM CEST, Vladimir Sementsov-Ogievskiy wrote: >> 17.03.2020 21:16, Alberto Garcia wrote: >>> Two changes are needed in this function: >>> >>> 1) A full discard deallocates a cluster so we can skip the operation if >>> it is already unallocated. With extended L2 entries however if any >>> of the subclusters has the 'all zeroes' bit set then we have to >>> clear it. >>> >>> 2) Setting the QCOW_OFLAG_ZERO bit of the L2 entry is forbidden if an >>> image has extended L2 entries. Instead, the individual 'all zeroes' >>> bits must be used. >>> >>> Signed-off-by: Alberto Garcia <berto@igalia.com> >>> --- >>> block/qcow2-cluster.c | 18 +++++++++++++++--- >>> 1 file changed, 15 insertions(+), 3 deletions(-) >>> >>> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c >>> index 746006a117..824c710760 100644 >>> --- a/block/qcow2-cluster.c >>> +++ b/block/qcow2-cluster.c >>> @@ -1790,12 +1790,20 @@ static int discard_in_l2_slice(BlockDriverState *bs, uint64_t offset, >>> * TODO We might want to use bdrv_block_status(bs) here, but we're >>> * holding s->lock, so that doesn't work today. >>> * >>> - * If full_discard is true, the sector should not read back as zeroes, >>> + * If full_discard is true, the cluster should not read back as zeroes, >>> * but rather fall through to the backing file. >>> */ >>> switch (qcow2_get_cluster_type(bs, old_l2_entry)) { >>> case QCOW2_CLUSTER_UNALLOCATED: >>> - if (full_discard || !bs->backing) { >>> + if (full_discard) { >>> + /* If the image has extended L2 entries we can only >>> + * skip this operation if the L2 bitmap is zero. */ >>> + uint64_t bitmap = has_subclusters(s) ? >>> + get_l2_bitmap(s, l2_slice, l2_index + i) : 0; >>> + if (bitmap == 0) { >>> + continue; >>> + } >>> + } else if (!bs->backing) { >>> continue; >>> } >> >> Hmm, so you do continue if full_discard is false AND bitmap != 0 & >> !bs->backing, > >> but you do not continue if full_discard is true AND bitmap != 0 & >> !bs->backing (as you will not go to "else") branch. > > 1. If full_discard is true it means that the entry and the bitmap should > always be set to 0, regardless of whether there's a backing file or > any other consideration. > > This is used e.g when shrinking an image, or by qcow2_make_empty(). > > We can only skip this operation if both the entry and the bitmap are > already 0 (the former we know because of QCOW2_CLUSTER_UNALLOCATED). Ah, understand, sorry. I thought that behavior was changed accidentally, but it is for purpose. With old code cluster is already unallocated, but with subclusters we may have some ZERO_PLAIN subclusters. > > 2. If full_discard is false it means that we must ensure that the > cluster reads back as zeroes, but there's no need to clear the bitmap > (in fact we must set QCOW_OFLAG_ZERO or QCOW_L2_BITMAP_ALL_ZEROES > depending on the type of image). > > We can skip this operation if there's no backing file and the cluster > is already unallocated (because then we know that it already reads as > zeroes). > > One optimization would be to skip the operation also if the image has > subclusters and the bitmap is QCOW_L2_BITMAP_ALL_ZEROES, I can do > that for the next version. > >> In case QCOW2_CLUSTER_ZERO_PLAIN, worth assert !has_subclusters(s) or >> mark image corrupted ? > > I think that should be handled directly in qcow2_get_cluster_type(). > > There's currently an inconsistency now that I think of it: if an image > has subclusters and QCOW_OFLAG_ZERO set then qcow2_get_cluster_type() > returns QCOW2_CLUSTER_ZERO_* but qcow2_get_subcluster_type() returns > QCOW2_SUBCLUSTER_INVALID. > > Two alternatives: > > - We add QCOW2_CLUSTER_INVALID so we get an error in both > cases. Problem: any function that calls qcow2_get_cluster_type() > should be modified to handle that. > > - We ignore QCOW_OFLAG_ZERO. Simpler, and it would allow us to use > that bit in the future if we wanted. > Hmm. Actually we don't check other reserved bits. But ZERO bit is risky, we may miss data corruptions during transmission to the qcow2-subclusters world. So I'm for the first variant if it's not too huge.
On Wed 22 Apr 2020 08:09:53 PM CEST, Vladimir Sementsov-Ogievskiy wrote: >> There's currently an inconsistency now that I think of it: if an image >> has subclusters and QCOW_OFLAG_ZERO set then qcow2_get_cluster_type() >> returns QCOW2_CLUSTER_ZERO_* but qcow2_get_subcluster_type() returns >> QCOW2_SUBCLUSTER_INVALID. >> >> Two alternatives: >> >> - We add QCOW2_CLUSTER_INVALID so we get an error in both >> cases. Problem: any function that calls qcow2_get_cluster_type() >> should be modified to handle that. >> >> - We ignore QCOW_OFLAG_ZERO. Simpler, and it would allow us to use >> that bit in the future if we wanted. >> > > Hmm. Actually we don't check other reserved bits. But ZERO bit is > risky, we may miss data corruptions during transmission to the > qcow2-subclusters world. That's the best argument for checking that bit. > So I'm for the first variant if it's not too huge. The other problem is that if we ever want to use that bit for something else then we would need to add an incompatible feature. If we just ignore it now then we may be able to make it a compatible feature. But the chances for that are low I think, and we still have 8 available bits in the L2 entry. Berto
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 746006a117..824c710760 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -1790,12 +1790,20 @@ static int discard_in_l2_slice(BlockDriverState *bs, uint64_t offset, * TODO We might want to use bdrv_block_status(bs) here, but we're * holding s->lock, so that doesn't work today. * - * If full_discard is true, the sector should not read back as zeroes, + * If full_discard is true, the cluster should not read back as zeroes, * but rather fall through to the backing file. */ switch (qcow2_get_cluster_type(bs, old_l2_entry)) { case QCOW2_CLUSTER_UNALLOCATED: - if (full_discard || !bs->backing) { + if (full_discard) { + /* If the image has extended L2 entries we can only + * skip this operation if the L2 bitmap is zero. */ + uint64_t bitmap = has_subclusters(s) ? + get_l2_bitmap(s, l2_slice, l2_index + i) : 0; + if (bitmap == 0) { + continue; + } + } else if (!bs->backing) { continue; } break; @@ -1817,7 +1825,11 @@ static int discard_in_l2_slice(BlockDriverState *bs, uint64_t offset, /* First remove L2 entries */ qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice); - if (!full_discard && s->qcow_version >= 3) { + if (has_subclusters(s)) { + set_l2_entry(s, l2_slice, l2_index + i, 0); + set_l2_bitmap(s, l2_slice, l2_index + i, + full_discard ? 0 : QCOW_L2_BITMAP_ALL_ZEROES); + } else if (!full_discard && s->qcow_version >= 3) { set_l2_entry(s, l2_slice, l2_index + i, QCOW_OFLAG_ZERO); } else { set_l2_entry(s, l2_slice, l2_index + i, 0);
Two changes are needed in this function: 1) A full discard deallocates a cluster so we can skip the operation if it is already unallocated. With extended L2 entries however if any of the subclusters has the 'all zeroes' bit set then we have to clear it. 2) Setting the QCOW_OFLAG_ZERO bit of the L2 entry is forbidden if an image has extended L2 entries. Instead, the individual 'all zeroes' bits must be used. Signed-off-by: Alberto Garcia <berto@igalia.com> --- block/qcow2-cluster.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)