diff mbox

[v10,04/17] qcow2: Optimize zero_single_l2() to minimize L2 churn

Message ID 20170427014626.11553-5-eblake@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Eric Blake April 27, 2017, 1:46 a.m. UTC
Similar to discard_single_l2(), we should try to avoid dirtying
the L2 cache when the cluster we are changing already has the
right characteristics.

Note that by the time we get to zero_single_l2(), BDRV_REQ_MAY_UNMAP
is a requirement to unallocate a cluster (this is because the block
layer clears that flag if discard.* flags during open requested that
we never punch holes - see the conversation around commit 170f4b2e,
https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg07306.html).
Therefore, this patch can only reuse a zero cluster as-is if either
unmapping is not requested, or if the zero cluster was not associated
with an allocation.

Technically, there are some cases where an unallocated cluster
already reads as all zeroes (namely, when there is no backing file
[easy: check bs->backing], or when the backing file also reads as
zeroes [harder: we can't check bdrv_get_block_status since we are
already holding the lock]), where the guest would not immediately see
a difference if we left that cluster unallocated.  But if the user
did not request unmapping, leaving an unallocated cluster is wrong;
and even if the user DID request unmapping, keeping a cluster
unallocated risks a subtle semantic change of guest-visible contents
if a backing file is later added, and it is not worth auditing
whether all internal uses such as mirror properly avoid an unmap
request.  Thus, this patch is intentionally limited to just clusters
that are already marked as zero.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v10: new patch, replacing earlier attempt to use unallocated clusters,
and ditching any optimization of v2 files
---
 block/qcow2-cluster.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

Comments

Max Reitz April 28, 2017, 6 p.m. UTC | #1
On 27.04.2017 03:46, Eric Blake wrote:
> Similar to discard_single_l2(), we should try to avoid dirtying
> the L2 cache when the cluster we are changing already has the
> right characteristics.
> 
> Note that by the time we get to zero_single_l2(), BDRV_REQ_MAY_UNMAP
> is a requirement to unallocate a cluster (this is because the block
> layer clears that flag if discard.* flags during open requested that
> we never punch holes - see the conversation around commit 170f4b2e,
> https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg07306.html).
> Therefore, this patch can only reuse a zero cluster as-is if either
> unmapping is not requested, or if the zero cluster was not associated
> with an allocation.
> 
> Technically, there are some cases where an unallocated cluster
> already reads as all zeroes (namely, when there is no backing file
> [easy: check bs->backing], or when the backing file also reads as
> zeroes [harder: we can't check bdrv_get_block_status since we are
> already holding the lock]), where the guest would not immediately see
> a difference if we left that cluster unallocated.  But if the user
> did not request unmapping, leaving an unallocated cluster is wrong;
> and even if the user DID request unmapping, keeping a cluster
> unallocated risks a subtle semantic change of guest-visible contents
> if a backing file is later added, and it is not worth auditing
> whether all internal uses such as mirror properly avoid an unmap
> request.  Thus, this patch is intentionally limited to just clusters
> that are already marked as zero.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> v10: new patch, replacing earlier attempt to use unallocated clusters,
> and ditching any optimization of v2 files
> ---
>  block/qcow2-cluster.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index db3d937..d542894 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -1614,6 +1614,7 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
>      int l2_index;
>      int ret;
>      int i;
> +    bool unmap = !!(flags & BDRV_REQ_MAY_UNMAP);
> 
>      ret = get_cluster_table(bs, offset, &l2_table, &l2_index);
>      if (ret < 0) {
> @@ -1629,9 +1630,17 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
> 
>          old_offset = be64_to_cpu(l2_table[l2_index + i]);
> 
> -        /* Update L2 entries */
> +        /*
> +         * Minimize L2 changes if the cluster already reads back as
> +         * zeroes with correct allocation.
> +         */
> +        if (qcow2_get_cluster_type(old_offset) == QCOW2_CLUSTER_ZERO &&
> +            !(unmap && old_offset & L2E_OFFSET_MASK)) {
> +            continue;
> +        }
> +
>          qcow2_cache_entry_mark_dirty(bs, s->l2_table_cache, l2_table);
> -        if (old_offset & QCOW_OFLAG_COMPRESSED || flags & BDRV_REQ_MAY_UNMAP) {
> +        if (old_offset & QCOW_OFLAG_COMPRESSED || unmap) {

Works, but I'd like it better to store the cluster type somewhere and
then use it here instead of checking the COMPRESSED flag.

But it works, so it's up to you:

Reviewed-by: Max Reitz <mreitz@redhat.com>

>              l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
>              qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
>          } else {
>
Eric Blake April 28, 2017, 7:11 p.m. UTC | #2
On 04/28/2017 01:00 PM, Max Reitz wrote:
> On 27.04.2017 03:46, Eric Blake wrote:
>> Similar to discard_single_l2(), we should try to avoid dirtying
>> the L2 cache when the cluster we are changing already has the
>> right characteristics.
>>

>> -        /* Update L2 entries */
>> +        /*
>> +         * Minimize L2 changes if the cluster already reads back as
>> +         * zeroes with correct allocation.
>> +         */
>> +        if (qcow2_get_cluster_type(old_offset) == QCOW2_CLUSTER_ZERO &&
>> +            !(unmap && old_offset & L2E_OFFSET_MASK)) {
>> +            continue;
>> +        }
>> +
>>          qcow2_cache_entry_mark_dirty(bs, s->l2_table_cache, l2_table);
>> -        if (old_offset & QCOW_OFLAG_COMPRESSED || flags & BDRV_REQ_MAY_UNMAP) {
>> +        if (old_offset & QCOW_OFLAG_COMPRESSED || unmap) {
> 
> Works, but I'd like it better to store the cluster type somewhere and
> then use it here instead of checking the COMPRESSED flag.

Indeed, qcow2_get_cluster_type() checks QCOW_OFLAG_COMPRESSED first -
but I like the idea of storing it in a temporary variable now that we
are using the cluster type in more than one place.

> 
> But it works, so it's up to you:
> 
> Reviewed-by: Max Reitz <mreitz@redhat.com>
> 
>>              l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
>>              qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
>>          } else {
>>
> 
>
diff mbox

Patch

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index db3d937..d542894 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1614,6 +1614,7 @@  static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
     int l2_index;
     int ret;
     int i;
+    bool unmap = !!(flags & BDRV_REQ_MAY_UNMAP);

     ret = get_cluster_table(bs, offset, &l2_table, &l2_index);
     if (ret < 0) {
@@ -1629,9 +1630,17 @@  static int zero_single_l2(BlockDriverState *bs, uint64_t offset,

         old_offset = be64_to_cpu(l2_table[l2_index + i]);

-        /* Update L2 entries */
+        /*
+         * Minimize L2 changes if the cluster already reads back as
+         * zeroes with correct allocation.
+         */
+        if (qcow2_get_cluster_type(old_offset) == QCOW2_CLUSTER_ZERO &&
+            !(unmap && old_offset & L2E_OFFSET_MASK)) {
+            continue;
+        }
+
         qcow2_cache_entry_mark_dirty(bs, s->l2_table_cache, l2_table);
-        if (old_offset & QCOW_OFLAG_COMPRESSED || flags & BDRV_REQ_MAY_UNMAP) {
+        if (old_offset & QCOW_OFLAG_COMPRESSED || unmap) {
             l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
             qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
         } else {