diff mbox series

[1/2] qemu-img: rebase: stop when reaching EOF of old backing file

Message ID 20230523162458.704266-2-andrey.drobyshev@virtuozzo.com (mailing list archive)
State New, archived
Headers show
Series qemu-img: fix getting stuck in infinite loop on | expand

Commit Message

Andrey Drobyshev May 23, 2023, 4:24 p.m. UTC
In case when we're rebasing within one backing chain, and when target image
is larger than old backing file, bdrv_is_allocated_above() ends up setting
*pnum = 0.  As a result, target offset isn't getting incremented, and we
get stuck in an infinite for loop.  Let's detect this case and break the
loop, as there's no more data to be read and merged from the old backing.

Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
---
 qemu-img.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Denis V. Lunev May 24, 2023, 8:30 a.m. UTC | #1
On 5/23/23 18:24, Andrey Drobyshev wrote:
> In case when we're rebasing within one backing chain, and when target image
> is larger than old backing file, bdrv_is_allocated_above() ends up setting
> *pnum = 0.  As a result, target offset isn't getting incremented, and we
> get stuck in an infinite for loop.  Let's detect this case and break the
> loop, as there's no more data to be read and merged from the old backing.
>
> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
> ---
>   qemu-img.c | 7 +++++++
>   1 file changed, 7 insertions(+)
>
> diff --git a/qemu-img.c b/qemu-img.c
> index 27f48051b0..55b6ce407c 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -3813,6 +3813,13 @@ static int img_rebase(int argc, char **argv)
>                                    strerror(-ret));
>                       goto out;
>                   }
> +                if (!n) {
> +                    /*
> +                     * We've reached EOF of the old backing, therefore there's
> +                     * no more mergeable data.
> +                     */
> +                    break;
> +                }
>                   if (!ret) {
>                       continue;
>                   }
nope. It seems that this is wrong patch.

iris ~/tmp/1 $ qemu-img create -f qcow2 base.qcow2 $(( 65 * 4 ))k
Formatting 'base.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=266240 lazy_refcounts=off refcount_bits=16
iris ~/tmp/1 $ qemu-img create -f qcow2 -o 
backing_file=base.qcow2,backing_fmt=qcow2 inc1.qcow2 $(( 64 * 4 ))
Formatting 'inc1.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=256 backing_file=base.qcow2 backing_fmt=qcow2 
lazy_refcounts=off refcount_bits=16
iris ~/tmp/1 $ qemu-img create -f qcow2 -o 
backing_file=inc1.qcow2,backing_fmt=qcow2 inc2.qcow2 $(( 64 * 5 ))k
Formatting 'inc2.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=327680 backing_file=inc1.qcow2 
backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
iris ~/tmp/1 $ qemu-io -c "write -P 0xae 256k 4k" base.qcow2
wrote 4096/4096 bytes at offset 262144
4 KiB, 1 ops; 00.01 sec (471.447 KiB/sec and 117.8617 ops/sec)
iris ~/tmp/1 $ qemu-io -c "read -P 0xae 256k 4k" base.qcow2
read 4096/4096 bytes at offset 262144
4 KiB, 1 ops; 00.00 sec (56.076 MiB/sec and 14355.4407 ops/sec)
iris ~/tmp/1 $ qemu-io -c "read -P 0xae 256k 4k" inc2.qcow2
Pattern verification failed at offset 262144, 4096 bytes
read 4096/4096 bytes at offset 262144
4 KiB, 1 ops; 00.00 sec (827.771 MiB/sec and 211909.3028 ops/sec)
iris ~/tmp/1 $ qemu-io -c "read -P 0 256k 4k" inc2.qcow2
read 4096/4096 bytes at offset 262144
4 KiB, 1 ops; 00.00 sec (838.611 MiB/sec and 214684.4139 ops/sec)
iris ~/tmp/1 $


iris ~/tmp/1 $ /home/den/src/qemu/build/qemu-img rebase -f qcow2 -b 
base.qcow2 -F qcow2 inc2.qcow2
iris ~/tmp/1 $ qemu-io -c "read -P 0 256k 4k" inc2.qcow2
Pattern verification failed at offset 262144, 4096 bytes
read 4096/4096 bytes at offset 262144
4 KiB, 1 ops; 00.00 sec (88.052 MiB/sec and 22541.3069 ops/sec)
iris ~/tmp/1 $

the problem is the following:
[----0xAE] <- base
[----] <- inc1
[--------] <- inc2

In this case last 4k of base should be read as 0x00 (this offset is not
written in inc2 and beyond end of the inc1).

This means that all non-present clusters in inc2 MUST be zeroed
during rebasing to base.

Something like this...

Den
Andrey Drobyshev May 25, 2023, 10:37 a.m. UTC | #2
On 5/24/23 11:30, Denis V. Lunev wrote:
> On 5/23/23 18:24, Andrey Drobyshev wrote:
>> In case when we're rebasing within one backing chain, and when target
>> image
>> is larger than old backing file, bdrv_is_allocated_above() ends up
>> setting
>> *pnum = 0.  As a result, target offset isn't getting incremented, and we
>> get stuck in an infinite for loop.  Let's detect this case and break the
>> loop, as there's no more data to be read and merged from the old backing.
>>
>> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
>> ---
>>   qemu-img.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/qemu-img.c b/qemu-img.c
>> index 27f48051b0..55b6ce407c 100644
>> --- a/qemu-img.c
>> +++ b/qemu-img.c
>> @@ -3813,6 +3813,13 @@ static int img_rebase(int argc, char **argv)
>>                                    strerror(-ret));
>>                       goto out;
>>                   }
>> +                if (!n) {
>> +                    /*
>> +                     * We've reached EOF of the old backing,
>> therefore there's
>> +                     * no more mergeable data.
>> +                     */
>> +                    break;
>> +                }
>>                   if (!ret) {
>>                       continue;
>>                   }
> nope. It seems that this is wrong patch.
> 
> iris ~/tmp/1 $ qemu-img create -f qcow2 base.qcow2 $(( 65 * 4 ))k
> Formatting 'base.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off
> compression_type=zlib size=266240 lazy_refcounts=off refcount_bits=16
> iris ~/tmp/1 $ qemu-img create -f qcow2 -o
> backing_file=base.qcow2,backing_fmt=qcow2 inc1.qcow2 $(( 64 * 4 ))
> Formatting 'inc1.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off
> compression_type=zlib size=256 backing_file=base.qcow2 backing_fmt=qcow2
> lazy_refcounts=off refcount_bits=16
> iris ~/tmp/1 $ qemu-img create -f qcow2 -o
> backing_file=inc1.qcow2,backing_fmt=qcow2 inc2.qcow2 $(( 64 * 5 ))k
> Formatting 'inc2.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off
> compression_type=zlib size=327680 backing_file=inc1.qcow2
> backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
> iris ~/tmp/1 $ qemu-io -c "write -P 0xae 256k 4k" base.qcow2
> wrote 4096/4096 bytes at offset 262144
> 4 KiB, 1 ops; 00.01 sec (471.447 KiB/sec and 117.8617 ops/sec)
> iris ~/tmp/1 $ qemu-io -c "read -P 0xae 256k 4k" base.qcow2
> read 4096/4096 bytes at offset 262144
> 4 KiB, 1 ops; 00.00 sec (56.076 MiB/sec and 14355.4407 ops/sec)
> iris ~/tmp/1 $ qemu-io -c "read -P 0xae 256k 4k" inc2.qcow2
> Pattern verification failed at offset 262144, 4096 bytes
> read 4096/4096 bytes at offset 262144
> 4 KiB, 1 ops; 00.00 sec (827.771 MiB/sec and 211909.3028 ops/sec)
> iris ~/tmp/1 $ qemu-io -c "read -P 0 256k 4k" inc2.qcow2
> read 4096/4096 bytes at offset 262144
> 4 KiB, 1 ops; 00.00 sec (838.611 MiB/sec and 214684.4139 ops/sec)
> iris ~/tmp/1 $
> 
> 
> iris ~/tmp/1 $ /home/den/src/qemu/build/qemu-img rebase -f qcow2 -b
> base.qcow2 -F qcow2 inc2.qcow2
> iris ~/tmp/1 $ qemu-io -c "read -P 0 256k 4k" inc2.qcow2
> Pattern verification failed at offset 262144, 4096 bytes
> read 4096/4096 bytes at offset 262144
> 4 KiB, 1 ops; 00.00 sec (88.052 MiB/sec and 22541.3069 ops/sec)
> iris ~/tmp/1 $
> 
> the problem is the following:
> [----0xAE] <- base
> [----] <- inc1
> [--------] <- inc2
> 
> In this case last 4k of base should be read as 0x00 (this offset is not
> written in inc2 and beyond end of the inc1).
> 
> This means that all non-present clusters in inc2 MUST be zeroed
> during rebasing to base.
> 
> Something like this...
> 
> Den

Thanks for pointing that out, it seems indeed that breaking the loop
just yet isn't a good idea.  Since the offsets beyond the old backing
size were read as zeroes before the rebase, this should remain the case
afterwards as well.  Which means we should explicitly zero the overlay
clusters which go beyond the old backing size.

I'll alter the patches accordingly (including the test) and send v2.

Andrey
diff mbox series

Patch

diff --git a/qemu-img.c b/qemu-img.c
index 27f48051b0..55b6ce407c 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -3813,6 +3813,13 @@  static int img_rebase(int argc, char **argv)
                                  strerror(-ret));
                     goto out;
                 }
+                if (!n) {
+                    /*
+                     * We've reached EOF of the old backing, therefore there's
+                     * no more mergeable data.
+                     */
+                    break;
+                }
                 if (!ret) {
                     continue;
                 }