Message ID | 20230523162458.704266-2-andrey.drobyshev@virtuozzo.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | qemu-img: fix getting stuck in infinite loop on | expand |
On 5/23/23 18:24, Andrey Drobyshev wrote: > In case when we're rebasing within one backing chain, and when target image > is larger than old backing file, bdrv_is_allocated_above() ends up setting > *pnum = 0. As a result, target offset isn't getting incremented, and we > get stuck in an infinite for loop. Let's detect this case and break the > loop, as there's no more data to be read and merged from the old backing. > > Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> > --- > qemu-img.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/qemu-img.c b/qemu-img.c > index 27f48051b0..55b6ce407c 100644 > --- a/qemu-img.c > +++ b/qemu-img.c > @@ -3813,6 +3813,13 @@ static int img_rebase(int argc, char **argv) > strerror(-ret)); > goto out; > } > + if (!n) { > + /* > + * We've reached EOF of the old backing, therefore there's > + * no more mergeable data. > + */ > + break; > + } > if (!ret) { > continue; > } nope. It seems that this is wrong patch. iris ~/tmp/1 $ qemu-img create -f qcow2 base.qcow2 $(( 65 * 4 ))k Formatting 'base.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=266240 lazy_refcounts=off refcount_bits=16 iris ~/tmp/1 $ qemu-img create -f qcow2 -o backing_file=base.qcow2,backing_fmt=qcow2 inc1.qcow2 $(( 64 * 4 )) Formatting 'inc1.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=256 backing_file=base.qcow2 backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16 iris ~/tmp/1 $ qemu-img create -f qcow2 -o backing_file=inc1.qcow2,backing_fmt=qcow2 inc2.qcow2 $(( 64 * 5 ))k Formatting 'inc2.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=327680 backing_file=inc1.qcow2 backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16 iris ~/tmp/1 $ qemu-io -c "write -P 0xae 256k 4k" base.qcow2 wrote 4096/4096 bytes at offset 262144 4 KiB, 1 ops; 00.01 sec (471.447 KiB/sec and 117.8617 ops/sec) iris ~/tmp/1 $ qemu-io -c "read -P 0xae 256k 4k" base.qcow2 read 4096/4096 bytes at offset 262144 4 KiB, 1 ops; 00.00 sec (56.076 MiB/sec and 14355.4407 ops/sec) iris ~/tmp/1 $ qemu-io -c "read -P 0xae 256k 4k" inc2.qcow2 Pattern verification failed at offset 262144, 4096 bytes read 4096/4096 bytes at offset 262144 4 KiB, 1 ops; 00.00 sec (827.771 MiB/sec and 211909.3028 ops/sec) iris ~/tmp/1 $ qemu-io -c "read -P 0 256k 4k" inc2.qcow2 read 4096/4096 bytes at offset 262144 4 KiB, 1 ops; 00.00 sec (838.611 MiB/sec and 214684.4139 ops/sec) iris ~/tmp/1 $ iris ~/tmp/1 $ /home/den/src/qemu/build/qemu-img rebase -f qcow2 -b base.qcow2 -F qcow2 inc2.qcow2 iris ~/tmp/1 $ qemu-io -c "read -P 0 256k 4k" inc2.qcow2 Pattern verification failed at offset 262144, 4096 bytes read 4096/4096 bytes at offset 262144 4 KiB, 1 ops; 00.00 sec (88.052 MiB/sec and 22541.3069 ops/sec) iris ~/tmp/1 $ the problem is the following: [----0xAE] <- base [----] <- inc1 [--------] <- inc2 In this case last 4k of base should be read as 0x00 (this offset is not written in inc2 and beyond end of the inc1). This means that all non-present clusters in inc2 MUST be zeroed during rebasing to base. Something like this... Den
On 5/24/23 11:30, Denis V. Lunev wrote: > On 5/23/23 18:24, Andrey Drobyshev wrote: >> In case when we're rebasing within one backing chain, and when target >> image >> is larger than old backing file, bdrv_is_allocated_above() ends up >> setting >> *pnum = 0. As a result, target offset isn't getting incremented, and we >> get stuck in an infinite for loop. Let's detect this case and break the >> loop, as there's no more data to be read and merged from the old backing. >> >> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> >> --- >> qemu-img.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/qemu-img.c b/qemu-img.c >> index 27f48051b0..55b6ce407c 100644 >> --- a/qemu-img.c >> +++ b/qemu-img.c >> @@ -3813,6 +3813,13 @@ static int img_rebase(int argc, char **argv) >> strerror(-ret)); >> goto out; >> } >> + if (!n) { >> + /* >> + * We've reached EOF of the old backing, >> therefore there's >> + * no more mergeable data. >> + */ >> + break; >> + } >> if (!ret) { >> continue; >> } > nope. It seems that this is wrong patch. > > iris ~/tmp/1 $ qemu-img create -f qcow2 base.qcow2 $(( 65 * 4 ))k > Formatting 'base.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off > compression_type=zlib size=266240 lazy_refcounts=off refcount_bits=16 > iris ~/tmp/1 $ qemu-img create -f qcow2 -o > backing_file=base.qcow2,backing_fmt=qcow2 inc1.qcow2 $(( 64 * 4 )) > Formatting 'inc1.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off > compression_type=zlib size=256 backing_file=base.qcow2 backing_fmt=qcow2 > lazy_refcounts=off refcount_bits=16 > iris ~/tmp/1 $ qemu-img create -f qcow2 -o > backing_file=inc1.qcow2,backing_fmt=qcow2 inc2.qcow2 $(( 64 * 5 ))k > Formatting 'inc2.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off > compression_type=zlib size=327680 backing_file=inc1.qcow2 > backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16 > iris ~/tmp/1 $ qemu-io -c "write -P 0xae 256k 4k" base.qcow2 > wrote 4096/4096 bytes at offset 262144 > 4 KiB, 1 ops; 00.01 sec (471.447 KiB/sec and 117.8617 ops/sec) > iris ~/tmp/1 $ qemu-io -c "read -P 0xae 256k 4k" base.qcow2 > read 4096/4096 bytes at offset 262144 > 4 KiB, 1 ops; 00.00 sec (56.076 MiB/sec and 14355.4407 ops/sec) > iris ~/tmp/1 $ qemu-io -c "read -P 0xae 256k 4k" inc2.qcow2 > Pattern verification failed at offset 262144, 4096 bytes > read 4096/4096 bytes at offset 262144 > 4 KiB, 1 ops; 00.00 sec (827.771 MiB/sec and 211909.3028 ops/sec) > iris ~/tmp/1 $ qemu-io -c "read -P 0 256k 4k" inc2.qcow2 > read 4096/4096 bytes at offset 262144 > 4 KiB, 1 ops; 00.00 sec (838.611 MiB/sec and 214684.4139 ops/sec) > iris ~/tmp/1 $ > > > iris ~/tmp/1 $ /home/den/src/qemu/build/qemu-img rebase -f qcow2 -b > base.qcow2 -F qcow2 inc2.qcow2 > iris ~/tmp/1 $ qemu-io -c "read -P 0 256k 4k" inc2.qcow2 > Pattern verification failed at offset 262144, 4096 bytes > read 4096/4096 bytes at offset 262144 > 4 KiB, 1 ops; 00.00 sec (88.052 MiB/sec and 22541.3069 ops/sec) > iris ~/tmp/1 $ > > the problem is the following: > [----0xAE] <- base > [----] <- inc1 > [--------] <- inc2 > > In this case last 4k of base should be read as 0x00 (this offset is not > written in inc2 and beyond end of the inc1). > > This means that all non-present clusters in inc2 MUST be zeroed > during rebasing to base. > > Something like this... > > Den Thanks for pointing that out, it seems indeed that breaking the loop just yet isn't a good idea. Since the offsets beyond the old backing size were read as zeroes before the rebase, this should remain the case afterwards as well. Which means we should explicitly zero the overlay clusters which go beyond the old backing size. I'll alter the patches accordingly (including the test) and send v2. Andrey
diff --git a/qemu-img.c b/qemu-img.c index 27f48051b0..55b6ce407c 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -3813,6 +3813,13 @@ static int img_rebase(int argc, char **argv) strerror(-ret)); goto out; } + if (!n) { + /* + * We've reached EOF of the old backing, therefore there's + * no more mergeable data. + */ + break; + } if (!ret) { continue; }
In case when we're rebasing within one backing chain, and when target image is larger than old backing file, bdrv_is_allocated_above() ends up setting *pnum = 0. As a result, target offset isn't getting incremented, and we get stuck in an infinite for loop. Let's detect this case and break the loop, as there's no more data to be read and merged from the old backing. Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> --- qemu-img.c | 7 +++++++ 1 file changed, 7 insertions(+)