diff mbox

[1/2] qemu-img: make sure contain the consecutive number of zero bytes

Message ID 1492957997-28587-1-git-send-email-lidongchen@tencent.com (mailing list archive)
State New, archived
Headers show

Commit Message

858585 jemmy April 23, 2017, 2:33 p.m. UTC
From: Lidong Chen <lidongchen@tencent.com>

is_allocated_sectors_min don't guarantee to contain the
consecutive number of zero bytes. this patch fixes this bug.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
---
 qemu-img.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Comments

Eric Blake April 24, 2017, 2:43 p.m. UTC | #1
On 04/23/2017 09:33 AM, jemmy858585@gmail.com wrote:
> From: Lidong Chen <lidongchen@tencent.com>
> 
> is_allocated_sectors_min don't guarantee to contain the
> consecutive number of zero bytes. this patch fixes this bug.

This message was sent without an 'In-Reply-To' header pointing to a 0/2
cover letter.  When sending a series, please always thread things to a
cover letter; you may find 'git config format.coverletter auto' to be
helpful.

> 
> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
> ---
>  qemu-img.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index b220cf7..df6d165 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -1060,9 +1060,9 @@ static int is_allocated_sectors(const uint8_t *buf, int n, int *pnum)
>  }
>  
>  /*
> - * Like is_allocated_sectors, but if the buffer starts with a used sector,
> - * up to 'min' consecutive sectors containing zeros are ignored. This avoids
> - * breaking up write requests for only small sparse areas.
> + * Like is_allocated_sectors, but up to 'min' consecutive sectors
> + * containing zeros are ignored. This avoids breaking up write requests
> + * for only small sparse areas.
>   */
>  static int is_allocated_sectors_min(const uint8_t *buf, int n, int *pnum,
>      int min)
> @@ -1071,11 +1071,12 @@ static int is_allocated_sectors_min(const uint8_t *buf, int n, int *pnum,
>      int num_checked, num_used;
>  
>      if (n < min) {
> -        min = n;
> +        *pnum = n;
> +        return 1;
>      }
>  
>      ret = is_allocated_sectors(buf, n, pnum);
> -    if (!ret) {
> +    if (!ret && *pnum >= min) {

I seem to recall past attempts to try and patch this function, which
were then turned down, although I haven't scrubbed the archives for a
quick URL to point to. I'm worried that there are more subtleties here
than what you realize.
858585 jemmy April 25, 2017, 1:50 a.m. UTC | #2
On Mon, Apr 24, 2017 at 10:43 PM, Eric Blake <eblake@redhat.com> wrote:
> On 04/23/2017 09:33 AM, jemmy858585@gmail.com wrote:
>> From: Lidong Chen <lidongchen@tencent.com>
>>
>> is_allocated_sectors_min don't guarantee to contain the
>> consecutive number of zero bytes. this patch fixes this bug.
>
> This message was sent without an 'In-Reply-To' header pointing to a 0/2
> cover letter.  When sending a series, please always thread things to a
> cover letter; you may find 'git config format.coverletter auto' to be
> helpful.

Thanks for your kind advises.

>
>>
>> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
>> ---
>>  qemu-img.c | 11 ++++++-----
>>  1 file changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/qemu-img.c b/qemu-img.c
>> index b220cf7..df6d165 100644
>> --- a/qemu-img.c
>> +++ b/qemu-img.c
>> @@ -1060,9 +1060,9 @@ static int is_allocated_sectors(const uint8_t *buf, int n, int *pnum)
>>  }
>>
>>  /*
>> - * Like is_allocated_sectors, but if the buffer starts with a used sector,
>> - * up to 'min' consecutive sectors containing zeros are ignored. This avoids
>> - * breaking up write requests for only small sparse areas.
>> + * Like is_allocated_sectors, but up to 'min' consecutive sectors
>> + * containing zeros are ignored. This avoids breaking up write requests
>> + * for only small sparse areas.
>>   */
>>  static int is_allocated_sectors_min(const uint8_t *buf, int n, int *pnum,
>>      int min)
>> @@ -1071,11 +1071,12 @@ static int is_allocated_sectors_min(const uint8_t *buf, int n, int *pnum,
>>      int num_checked, num_used;
>>
>>      if (n < min) {
>> -        min = n;
>> +        *pnum = n;
>> +        return 1;
>>      }
>>
>>      ret = is_allocated_sectors(buf, n, pnum);
>> -    if (!ret) {
>> +    if (!ret && *pnum >= min) {
>
> I seem to recall past attempts to try and patch this function, which
> were then turned down, although I haven't scrubbed the archives for a
> quick URL to point to. I'm worried that there are more subtleties here
> than what you realize.

Hi Eric:
Do you mean this URL?
https://lists.gnu.org/archive/html/qemu-block/2017-01/msg00306.html

But I think the code is not consistent with qemu-img --help.
qemu-img --help
  '-S' indicates the consecutive number of bytes (defaults to 4k) that must
       contain only zeros for qemu-img to create a sparse image during
       conversion. If the number of bytes is 0, the source will not be
scanned for
       unallocated or zero sectors, and the destination image will always be
       fully allocated.

another reason:
if s->has_zero_init is 1(the qcow2 image which have backing_file), the empty
space at the beginning of the buffer still need write and invoke
blk_co_pwrite_zeroes.
and split a single write operation into two just because there is small empty
space at the beginning.

Thanks.

>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
>
Eric Blake April 25, 2017, 7:20 p.m. UTC | #3
On 04/24/2017 08:50 PM, 858585 jemmy wrote:
> On Mon, Apr 24, 2017 at 10:43 PM, Eric Blake <eblake@redhat.com> wrote:
>> On 04/23/2017 09:33 AM, jemmy858585@gmail.com wrote:
>>> From: Lidong Chen <lidongchen@tencent.com>
>>>
>>> is_allocated_sectors_min don't guarantee to contain the
>>> consecutive number of zero bytes. this patch fixes this bug.
>>
>> This message was sent without an 'In-Reply-To' header pointing to a 0/2
>> cover letter.  When sending a series, please always thread things to a
>> cover letter; you may find 'git config format.coverletter auto' to be
>> helpful.
> 
> Thanks for your kind advises.
> 

>>
>> I seem to recall past attempts to try and patch this function, which
>> were then turned down, although I haven't scrubbed the archives for a
>> quick URL to point to. I'm worried that there are more subtleties here
>> than what you realize.
> 
> Hi Eric:
> Do you mean this URL?
> https://lists.gnu.org/archive/html/qemu-block/2017-01/msg00306.html

Yes, that's probably the one.

> 
> But I think the code is not consistent with qemu-img --help.
> qemu-img --help
>   '-S' indicates the consecutive number of bytes (defaults to 4k) that must
>        contain only zeros for qemu-img to create a sparse image during
>        conversion. If the number of bytes is 0, the source will not be
> scanned for
>        unallocated or zero sectors, and the destination image will always be
>        fully allocated.

If you still think this patch is needed, the best way to convince me of
it is accompany your patch by a qemu-iotests enhancement that covers the
change in behavior (running the test pre-patch would show that we are
broken without the patch, and having the test ensures we can't later
regress).  That's a lot more work than the vague two lines of the commit
message body.
Max Reitz April 25, 2017, 8:01 p.m. UTC | #4
On 25.04.2017 03:50, 858585 jemmy wrote:
> On Mon, Apr 24, 2017 at 10:43 PM, Eric Blake <eblake@redhat.com> wrote:
>> On 04/23/2017 09:33 AM, jemmy858585@gmail.com wrote:
>>> From: Lidong Chen <lidongchen@tencent.com>
>>>
>>> is_allocated_sectors_min don't guarantee to contain the
>>> consecutive number of zero bytes. this patch fixes this bug.
>>
>> This message was sent without an 'In-Reply-To' header pointing to a 0/2
>> cover letter.  When sending a series, please always thread things to a
>> cover letter; you may find 'git config format.coverletter auto' to be
>> helpful.
> 
> Thanks for your kind advises.
> 
>>
>>>
>>> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
>>> ---
>>>  qemu-img.c | 11 ++++++-----
>>>  1 file changed, 6 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/qemu-img.c b/qemu-img.c
>>> index b220cf7..df6d165 100644
>>> --- a/qemu-img.c
>>> +++ b/qemu-img.c
>>> @@ -1060,9 +1060,9 @@ static int is_allocated_sectors(const uint8_t *buf, int n, int *pnum)
>>>  }
>>>
>>>  /*
>>> - * Like is_allocated_sectors, but if the buffer starts with a used sector,
>>> - * up to 'min' consecutive sectors containing zeros are ignored. This avoids
>>> - * breaking up write requests for only small sparse areas.
>>> + * Like is_allocated_sectors, but up to 'min' consecutive sectors
>>> + * containing zeros are ignored. This avoids breaking up write requests
>>> + * for only small sparse areas.
>>>   */
>>>  static int is_allocated_sectors_min(const uint8_t *buf, int n, int *pnum,
>>>      int min)
>>> @@ -1071,11 +1071,12 @@ static int is_allocated_sectors_min(const uint8_t *buf, int n, int *pnum,
>>>      int num_checked, num_used;
>>>
>>>      if (n < min) {
>>> -        min = n;
>>> +        *pnum = n;
>>> +        return 1;
>>>      }
>>>
>>>      ret = is_allocated_sectors(buf, n, pnum);
>>> -    if (!ret) {
>>> +    if (!ret && *pnum >= min) {
>>
>> I seem to recall past attempts to try and patch this function, which
>> were then turned down, although I haven't scrubbed the archives for a
>> quick URL to point to. I'm worried that there are more subtleties here
>> than what you realize.
> 
> Hi Eric:
> Do you mean this URL?
> https://lists.gnu.org/archive/html/qemu-block/2017-01/msg00306.html
> 
> But I think the code is not consistent with qemu-img --help.
> qemu-img --help
>   '-S' indicates the consecutive number of bytes (defaults to 4k) that must
>        contain only zeros for qemu-img to create a sparse image during
>        conversion. If the number of bytes is 0, the source will not be
> scanned for
>        unallocated or zero sectors, and the destination image will always be
>        fully allocated.
> 
> another reason:
> if s->has_zero_init is 1(the qcow2 image which have backing_file), the empty
> space at the beginning of the buffer still need write and invoke
> blk_co_pwrite_zeroes.

Right, that is indeed a reason that we may want to behave differently in
these cases. But in other cases this is less efficient.

> and split a single write operation into two just because there is small empty
> space at the beginning.
And then there's the fact that, in my opinion, this is actually a
problem at qcow2 level. Some people are working on improving this (see
e.g. Berto's subcluster RFC, which would allow us to implement
bdrv_co_pwrite_zeroes() below cluster granularity).

All in all, I don't think you can generically circumvent this issue here
for all block drivers. The rule we'd have to implement here is: If some
part of a cluster (to be written) is zero and another is not, we should
write the whole cluster. If a whole cluster is zero, we should issue a
write-zeroes request. But that would mean to check where some buffer
passes a cluster boundary and so on, and on top of this all of it is
qcow2-specific; so there goes the genericity...

Max
diff mbox

Patch

diff --git a/qemu-img.c b/qemu-img.c
index b220cf7..df6d165 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1060,9 +1060,9 @@  static int is_allocated_sectors(const uint8_t *buf, int n, int *pnum)
 }
 
 /*
- * Like is_allocated_sectors, but if the buffer starts with a used sector,
- * up to 'min' consecutive sectors containing zeros are ignored. This avoids
- * breaking up write requests for only small sparse areas.
+ * Like is_allocated_sectors, but up to 'min' consecutive sectors
+ * containing zeros are ignored. This avoids breaking up write requests
+ * for only small sparse areas.
  */
 static int is_allocated_sectors_min(const uint8_t *buf, int n, int *pnum,
     int min)
@@ -1071,11 +1071,12 @@  static int is_allocated_sectors_min(const uint8_t *buf, int n, int *pnum,
     int num_checked, num_used;
 
     if (n < min) {
-        min = n;
+        *pnum = n;
+        return 1;
     }
 
     ret = is_allocated_sectors(buf, n, pnum);
-    if (!ret) {
+    if (!ret && *pnum >= min) {
         return ret;
     }