[1/1] mirror: restore offset after zeroing out the image
diff mbox

Message ID 1485771730-19849-1-git-send-email-den@openvz.org
State New
Headers show

Commit Message

Denis V. Lunev Jan. 30, 2017, 10:22 a.m. UTC
If explicit zeroing out before mirroring is required for the target image,
it moves the block job offset counter to EOF, then offset and len counters
count the image size twice.

There is no harm but confusing stats (e.g. for 1G image the completion
counter starts from 1G and increases to 2G)

The patch fixed that problem by resetting the offset counter.

Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
---
 block/mirror.c             | 3 +++
 tests/qemu-iotests/094.out | 4 ++--
 2 files changed, 5 insertions(+), 2 deletions(-)

Comments

Eric Blake Jan. 30, 2017, 5:16 p.m. UTC | #1
On 01/30/2017 04:22 AM, Denis V. Lunev wrote:
> If explicit zeroing out before mirroring is required for the target image,
> it moves the block job offset counter to EOF, then offset and len counters
> count the image size twice.
> 
> There is no harm but confusing stats (e.g. for 1G image the completion
> counter starts from 1G and increases to 2G)
> 
> The patch fixed that problem by resetting the offset counter.

Counters are explicitly documented NOT tied to disk length; they are
merely estimates of proportional completion.  I'm not sure if this makes
the numbers jump backwards from the observer's viewpoint, but if you can
ever spot 1g/1g right before rewinding to 0g/1g (where pre-patch could
see 1g/2g), then that is a reason to not take this patch.  On the other
hand, your argument that the pre-patch behavior progressing towards 2g
has a very fast progression from 0-1g/2g, and then a much slower
1g-2g/2g, which makes the estimate of percent completion skewed, while a
newer progression of 0-1g/1g is more realistic, may have some merit.

I'm not sold on this patch yet, but stronger arguments in the commit
message may sway me.

> +++ b/tests/qemu-iotests/094.out
> @@ -3,9 +3,9 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
>  Formatting 'TEST_DIR/source.IMGFMT', fmt=IMGFMT size=67108864
>  {"return": {}}
>  {"return": {}}
> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 67108864, "offset": 67108864, "speed": 0, "type": "mirror"}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}}
>  {"return": {}}

This part of the change is scary - a ready event showing 0/0 HAS been
known to confuse libvirt in the past.  Qemu should NEVER advertise a
ready event with 0/0, it should at least be 1/1 (because of the number
of clients that have workarounds to deal with older qemu behavior on 0/0
and which might misbehave if we ever issue that again).

> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 67108864, "offset": 67108864, "speed": 0, "type": "mirror"}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}}

So NACK to the patch as currently written, but not necessarily to the
idea if you can give better progress numbers and never reach the state
of BLOCK_JOB_READY and BLOCK_JOB_COMPLETED with a 0/0 length/offset.
Denis V. Lunev Jan. 30, 2017, 5:33 p.m. UTC | #2
On 01/30/2017 08:16 PM, Eric Blake wrote:
> On 01/30/2017 04:22 AM, Denis V. Lunev wrote:
>> If explicit zeroing out before mirroring is required for the target image,
>> it moves the block job offset counter to EOF, then offset and len counters
>> count the image size twice.
>>
>> There is no harm but confusing stats (e.g. for 1G image the completion
>> counter starts from 1G and increases to 2G)
>>
>> The patch fixed that problem by resetting the offset counter.
> Counters are explicitly documented NOT tied to disk length; they are
> merely estimates of proportional completion.  I'm not sure if this makes
> the numbers jump backwards from the observer's viewpoint, but if you can
> ever spot 1g/1g right before rewinding to 0g/1g (where pre-patch could
> see 1g/2g), then that is a reason to not take this patch.  On the other
> hand, your argument that the pre-patch behavior progressing towards 2g
> has a very fast progression from 0-1g/2g, and then a much slower
> 1g-2g/2g, which makes the estimate of percent completion skewed, while a
> newer progression of 0-1g/1g is more realistic, may have some merit.
>
> I'm not sold on this patch yet, but stronger arguments in the commit
> message may sway me.
>
>> +++ b/tests/qemu-iotests/094.out
>> @@ -3,9 +3,9 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
>>  Formatting 'TEST_DIR/source.IMGFMT', fmt=IMGFMT size=67108864
>>  {"return": {}}
>>  {"return": {}}
>> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 67108864, "offset": 67108864, "speed": 0, "type": "mirror"}}
>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}}
>>  {"return": {}}
> This part of the change is scary - a ready event showing 0/0 HAS been
> known to confuse libvirt in the past.  Qemu should NEVER advertise a
> ready event with 0/0, it should at least be 1/1 (because of the number
> of clients that have workarounds to deal with older qemu behavior on 0/0
> and which might misbehave if we ever issue that again).
>
>> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 67108864, "offset": 67108864, "speed": 0, "type": "mirror"}}
>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}}
> So NACK to the patch as currently written, but not necessarily to the
> idea if you can give better progress numbers and never reach the state
> of BLOCK_JOB_READY and BLOCK_JOB_COMPLETED with a 0/0 length/offset.
>
ok. fair enough. Thank you for the review.

Will it be better to (somehow) skip progressing below using
some condition during mirror_dirty_init() stage?

static void mirror_iteration_done(MirrorOp *op, int ret)
{
    .....

    if (ret >= 0) {
        if (s->cow_bitmap) {
            bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
        }
        s->common.offset += (uint64_t)op->nb_sectors * BDRV_SECTOR_SIZE;
<---- specifically this progressing
    }

Den
Anton Nefedov Jan. 30, 2017, 6:10 p.m. UTC | #3
On 01/30/2017 08:33 PM, Denis V. Lunev wrote:
> On 01/30/2017 08:16 PM, Eric Blake wrote:
>> On 01/30/2017 04:22 AM, Denis V. Lunev wrote:
>>> If explicit zeroing out before mirroring is required for the target image,
>>> it moves the block job offset counter to EOF, then offset and len counters
>>> count the image size twice.
>>>
>>> There is no harm but confusing stats (e.g. for 1G image the completion
>>> counter starts from 1G and increases to 2G)
>>>
>>> The patch fixed that problem by resetting the offset counter.
>> Counters are explicitly documented NOT tied to disk length; they are
>> merely estimates of proportional completion.  I'm not sure if this makes
>> the numbers jump backwards from the observer's viewpoint, but if you can
>> ever spot 1g/1g right before rewinding to 0g/1g (where pre-patch could
>> see 1g/2g), then that is a reason to not take this patch.

FWIW you'll actually see 1g/0g first (if you happen to catch that tiny 
phase :)

>> On the other
>> hand, your argument that the pre-patch behavior progressing towards 2g
>> has a very fast progression from 0-1g/2g, and then a much slower
>> 1g-2g/2g, which makes the estimate of percent completion skewed, while a
>> newer progression of 0-1g/1g is more realistic, may have some merit.
>>
>> I'm not sold on this patch yet, but stronger arguments in the commit
>> message may sway me.
>>
>>> +++ b/tests/qemu-iotests/094.out
>>> @@ -3,9 +3,9 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
>>>  Formatting 'TEST_DIR/source.IMGFMT', fmt=IMGFMT size=67108864
>>>  {"return": {}}
>>>  {"return": {}}
>>> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 67108864, "offset": 67108864, "speed": 0, "type": "mirror"}}
>>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}}
>>>  {"return": {}}
>> This part of the change is scary - a ready event showing 0/0 HAS been
>> known to confuse libvirt in the past.  Qemu should NEVER advertise a
>> ready event with 0/0, it should at least be 1/1 (because of the number
>> of clients that have workarounds to deal with older qemu behavior on 0/0
>> and which might misbehave if we ever issue that again).

I think this 094.out modification is not needed and actually breaks the 
test; apologies

>>
>>> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 67108864, "offset": 67108864, "speed": 0, "type": "mirror"}}
>>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}}
>> So NACK to the patch as currently written, but not necessarily to the
>> idea if you can give better progress numbers and never reach the state
>> of BLOCK_JOB_READY and BLOCK_JOB_COMPLETED with a 0/0 length/offset.
>>
> ok. fair enough. Thank you for the review.
>
> Will it be better to (somehow) skip progressing below using
> some condition during mirror_dirty_init() stage?
>
> static void mirror_iteration_done(MirrorOp *op, int ret)
> {
>     .....
>
>     if (ret >= 0) {
>         if (s->cow_bitmap) {
>             bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
>         }
>         s->common.offset += (uint64_t)op->nb_sectors * BDRV_SECTOR_SIZE;
> <---- specifically this progressing
>     }
>
Isn't this going to look quite too invasive for such cause

Patch
diff mbox

diff --git a/block/mirror.c b/block/mirror.c
index 301ba92..94915e8 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -587,6 +587,9 @@  static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
         }
 
         mirror_wait_for_all_io(s);
+
+        /* offset has moved to EOF, restore it */
+        s->common.offset = 0;
     }
 
     /* First part, loop on the sectors and initialize the dirty bitmap.  */
diff --git a/tests/qemu-iotests/094.out b/tests/qemu-iotests/094.out
index b66dc07..522a20c 100644
--- a/tests/qemu-iotests/094.out
+++ b/tests/qemu-iotests/094.out
@@ -3,9 +3,9 @@  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
 Formatting 'TEST_DIR/source.IMGFMT', fmt=IMGFMT size=67108864
 {"return": {}}
 {"return": {}}
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 67108864, "offset": 67108864, "speed": 0, "type": "mirror"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_READY", "data": {"device": "src", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}}
 {"return": {}}
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 67108864, "offset": 67108864, "speed": 0, "type": "mirror"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 0, "offset": 0, "speed": 0, "type": "mirror"}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN"}
 *** done