diff mbox series

[4/5] block, migration: add bdrv_flush_vmstate helper

Message ID 20200616162035.29857-5-den@openvz.org (mailing list archive)
State New, archived
Headers show
Series block: seriously improve savevm performance | expand

Commit Message

Denis V. Lunev June 16, 2020, 4:20 p.m. UTC
Right now bdrv_fclose() is just calling bdrv_flush().

The problem is that migration code is working inefficently from black
layer terms and are frequently called for very small pieces of not
properly aligned data. Block layer is capable to work this way, but
this is very slow.

This patch is a preparation for the introduction of the intermediate
buffer at block driver state. It would be beneficial to separate
conventional bdrv_flush() from closing QEMU file from migration code.

The patch also forces bdrv_flush_vmstate() operation inside
synchronous blk_save_vmstate() operation. This helper is used from
qemu-io only.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Fam Zheng <fam@euphon.net>
CC: Juan Quintela <quintela@redhat.com>
CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
CC: Denis Plotnikov <dplotnikov@virtuozzo.com>
---
 block/block-backend.c |  6 +++++-
 block/io.c            | 15 +++++++++++++++
 include/block/block.h |  3 +++
 migration/savevm.c    |  3 +++
 4 files changed, 26 insertions(+), 1 deletion(-)

Comments

Eric Blake June 16, 2020, 9:11 p.m. UTC | #1
On 6/16/20 11:20 AM, Denis V. Lunev wrote:
> Right now bdrv_fclose() is just calling bdrv_flush().
> 
> The problem is that migration code is working inefficently from black

inefficiently, block

> layer terms and are frequently called for very small pieces of not
> properly aligned data. Block layer is capable to work this way, but

s/not properly aligned/unaligned/

> this is very slow.
> 
> This patch is a preparation for the introduction of the intermediate
> buffer at block driver state. It would be beneficial to separate
> conventional bdrv_flush() from closing QEMU file from migration code.
> 
> The patch also forces bdrv_flush_vmstate() operation inside
> synchronous blk_save_vmstate() operation. This helper is used from
> qemu-io only.
> 

> +++ b/block/block-backend.c
> @@ -2177,16 +2177,20 @@ int blk_truncate(BlockBackend *blk, int64_t offset, bool exact,
>   int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
>                        int64_t pos, int size)
>   {
> -    int ret;
> +    int ret, ret2;
>   
>       if (!blk_is_available(blk)) {
>           return -ENOMEDIUM;
>       }
>   
>       ret = bdrv_save_vmstate(blk_bs(blk), buf, pos, size);
> +    ret2 = bdrv_flush_vmstate(blk_bs(blk));

Do you really want to be attempting bdrv_flush_vmstate() even after 
bdrv_save_vmstate() failed?  Better might be...

>       if (ret < 0) {
>           return ret;
>       }

...attempting it here, at which point it looks like the only reason you 
need ret2 is to preserve ret long enough...

> +    if (ret2 < 0) {
> +        return ret2;
> +    }
>   
>       if (ret == size && !blk->enable_write_cache) {

...for this check.  But a quick look at bdrv_save_vmstate() says this 
check is dead: the function can only return negative error or exactly 
size, and we already filtered out negative error above.
Denis V. Lunev June 16, 2020, 9:29 p.m. UTC | #2
On 6/17/20 12:11 AM, Eric Blake wrote:
> On 6/16/20 11:20 AM, Denis V. Lunev wrote:
>> Right now bdrv_fclose() is just calling bdrv_flush().
>>
>> The problem is that migration code is working inefficently from black
>
> inefficiently, block
>
>> layer terms and are frequently called for very small pieces of not
>> properly aligned data. Block layer is capable to work this way, but
>
> s/not properly aligned/unaligned/
>
>> this is very slow.
>>
>> This patch is a preparation for the introduction of the intermediate
>> buffer at block driver state. It would be beneficial to separate
>> conventional bdrv_flush() from closing QEMU file from migration code.
>>
>> The patch also forces bdrv_flush_vmstate() operation inside
>> synchronous blk_save_vmstate() operation. This helper is used from
>> qemu-io only.
>>
>
>> +++ b/block/block-backend.c
>> @@ -2177,16 +2177,20 @@ int blk_truncate(BlockBackend *blk, int64_t
>> offset, bool exact,
>>   int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
>>                        int64_t pos, int size)
>>   {
>> -    int ret;
>> +    int ret, ret2;
>>         if (!blk_is_available(blk)) {
>>           return -ENOMEDIUM;
>>       }
>>         ret = bdrv_save_vmstate(blk_bs(blk), buf, pos, size);
>> +    ret2 = bdrv_flush_vmstate(blk_bs(blk));
>
> Do you really want to be attempting bdrv_flush_vmstate() even after
> bdrv_save_vmstate() failed?  Better might be...
>
>>       if (ret < 0) {
>>           return ret;
>>       }
>
> ...attempting it here, at which point it looks like the only reason
> you need ret2 is to preserve ret long enough...
no, I would like to be sure that intermediate state is always cleared at
the end.
In the next patch I am going to put intermediate buffer to
BlockDriverState and thus it has to be removed in any case.

May be flush would be a bad name... clean or finalize?

>
>> +    if (ret2 < 0) {
>> +        return ret2;
>> +    }
>>         if (ret == size && !blk->enable_write_cache) {
>
> ...for this check.  But a quick look at bdrv_save_vmstate() says this
> check is dead: the function can only return negative error or exactly
> size, and we already filtered out negative error above.
>
>
hmm. you are right. good point. this check is dead.
Worth to remove :) I'll add this as a separate patch.

Den
Vladimir Sementsov-Ogievskiy June 18, 2020, 10:02 a.m. UTC | #3
16.06.2020 19:20, Denis V. Lunev wrote:
> Right now bdrv_fclose() is just calling bdrv_flush().
> 
> The problem is that migration code is working inefficently from black
> layer terms and are frequently called for very small pieces of not
> properly aligned data. Block layer is capable to work this way, but
> this is very slow.
> 
> This patch is a preparation for the introduction of the intermediate
> buffer at block driver state. It would be beneficial to separate
> conventional bdrv_flush() from closing QEMU file from migration code.
> 
> The patch also forces bdrv_flush_vmstate() operation inside
> synchronous blk_save_vmstate() operation. This helper is used from
> qemu-io only.
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Max Reitz <mreitz@redhat.com>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Fam Zheng <fam@euphon.net>
> CC: Juan Quintela <quintela@redhat.com>
> CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> CC: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> CC: Denis Plotnikov <dplotnikov@virtuozzo.com>
> ---
>   block/block-backend.c |  6 +++++-
>   block/io.c            | 15 +++++++++++++++
>   include/block/block.h |  3 +++
>   migration/savevm.c    |  3 +++
>   4 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 6936b25c83..3afa0ff7d5 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -2177,16 +2177,20 @@ int blk_truncate(BlockBackend *blk, int64_t offset, bool exact,
>   int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
>                        int64_t pos, int size)
>   {
> -    int ret;
> +    int ret, ret2;
>   
>       if (!blk_is_available(blk)) {
>           return -ENOMEDIUM;
>       }
>   
>       ret = bdrv_save_vmstate(blk_bs(blk), buf, pos, size);
> +    ret2 = bdrv_flush_vmstate(blk_bs(blk));
>       if (ret < 0) {
>           return ret;
>       }
> +    if (ret2 < 0) {
> +        return ret2;
> +    }
>   
>       if (ret == size && !blk->enable_write_cache) {
>           ret = bdrv_flush(blk_bs(blk));
> diff --git a/block/io.c b/block/io.c
> index df8f2a98d4..8718df4ea8 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2724,6 +2724,21 @@ int bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos)
>       return bdrv_rw_vmstate(bs, qiov, pos, true);
>   }
>   
> +static int coroutine_fn bdrv_co_flush_vmstate(BlockDriverState *bs)
> +{
> +    return 0;
> +}
> +
> +static int coroutine_fn bdrv_flush_vmstate_co_entry(void *opaque)
> +{
> +    return bdrv_co_flush_vmstate(opaque);
> +}
> +
> +int bdrv_flush_vmstate(BlockDriverState *bs)
> +{
> +    return bdrv_run_co(bs, bdrv_flush_vmstate_co_entry, bs);
> +}
> +
>   /**************************************************************/
>   /* async I/Os */
>   
> diff --git a/include/block/block.h b/include/block/block.h
> index 25e299605e..e532bcffc8 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -572,6 +572,9 @@ int bdrv_save_vmstate(BlockDriverState *bs, const uint8_t *buf,
>   
>   int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf,
>                         int64_t pos, int size);
> +/* bdrv_flush_vmstate() is mandatory to commit vmstate changes if
> +   bdrv_save_vmstate() was ever called */
> +int bdrv_flush_vmstate(BlockDriverState *bs);
>   
>   void bdrv_img_create(const char *filename, const char *fmt,
>                        const char *base_filename, const char *base_fmt,
> diff --git a/migration/savevm.c b/migration/savevm.c
> index da3dead4e9..d984ce7aa1 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -150,6 +150,9 @@ static ssize_t block_get_buffer(void *opaque, uint8_t *buf, int64_t pos,
>   
>   static int bdrv_fclose(void *opaque, Error **errp)
>   {
> +    int err = bdrv_flush_vmstate(opaque);
> +    if (err < 0)
> +        return err;

Qemu style wants braces anyway..

>       return bdrv_flush(opaque);
>   }

I'm not sure, that we want this on read path, neither, why we already do normal flush on read path. Anyway, it should be no-op on read path and won't hurt.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Vladimir Sementsov-Ogievskiy June 18, 2020, 10:03 a.m. UTC | #4
17.06.2020 00:29, Denis V. Lunev wrote:
>>>        if (ret < 0) {
>>>            return ret;
>>>        }
>> ...attempting it here, at which point it looks like the only reason
>> you need ret2 is to preserve ret long enough...
> no, I would like to be sure that intermediate state is always cleared at
> the end.
> In the next patch I am going to put intermediate buffer to
> BlockDriverState and thus it has to be removed in any case.
> 
> May be flush would be a bad name... clean or finalize?
> 

finalize sounds good for me
diff mbox series

Patch

diff --git a/block/block-backend.c b/block/block-backend.c
index 6936b25c83..3afa0ff7d5 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -2177,16 +2177,20 @@  int blk_truncate(BlockBackend *blk, int64_t offset, bool exact,
 int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
                      int64_t pos, int size)
 {
-    int ret;
+    int ret, ret2;
 
     if (!blk_is_available(blk)) {
         return -ENOMEDIUM;
     }
 
     ret = bdrv_save_vmstate(blk_bs(blk), buf, pos, size);
+    ret2 = bdrv_flush_vmstate(blk_bs(blk));
     if (ret < 0) {
         return ret;
     }
+    if (ret2 < 0) {
+        return ret2;
+    }
 
     if (ret == size && !blk->enable_write_cache) {
         ret = bdrv_flush(blk_bs(blk));
diff --git a/block/io.c b/block/io.c
index df8f2a98d4..8718df4ea8 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2724,6 +2724,21 @@  int bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos)
     return bdrv_rw_vmstate(bs, qiov, pos, true);
 }
 
+static int coroutine_fn bdrv_co_flush_vmstate(BlockDriverState *bs)
+{
+    return 0;
+}
+
+static int coroutine_fn bdrv_flush_vmstate_co_entry(void *opaque)
+{
+    return bdrv_co_flush_vmstate(opaque);
+}
+
+int bdrv_flush_vmstate(BlockDriverState *bs)
+{
+    return bdrv_run_co(bs, bdrv_flush_vmstate_co_entry, bs);
+}
+
 /**************************************************************/
 /* async I/Os */
 
diff --git a/include/block/block.h b/include/block/block.h
index 25e299605e..e532bcffc8 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -572,6 +572,9 @@  int bdrv_save_vmstate(BlockDriverState *bs, const uint8_t *buf,
 
 int bdrv_load_vmstate(BlockDriverState *bs, uint8_t *buf,
                       int64_t pos, int size);
+/* bdrv_flush_vmstate() is mandatory to commit vmstate changes if
+   bdrv_save_vmstate() was ever called */
+int bdrv_flush_vmstate(BlockDriverState *bs);
 
 void bdrv_img_create(const char *filename, const char *fmt,
                      const char *base_filename, const char *base_fmt,
diff --git a/migration/savevm.c b/migration/savevm.c
index da3dead4e9..d984ce7aa1 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -150,6 +150,9 @@  static ssize_t block_get_buffer(void *opaque, uint8_t *buf, int64_t pos,
 
 static int bdrv_fclose(void *opaque, Error **errp)
 {
+    int err = bdrv_flush_vmstate(opaque);
+    if (err < 0)
+        return err;
     return bdrv_flush(opaque);
 }