diff mbox

[1/1] mirror: double performance of the bulk stage if the disc is full

Message ID 1468316175-11522-1-git-send-email-den@openvz.org (mailing list archive)
State New, archived
Headers show

Commit Message

Denis V. Lunev July 12, 2016, 9:36 a.m. UTC
From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Mirror can do up to 16 in-flight requests, but actually on full copy
(the whole source disk is non-zero) in-flight is always 1. This happens
as the request is not limited in size: the data occupies maximum available
capacity of s->buf.

The patch limits the size of the request to some artificial constant
(1 Mb here), which is not that big or small. This effectively enables
back parallelism in mirror code as it was designed.

The result is important: the time to migrate 10 Gb disk is reduced from
~350 sec to 170 sec.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Fam Zheng <famz@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Jeff Cody <jcody@redhat.com>
CC: Eric Blake <eblake@redhat.com>
---
 block/mirror.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Paolo Bonzini July 12, 2016, 10:21 a.m. UTC | #1
On 12/07/2016 11:36, Denis V. Lunev wrote:
> From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> Mirror can do up to 16 in-flight requests, but actually on full copy
> (the whole source disk is non-zero) in-flight is always 1. This happens
> as the request is not limited in size: the data occupies maximum available
> capacity of s->buf.
> 
> The patch limits the size of the request to some artificial constant
> (1 Mb here), which is not that big or small. This effectively enables
> back parallelism in mirror code as it was designed.
> 
> The result is important: the time to migrate 10 Gb disk is reduced from
> ~350 sec to 170 sec.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Fam Zheng <famz@redhat.com>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Max Reitz <mreitz@redhat.com>
> CC: Jeff Cody <jcody@redhat.com>
> CC: Eric Blake <eblake@redhat.com>
> ---
>  block/mirror.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index 4fe127e..53d3bcd 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -23,7 +23,9 @@
>  
>  #define SLICE_TIME    100000000ULL /* ns */
>  #define MAX_IN_FLIGHT 16
> -#define DEFAULT_MIRROR_BUF_SIZE   (10 << 20)
> +#define MAX_IO_SECTORS ((1 << 20) >> BDRV_SECTOR_BITS) /* 1 Mb */
> +#define DEFAULT_MIRROR_BUF_SIZE \
> +    (MAX_IN_FLIGHT * MAX_IO_SECTORS * BDRV_SECTOR_SIZE)
>  
>  /* The mirroring buffer is a list of granularity-sized chunks.
>   * Free chunks are organized in a list.
> @@ -387,7 +389,9 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
>                                            nb_chunks * sectors_per_chunk,
>                                            &io_sectors, &file);
>          if (ret < 0) {
> -            io_sectors = nb_chunks * sectors_per_chunk;
> +            io_sectors = MIN(nb_chunks * sectors_per_chunk, MAX_IO_SECTORS);
> +        } else if (ret & BDRV_BLOCK_DATA) {
> +            io_sectors = MIN(io_sectors, MAX_IO_SECTORS);
>          }
>  
>          io_sectors -= io_sectors % sectors_per_chunk;
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

Thanks!
Kevin Wolf July 12, 2016, 1:51 p.m. UTC | #2
Am 12.07.2016 um 11:36 hat Denis V. Lunev geschrieben:
> From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> Mirror can do up to 16 in-flight requests, but actually on full copy
> (the whole source disk is non-zero) in-flight is always 1. This happens
> as the request is not limited in size: the data occupies maximum available
> capacity of s->buf.
> 
> The patch limits the size of the request to some artificial constant
> (1 Mb here), which is not that big or small. This effectively enables
> back parallelism in mirror code as it was designed.
> 
> The result is important: the time to migrate 10 Gb disk is reduced from
> ~350 sec to 170 sec.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Fam Zheng <famz@redhat.com>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Max Reitz <mreitz@redhat.com>
> CC: Jeff Cody <jcody@redhat.com>
> CC: Eric Blake <eblake@redhat.com>
> ---
>  block/mirror.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index 4fe127e..53d3bcd 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -23,7 +23,9 @@
>  
>  #define SLICE_TIME    100000000ULL /* ns */
>  #define MAX_IN_FLIGHT 16
> -#define DEFAULT_MIRROR_BUF_SIZE   (10 << 20)
> +#define MAX_IO_SECTORS ((1 << 20) >> BDRV_SECTOR_BITS) /* 1 Mb */
> +#define DEFAULT_MIRROR_BUF_SIZE \
> +    (MAX_IN_FLIGHT * MAX_IO_SECTORS * BDRV_SECTOR_SIZE)
>  
>  /* The mirroring buffer is a list of granularity-sized chunks.
>   * Free chunks are organized in a list.
> @@ -387,7 +389,9 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
>                                            nb_chunks * sectors_per_chunk,
>                                            &io_sectors, &file);
>          if (ret < 0) {
> -            io_sectors = nb_chunks * sectors_per_chunk;
> +            io_sectors = MIN(nb_chunks * sectors_per_chunk, MAX_IO_SECTORS);
> +        } else if (ret & BDRV_BLOCK_DATA) {
> +            io_sectors = MIN(io_sectors, MAX_IO_SECTORS);
>          }

Would it make sense to consider the actual buffer size? If we have
s->buf_size / 16 > 1 MB, then this is wasting buffer space.

On the other hand, there is probably a minimum size where using a single
larger buffer performs better than two concurrent small ones. Which size
this is, is hard to tell, though. If we assume that 1 MB is a good
default (should we do some more testing to find the sweet spot?), we
could write this as:

  io_sectors = MIN(io_sectors,
                   MAX((s->buf_size / BDRV_SECTOR_SIZE) / MAX_IN_FLIGHT,
                       MAX_IO_SECTORS))

Kevin
Vladimir Sementsov-Ogievskiy July 13, 2016, 8 a.m. UTC | #3
On 12.07.2016 16:51, Kevin Wolf wrote:
> Am 12.07.2016 um 11:36 hat Denis V. Lunev geschrieben:
>> From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>
>> Mirror can do up to 16 in-flight requests, but actually on full copy
>> (the whole source disk is non-zero) in-flight is always 1. This happens
>> as the request is not limited in size: the data occupies maximum available
>> capacity of s->buf.
>>
>> The patch limits the size of the request to some artificial constant
>> (1 Mb here), which is not that big or small. This effectively enables
>> back parallelism in mirror code as it was designed.
>>
>> The result is important: the time to migrate 10 Gb disk is reduced from
>> ~350 sec to 170 sec.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> CC: Fam Zheng <famz@redhat.com>
>> CC: Kevin Wolf <kwolf@redhat.com>
>> CC: Max Reitz <mreitz@redhat.com>
>> CC: Jeff Cody <jcody@redhat.com>
>> CC: Eric Blake <eblake@redhat.com>
>> ---
>>   block/mirror.c | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/mirror.c b/block/mirror.c
>> index 4fe127e..53d3bcd 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -23,7 +23,9 @@
>>   
>>   #define SLICE_TIME    100000000ULL /* ns */
>>   #define MAX_IN_FLIGHT 16
>> -#define DEFAULT_MIRROR_BUF_SIZE   (10 << 20)
>> +#define MAX_IO_SECTORS ((1 << 20) >> BDRV_SECTOR_BITS) /* 1 Mb */
>> +#define DEFAULT_MIRROR_BUF_SIZE \
>> +    (MAX_IN_FLIGHT * MAX_IO_SECTORS * BDRV_SECTOR_SIZE)
>>   
>>   /* The mirroring buffer is a list of granularity-sized chunks.
>>    * Free chunks are organized in a list.
>> @@ -387,7 +389,9 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
>>                                             nb_chunks * sectors_per_chunk,
>>                                             &io_sectors, &file);
>>           if (ret < 0) {
>> -            io_sectors = nb_chunks * sectors_per_chunk;
>> +            io_sectors = MIN(nb_chunks * sectors_per_chunk, MAX_IO_SECTORS);
>> +        } else if (ret & BDRV_BLOCK_DATA) {
>> +            io_sectors = MIN(io_sectors, MAX_IO_SECTORS);
>>           }
> Would it make sense to consider the actual buffer size? If we have
> s->buf_size / 16 > 1 MB, then this is wasting buffer space.
>
> On the other hand, there is probably a minimum size where using a single
> larger buffer performs better than two concurrent small ones. Which size
> this is, is hard to tell, though. If we assume that 1 MB is a good
> default (should we do some more testing to find the sweet spot?), we
> could write this as:
>
>    io_sectors = MIN(io_sectors,
>                     MAX((s->buf_size / BDRV_SECTOR_SIZE) / MAX_IN_FLIGHT,
>                         MAX_IO_SECTORS))
>
> Kevin

Ok, thanks, will resend.
diff mbox

Patch

diff --git a/block/mirror.c b/block/mirror.c
index 4fe127e..53d3bcd 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -23,7 +23,9 @@ 
 
 #define SLICE_TIME    100000000ULL /* ns */
 #define MAX_IN_FLIGHT 16
-#define DEFAULT_MIRROR_BUF_SIZE   (10 << 20)
+#define MAX_IO_SECTORS ((1 << 20) >> BDRV_SECTOR_BITS) /* 1 Mb */
+#define DEFAULT_MIRROR_BUF_SIZE \
+    (MAX_IN_FLIGHT * MAX_IO_SECTORS * BDRV_SECTOR_SIZE)
 
 /* The mirroring buffer is a list of granularity-sized chunks.
  * Free chunks are organized in a list.
@@ -387,7 +389,9 @@  static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
                                           nb_chunks * sectors_per_chunk,
                                           &io_sectors, &file);
         if (ret < 0) {
-            io_sectors = nb_chunks * sectors_per_chunk;
+            io_sectors = MIN(nb_chunks * sectors_per_chunk, MAX_IO_SECTORS);
+        } else if (ret & BDRV_BLOCK_DATA) {
+            io_sectors = MIN(io_sectors, MAX_IO_SECTORS);
         }
 
         io_sectors -= io_sectors % sectors_per_chunk;