diff mbox

[08/12] migration: do not flush_compressed_data at the end of each iteration

Message ID 20180604095520.8563-9-xiaoguangrong@tencent.com (mailing list archive)
State New, archived
Headers show

Commit Message

Xiao Guangrong June 4, 2018, 9:55 a.m. UTC
From: Xiao Guangrong <xiaoguangrong@tencent.com>

flush_compressed_data() needs to wait all compression threads to
finish their work, after that all threads are free until the
migration feed new request to them, reducing its call can improve
the throughput and use CPU resource more effectively

We do not need to flush all threads at the end of iteration, the
data can be kept locally until the memory block is changed

Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
---
 migration/ram.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Dr. David Alan Gilbert July 13, 2018, 6:01 p.m. UTC | #1
* guangrong.xiao@gmail.com (guangrong.xiao@gmail.com) wrote:
> From: Xiao Guangrong <xiaoguangrong@tencent.com>
> 
> flush_compressed_data() needs to wait all compression threads to
> finish their work, after that all threads are free until the
> migration feed new request to them, reducing its call can improve
> the throughput and use CPU resource more effectively
> 
> We do not need to flush all threads at the end of iteration, the
> data can be kept locally until the memory block is changed
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
> ---
>  migration/ram.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index f9a8646520..0a38c1c61e 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1994,6 +1994,7 @@ static void ram_save_cleanup(void *opaque)
>      }
>  
>      xbzrle_cleanup();
> +    flush_compressed_data(*rsp);
>      compress_threads_save_cleanup();
>      ram_state_cleanup(rsp);
>  }

I'm not sure why this change corresponds to the other removal.
We should already have sent all remaining data in ram_save_complete()'s
call to flush_compressed_data - so what is this one for?

> @@ -2690,7 +2691,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>          }
>          i++;
>      }
> -    flush_compressed_data(rs);
>      rcu_read_unlock();

Hmm - are we sure there's no other cases that depend on ordering of all
of the compressed data being sent out between threads?
I think the one I'd most worry about is the case where:

  iteration one:
     thread 1: Save compressed page 'n'

  iteration two:
     thread 2: Save compressed page 'n'

What guarantees that the version of page 'n'
from thread 2 reaches the destination first without
this flush?

Dave

>      /*
> -- 
> 2.14.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Xiao Guangrong July 18, 2018, 8:44 a.m. UTC | #2
On 07/14/2018 02:01 AM, Dr. David Alan Gilbert wrote:
> * guangrong.xiao@gmail.com (guangrong.xiao@gmail.com) wrote:
>> From: Xiao Guangrong <xiaoguangrong@tencent.com>
>>
>> flush_compressed_data() needs to wait all compression threads to
>> finish their work, after that all threads are free until the
>> migration feed new request to them, reducing its call can improve
>> the throughput and use CPU resource more effectively
>>
>> We do not need to flush all threads at the end of iteration, the
>> data can be kept locally until the memory block is changed
>>
>> Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
>> ---
>>   migration/ram.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/migration/ram.c b/migration/ram.c
>> index f9a8646520..0a38c1c61e 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -1994,6 +1994,7 @@ static void ram_save_cleanup(void *opaque)
>>       }
>>   
>>       xbzrle_cleanup();
>> +    flush_compressed_data(*rsp);
>>       compress_threads_save_cleanup();
>>       ram_state_cleanup(rsp);
>>   }
> 
> I'm not sure why this change corresponds to the other removal.
> We should already have sent all remaining data in ram_save_complete()'s
> call to flush_compressed_data - so what is this one for?
> 

This is for the error condition, if any error occurred during live migration,
there is no chance to call ram_save_complete. After using the lockless
multithreads model, we assert all requests have been handled before destroy
the work threads.

>> @@ -2690,7 +2691,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>>           }
>>           i++;
>>       }
>> -    flush_compressed_data(rs);
>>       rcu_read_unlock();
> 
> Hmm - are we sure there's no other cases that depend on ordering of all
> of the compressed data being sent out between threads?

Err, i tried think it over carefully, however, still missed the case you mentioned. :(
Anyway, doing flush_compressed_data() for every 50ms hurt us too much.

> I think the one I'd most worry about is the case where:
> 
>    iteration one:
>       thread 1: Save compressed page 'n'
> 
>    iteration two:
>       thread 2: Save compressed page 'n'
> 
> What guarantees that the version of page 'n'
> from thread 2 reaches the destination first without
> this flush?
> 

Hmm... you are right, i missed this case. So how about avoid it by doing this
check at the beginning of ram_save_iterate():

if (ram_counters.dirty_sync_count != rs.dirty_sync_count) {
     flush_compressed_data(*rsp);
     rs.dirty_sync_count = ram_counters.dirty_sync_count;
}
diff mbox

Patch

diff --git a/migration/ram.c b/migration/ram.c
index f9a8646520..0a38c1c61e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1994,6 +1994,7 @@  static void ram_save_cleanup(void *opaque)
     }
 
     xbzrle_cleanup();
+    flush_compressed_data(*rsp);
     compress_threads_save_cleanup();
     ram_state_cleanup(rsp);
 }
@@ -2690,7 +2691,6 @@  static int ram_save_iterate(QEMUFile *f, void *opaque)
         }
         i++;
     }
-    flush_compressed_data(rs);
     rcu_read_unlock();
 
     /*