[2/2] migration: move bdrv_invalidate_cache_all of of coroutine context
diff mbox

Message ID 1455012999-26858-2-git-send-email-den@openvz.org
State New
Headers show

Commit Message

Denis V. Lunev Feb. 9, 2016, 10:16 a.m. UTC
There is a possibility to hit assert qcow2_get_specific_info that
s->qcow_version is undefined. This happens when VM in starting from
suspended state, i.e. it processes incoming migration, and in the same
time 'info block' is called.

The problem is that in the qcow2_invalidate_cache closes and the image
and memsets BDRVQcowState in the middle.

The patch moves out processing of bdrv_invalidate_cache_all out of
coroutine context for postcopy migration to avoid that. This function
is called with the following stack:
  process_incoming_migration_co
  qemu_loadvm_state
  qemu_loadvm_state_main
  loadvm_process_command
  loadvm_postcopy_handle_run

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
---
Actually this patch is compile-tested only. I do not know how to start
post-copy migration. Previous patch was tested using 'virst managedsave'

 migration/savevm.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

Comments

Eric Blake Feb. 9, 2016, 3:28 p.m. UTC | #1
On 02/09/2016 03:16 AM, Denis V. Lunev wrote:
> There is a possibility to hit assert qcow2_get_specific_info that

s/hit assert/hit an assert in/


> s->qcow_version is undefined. This happens when VM in starting from
> suspended state, i.e. it processes incoming migration, and in the same
> time 'info block' is called.
> 
> The problem is that in the qcow2_invalidate_cache closes and the image
> and memsets BDRVQcowState in the middle.

Same grammar suggestions as in 1/2.

> 
> The patch moves out processing of bdrv_invalidate_cache_all out of

s/moves out/moves/

> coroutine context for postcopy migration to avoid that. This function
> is called with the following stack:
>   process_incoming_migration_co
>   qemu_loadvm_state
>   qemu_loadvm_state_main
>   loadvm_process_command
>   loadvm_postcopy_handle_run
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> CC: Amit Shah <amit.shah@redhat.com>
> ---
> Actually this patch is compile-tested only. I do not know how to start
> post-copy migration. Previous patch was tested using 'virst managedsave'

Not part of the patch, but s/virst/virsh/

Again, I'll let the migration experts do the actual review.

Patch
diff mbox

diff --git a/migration/savevm.c b/migration/savevm.c
index 94f2894..8415fd9 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1496,18 +1496,10 @@  static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
     return 0;
 }
 
-/* After all discards we can start running and asking for pages */
-static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
+static void loadvm_postcopy_handle_run_bh(void *opaque)
 {
-    PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_RUNNING);
     Error *local_err = NULL;
 
-    trace_loadvm_postcopy_handle_run();
-    if (ps != POSTCOPY_INCOMING_LISTENING) {
-        error_report("CMD_POSTCOPY_RUN in wrong postcopy state (%d)", ps);
-        return -1;
-    }
-
     /* TODO we should move all of this lot into postcopy_ram.c or a shared code
      * in migration.c
      */
@@ -1519,7 +1511,6 @@  static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
     bdrv_invalidate_cache_all(&local_err);
     if (local_err) {
         error_report_err(local_err);
-        return -1;
     }
 
     trace_loadvm_postcopy_handle_run_cpu_sync();
@@ -1534,6 +1525,22 @@  static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
         /* leave it paused and let management decide when to start the CPU */
         runstate_set(RUN_STATE_PAUSED);
     }
+}
+
+/* After all discards we can start running and asking for pages */
+static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
+{
+    PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_RUNNING);
+    QEMUBH *bh;
+
+    trace_loadvm_postcopy_handle_run();
+    if (ps != POSTCOPY_INCOMING_LISTENING) {
+        error_report("CMD_POSTCOPY_RUN in wrong postcopy state (%d)", ps);
+        return -1;
+    }
+
+    bh = qemu_bh_new(loadvm_postcopy_handle_run_bh, NULL);
+    qemu_bh_schedule(bh);
 
     /* We need to finish reading the stream from the package
      * and also stop reading anything more from the stream that loaded the