diff mbox

[v2,for-2.11] migration, xen: Fix block image lock issue on live migration

Message ID 20171116151419.694-1-anthony.perard@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Anthony PERARD Nov. 16, 2017, 3:14 p.m. UTC
When doing a live migration of a Xen guest with libxl, the images for
block devices are locked by the original QEMU process, and this prevent
the QEMU at the destination to take the lock and the migration fail.

From QEMU point of view, once the RAM of a domain is migrated, there is
two QMP commands, "stop" then "xen-save-devices-state", at which point a
new QEMU is spawned at the destination.

Release locks in "xen-save-devices-state" so the destination can takes
them, if it's a live migration.

This patch add the "live" parameter to "xen-save-devices-state" which
default to true so older version of libxenlight can work with newer
version of QEMU.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---
Changes in V2:
- add the live parameter

CC: Kevin Wolf <kwolf@redhat.com>

also CCing libxl maintainers:
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 migration/savevm.c  | 23 ++++++++++++++++++++++-
 qapi/migration.json |  6 +++++-
 2 files changed, 27 insertions(+), 2 deletions(-)

Comments

Kevin Wolf Nov. 17, 2017, 12:41 p.m. UTC | #1
Am 16.11.2017 um 16:14 hat Anthony PERARD geschrieben:
> When doing a live migration of a Xen guest with libxl, the images for
> block devices are locked by the original QEMU process, and this prevent
> the QEMU at the destination to take the lock and the migration fail.
> 
> From QEMU point of view, once the RAM of a domain is migrated, there is
> two QMP commands, "stop" then "xen-save-devices-state", at which point a
> new QEMU is spawned at the destination.
> 
> Release locks in "xen-save-devices-state" so the destination can takes
> them, if it's a live migration.
> 
> This patch add the "live" parameter to "xen-save-devices-state" which
> default to true so older version of libxenlight can work with newer
> version of QEMU.
> 
> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
> ---
> Changes in V2:
> - add the live parameter
> 
> CC: Kevin Wolf <kwolf@redhat.com>

Makes sense to me.

Kevin
Dr. David Alan Gilbert Nov. 20, 2017, 2:52 p.m. UTC | #2
* Anthony PERARD (anthony.perard@citrix.com) wrote:
> When doing a live migration of a Xen guest with libxl, the images for
> block devices are locked by the original QEMU process, and this prevent
> the QEMU at the destination to take the lock and the migration fail.
> 
> From QEMU point of view, once the RAM of a domain is migrated, there is
> two QMP commands, "stop" then "xen-save-devices-state", at which point a
> new QEMU is spawned at the destination.
> 
> Release locks in "xen-save-devices-state" so the destination can takes
> them, if it's a live migration.
> 
> This patch add the "live" parameter to "xen-save-devices-state" which
> default to true so older version of libxenlight can work with newer
> version of QEMU.
> 
> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
> Changes in V2:
> - add the live parameter
> 
> CC: Kevin Wolf <kwolf@redhat.com>
> 
> also CCing libxl maintainers:
> CC: Ian Jackson <ian.jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  migration/savevm.c  | 23 ++++++++++++++++++++++-
>  qapi/migration.json |  6 +++++-
>  2 files changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 4a88228614..7bc4e23e65 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2242,13 +2242,20 @@ int save_snapshot(const char *name, Error **errp)
>      return ret;
>  }
>  
> -void qmp_xen_save_devices_state(const char *filename, Error **errp)
> +void qmp_xen_save_devices_state(const char *filename, bool has_live, bool live,
> +                                Error **errp)
>  {
>      QEMUFile *f;
>      QIOChannelFile *ioc;
>      int saved_vm_running;
>      int ret;
>  
> +    if (!has_live) {
> +        /* live default to true so old version of Xen tool stack can have a
> +         * successfull live migration */
> +        live = true;
> +    }
> +
>      saved_vm_running = runstate_is_running();
>      vm_stop(RUN_STATE_SAVE_VM);
>      global_state_store_running();
> @@ -2263,6 +2270,20 @@ void qmp_xen_save_devices_state(const char *filename, Error **errp)
>      qemu_fclose(f);
>      if (ret < 0) {
>          error_setg(errp, QERR_IO_ERROR);
> +    } else {
> +        /* libxl calls the QMP command "stop" before calling
> +         * "xen-save-devices-state" and in case of migration failure, libxl
> +         * would call "cont".
> +         * So call bdrv_inactivate_all (release locks) here to let the other
> +         * side of the migration take controle of the images.
> +         */
> +        if (live && !saved_vm_running) {
> +            ret = bdrv_inactivate_all();
> +            if (ret) {
> +                error_setg(errp, "%s: bdrv_inactivate_all() failed (%d)",
> +                           __func__, ret);
> +            }
> +        }
>      }
>  
>   the_end:
> diff --git a/qapi/migration.json b/qapi/migration.json
> index bbc4671ded..03f57c9616 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -1075,6 +1075,9 @@
>  # data. See xen-save-devices-state.txt for a description of the binary
>  # format.
>  #
> +# @live: Optional argument to ask QEMU to treat this command as part of a live
> +# migration. Default to true. (since 2.11)
> +#
>  # Returns: Nothing on success
>  #
>  # Since: 1.1
> @@ -1086,7 +1089,8 @@
>  # <- { "return": {} }
>  #
>  ##
> -{ 'command': 'xen-save-devices-state', 'data': {'filename': 'str'} }
> +{ 'command': 'xen-save-devices-state',
> +  'data': {'filename': 'str', '*live':'bool' } }
>  
>  ##
>  # @xen-set-replication:
> -- 
> Anthony PERARD
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Juan Quintela Nov. 21, 2017, 6:41 p.m. UTC | #3
Anthony PERARD <anthony.perard@citrix.com> wrote:
> When doing a live migration of a Xen guest with libxl, the images for
> block devices are locked by the original QEMU process, and this prevent
> the QEMU at the destination to take the lock and the migration fail.
>
> From QEMU point of view, once the RAM of a domain is migrated, there is
> two QMP commands, "stop" then "xen-save-devices-state", at which point a
> new QEMU is spawned at the destination.
>
> Release locks in "xen-save-devices-state" so the destination can takes
> them, if it's a live migration.
>
> This patch add the "live" parameter to "xen-save-devices-state" which
> default to true so older version of libxenlight can work with newer
> version of QEMU.
>
> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
> ---
> Changes in V2:
> - add the live parameter
>
> CC: Kevin Wolf <kwolf@redhat.com>
>
> also CCing libxl maintainers:
> CC: Ian Jackson <ian.jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>
queued
diff mbox

Patch

diff --git a/migration/savevm.c b/migration/savevm.c
index 4a88228614..7bc4e23e65 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2242,13 +2242,20 @@  int save_snapshot(const char *name, Error **errp)
     return ret;
 }
 
-void qmp_xen_save_devices_state(const char *filename, Error **errp)
+void qmp_xen_save_devices_state(const char *filename, bool has_live, bool live,
+                                Error **errp)
 {
     QEMUFile *f;
     QIOChannelFile *ioc;
     int saved_vm_running;
     int ret;
 
+    if (!has_live) {
+        /* live default to true so old version of Xen tool stack can have a
+         * successfull live migration */
+        live = true;
+    }
+
     saved_vm_running = runstate_is_running();
     vm_stop(RUN_STATE_SAVE_VM);
     global_state_store_running();
@@ -2263,6 +2270,20 @@  void qmp_xen_save_devices_state(const char *filename, Error **errp)
     qemu_fclose(f);
     if (ret < 0) {
         error_setg(errp, QERR_IO_ERROR);
+    } else {
+        /* libxl calls the QMP command "stop" before calling
+         * "xen-save-devices-state" and in case of migration failure, libxl
+         * would call "cont".
+         * So call bdrv_inactivate_all (release locks) here to let the other
+         * side of the migration take controle of the images.
+         */
+        if (live && !saved_vm_running) {
+            ret = bdrv_inactivate_all();
+            if (ret) {
+                error_setg(errp, "%s: bdrv_inactivate_all() failed (%d)",
+                           __func__, ret);
+            }
+        }
     }
 
  the_end:
diff --git a/qapi/migration.json b/qapi/migration.json
index bbc4671ded..03f57c9616 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1075,6 +1075,9 @@ 
 # data. See xen-save-devices-state.txt for a description of the binary
 # format.
 #
+# @live: Optional argument to ask QEMU to treat this command as part of a live
+# migration. Default to true. (since 2.11)
+#
 # Returns: Nothing on success
 #
 # Since: 1.1
@@ -1086,7 +1089,8 @@ 
 # <- { "return": {} }
 #
 ##
-{ 'command': 'xen-save-devices-state', 'data': {'filename': 'str'} }
+{ 'command': 'xen-save-devices-state',
+  'data': {'filename': 'str', '*live':'bool' } }
 
 ##
 # @xen-set-replication: