From patchwork Tue May 14 08:05:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Olaf Hering X-Patchwork-Id: 10942429 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8EA5392A for ; Tue, 14 May 2019 08:08:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 82ABB285E8 for ; Tue, 14 May 2019 08:08:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 76D9328623; Tue, 14 May 2019 08:08:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5D462285E8 for ; Tue, 14 May 2019 08:08:24 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hQSRx-00007s-Rx; Tue, 14 May 2019 08:06:13 +0000 Received: from us1-rack-dfw2.inumbo.com ([104.130.134.6]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hQSRx-00007n-0Y for xen-devel@lists.xenproject.org; Tue, 14 May 2019 08:06:13 +0000 X-Inumbo-ID: 220fa23f-761f-11e9-8980-bc764e045a96 Received: from mo6-p01-ob.smtp.rzone.de (unknown [2a01:238:20a:202:5301::3]) by us1-rack-dfw2.inumbo.com (Halon) with ESMTPS id 220fa23f-761f-11e9-8980-bc764e045a96; Tue, 14 May 2019 08:06:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1557821169; s=strato-dkim-0002; d=aepfle.de; h=Message-Id:Date:Subject:Cc:To:From:X-RZG-CLASS-ID:X-RZG-AUTH:From: Subject:Sender; bh=dMxwTJpBfGGdB+l310X9kkFmB63lDFNyAgGWNgbRUS8=; b=KtCR9QX6qG29YDKIVUzDyCVQ9qE2cx77ms0JuRkWDbUcVjaeq6owrhVOUh/sbOkm/z 8+fq8+l85RQW3LbCyV/MvRksQeGu6by5PwStLwo6/aQ2KjqiAGzkJ8x6a7Tr//XPJUu3 QfD38WHRaoC67wY/zq+ieTlsXPAmw6PyVeJMC/wHPxsZiR3PjOa+9sY8TAhKFxOWGxWZ E8Ppk1OXwUtMJ52013AIWGQKVWT2UeiysDgPpLBvFHALHfIZEkQezi7e9HI4YUGz8bJF 86RkFDYOY2Utuvc5Sr+ISDKVKx4GKksj4moIA8IO5U5HpuzMV3UtoAt4BEccCcn2JT7B HQRA== X-RZG-AUTH: ":P2EQZWCpfu+qG7CngxMFH1J+3q8wa/QXkBR9MXjAuzBW/OdlBZQ4AHSS3GpFjw==" X-RZG-CLASS-ID: mo00 Received: from sender by smtp.strato.de (RZmta 44.20 DYNA|AUTH) with ESMTPSA id U080cav4E8624NR (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (curve secp521r1 with 521 ECDH bits, eq. 15360 bits RSA)) (Client did not present a certificate); Tue, 14 May 2019 10:06:02 +0200 (CEST) From: Olaf Hering To: xen-devel@lists.xenproject.org Date: Tue, 14 May 2019 10:05:58 +0200 Message-Id: <20190514080558.14540-1-olaf@aepfle.de> X-Mailer: git-send-email 2.16.4 MIME-Version: 1.0 Subject: [Xen-devel] [PATCH v5] libxl: fix migration of PV and PVH domUs with and without qemu X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Anthony PERARD , Wei Liu , Olaf Hering , Ian Jackson , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP If a domU has a qemu-xen instance attached, it is required to call qemus "xen-save-devices-state" method. Without it, the receiving side of a PV or PVH migration may be unable to lock the image: xen be: qdisk-51712: xen be: qdisk-51712: error: Failed to get "write" lock error: Failed to get "write" lock xen be: qdisk-51712: xen be: qdisk-51712: initialise() failed initialise() failed To fix this bug, libxl__domain_suspend_device_model() and libxl__domain_resume_device_model() have to be called not only for HVM, but also if the active device_model is QEMU_XEN. Unfortunately, libxl__domain_build_info_setdefault() used to hardcode b_info->device_model_version to QEMU_XEN if it does not know it any better. As a result libxl__device_model_version_running() will return incorrect values. This breaks domUs without a device_model. libxl__qmp_stop() would wait 10 seconds in qmp_open() for a qemu that will never appear. During this long timeframe the domU remains in state paused on the sending side. As a result network connections may be dropped. Once this bug is fixed as well, by just removing the assumption that every domU has a QEMU_XEN, there is no code to actually initialise b_info->device_model_version. There is a helper function libxl__need_xenpv_qemu(), which is used in various places to decide if a device_model has to be spawned. This function can not be used as is, just to fill device_model_version, because store_libxl_entry() was already called earlier. Introduce LIBXL_DEVICE_MODEL_VERSION_NONE for PV and PVH that have no need for a device_model to make the state explicit. Indicate this new state via LIBXL_HAVE macro in libxl.h. v05: - move initialization of device_model_version to extra commit - return error from libxl__need_xenpv_qemu - add LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE v04: - make sure device_model_stubdomain is initialized v03: - rearrange code to make sure device_model_version is initialized before store_libxl_entry() is called v02: - update wording in a comment - remove stale goto in domcreate_launch_dm - initialize ret in libxl__need_xenpv_qemu Signed-off-by: Olaf Hering Cc: Roger Pau Monné Cc: Anthony PERARD Reviewed-by: Roger Pau Monné --- tools/libxl/libxl.h | 7 +++++++ tools/libxl/libxl_create.c | 17 ++++++++++++++--- tools/libxl/libxl_dom_suspend.c | 8 ++++++-- tools/libxl/libxl_types.idl | 1 + 4 files changed, 28 insertions(+), 5 deletions(-) diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 482499a6c0..e0f6916b66 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -1182,6 +1182,13 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, const libxl_mac *src); */ #define LIBXL_HAVE_PVCALLS 1 +/* + * LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE + * + * If this is defined, libxl will only run a device-model if required. + */ +#define LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE 1 + typedef char **libxl_string_list; void libxl_string_list_dispose(libxl_string_list *sl); int libxl_string_list_length(const libxl_string_list *sl); diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 3f0431cc84..64336b0d29 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -47,9 +47,20 @@ int libxl__domain_set_device_model(libxl__gc *gc, libxl_domain_config *d_config) } break; default: - b_info->device_model_version = - LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL; - break; + ret = libxl__need_xenpv_qemu(gc, d_config); + switch (ret) { + case 1: + d_config->b_info.device_model_version = + LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN; + break; + case 0: + d_config->b_info.device_model_version = + LIBXL_DEVICE_MODEL_VERSION_NONE; + break; + default: + LOGE(ERROR, "Unable to determine QEMU requisite"); + return ret; + } } if (b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) { diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c index d1af3a6573..c492fe5dd1 100644 --- a/tools/libxl/libxl_dom_suspend.c +++ b/tools/libxl/libxl_dom_suspend.c @@ -379,7 +379,9 @@ static void domain_suspend_common_guest_suspended(libxl__egc *egc, libxl__ev_xswatch_deregister(gc, &dsps->guest_watch); libxl__ev_time_deregister(gc, &dsps->guest_timeout); - if (dsps->type == LIBXL_DOMAIN_TYPE_HVM) { + if (dsps->type == LIBXL_DOMAIN_TYPE_HVM || + libxl__device_model_version_running(gc, dsps->domid) == + LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) { dsps->callback_device_model_done = domain_suspend_common_done; libxl__domain_suspend_device_model(egc, dsps); /* must be last */ return; @@ -459,7 +461,9 @@ int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel) goto out; } - if (type == LIBXL_DOMAIN_TYPE_HVM) { + if (type == LIBXL_DOMAIN_TYPE_HVM || + libxl__device_model_version_running(gc, domid) == + LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) { rc = libxl__domain_resume_device_model(gc, domid); if (rc) { LOGD(ERROR, domid, "failed to resume device model:%d", rc); diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index cb4702fd7a..75bde095bc 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -106,6 +106,7 @@ libxl_device_model_version = Enumeration("device_model_version", [ (0, "UNKNOWN"), (1, "QEMU_XEN_TRADITIONAL"), # Historical qemu-xen device model (qemu-dm) (2, "QEMU_XEN"), # Upstream based qemu-xen device model + (3, "NONE"), ]) libxl_console_type = Enumeration("console_type", [