From patchwork Thu Feb 13 10:53:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Durrant X-Patchwork-Id: 11380155 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3043E138D for ; Thu, 13 Feb 2020 10:54:59 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0CA1F218AC for ; Thu, 13 Feb 2020 10:54:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="eJTnSw56" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0CA1F218AC Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=amazon.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j2C8B-00036i-S8; Thu, 13 Feb 2020 10:54:03 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j2C8A-00036O-8J for xen-devel@lists.xenproject.org; Thu, 13 Feb 2020 10:54:02 +0000 X-Inumbo-ID: 24783982-4e4f-11ea-b882-12813bfff9fa Received: from smtp-fw-33001.amazon.com (unknown [207.171.190.10]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 24783982-4e4f-11ea-b882-12813bfff9fa; Thu, 13 Feb 2020 10:54:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1581591242; x=1613127242; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XOqKuNa4thKokZWNXs/Lt4T7RNKv5ob0Stt1lipUn8c=; b=eJTnSw56DstD2Em9MX+5X3YyKpHjXQptNK3mZ+mpdukb4TUw1AGqOFhW tWLanUIj+AVSiiGgIDg0ItvBMe5dDXME8bnXSXo9TNRq9eOPiXBHTuYxq feg1MEu7ZYawDcHqOoMYHhTr82GgYeiHs0yHVbE0U4A9jYOSQtiTP57+w g=; IronPort-SDR: CFjBE2IAGi+KdiH/Ir8YYUzoO3BY9ehGpfc6K611TgNA+MGYCihjfTmVoUeG5E53v0YvjL0N5u xmCAtRLP3x3Q== X-IronPort-AV: E=Sophos;i="5.70,436,1574121600"; d="scan'208";a="26188826" Received: from sea32-co-svc-lb4-vlan3.sea.corp.amazon.com (HELO email-inbound-relay-1a-821c648d.us-east-1.amazon.com) ([10.47.23.38]) by smtp-border-fw-out-33001.sea14.amazon.com with ESMTP; 13 Feb 2020 10:54:00 +0000 Received: from EX13MTAUEA002.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan3.iad.amazon.com [10.40.159.166]) by email-inbound-relay-1a-821c648d.us-east-1.amazon.com (Postfix) with ESMTPS id 9E3A8A25B3; Thu, 13 Feb 2020 10:53:56 +0000 (UTC) Received: from EX13D32EUB001.ant.amazon.com (10.43.166.125) by EX13MTAUEA002.ant.amazon.com (10.43.61.77) with Microsoft SMTP Server (TLS) id 15.0.1236.3; Thu, 13 Feb 2020 10:53:41 +0000 Received: from EX13MTAUEA002.ant.amazon.com (10.43.61.77) by EX13D32EUB001.ant.amazon.com (10.43.166.125) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Thu, 13 Feb 2020 10:53:40 +0000 Received: from u2f063a87eabd5f.cbg10.amazon.com (10.125.106.135) by mail-relay.amazon.com (10.43.61.169) with Microsoft SMTP Server id 15.0.1236.3 via Frontend Transport; Thu, 13 Feb 2020 10:53:38 +0000 From: Paul Durrant To: Date: Thu, 13 Feb 2020 10:53:25 +0000 Message-ID: <20200213105325.3022-3-pdurrant@amazon.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200213105325.3022-1-pdurrant@amazon.com> References: <20200213105325.3022-1-pdurrant@amazon.com> MIME-Version: 1.0 Precedence: Bulk Subject: [Xen-devel] [PATCH v5 2/2] docs/designs: Add a design document for migration of xenstore data X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Stefano Stabellini , Julien Grall , Wei Liu , Konrad Rzeszutek Wilk , George Dunlap , Andrew Cooper , Paul Durrant , Ian Jackson , Jan Beulich Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" This patch details proposes extra migration data and xenstore protocol extensions to support non-cooperative live migration of guests. Signed-off-by: Paul Durrant --- Cc: Andrew Cooper Cc: George Dunlap Cc: Ian Jackson Cc: Jan Beulich Cc: Julien Grall Cc: Konrad Rzeszutek Wilk Cc: Stefano Stabellini Cc: Wei Liu v5: - Add QUIESCE - Make semantics of in GET_DOMAIN_WATCHES more clear v4: - Drop the restrictions on special paths v3: - New in v3 --- docs/designs/xenstore-migration.md | 136 +++++++++++++++++++++++++++++ 1 file changed, 136 insertions(+) create mode 100644 docs/designs/xenstore-migration.md diff --git a/docs/designs/xenstore-migration.md b/docs/designs/xenstore-migration.md new file mode 100644 index 0000000000..5cfe2d9a7d --- /dev/null +++ b/docs/designs/xenstore-migration.md @@ -0,0 +1,136 @@ +# Xenstore Migration + +## Background + +The design for *Non-Cooperative Migration of Guests*[1] explains that extra +save records are required in the migrations stream to allow a guest running +PV drivers to be migrated without its co-operation. Moreover the save +records must include details of registered xenstore watches as well as +content; information that cannot currently be recovered from `xenstored`, +and hence some extension to the xenstore protocol[2] will also be required. + +The *libxenlight Domain Image Format* specification[3] already defines a +record type `EMULATOR_XENSTORE_DATA` but this is not suitable for +transferring xenstore data pertaining to the domain directly as it is +specified such that keys are relative to the path +`/local/domain/$dm_domid/device-model/$domid`. Thus it is necessary to +define at least one new save record type. + +## Proposal + +### New Save Record + +A new mandatory record type should be defined within the libxenlight Domain +Image Format: + +`0x00000007: DOMAIN_XENSTORE_DATA` + +The format of each of these new records should be as follows: + + +``` +0 1 2 3 4 5 6 7 octet ++------------------------+------------------------+ +| type | record specific data | ++------------------------+ | +... ++-------------------------------------------------+ +``` + + +| Field | Description | +|---|---| +| `type` | 0x00000000: invalid | +| | 0x00000001: node data | +| | 0x00000002: watch data | +| | 0x00000003 - 0xFFFFFFFF: reserved for future use | + + +where data is always in the form of a NUL separated and terminated tuple +as follows + + +**node data** + + +`|||` + + +`` is considered relative to the domain path `/local/domain/$domid` +and hence must not begin with `/`. +`` and `` should be suitable to formulate a `WRITE` operation +to the receiving xenstore and `` should be similarly suitable +to formulate a subsequent `SET_PERMS` operation. + +**watch data** + + +`||` + +`` again is considered relative and, together with ``, should +be suitable to formulate an `ADD_DOMAIN_WATCHES` operation (see below). + + +### Protocol Extension + +Before xenstore state is migrated it is necessary to wait for any pending +reads, writes, watch registrations etc. to complete, and also to make sure +that xenstored does not start processing any new requests (so that new +requests remain pending on the shared ring for subsequent processing on the +new host). Hence the following operation is needed: + +``` +QUIESCE | + +Complete processing of any request issued by the specified domain, and +do not process any further requests from the shared ring. +``` + +The `WATCH` operation does not allow specification of a ``; it is +assumed that the watch pertains to the domain that owns the shared ring +over which the operation is passed. Hence, for the tool-stack to be able +to register a watch on behalf of a domain a new operation is needed: + +``` +ADD_DOMAIN_WATCHES ||+ + +Adds watches on behalf of the specified domain. + + is a NUL separated tuple of |. The semantics of this +operation are identical to the domain issuing WATCH || for +each . +``` + +The watch information for a domain also needs to be extracted from the +sending xenstored so the following operation is also needed: + +``` +GET_DOMAIN_WATCHES | ||* + +Gets the list of watches that are currently registered for the domain. + + is a NUL separated tuple of |. The sub-list returned +will start at items into the the overall list of watches and may +be truncated (at a boundary) such that the returned data fits +within XENSTORE_PAYLOAD_MAX. + +If is beyond the end of the overall list then the returned sub- +list will be empty. If the value of changes then it indicates +that the overall watch list has changed and thus it may be necessary +to re-issue the operation for previous values of . +``` + +It may also be desirable to state in the protocol specification that +the `INTRODUCE` operation should not clear the `` specified such that +a `RELEASE` operation followed by an `INTRODUCE` operation form an +idempotent pair. The current implementation of *C xentored* does this +(in the `domain_conn_reset()` function) but this could be dropped as this +behaviour is not currently specified and the page will always be zeroed +for a newly created domain. + + +* * * + +[1] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/designs/non-cooperative-migration.md +[2] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/misc/xenstore.txt +[3] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/specs/libxl-migration-stream.pandoc