From patchwork Mon Aug 5 09:32:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Gowans, James" X-Patchwork-Id: 13753335 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23675C3DA4A for ; Mon, 5 Aug 2024 09:36:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B7A406B009F; Mon, 5 Aug 2024 05:36:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B52786B00A0; Mon, 5 Aug 2024 05:36:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A198A6B00A1; Mon, 5 Aug 2024 05:36:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 829E26B009F for ; Mon, 5 Aug 2024 05:36:14 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 2E9BC1201A0 for ; Mon, 5 Aug 2024 09:36:14 +0000 (UTC) X-FDA: 82417685868.23.62430E5 Received: from smtp-fw-9106.amazon.com (smtp-fw-9106.amazon.com [207.171.188.206]) by imf27.hostedemail.com (Postfix) with ESMTP id F211F40007 for ; Mon, 5 Aug 2024 09:36:11 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=PB8CXfjf; dmarc=pass (policy=quarantine) header.from=amazon.com; spf=pass (imf27.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 207.171.188.206 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722850503; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SYxgCtdC1svzDJcgLJLCz3gGK5To5Xu2NrqOWTnvUZY=; b=YoGsZqiTp0/RYir3H9tdno6MtXVCvL0O9gOARoCH5rlcy7Huj/P1u5zbsGuJoTpkbdHkI2 DXYlqlvP/j8NbKgU70lxlP85b6ggN4WzJN5+wGgVdoJEZANGlleapNse/58hY4POlToaFZ KsG5V9o5zE+a3R2UTlP4tAWtrRSTV8g= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722850503; a=rsa-sha256; cv=none; b=TgvPBe2lQ+VfVGK5GmhdbYBznQeW9cASbuYHfLDctOih4ykwqjCUjzlh+T0Q2QyusRun6C PPQ4C8EKHvJiqOmMRwdO74bX8nWsVxbAc8+CHAuWnnHK25k4f+z7gsu1AeKeoKS5owUvlh aRu9G8HaVpdtvdYhAe9wcIaypT4zEiw= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=PB8CXfjf; dmarc=pass (policy=quarantine) header.from=amazon.com; spf=pass (imf27.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 207.171.188.206 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1722850573; x=1754386573; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SYxgCtdC1svzDJcgLJLCz3gGK5To5Xu2NrqOWTnvUZY=; b=PB8CXfjfrygawNjekWAzvSGQJBfUV9jTkheKspDnsJVr4PeHw1Et5CEi I2cx3mRpgPFTvNsg9CMuwpkfgPuhaA90rkebgEVRo/zZ/Rtb6EZAN7Pwm vtPT57a6/w09f6Dbuv/3W3cXECu2b7mYoh8OmqoJlaWvWk3X90mNfgig9 s=; X-IronPort-AV: E=Sophos;i="6.09,264,1716249600"; d="scan'208";a="747199432" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-9106.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2024 09:36:11 +0000 Received: from EX19MTAEUB001.ant.amazon.com [10.0.10.100:34899] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.4.201:2525] with esmtp (Farcaster) id 7ef1e7da-8235-4c57-b9f9-879b313d6daf; Mon, 5 Aug 2024 09:36:09 +0000 (UTC) X-Farcaster-Flow-ID: 7ef1e7da-8235-4c57-b9f9-879b313d6daf Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUB001.ant.amazon.com (10.252.51.26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:36:08 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.113) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:35:59 +0000 From: James Gowans To: CC: James Gowans , Sean Christopherson , Paolo Bonzini , Alexander Viro , Steve Sistare , Christian Brauner , Jan Kara , "Anthony Yznaga" , Mike Rapoport , "Andrew Morton" , , Jason Gunthorpe , , Usama Arif , , Alexander Graf , David Woodhouse , Paul Durrant , Nicolas Saenz Julienne Subject: [PATCH 09/10] guestmemfs: Add documentation and usage instructions Date: Mon, 5 Aug 2024 11:32:44 +0200 Message-ID: <20240805093245.889357-10-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240805093245.889357-1-jgowans@amazon.com> References: <20240805093245.889357-1-jgowans@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.146.13.113] X-ClientProxiedBy: EX19D031UWC002.ant.amazon.com (10.13.139.212) To EX19D014EUC004.ant.amazon.com (10.252.51.182) X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: F211F40007 X-Stat-Signature: mifbajgzosaqag1j77ibe3hf5xey4h14 X-Rspam-User: X-HE-Tag: 1722850571-320786 X-HE-Meta: U2FsdGVkX18sKI8Cjs0O3FAvHcZQv8iSP2icZFzmKMFUbp1bB91o8rLRWs6R2jcnW6r8Ke54xijw+PfVg2MeTXBrNhDTyEownL3Oi2IqRDTGZI+8VtrfwPnSwg6pM0JTovsgPZoxt0EG5nSnTsGY6fdE0WWQ41dIJyb+10IdQKlWRaOZNiQweOP0p9iwx2Tw/DM+POZE/JqG2sQzq0mI7pgiIa2VSQIObvNtG6rfDEFN1hIvRw8HNWm9n2WrrJdYxWvGxsGb9eFHQeRdpZUpwljF4di6H+DfyXyO3r8TG/RzDSfPiedxqiioRZ4EKyTcxHHdQZgQGn0i4ClTTJPI0Tl3qGoEeZLiYryXDsBpuWFDqX5FdRYTH7zWOkUjd4VI/yh2fxn/7PXkVB1E9eNw34mWlpOQoB4E5pYAgzI4MnV6EKo0qNs6nHCaFi/tjQWlzlUW2KPwa79cJgCl3OyyU+rKr2EJFcRQg4S4EqK2688mAGwxqL+QfFM6k/X3YGM97KgpaLOnCDBw6Z9WE7XDa5po8QCnfUKS5EeL6YDc5sHsuffKgE2goDVoEHDHZTyKFZFPG15cBzsVnpLAd1HkqdWqlo90ufE5rjhIpjY7E3ATUoqcuM/puajq79/PdhWW+KZOluUjNb6Y/XVxheJVyKhG1NlDwdFrNIFrqPeL3/UMnmfh2Su/0NTIkln+N1dE2Io9i+2SjzdjoDecl01pbtgkurvbkEErY+AlE9DNgt4aqB7ZYaajoy3gtLYmSiVg8r0aO5W4rVMlnaBo/FHEqTPtfSeu/yXn+9wiqGwTjTmrBL86syvBmAEa5WOZt6GytRmNtBGKjIwpkgupPfHVIpaFs++cdcgsVQcI5wCCfwK/OfvkEIHhTYddyK7ge2NEdNz3lAyGm90TF52G8bMx5XZt0DSurPPmfgCHI8pL8c5qEBfFP0O6Vl3H6R3xuF8aNe+5djUZv+PqyOcCeuj mZpwVULp s+ymVZK3kjNj654FoPlqCzf88j3xalROL/mcbUOHUnmgrWuofNSagWLkQQkCPz1IN18tihmgc59OxmPYiW65Innd2/Dnm5ImGCN/Mn+BgIuONsMHi4SC90xj0GNZdjpdjJJ6pdp7Xb9w3Ddnv1agJAykF7CvLv0A4Ws7sU5c7igIk+5kDa/6WwT2NYJilBw6FlRmCPhBATMUIUO/oioUEYKCdf859kCgi5T2HtilV1FudQVj7Yd9MmpSKOdYY6cw1xoFof1HS2uzzlz1mUExzO6gkkKYBJuzWyyIGkx2IbfxXCfv1u751KeBiUoeWzefPvasS51CunU+G6MX/CZ2AXKh9BLroY6x5d8Mb8y5pDngN+YwIY2tpUcyUmn2oC5smtPocjCib2v6xDnfY/+oQbcdJKijhD0vqMK6linAAOlq5D2omLbEgajoxLtTltoNz/pdgaQIM0AlvkqEnaYT0brHtv41rWbVMpsQENctksxOlPDB0Q/TNJC+6hG+3xCoaJR3QNz79rm4UAHPI68cDeyFSYXf1QyKa+GVHEi0Teqv/6q0Wasw20QfkfXZSKvzQtVTVmPjT4CsStfE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Describe the motivation for guestmemfs, the functionality it provides, how to compile it in, how to use it as a source of guest memory, how to persist it across kexec and save/restore a VM. Signed-off-by: James Gowans --- Documentation/filesystems/guestmemfs.rst | 87 ++++++++++++++++++++++++ 1 file changed, 87 insertions(+) create mode 100644 Documentation/filesystems/guestmemfs.rst diff --git a/Documentation/filesystems/guestmemfs.rst b/Documentation/filesystems/guestmemfs.rst new file mode 100644 index 000000000000..d6ce0d194cc8 --- /dev/null +++ b/Documentation/filesystems/guestmemfs.rst @@ -0,0 +1,87 @@ +.. SPDX-License-Identifier: GPL-2.0 + +====================================================== +Guestmemfs - Persistent in-memory guest RAM filesystem +====================================================== + +Overview +======== + +Guestmemfs is an in-memory filesystem designed specifically for the purpose of +live update of virtual machines by being a persistent across kexec source of +guest VM memory. + +Live update of a hypervisor refers to act of pausing running VMs, serialising +state, kexec-ing into a new hypervisor image, re-hydraing the KVM guests and +resuming them. To achieve this guest memory must be preserved across kexec. + +Additionally, guestmemfs provides: +- secret hiding for guest memory: the physical memory allocated for guestmemfs + is carved out of the direct map early in boot. +- struct page overhead elimination: guestmemfs memory is not allocated by the + buddy allocator and does not have associated struct pages. +- huge page mappings: allocations are done at PMD size and this improves TLB + performance (work in progress.) + +Compilation +=========== + +Guestmemfs is enabled via CONFIG_GUESTMEMFS_FS + +Persistence across kexec is enabled via CONFIG_KEXEC_KHO + +Usage +===== + +On first boot (cold boot), allocate a large contiguous chunk of memory for +guestmemfs via a kernel cmdline argument, eg: +`guestmemfs=10G`. + +Mount guestmemfs: +mount -t guestmemfs guestmemfs /mnt/guestmemfs/ + +Create and truncate a file which will be used for guest RAM: + +touch /mnt/guesttmemfs/guest-ram +truncate -s 500M /mnt/guestmemfs/guest-ram + +Boot a VM with this as the RAM source and the live update option enabled: + +qemu-system-x86_64 ... \ + -object memory-backend-file,id=pc.ram,size=100M,mem-path=/mnt/guestmemfs/guest-ram,share=yes,prealloc=off \ + -migrate-mode-enable cpr-reboot \ + ... + +Suspect the guest and save the state via QEMU monitor: + +migrate_set_parameter mode cpr-reboot +migrate file:/qemu.sav + +Activate KHO to serialise guestmemfs metadata and then kexec to the new +hypervisor image: + +echo 1 > /sys/kernel/kho/active +kexec -s -l --reuse-cmdline +kexec -e + +After the kexec completes remount guestmemfs (or have it added to fstab) +Re-start QEMU in live update restore mode: + +qemu-system-x86_64 ... \ + -object memory-backend-file,id=pc.ram,size=100M,mem-path=/mnt/guestmemfs/guest-ram,share=yes,prealloc=off \ + -migrate-mode-enable cpr-reboot \ + -incoming defer + ... + +Finally restore the VM state and resume it via QEMU console: + +migrate_incoming file:/qemu.sav + +Future Work +=========== +- NUMA awareness and multi-mount point support +- Actually creating PMD-level mappings in page tables +- guest_memfd style interface for confidential computing +- supporting PUD-level allocations and mappings +- MCE handling +- Persisting IOMMU pgtables to allow DMA to guestmemfs during kexec