From patchwork Mon Aug 5 09:32:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Gowans, James" X-Patchwork-Id: 13753326 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C8F3C3DA7F for ; Mon, 5 Aug 2024 09:33:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 647976B007B; Mon, 5 Aug 2024 05:33:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5F7BF6B0082; Mon, 5 Aug 2024 05:33:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 498516B0085; Mon, 5 Aug 2024 05:33:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 27DA16B007B for ; Mon, 5 Aug 2024 05:33:24 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id B58CB1601EA for ; Mon, 5 Aug 2024 09:33:23 +0000 (UTC) X-FDA: 82417678686.17.D8984B5 Received: from smtp-fw-52003.amazon.com (smtp-fw-52003.amazon.com [52.119.213.152]) by imf07.hostedemail.com (Postfix) with ESMTP id C865540002 for ; Mon, 5 Aug 2024 09:33:21 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b="kMHfbr/j"; spf=pass (imf07.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 52.119.213.152 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722850371; a=rsa-sha256; cv=none; b=UQht70GjawbImemEQTlYbJYm/9yYFckK2soOZe9ern14hgHy0pWgQSt7NthMOg3CIbpvjU KaVOBWBd4T0iyGjtlicut12mjyf4+SPrXtyReVTdElDks+qDQ9lstEEkcpa8vxDa+qe+uz 0aPpjs9Cth/zpDGm/4gXyz1WnMS8PAk= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b="kMHfbr/j"; spf=pass (imf07.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 52.119.213.152 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722850371; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CIMC4dBHyqkF/IJtbOCSyvit7T6FcLaiifrH1Oqev8I=; b=AhtMhEFrt/7B7Db/dxZ6CuWRCEY/+nJS7LhZVgB+I0+7C1jOXO2/qbDZd0rOnBldFXcp0F Bs9TAbnxE/Bld/W5FzGpX6hh3w/8BdChLObBYBtgHLwFfiXTbttseHFREY4I9br91pKIRm RE02rtt6dJmjfDJMQBU2zOZ1ggrQhFI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1722850402; x=1754386402; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CIMC4dBHyqkF/IJtbOCSyvit7T6FcLaiifrH1Oqev8I=; b=kMHfbr/jpwBdUFPda9lxD45d72LjciyQNLbdVoo8xbQEho5sIeI/hhE7 WrucxuP+39cmAPrZzzRtwyiZh8kAxbYK22VE+Op54EotYWDVbZdj07ISB iSqeZ2SlYqJH/ETeFLEqq+YaothWLygkp9EErrH+LijYS+UpO/en+vS5/ U=; X-IronPort-AV: E=Sophos;i="6.09,264,1716249600"; d="scan'208";a="16963930" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52003.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2024 09:33:20 +0000 Received: from EX19MTAEUA002.ant.amazon.com [10.0.43.254:34036] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.35.73:2525] with esmtp (Farcaster) id 9a93516f-30d6-4fa2-8ce9-a895a0a56cc5; Mon, 5 Aug 2024 09:33:18 +0000 (UTC) X-Farcaster-Flow-ID: 9a93516f-30d6-4fa2-8ce9-a895a0a56cc5 Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUA002.ant.amazon.com (10.252.50.126) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:33:18 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.113) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:33:09 +0000 From: James Gowans To: CC: James Gowans , Sean Christopherson , Paolo Bonzini , Alexander Viro , Steve Sistare , Christian Brauner , Jan Kara , "Anthony Yznaga" , Mike Rapoport , "Andrew Morton" , , Jason Gunthorpe , , Usama Arif , , Alexander Graf , David Woodhouse , Paul Durrant , Nicolas Saenz Julienne Subject: [PATCH 01/10] guestmemfs: Introduce filesystem skeleton Date: Mon, 5 Aug 2024 11:32:36 +0200 Message-ID: <20240805093245.889357-2-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240805093245.889357-1-jgowans@amazon.com> References: <20240805093245.889357-1-jgowans@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.146.13.113] X-ClientProxiedBy: EX19D033UWC004.ant.amazon.com (10.13.139.225) To EX19D014EUC004.ant.amazon.com (10.252.51.182) X-Stat-Signature: 4znpefwb6um9wsxr8fd6yt797urppin4 X-Rspamd-Queue-Id: C865540002 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1722850401-40270 X-HE-Meta: U2FsdGVkX19Fe92g9PCqyXokX5gXWshxaPDT7BL2M5ak8rf4LVS+FHj2K45kti+4YqkpBvOT0G7yeMqDpJ5PQmwPgLtwioX/+LkhbcIYc7lO4L1vS0pwO1H4iWBkaPdK1EBZaz4aGv9JcFr6D29Hk0vFDGuNJzgdSeX6hdRfj9O4/HnNvhLhF8q88IuAaBLuFVuokGPGidLE4VigQ7+7l18kCTmkIVgH0fqU60y/6spV9HZKidFtBcFpTgARFzLlJ1isAN9CamXFzNom3ULMgsDAldp435gZaH42OLLpRyj4OXdiWiPT3cUpzLhCOBcjVHTnCubEctHsUhhnodRvwjVXFj6aaZ0GLxdkBr3FiffmfGtwggtis561IEyNtmvwdIUnfIEwHsztEavYez7/ONdPz/u3D+PUvqwf36f+bXuHvZbp0svzN6E4lUkRqbIBdN4QYuq2+m3pPQg8Ennq1tpIHM7MFqQeERAo8WirAyG4tGsFYgsJnZObMH9dtJUg3BzF1+u5abDH2WrCzt4dGfpUrpUrTHHZH4DaOUcYPfneCOslAyQPkfWA1Glh9p0JGmyHZUzUMUUDInBTON4CiyQsJRUCoCPlveuarqzPpXk1kq7cRgvzSpvN7VLqPt9G5vjVNh9RoxbHuNoNNZ2KdAhkiCYxu3Ov08shoX3gve1kvFqCS23hinNhO+EKh3SKnLrW/P9eqlbuhb6CoyIlOqVnEfIJy/K6iDYbsqYDASow33stCoLeyoasPdKo8ZtaZeG5QtukHXIo4AeffzIm+IFXvVTREhCUJGa0nKsrI5rFszF+sNNgq1t7gUqZ4AkyzRvTAC3LAAWo7EW3KwfZZH8fwkSHSQ4mhowUk5ft6SSO4CHEFWZCOsHOX4s6+yxg+Ad6MKujNq1ryrwFCs6dINgUnUjTF7QJPWnPsxGTuQjpjZeo9uwL76mQwhp0yg+3ned6vHUg64+MAQEi6e1 qN7cLvE/ h3DPUQu2R9i6UfIlRCDEzkfut/ZaeAx+PSzqh5e8vZIe/Fge7DVAQ7U2pcb7lgOfagM4XsqwMdMr97ZNUuaENPwPcDpIfOGqFLYrrRNb4pM3+7U/m6JVLR3oaN9w3DApz1EAWqFh5nQVAXl+kmy/5GGm/HuQfRaqKkdprHpvn4jJVKgT6nvtu+GptTKZh+7TrGExjQWB+fpc09ekXSSPsGQ+r0j8mbZdnMo6Sdh0D3pShGyQg6UukBh0VQK0r3/yv71ktTNM/g5+55UQRzWyHZ9ryGp3dZzGkxmyIyuTShNLP7SQ72vvJEVHZdZPpKWPfr91dMmQ53Q7F0Nb6oMl+O+ubOp3teuQD3zQ19Cxx6Hikic/jtaL/lp+Ar/pbt0T+X0TtO3e/sDq0JiIGOA6iI736CGBaPBofDMjplIpOiXvVdnTvsJI0Y/P79fvsn4/kR99syMW+mVeGZeOTtf4z723g85+dIR0+yTXz8bumjOA43r+/Exl96Cf2KpcxPzKuFtpYpgU6fvWLLPoP9fN578mwQpoC+m7NMAQGhAufz/gDRBcRC17Ea+ETjf4OcH5uXnObxiJ4dbAgWQE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add an in-memory filesystem: guestmemfs. Memory is donated to guestmemfs by carving it out of the normal System RAM range with the memmap= cmdline parameter and then giving that same physical range to guestmemfs with the guestmemfs= cmdline parameter. A new filesystem is added; so far it doesn't do much except persist a super block at the start of the donated memory and allows itself to be mounted. A hook to x86 mm init is added to reserve the memory really early on via memblock allocator. There is probably a better arch-independent place to do this... Signed-off-by: James Gowans --- arch/x86/mm/init_64.c | 2 + fs/Kconfig | 1 + fs/Makefile | 1 + fs/guestmemfs/Kconfig | 11 ++++ fs/guestmemfs/Makefile | 6 ++ fs/guestmemfs/guestmemfs.c | 116 +++++++++++++++++++++++++++++++++++++ fs/guestmemfs/guestmemfs.h | 9 +++ include/linux/guestmemfs.h | 16 +++++ 8 files changed, 162 insertions(+) create mode 100644 fs/guestmemfs/Kconfig create mode 100644 fs/guestmemfs/Makefile create mode 100644 fs/guestmemfs/guestmemfs.c create mode 100644 fs/guestmemfs/guestmemfs.h create mode 100644 include/linux/guestmemfs.h diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 8932ba8f5cdd..39fcf017c90c 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -1331,6 +1332,7 @@ static void __init preallocate_vmalloc_pages(void) void __init mem_init(void) { + guestmemfs_reserve_mem(); pci_iommu_alloc(); /* clear_bss() already clear the empty_zero_page */ diff --git a/fs/Kconfig b/fs/Kconfig index a46b0cbc4d8f..727359901da8 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -321,6 +321,7 @@ source "fs/befs/Kconfig" source "fs/bfs/Kconfig" source "fs/efs/Kconfig" source "fs/jffs2/Kconfig" +source "fs/guestmemfs/Kconfig" # UBIFS File system configuration source "fs/ubifs/Kconfig" source "fs/cramfs/Kconfig" diff --git a/fs/Makefile b/fs/Makefile index 6ecc9b0a53f2..044524b17d63 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -129,3 +129,4 @@ obj-$(CONFIG_EFIVAR_FS) += efivarfs/ obj-$(CONFIG_EROFS_FS) += erofs/ obj-$(CONFIG_VBOXSF_FS) += vboxsf/ obj-$(CONFIG_ZONEFS_FS) += zonefs/ +obj-$(CONFIG_GUESTMEMFS_FS) += guestmemfs/ diff --git a/fs/guestmemfs/Kconfig b/fs/guestmemfs/Kconfig new file mode 100644 index 000000000000..d87fca4822cb --- /dev/null +++ b/fs/guestmemfs/Kconfig @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: GPL-2.0-only + +config GUESTMEMFS_FS + bool "Persistent Guest memory filesystem (guestmemfs)" + help + An in-memory filesystem on top of reserved memory specified via + guestmemfs= cmdline argument. Used for storing kernel state and + userspace memory which is preserved across kexec to support + live update of a hypervisor when running guest virtual machines. + Select this if you need the ability to persist memory for guest VMs + across kexec to do live update. diff --git a/fs/guestmemfs/Makefile b/fs/guestmemfs/Makefile new file mode 100644 index 000000000000..6dc820a9d4fe --- /dev/null +++ b/fs/guestmemfs/Makefile @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# Makefile for persistent kernel filesystem +# + +obj-y += guestmemfs.o diff --git a/fs/guestmemfs/guestmemfs.c b/fs/guestmemfs/guestmemfs.c new file mode 100644 index 000000000000..3aaada1b8df6 --- /dev/null +++ b/fs/guestmemfs/guestmemfs.c @@ -0,0 +1,116 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "guestmemfs.h" +#include +#include +#include +#include +#include +#include +#include + +static phys_addr_t guestmemfs_base, guestmemfs_size; +struct guestmemfs_sb *psb; + +static int statfs(struct dentry *root, struct kstatfs *buf) +{ + simple_statfs(root, buf); + buf->f_bsize = PMD_SIZE; + buf->f_blocks = guestmemfs_size / PMD_SIZE; + buf->f_bfree = buf->f_bavail = buf->f_blocks; + return 0; +} + +static const struct super_operations guestmemfs_super_ops = { + .statfs = statfs, +}; + +static int guestmemfs_fill_super(struct super_block *sb, struct fs_context *fc) +{ + struct inode *inode; + struct dentry *dentry; + + psb = kzalloc(sizeof(*psb), GFP_KERNEL); + /* + * Keep a reference to the persistent super block in the + * ephemeral super block. + */ + sb->s_fs_info = psb; + sb->s_op = &guestmemfs_super_ops; + + inode = new_inode(sb); + if (!inode) + return -ENOMEM; + + inode->i_ino = 1; + inode->i_mode = S_IFDIR; + inode->i_op = &simple_dir_inode_operations; + inode->i_fop = &simple_dir_operations; + simple_inode_init_ts(inode); + /* directory inodes start off with i_nlink == 2 (for "." entry) */ + inc_nlink(inode); + + dentry = d_make_root(inode); + if (!dentry) + return -ENOMEM; + sb->s_root = dentry; + + return 0; +} + +static int guestmemfs_get_tree(struct fs_context *fc) +{ + return get_tree_nodev(fc, guestmemfs_fill_super); +} + +static const struct fs_context_operations guestmemfs_context_ops = { + .get_tree = guestmemfs_get_tree, +}; + +static int guestmemfs_init_fs_context(struct fs_context *const fc) +{ + fc->ops = &guestmemfs_context_ops; + return 0; +} + +static struct file_system_type guestmemfs_fs_type = { + .owner = THIS_MODULE, + .name = "guestmemfs", + .init_fs_context = guestmemfs_init_fs_context, + .kill_sb = kill_litter_super, + .fs_flags = FS_USERNS_MOUNT, +}; + +static int __init guestmemfs_init(void) +{ + int ret; + + ret = register_filesystem(&guestmemfs_fs_type); + return ret; +} + +/** + * Format: guestmemfs=: + * Just like: memmap=nn[KMG]!ss[KMG] + */ +static int __init parse_guestmemfs_extents(char *p) +{ + guestmemfs_size = memparse(p, &p); + return 0; +} + +early_param("guestmemfs", parse_guestmemfs_extents); + +void __init guestmemfs_reserve_mem(void) +{ + guestmemfs_base = memblock_phys_alloc(guestmemfs_size, 4 << 10); + if (guestmemfs_base) { + memblock_reserved_mark_noinit(guestmemfs_base, guestmemfs_size); + memblock_mark_nomap(guestmemfs_base, guestmemfs_size); + } else { + pr_warn("Failed to alloc %llu bytes for guestmemfs\n", guestmemfs_size); + } +} + +MODULE_ALIAS_FS("guestmemfs"); +module_init(guestmemfs_init); diff --git a/fs/guestmemfs/guestmemfs.h b/fs/guestmemfs/guestmemfs.h new file mode 100644 index 000000000000..37d8cf630e0a --- /dev/null +++ b/fs/guestmemfs/guestmemfs.h @@ -0,0 +1,9 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#define pr_fmt(fmt) "guestmemfs: " KBUILD_MODNAME ": " fmt + +#include + +struct guestmemfs_sb { + /* Will be populated soon... */ +}; diff --git a/include/linux/guestmemfs.h b/include/linux/guestmemfs.h new file mode 100644 index 000000000000..60e769c8e533 --- /dev/null +++ b/include/linux/guestmemfs.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: MIT */ + +#ifndef _LINUX_GUESTMEMFS_H +#define _LINUX_GUESTMEMFS_H + +/* + * Carves out chunks of memory from memblocks for guestmemfs. + * Must be called in early boot before memblocks are freed. + */ +# ifdef CONFIG_GUESTMEMFS_FS +void guestmemfs_reserve_mem(void); +#else +void guestmemfs_reserve_mem(void) { } +#endif + +#endif From patchwork Mon Aug 5 09:32:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Gowans, James" X-Patchwork-Id: 13753328 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45263C3DA7F for ; Mon, 5 Aug 2024 09:33:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C3B876B0089; Mon, 5 Aug 2024 05:33:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BC4756B008A; Mon, 5 Aug 2024 05:33:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A65BB6B008C; Mon, 5 Aug 2024 05:33:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8423F6B0089 for ; Mon, 5 Aug 2024 05:33:34 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 33CFBA87E8 for ; Mon, 5 Aug 2024 09:33:34 +0000 (UTC) X-FDA: 82417679148.18.C480760 Received: from smtp-fw-52005.amazon.com (smtp-fw-52005.amazon.com [52.119.213.156]) by imf11.hostedemail.com (Postfix) with ESMTP id EDF8440013 for ; Mon, 5 Aug 2024 09:33:31 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=CwAhfAKe; spf=pass (imf11.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 52.119.213.156 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722850382; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HjMwVBXj1g+VXvnThIAFzLofQdSvF983VdApn1VgvTs=; b=H81dbAhmzz1K0Pvo6AYZQs9hzMoMbI2IS0yQGBzEr2GcRSA5EhNQrV4dn2FgCT/Jf6tKYy AEkCFVUAUv0HZm3KhsiuLSXuWl/LXqjtZokcYMhRgMWwtDf4/OktQflEn/W5mhwaFWem7m 1bpoZFNI6Lc2iOnSjgQ2j0IaBX1XKBY= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=CwAhfAKe; spf=pass (imf11.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 52.119.213.156 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722850382; a=rsa-sha256; cv=none; b=cuBJ/C9f/t1Jda61g/hk2pRW8W277T1NfEVLMzsDyHLZwBSN7Mn/G54CoMtekdicqlBssC 0PYzQIa4GN2JqESjhqKQOJ5gQpLSI7YOGUbExdfeVfq47kZyhEOyfBgYA/otmGJICeoOyt oXijRMWxBV1jcqbWg0PkkHnJZgEB5h0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1722850412; x=1754386412; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HjMwVBXj1g+VXvnThIAFzLofQdSvF983VdApn1VgvTs=; b=CwAhfAKews5aumLdiJU8KIZ20VvqOEiQX38itdAoJS9BV6PQjeaANpQV 1IghRRieU0Fgm4+fLBTHaxtoX5STvqewpvP2bDoQRp1Gu0+mSm3b5JXKP l9G8QVrBWn5+LMsEBaCojDX7NBBFjWyc3uLkcZ7LSefO4LaicAYXqxiXq Y=; X-IronPort-AV: E=Sophos;i="6.09,264,1716249600"; d="scan'208";a="672010836" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52005.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2024 09:33:29 +0000 Received: from EX19MTAEUB001.ant.amazon.com [10.0.10.100:16269] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.35.73:2525] with esmtp (Farcaster) id 2fb10915-ae34-4c72-8260-ae60945f471a; Mon, 5 Aug 2024 09:33:28 +0000 (UTC) X-Farcaster-Flow-ID: 2fb10915-ae34-4c72-8260-ae60945f471a Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUB001.ant.amazon.com (10.252.51.26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:33:28 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.113) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:33:19 +0000 From: James Gowans To: CC: James Gowans , Sean Christopherson , Paolo Bonzini , Alexander Viro , Steve Sistare , Christian Brauner , Jan Kara , "Anthony Yznaga" , Mike Rapoport , "Andrew Morton" , , Jason Gunthorpe , , Usama Arif , , Alexander Graf , David Woodhouse , Paul Durrant , Nicolas Saenz Julienne Subject: [PATCH 02/10] guestmemfs: add inode store, files and dirs Date: Mon, 5 Aug 2024 11:32:37 +0200 Message-ID: <20240805093245.889357-3-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240805093245.889357-1-jgowans@amazon.com> References: <20240805093245.889357-1-jgowans@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.146.13.113] X-ClientProxiedBy: EX19D033UWC004.ant.amazon.com (10.13.139.225) To EX19D014EUC004.ant.amazon.com (10.252.51.182) X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: EDF8440013 X-Stat-Signature: ufhgiudfwjm8gx6hux9q9grm9zpyd8mg X-HE-Tag: 1722850411-131126 X-HE-Meta: U2FsdGVkX1/C4sMSTB1NmUyO3BgUDvJt7ullsPhX4iSudrzSD650qXRDxb9/RWhmU9/TxsxwNT3Bf+ly9D4w1LbYIr+D/IoCfwor2HgYKbDuL/Rb4ifTU6Rnpm7y58Hsv4DOx2JoyiQ5O38ta35TfjCSqmF3K/CjJ1yZ20m/2StRF+noUu1lU9+/nnOpOoUpuq46hHF/eQmq6YZkr44BtkI/gsjKbKTNqz8ZEegLI+Bp5pkDUfHL8cMLFrnX8LkdpnFwQT7+o3ruCDamD7h5TejIyZ3HypfQa28wqp6SJoty+Lx0EkyzQ/A0vzfo9eggyErYW5QfyYoNFck48JKz8nGtfoch3ba3ROWDKEAnAyD5SV1ulsYeMOOb7O++xPGLVFJqK0UbAhNONeC3GT1jLqZyIRPHO7Bg5uHrEqAZS7htHwb/KkonriPJ6Ymj3903cezYk+hUW7sU9uH/QjDL0HMhv4JpAMC9or2/oKYNA7Uhantr6Z183ok9gQcaYV69BwTk8kcVxwhXhZsPJFtwvTR3nhK9zCaIR8xubEL2a1lAUNYV6qQCnEJpcQu7JFNJUwmuHDUuuvpoGU2zJYVbpKRljowSXj+SN7TDHyCf5w4+FHMSkdG+YYUURzCfZHhJKN9RiPEfdrmzW8I2RRbGFxe54jqkmqcUBXRfgLcKhnmfBOE7+PmPCZtxrSjYWCT5ByAea1K4t/LErNN3/sQyyP6eqS1ZFD1Aq643eWcAy4mYIQrwz0LlTfLCAPrQdcjhE6u4+VcPAdu8pAgiV5xQRrJdJTd0GIJ89LEYLhZmsi7Oc39bMoXYAeyrmA3CVyag7/lprkHSo2XU/S2WlHfjmmkX4HwdF55ykvKoLSeWGyfpUHDZMoJT1lJ5de04DrSIuLWqObcn2LLkJKGws7jTKPRKuwCsI2tF+jjejJcgVQAM7qdSPEuuY10PHqPWSiFjmvzWfo8vDLq576mrbT9 YuZAAKP5 jjKO+d03VYDJR9+GIfZtFtP5dcqk0bLJrsdJ6yaAVJB9LFOxvM4GY5FqMhHqYE1OfBLvmlVOJs1Wcb7pUkFqKmQCgX/BZTdJ05AaxSE1rZjAdWdELynAGt7j2ufcg86hADZi6NlGWNBexjzX982lKjMCJjafLuYYBjyXKEc+VuQ8cO3p3vZi/FLz6KAHwIAnnimQRkeHJQiLywny0dMaZtVyL4TKknJxDtfo6I/Wdhey5hUG0BxqNk9hnA7xOSFRuYSXkQ4Kg1CjAWJ7NEwAeWMQsc3jyZsmtsMLiU6zQGb1pOpmhkaH6WJIJ/FKb/TNsNmHKMr4cKK821dDKEGgusWHfiE1Kp7c18KG6pVuCho9v4qsrbGOIX6FbDY2aDsCO0nwdBQ7yr6DOA8Z5lNeLASGWHvB368wEzcKotM89c6/uztVQFuNezHAOJzQEAlps6PsZEjJ0eWSRgTwK0Y2G7BgfwMwb2EfVB1yELVDj7lLwr8zNw1DMoNXbARLqOJQRbmE3vAfrwRZamia/YgXY4+ZHCu3vP8Hrw0YoEBLif2bYvcAdQQvjGgXlwD087y5kHdcQR7bzfFotu6Q= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Here inodes are added to the filesystem: inodes for both regular files and directories. This involes supporting the callbacks to create inodes in a directory, as well as being able to list the contents of a directory and lookup and inode by name. The inode store is implemented as a 2 MiB page which is an array of struct guestmemfs_inode. The reason to have a large allocation and put them all in a big flat array is to make persistence easy: when it's time to introduce persistence to the filesystem it will need to persist this one big chunk of inodes across kexec using KHO. Free inodes in the page form a slab type structure, the first free inode pointing to the next free inode, etc. The super block points to the first free, so allocating involves popping the head, and freeing an inode involves pushing a new head. Directories point to the first inode in the directory via a child_inode reference. Subsequent inodes within the same directory are pointed to via a sibling_inode member. Essentially forming a linked list of inodes within the directory. Looking up an inode in a directory involves traversing the sibling_inode linked list until one with a matching name is found. Filesystem stats are updated to account for total and allocated inodes. Signed-off-by: James Gowans --- fs/guestmemfs/Makefile | 2 +- fs/guestmemfs/dir.c | 43 ++++++++++ fs/guestmemfs/guestmemfs.c | 21 ++++- fs/guestmemfs/guestmemfs.h | 36 +++++++- fs/guestmemfs/inode.c | 164 +++++++++++++++++++++++++++++++++++++ 5 files changed, 260 insertions(+), 6 deletions(-) create mode 100644 fs/guestmemfs/dir.c create mode 100644 fs/guestmemfs/inode.c diff --git a/fs/guestmemfs/Makefile b/fs/guestmemfs/Makefile index 6dc820a9d4fe..804997799ce8 100644 --- a/fs/guestmemfs/Makefile +++ b/fs/guestmemfs/Makefile @@ -3,4 +3,4 @@ # Makefile for persistent kernel filesystem # -obj-y += guestmemfs.o +obj-y += guestmemfs.o inode.o dir.o diff --git a/fs/guestmemfs/dir.c b/fs/guestmemfs/dir.c new file mode 100644 index 000000000000..4acd81421c85 --- /dev/null +++ b/fs/guestmemfs/dir.c @@ -0,0 +1,43 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "guestmemfs.h" + +static int guestmemfs_dir_iterate(struct file *dir, struct dir_context *ctx) +{ + struct guestmemfs_inode *guestmemfs_inode; + struct super_block *sb = dir->f_inode->i_sb; + + /* Indication from previous invoke that there's no more to iterate. */ + if (ctx->pos == -1) + return 0; + + if (!dir_emit_dots(dir, ctx)) + return 0; + + /* + * Just emitted this dir; go to dir contents. Use pos to smuggle + * the next inode number to emit across iterations. + * -1 indicates no valid inode. Can't use 0 because first loop has pos=0 + */ + if (ctx->pos == 2) { + ctx->pos = guestmemfs_get_persisted_inode(sb, dir->f_inode->i_ino)->child_ino; + /* Empty dir case. */ + if (ctx->pos == 0) + ctx->pos = -1; + } + + while (ctx->pos > 1) { + guestmemfs_inode = guestmemfs_get_persisted_inode(sb, ctx->pos); + dir_emit(ctx, guestmemfs_inode->filename, GUESTMEMFS_FILENAME_LEN, + ctx->pos, DT_UNKNOWN); + ctx->pos = guestmemfs_inode->sibling_ino; + if (!ctx->pos) + ctx->pos = -1; + } + return 0; +} + +const struct file_operations guestmemfs_dir_fops = { + .owner = THIS_MODULE, + .iterate_shared = guestmemfs_dir_iterate, +}; diff --git a/fs/guestmemfs/guestmemfs.c b/fs/guestmemfs/guestmemfs.c index 3aaada1b8df6..21cb3490a2bd 100644 --- a/fs/guestmemfs/guestmemfs.c +++ b/fs/guestmemfs/guestmemfs.c @@ -18,6 +18,9 @@ static int statfs(struct dentry *root, struct kstatfs *buf) buf->f_bsize = PMD_SIZE; buf->f_blocks = guestmemfs_size / PMD_SIZE; buf->f_bfree = buf->f_bavail = buf->f_blocks; + buf->f_files = PMD_SIZE / sizeof(struct guestmemfs_inode); + buf->f_ffree = buf->f_files - + GUESTMEMFS_PSB(root->d_sb)->allocated_inodes; return 0; } @@ -31,24 +34,34 @@ static int guestmemfs_fill_super(struct super_block *sb, struct fs_context *fc) struct dentry *dentry; psb = kzalloc(sizeof(*psb), GFP_KERNEL); + psb->inodes = kzalloc(2 << 20, GFP_KERNEL); + if (!psb->inodes) + return -ENOMEM; + /* * Keep a reference to the persistent super block in the * ephemeral super block. */ sb->s_fs_info = psb; + spin_lock_init(&psb->allocation_lock); + guestmemfs_initialise_inode_store(sb); + guestmemfs_get_persisted_inode(sb, 1)->flags = GUESTMEMFS_INODE_FLAG_DIR; + strscpy(guestmemfs_get_persisted_inode(sb, 1)->filename, ".", + GUESTMEMFS_FILENAME_LEN); + psb->next_free_ino = 2; + sb->s_op = &guestmemfs_super_ops; - inode = new_inode(sb); + inode = guestmemfs_inode_get(sb, 1); if (!inode) return -ENOMEM; - inode->i_ino = 1; inode->i_mode = S_IFDIR; - inode->i_op = &simple_dir_inode_operations; - inode->i_fop = &simple_dir_operations; + inode->i_fop = &guestmemfs_dir_fops; simple_inode_init_ts(inode); /* directory inodes start off with i_nlink == 2 (for "." entry) */ inc_nlink(inode); + inode_init_owner(&nop_mnt_idmap, inode, NULL, inode->i_mode); dentry = d_make_root(inode); if (!dentry) diff --git a/fs/guestmemfs/guestmemfs.h b/fs/guestmemfs/guestmemfs.h index 37d8cf630e0a..3a2954d1beec 100644 --- a/fs/guestmemfs/guestmemfs.h +++ b/fs/guestmemfs/guestmemfs.h @@ -3,7 +3,41 @@ #define pr_fmt(fmt) "guestmemfs: " KBUILD_MODNAME ": " fmt #include +#include + +#define GUESTMEMFS_FILENAME_LEN 255 +#define GUESTMEMFS_PSB(sb) ((struct guestmemfs_sb *)sb->s_fs_info) struct guestmemfs_sb { - /* Will be populated soon... */ + /* Inode number */ + unsigned long next_free_ino; + unsigned long allocated_inodes; + struct guestmemfs_inode *inodes; + spinlock_t allocation_lock; +}; + +// If neither of these are set the inode is not in use. +#define GUESTMEMFS_INODE_FLAG_FILE (1 << 0) +#define GUESTMEMFS_INODE_FLAG_DIR (1 << 1) +struct guestmemfs_inode { + int flags; + /* + * Points to next inode in the same directory, or + * 0 if last file in directory. + */ + unsigned long sibling_ino; + /* + * If this inode is a directory, this points to the + * first inode *in* that directory. + */ + unsigned long child_ino; + char filename[GUESTMEMFS_FILENAME_LEN]; + void *mappings; + int num_mappings; }; + +void guestmemfs_initialise_inode_store(struct super_block *sb); +struct inode *guestmemfs_inode_get(struct super_block *sb, unsigned long ino); +struct guestmemfs_inode *guestmemfs_get_persisted_inode(struct super_block *sb, int ino); + +extern const struct file_operations guestmemfs_dir_fops; diff --git a/fs/guestmemfs/inode.c b/fs/guestmemfs/inode.c new file mode 100644 index 000000000000..2360c3a4857d --- /dev/null +++ b/fs/guestmemfs/inode.c @@ -0,0 +1,164 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "guestmemfs.h" +#include + +const struct inode_operations guestmemfs_dir_inode_operations; + +struct guestmemfs_inode *guestmemfs_get_persisted_inode(struct super_block *sb, int ino) +{ + /* + * Inode index starts at 1, so -1 to get memory index. + */ + return GUESTMEMFS_PSB(sb)->inodes + ino - 1; +} + +struct inode *guestmemfs_inode_get(struct super_block *sb, unsigned long ino) +{ + struct inode *inode = iget_locked(sb, ino); + + /* If this inode is cached it is already populated; just return */ + if (!(inode->i_state & I_NEW)) + return inode; + inode->i_op = &guestmemfs_dir_inode_operations; + inode->i_sb = sb; + inode->i_mode = S_IFREG; + unlock_new_inode(inode); + return inode; +} + +static unsigned long guestmemfs_allocate_inode(struct super_block *sb) +{ + + unsigned long next_free_ino = -ENOMEM; + struct guestmemfs_sb *psb = GUESTMEMFS_PSB(sb); + + spin_lock(&psb->allocation_lock); + next_free_ino = psb->next_free_ino; + psb->allocated_inodes += 1; + if (!next_free_ino) + goto out; + psb->next_free_ino = + guestmemfs_get_persisted_inode(sb, next_free_ino)->sibling_ino; +out: + spin_unlock(&psb->allocation_lock); + return next_free_ino; +} + +/* + * Zeroes the inode and makes it the head of the free list. + */ +static void guestmemfs_free_inode(struct super_block *sb, unsigned long ino) +{ + struct guestmemfs_sb *psb = GUESTMEMFS_PSB(sb); + struct guestmemfs_inode *inode = guestmemfs_get_persisted_inode(sb, ino); + + spin_lock(&psb->allocation_lock); + memset(inode, 0, sizeof(struct guestmemfs_inode)); + inode->sibling_ino = psb->next_free_ino; + psb->next_free_ino = ino; + psb->allocated_inodes -= 1; + spin_unlock(&psb->allocation_lock); +} + +/* + * Sets all inodes as free and points each free inode to the next one. + */ +void guestmemfs_initialise_inode_store(struct super_block *sb) +{ + /* Inode store is a PMD sized (ie: 2 MiB) page */ + memset(guestmemfs_get_persisted_inode(sb, 1), 0, PMD_SIZE); + /* Point each inode for the next one; linked-list initialisation. */ + for (unsigned long ino = 2; ino * sizeof(struct guestmemfs_inode) < PMD_SIZE; ino++) + guestmemfs_get_persisted_inode(sb, ino - 1)->sibling_ino = ino; +} + +static int guestmemfs_create(struct mnt_idmap *id, struct inode *dir, + struct dentry *dentry, umode_t mode, bool excl) +{ + unsigned long free_inode; + struct guestmemfs_inode *guestmemfs_inode; + struct inode *vfs_inode; + + free_inode = guestmemfs_allocate_inode(dir->i_sb); + if (free_inode <= 0) + return -ENOMEM; + + guestmemfs_inode = guestmemfs_get_persisted_inode(dir->i_sb, free_inode); + guestmemfs_inode->sibling_ino = + guestmemfs_get_persisted_inode(dir->i_sb, dir->i_ino)->child_ino; + guestmemfs_get_persisted_inode(dir->i_sb, dir->i_ino)->child_ino = free_inode; + strscpy(guestmemfs_inode->filename, dentry->d_name.name, GUESTMEMFS_FILENAME_LEN); + guestmemfs_inode->flags = GUESTMEMFS_INODE_FLAG_FILE; + /* TODO: make dynamic */ + guestmemfs_inode->mappings = kzalloc(PAGE_SIZE, GFP_KERNEL); + + vfs_inode = guestmemfs_inode_get(dir->i_sb, free_inode); + d_instantiate(dentry, vfs_inode); + return 0; +} + +static struct dentry *guestmemfs_lookup(struct inode *dir, + struct dentry *dentry, + unsigned int flags) +{ + struct guestmemfs_inode *guestmemfs_inode; + unsigned long ino; + + guestmemfs_inode = guestmemfs_get_persisted_inode(dir->i_sb, dir->i_ino); + ino = guestmemfs_inode->child_ino; + while (ino) { + guestmemfs_inode = guestmemfs_get_persisted_inode(dir->i_sb, ino); + if (!strncmp(guestmemfs_inode->filename, + dentry->d_name.name, + GUESTMEMFS_FILENAME_LEN)) { + d_add(dentry, guestmemfs_inode_get(dir->i_sb, ino)); + break; + } + ino = guestmemfs_inode->sibling_ino; + } + return NULL; +} + +static int guestmemfs_unlink(struct inode *dir, struct dentry *dentry) +{ + unsigned long ino; + struct guestmemfs_inode *inode; + + ino = guestmemfs_get_persisted_inode(dir->i_sb, dir->i_ino)->child_ino; + + /* Special case for first file in dir */ + if (ino == dentry->d_inode->i_ino) { + guestmemfs_get_persisted_inode(dir->i_sb, dir->i_ino)->child_ino = + guestmemfs_get_persisted_inode(dir->i_sb, + dentry->d_inode->i_ino)->sibling_ino; + guestmemfs_free_inode(dir->i_sb, ino); + return 0; + } + + /* + * Although we know exactly the inode to free, because we maintain only + * a singly linked list we need to scan for it to find the previous + * element so it's "next" pointer can be updated. + */ + while (ino) { + inode = guestmemfs_get_persisted_inode(dir->i_sb, ino); + /* We've found the one pointing to the one we want to delete */ + if (inode->sibling_ino == dentry->d_inode->i_ino) { + inode->sibling_ino = + guestmemfs_get_persisted_inode(dir->i_sb, + dentry->d_inode->i_ino)->sibling_ino; + guestmemfs_free_inode(dir->i_sb, dentry->d_inode->i_ino); + break; + } + ino = guestmemfs_get_persisted_inode(dir->i_sb, ino)->sibling_ino; + } + + return 0; +} + +const struct inode_operations guestmemfs_dir_inode_operations = { + .create = guestmemfs_create, + .lookup = guestmemfs_lookup, + .unlink = guestmemfs_unlink, +}; From patchwork Mon Aug 5 09:32:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Gowans, James" X-Patchwork-Id: 13753329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 650C1C3DA4A for ; Mon, 5 Aug 2024 09:34:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EFDD26B008C; Mon, 5 Aug 2024 05:34:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EABB46B0092; Mon, 5 Aug 2024 05:34:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D729F6B0093; Mon, 5 Aug 2024 05:34:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B7D7F6B008C for ; Mon, 5 Aug 2024 05:34:16 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 610DB81D09 for ; Mon, 5 Aug 2024 09:34:16 +0000 (UTC) X-FDA: 82417680912.01.EFEF178 Received: from smtp-fw-80008.amazon.com (smtp-fw-80008.amazon.com [99.78.197.219]) by imf15.hostedemail.com (Postfix) with ESMTP id 4AD3FA0013 for ; Mon, 5 Aug 2024 09:34:14 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=beTU4qxc; spf=pass (imf15.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 99.78.197.219 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722850392; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mf4bGqkjJGcuD7WLWM/6xATCnP8HxmWYZb3Uae5eqbU=; b=Xhpx3Tv0E5ZCkytGpiZmP1oVbwEQtMfU90VOi+Xc0PKRT/TJ4Dg1nPVUdr2yaTIfPQFZYc YDm5bxOquiv5/RGUJfUAVQf73UDhUnU7aMtjYCnRbmR93IFjsGPHUGrt/PBt/Fzfenv5gu so4XaL0xXiVOJvWeTWCgr9GYX0CRqnw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722850392; a=rsa-sha256; cv=none; b=Ioo/1XDhxhpXhQzxdK04w4+TMB3O3XlWeRhVMkRBEnn7NprYYtXIQg3FkzSJ1jK37b+nGD 5DZrq2BmWAfGni2aFFK8MC0NQKz3Lc0IrK8CEpmpKHY2c/BLa2EtvNBchDGAvewSRSXpc7 kL+F3B9aF0enedWUhoRaVJoAxulEui0= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=beTU4qxc; spf=pass (imf15.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 99.78.197.219 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1722850454; x=1754386454; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mf4bGqkjJGcuD7WLWM/6xATCnP8HxmWYZb3Uae5eqbU=; b=beTU4qxcmHHToJLaMKTze7TrqVcBEFY5+QegBM6/tXgl1QraGY6Ced0I LAYjuEap0z0zchyJ080rZ1tx0PCjzBFjviNw7GtWiRh0sUipwFQzwDRC9 gex2rcSMFhIocBWdoUjSfzhz3R5/JnaxoAUarSo7rKBMHEKGpJHvi/rES w=; X-IronPort-AV: E=Sophos;i="6.09,264,1716249600"; d="scan'208";a="112401062" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80008.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2024 09:34:11 +0000 Received: from EX19MTAEUA001.ant.amazon.com [10.0.43.254:30354] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.25.233:2525] with esmtp (Farcaster) id bfd7aaac-eb93-420b-99d8-e287bbc4fa9e; Mon, 5 Aug 2024 09:34:10 +0000 (UTC) X-Farcaster-Flow-ID: bfd7aaac-eb93-420b-99d8-e287bbc4fa9e Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUA001.ant.amazon.com (10.252.50.192) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:34:08 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.113) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:33:59 +0000 From: James Gowans To: CC: James Gowans , Sean Christopherson , Paolo Bonzini , Alexander Viro , Steve Sistare , Christian Brauner , Jan Kara , "Anthony Yznaga" , Mike Rapoport , "Andrew Morton" , , Jason Gunthorpe , , Usama Arif , , Alexander Graf , David Woodhouse , Paul Durrant , Nicolas Saenz Julienne Subject: [PATCH 03/10] guestmemfs: add persistent data block allocator Date: Mon, 5 Aug 2024 11:32:38 +0200 Message-ID: <20240805093245.889357-4-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240805093245.889357-1-jgowans@amazon.com> References: <20240805093245.889357-1-jgowans@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.146.13.113] X-ClientProxiedBy: EX19D044UWB004.ant.amazon.com (10.13.139.134) To EX19D014EUC004.ant.amazon.com (10.252.51.182) X-Stat-Signature: jfiof9m551q8h6t45fderzmo1tt5d7ph X-Rspamd-Queue-Id: 4AD3FA0013 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1722850454-84536 X-HE-Meta: U2FsdGVkX19btmAHlV9KxRcs5ViW/PIOnIjGp2u9ebr5sfmKQbXdfItM7zORUIBqOTuNTsh+MREf9z4trCGng+swke0awmGoMDEMn9sE/seFbSYuNa25xlVB26zSkTQoAqgwCxra5pQHPLVJ3IkyWV79FvSFXD/fxgytd6IJUVw51l8mKSH3dBg+Lp1Pvi5sjP3t+WGyiqYjTgzMLwlpUpe0UoMVE5pkk6PySnOot9FbBAa7VY/Uzm1MeP2OBpmmfL35hU+TGM7Rqr88Aa/H3exOEoZgJPal3JcFFhlrTqBWwc2wFYj2L8v1V/kBFrwqdosTelVNsFZQcZDJrd3d3FXwLKAgQfwbyIVsKGdHzEvVThk9QLZ02wQAVJlg/UAKSj5/PiYxg0NB8Efuuk+SNpUDALeeu1N4N7pAZdpQnzuI+NG7nBcm6qhVU4KKfhJ73q6uH4OLoc/kK3qxeDQMCK8PdGHrTQCL/ASjgoBcAyVv4QBE98wXkLsgC7yuzGenJLk4zQcTlD/BxYbJS9jpu2cvk76R/QMF892oBknKhpI6yoOarWms7Eywuylp0HNmPQ37G7hKKxTHkpGoDUf/jMO03pbgwjL3OvxRaVRY7SERA8QNz6LwheaEGqNk1yRgZdIMzJzQTycsAj2U8FpU5Ti97UHm08J8g8phMS38HJnM9Iw0aRngWC7nrjuCYXqH6oFgMAMpMR+6eNA+ic+DfhVK4z/P/StyXxvkliOPkhGvAshpd2TASO+4Xe5drIGT3DLlJ8iAoAmOCeS5l4AmLcPuy4rW03/QeCsxXTMwFiGiZwdhJJaly7cSrtJ7Lk+6NoEkQRjWTrj8VtOyXD7F38kriLZLBzoD6AEv0sqs5R9sqLa0Smxijk1+If4qSMnwc4QlLh2TYamFgsgos4A5K1vNU7QJgjnrpgPiqGpgwqphwnocuT9MK4/3NslBbjIZzugBH5NFxB5IVQd0c+Y EbszFsHd /GyaquY9BRvTTgpCfeQpTdGwRcq06rxQTXjeK8/jrSiqXpLtcWahbLozi2ovAuodAIkQc8tRVNvnNDtCFarO4AjMTGpRXkuqRricA63wqxthXDa+q/bkwoJBFC9LwwpS4cHlnA8x2LeGRJGEwzHISnEeRNTz4ZeRii0/iTEHzCIUSOoY1+MkizNvYvq9RXpcqbRSlDvfSd1MMyPxnNwO5Vc1eC8eI6pGCMwITB0km++0d6+PgePA+EN97buxRhY9mzCo0XM1yGGdP59c7Z8PVtvdPZQaRy9gXPmZIw8RHuhHujNp9jkijwxRB8jQJYpiDsNyOt4ZC+XwdpcZ7VYqPDDwLADqgElSd+GODSQtG3PwMkPtk7RPdBBfm2QRJPpLL0LIbf/fpWASHeU3GmH8T1br1XXampeyEsoukI+eoFPgPZeAz/mjgGqmq2WP9+WiroFX78ajGmKBFZUe+rU96efXlQ1uPXC22uIoo88W6OUcjM6quMEci89eSUhoN6LXp+MyWx+KK0i48BV+VUa9hN5QrJ5uZM/2y+6RYYOxCMRrxdqZyo5Xf26gaar3KlUXDDF/VjfpqlYXhxc0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In order to assign backing data memory to files there needs to be the ability to allocate blocks of data from the large contiguous reserved memory block of filesystem memory. Here an allocated is added to serve that purpose. For now it's a simple bitmap allocator: each bit corresponds to a 2 MiB chunk in the filesystem data block. On initialisation the bitmap is allocated for a fixed size (TODO: make this dynamic based on filesystem memory size). Allocating a block involves finding and setting the next free bit. Allocations will be done in the next commit which adds support for truncating files. It's quite limiting having a fixed size bitmap, and we perhaps want to look at making this a dynamic and potentially large allocation early in boot using the memblock allocator. It may also turn out that a simple bitmap is too limiting and something with more metadata is needed. Signed-off-by: James Gowans --- fs/guestmemfs/Makefile | 2 +- fs/guestmemfs/allocator.c | 40 ++++++++++++++++++++++++++++++++++++++ fs/guestmemfs/guestmemfs.c | 4 ++++ fs/guestmemfs/guestmemfs.h | 3 +++ 4 files changed, 48 insertions(+), 1 deletion(-) create mode 100644 fs/guestmemfs/allocator.c diff --git a/fs/guestmemfs/Makefile b/fs/guestmemfs/Makefile index 804997799ce8..b357073a60f3 100644 --- a/fs/guestmemfs/Makefile +++ b/fs/guestmemfs/Makefile @@ -3,4 +3,4 @@ # Makefile for persistent kernel filesystem # -obj-y += guestmemfs.o inode.o dir.o +obj-y += guestmemfs.o inode.o dir.o allocator.o diff --git a/fs/guestmemfs/allocator.c b/fs/guestmemfs/allocator.c new file mode 100644 index 000000000000..3da14d11b60f --- /dev/null +++ b/fs/guestmemfs/allocator.c @@ -0,0 +1,40 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "guestmemfs.h" + +/** + * For allocating blocks from the guestmemfs filesystem. + */ + +static void *guestmemfs_allocations_bitmap(struct super_block *sb) +{ + return GUESTMEMFS_PSB(sb)->allocator_bitmap; +} + +void guestmemfs_zero_allocations(struct super_block *sb) +{ + memset(guestmemfs_allocations_bitmap(sb), 0, (1 << 20)); +} + +/* + * Allocs one 2 MiB block, and returns the block index. + * Index is 2 MiB chunk index. + * Negative error code if unable to alloc. + */ +long guestmemfs_alloc_block(struct super_block *sb) +{ + unsigned long free_bit; + void *allocations_mem = guestmemfs_allocations_bitmap(sb); + + free_bit = bitmap_find_next_zero_area(allocations_mem, + (1 << 20), /* Size */ + 0, /* Start */ + 1, /* Number of zeroed bits to look for */ + 0); /* Alignment mask - none required. */ + + if (free_bit >= PMD_SIZE / 2) + return -ENOMEM; + + bitmap_set(allocations_mem, free_bit, 1); + return free_bit; +} diff --git a/fs/guestmemfs/guestmemfs.c b/fs/guestmemfs/guestmemfs.c index 21cb3490a2bd..c45c796c497a 100644 --- a/fs/guestmemfs/guestmemfs.c +++ b/fs/guestmemfs/guestmemfs.c @@ -37,6 +37,9 @@ static int guestmemfs_fill_super(struct super_block *sb, struct fs_context *fc) psb->inodes = kzalloc(2 << 20, GFP_KERNEL); if (!psb->inodes) return -ENOMEM; + psb->allocator_bitmap = kzalloc(1 << 20, GFP_KERNEL); + if (!psb->allocator_bitmap) + return -ENOMEM; /* * Keep a reference to the persistent super block in the @@ -45,6 +48,7 @@ static int guestmemfs_fill_super(struct super_block *sb, struct fs_context *fc) sb->s_fs_info = psb; spin_lock_init(&psb->allocation_lock); guestmemfs_initialise_inode_store(sb); + guestmemfs_zero_allocations(sb); guestmemfs_get_persisted_inode(sb, 1)->flags = GUESTMEMFS_INODE_FLAG_DIR; strscpy(guestmemfs_get_persisted_inode(sb, 1)->filename, ".", GUESTMEMFS_FILENAME_LEN); diff --git a/fs/guestmemfs/guestmemfs.h b/fs/guestmemfs/guestmemfs.h index 3a2954d1beec..af9832390be3 100644 --- a/fs/guestmemfs/guestmemfs.h +++ b/fs/guestmemfs/guestmemfs.h @@ -13,6 +13,7 @@ struct guestmemfs_sb { unsigned long next_free_ino; unsigned long allocated_inodes; struct guestmemfs_inode *inodes; + void *allocator_bitmap; spinlock_t allocation_lock; }; @@ -37,6 +38,8 @@ struct guestmemfs_inode { }; void guestmemfs_initialise_inode_store(struct super_block *sb); +void guestmemfs_zero_allocations(struct super_block *sb); +long guestmemfs_alloc_block(struct super_block *sb); struct inode *guestmemfs_inode_get(struct super_block *sb, unsigned long ino); struct guestmemfs_inode *guestmemfs_get_persisted_inode(struct super_block *sb, int ino); From patchwork Mon Aug 5 09:32:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Gowans, James" X-Patchwork-Id: 13753330 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69DB2C3DA4A for ; Mon, 5 Aug 2024 09:34:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EFB436B0093; Mon, 5 Aug 2024 05:34:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EAED46B0095; Mon, 5 Aug 2024 05:34:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D4F276B0096; Mon, 5 Aug 2024 05:34:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id B59F76B0093 for ; Mon, 5 Aug 2024 05:34:32 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 37CB7161C6D for ; Mon, 5 Aug 2024 09:34:32 +0000 (UTC) X-FDA: 82417681584.14.8CDB010 Received: from smtp-fw-9102.amazon.com (smtp-fw-9102.amazon.com [207.171.184.29]) by imf16.hostedemail.com (Postfix) with ESMTP id 13BBB180010 for ; Mon, 5 Aug 2024 09:34:29 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=v4hYxFIO; spf=pass (imf16.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 207.171.184.29 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722850408; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rBGc29GJqkanga86wc+CDruFBMWuAWTKAOMI6eWst9M=; b=zkAgQOj00b54sz0wBvDT3ZS8VTA6EYeX9mqpbHgiGcKsCUK5h0Cf4OIr8PWdWl3ap4bup/ YEJ40oStYATPxberrYb8McdRSJGUvG8ZkznpbN1ZfRVxGxJrGDY/8G1iNvSqj6Vut48O2p qWJfkk+GkMq50hUnLYnvSUZkBipc+ic= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722850408; a=rsa-sha256; cv=none; b=MCeBEdpkSsmSs7FNWUUKMG6ceNvVy3gRzFy35U7GI1tDARmTLc5LusPNefWhxDBkBi4Ky9 K8edObV4NyhOvkJyHqzdagtRZtncS380ib3/YwUzNkhvkGN169ZGjqgRe+jZrdiIRIquUi 4fTw6PwROwi20JTP/6GnVrAxP0r/+Kg= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=v4hYxFIO; spf=pass (imf16.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 207.171.184.29 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1722850471; x=1754386471; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rBGc29GJqkanga86wc+CDruFBMWuAWTKAOMI6eWst9M=; b=v4hYxFIOOn8QcARdOynxpYFVXg9WnxPMiM4F6TMDuEgPr9LGnHWLXWV4 2S+vTN1XmW32pRDcl3jtnqEu7RuxwR5gALvuVcPAaVAcmk63grZKxqOpO j7hAgl5FIcwRZtgVSyAfwpCWb2IrsJmaXhsZk/BMEsdJNkPiKwuay3dOI 8=; X-IronPort-AV: E=Sophos;i="6.09,264,1716249600"; d="scan'208";a="440962193" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2024 09:34:20 +0000 Received: from EX19MTAEUB001.ant.amazon.com [10.0.10.100:16945] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.35.73:2525] with esmtp (Farcaster) id bb9ee3fa-f148-4792-ad18-0aaa1262bc00; Mon, 5 Aug 2024 09:34:18 +0000 (UTC) X-Farcaster-Flow-ID: bb9ee3fa-f148-4792-ad18-0aaa1262bc00 Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUB001.ant.amazon.com (10.252.51.26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:34:18 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.113) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:34:09 +0000 From: James Gowans To: CC: James Gowans , Sean Christopherson , Paolo Bonzini , Alexander Viro , Steve Sistare , Christian Brauner , Jan Kara , "Anthony Yznaga" , Mike Rapoport , "Andrew Morton" , , Jason Gunthorpe , , Usama Arif , , Alexander Graf , David Woodhouse , Paul Durrant , Nicolas Saenz Julienne Subject: [PATCH 04/10] guestmemfs: support file truncation Date: Mon, 5 Aug 2024 11:32:39 +0200 Message-ID: <20240805093245.889357-5-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240805093245.889357-1-jgowans@amazon.com> References: <20240805093245.889357-1-jgowans@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.146.13.113] X-ClientProxiedBy: EX19D044UWB004.ant.amazon.com (10.13.139.134) To EX19D014EUC004.ant.amazon.com (10.252.51.182) X-Stat-Signature: oznf4hzkknck6xjfm3ozhbquqyxsretw X-Rspamd-Queue-Id: 13BBB180010 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1722850469-862343 X-HE-Meta: U2FsdGVkX18lE6NqDcaoQz9vJnb5U9ycH63O7h0iIg+kRJuFTX3SCbwIMflgwZ4+cCOTaHoXHkfzHYBVRaKoghhv7gi75ByUE8s8cujNo50riLYkntgIBUORKt62DICgD9tn+OcoYoYf5x30Vjoi8bxFnK1vL/cIv0+K8J23tJkfcSfAGOdu+/XHRu6qHFZLSk0HbLetJRRkkDJSJMCDMsvqgm11hVBtugXBYKgqIov8gGYSVDFBQJi1GzcxuPg/+ntJoIX6k/PBSfhsx9396dDjjAxb3GzsHjygPlmYTM4OHQUf218eNFKs/TNolCQz7aSXsHwYxHzfnKeI2rIfoIOiYyXphyKsqEOrIwX3Dev1RFAWBic+x/lZbUlSbapymucmz66Ol1keSsegJlnXpvQZIgRkSU0Xsozc83L3OlMSE9/XfwMcGoKkeZv1Sqdp927DoQpT6yrjKjpSngjxTIvOcKhjOGF4/ZxACbkKja5+tqHfvXRNnj/J54s+APxz8KHWfBBF3ntbbRZVF+M3pkwrV+7hN4S45cUzcJWT4hedfDSHQOKm1ifLIGcoK1QYMOR6G3Qeb+UhpxT2ggXnxWBXEIn9acALkZywlbD0dS+pBVjQv1z7agO0x6YkhghW7qTG1CLTCR5M3NalMdYkBGoZS0ggDsR7SCuXEbfLlbbmZy0L7Jf+sBVocQq7BoeJzM5lHpZwPKcFvFFq/Ahodn5f5aY/c5SJaHyO3cCTzOye/P/vGiP9lGzpN1/XJGi81t3mJpAHlMaD7XiQLYircHarn7g8ckq5QzOVWe0o7JawOXbxxOMgcmb4bt+jgIzlE/cZC2l/h/ZfTZntZfyh6pZCotjl/JgX9YAm2uDpOlwqulh04X/wr8kr9eD9Nvp39r5HbXUJj8A7A27QgKDbQ3iQa9OVjRnRDdIYMqmWVjUorZRfOnfVwcz0vHVUkkFe/udepNobkvjqDSIYvb2 z/L6pqLC uZ5NkGNeVjqV4Y8fIolDynH08WOCAty+G/KuEtp0Q6/jDwdXA9LqRCWOW0T8WUmscVX9dn3aZiu0QH+9j8SgGvtrm2iIJ4Qbo59ze+JnbZ8rxZXRIo2kntbY5iJW+ywy+DaaBATm6z3u4rGrOJAZBsIRxfMT4ZwOGoac8IVRsQPdM2v7i7eu33dtavXycwP4D639shSkhSIXNdzJ8bKd7U1LKV1RgcjMJ1HtG/CkE2sbkQt8Rapoh3DKQLViEExmCDF9xI6AoCqljDQZEZ3xK/U8TVxdnug7QrpE1O90BfdTiTOsYRpCus3LAPqqD1LExdFr6aWoW6lqmXR4MPLKH+cDrZDLURQTQ/dXh3cw8P7JAgo38IZ8EKKydLXpYUSF9UlAh7280SoXmgCt+xbNYMjgPfpaS+dkag0M9rOZhYPP/MmJQEbiiWpcNaWjtNtGpgdoVRw3ymdOQZEIPvM7R2nfXl/nU3oI+hEQTavA+Wi37Anghbob/CMv3H1knjAV2mFceukju9N5eTPyqj2xE7JReWTqdo6LXed01cI3i0wSomYdAjJu/9o/bWWlhsiVLWs7Pg3Cgu8b+V0o= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In a previous commit a block allocator was added. Now use that block allocator to allocate blocks for files when ftruncate is run on them. To do that a inode_operations is added on the file inodes with a getattr callback handling the ATTR_SIZE attribute. When this is invoked pages are allocated, the indexes of which are put into a mappings block. The mappings block is an array with the index being the file offset block and the value at that index being the pkernfs block backign that file offset. Signed-off-by: James Gowans --- fs/guestmemfs/Makefile | 2 +- fs/guestmemfs/file.c | 52 ++++++++++++++++++++++++++++++++++++++ fs/guestmemfs/guestmemfs.h | 2 ++ fs/guestmemfs/inode.c | 25 +++++++++++++++--- 4 files changed, 77 insertions(+), 4 deletions(-) create mode 100644 fs/guestmemfs/file.c diff --git a/fs/guestmemfs/Makefile b/fs/guestmemfs/Makefile index b357073a60f3..e93e43ba274b 100644 --- a/fs/guestmemfs/Makefile +++ b/fs/guestmemfs/Makefile @@ -3,4 +3,4 @@ # Makefile for persistent kernel filesystem # -obj-y += guestmemfs.o inode.o dir.o allocator.o +obj-y += guestmemfs.o inode.o dir.o allocator.o file.o diff --git a/fs/guestmemfs/file.c b/fs/guestmemfs/file.c new file mode 100644 index 000000000000..618c93b12196 --- /dev/null +++ b/fs/guestmemfs/file.c @@ -0,0 +1,52 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "guestmemfs.h" + +static int truncate(struct inode *inode, loff_t newsize) +{ + unsigned long free_block; + struct guestmemfs_inode *guestmemfs_inode; + unsigned long *mappings; + + guestmemfs_inode = guestmemfs_get_persisted_inode(inode->i_sb, inode->i_ino); + mappings = guestmemfs_inode->mappings; + i_size_write(inode, newsize); + for (int block_idx = 0; block_idx * PMD_SIZE < newsize; ++block_idx) { + free_block = guestmemfs_alloc_block(inode->i_sb); + if (free_block < 0) + /* TODO: roll back allocations. */ + return -ENOMEM; + *(mappings + block_idx) = free_block; + ++guestmemfs_inode->num_mappings; + } + return 0; +} + +static int inode_setattr(struct mnt_idmap *idmap, struct dentry *dentry, struct iattr *iattr) +{ + struct inode *inode = dentry->d_inode; + int error; + + error = setattr_prepare(idmap, dentry, iattr); + if (error) + return error; + + if (iattr->ia_valid & ATTR_SIZE) { + error = truncate(inode, iattr->ia_size); + if (error) + return error; + } + setattr_copy(idmap, inode, iattr); + mark_inode_dirty(inode); + return 0; +} + +const struct inode_operations guestmemfs_file_inode_operations = { + .setattr = inode_setattr, + .getattr = simple_getattr, +}; + +const struct file_operations guestmemfs_file_fops = { + .owner = THIS_MODULE, + .iterate_shared = NULL, +}; diff --git a/fs/guestmemfs/guestmemfs.h b/fs/guestmemfs/guestmemfs.h index af9832390be3..7ea03ac8ecca 100644 --- a/fs/guestmemfs/guestmemfs.h +++ b/fs/guestmemfs/guestmemfs.h @@ -44,3 +44,5 @@ struct inode *guestmemfs_inode_get(struct super_block *sb, unsigned long ino); struct guestmemfs_inode *guestmemfs_get_persisted_inode(struct super_block *sb, int ino); extern const struct file_operations guestmemfs_dir_fops; +extern const struct file_operations guestmemfs_file_fops; +extern const struct inode_operations guestmemfs_file_inode_operations; diff --git a/fs/guestmemfs/inode.c b/fs/guestmemfs/inode.c index 2360c3a4857d..61f70441d82c 100644 --- a/fs/guestmemfs/inode.c +++ b/fs/guestmemfs/inode.c @@ -15,14 +15,28 @@ struct guestmemfs_inode *guestmemfs_get_persisted_inode(struct super_block *sb, struct inode *guestmemfs_inode_get(struct super_block *sb, unsigned long ino) { + struct guestmemfs_inode *guestmemfs_inode; struct inode *inode = iget_locked(sb, ino); /* If this inode is cached it is already populated; just return */ if (!(inode->i_state & I_NEW)) return inode; - inode->i_op = &guestmemfs_dir_inode_operations; + guestmemfs_inode = guestmemfs_get_persisted_inode(sb, ino); inode->i_sb = sb; - inode->i_mode = S_IFREG; + + if (guestmemfs_inode->flags & GUESTMEMFS_INODE_FLAG_DIR) { + inode->i_op = &guestmemfs_dir_inode_operations; + inode->i_mode = S_IFDIR; + } else { + inode->i_op = &guestmemfs_file_inode_operations; + inode->i_mode = S_IFREG; + inode->i_fop = &guestmemfs_file_fops; + inode->i_size = guestmemfs_inode->num_mappings * PMD_SIZE; + } + + set_nlink(inode, 1); + + /* Switch based on file type */ unlock_new_inode(inode); return inode; } @@ -103,6 +117,7 @@ static struct dentry *guestmemfs_lookup(struct inode *dir, unsigned int flags) { struct guestmemfs_inode *guestmemfs_inode; + struct inode *vfs_inode; unsigned long ino; guestmemfs_inode = guestmemfs_get_persisted_inode(dir->i_sb, dir->i_ino); @@ -112,7 +127,10 @@ static struct dentry *guestmemfs_lookup(struct inode *dir, if (!strncmp(guestmemfs_inode->filename, dentry->d_name.name, GUESTMEMFS_FILENAME_LEN)) { - d_add(dentry, guestmemfs_inode_get(dir->i_sb, ino)); + vfs_inode = guestmemfs_inode_get(dir->i_sb, ino); + mark_inode_dirty(dir); + inode_update_timestamps(vfs_inode, S_ATIME); + d_add(dentry, vfs_inode); break; } ino = guestmemfs_inode->sibling_ino; @@ -162,3 +180,4 @@ const struct inode_operations guestmemfs_dir_inode_operations = { .lookup = guestmemfs_lookup, .unlink = guestmemfs_unlink, }; + From patchwork Mon Aug 5 09:32:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Gowans, James" X-Patchwork-Id: 13753331 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6CA17C3DA7F for ; Mon, 5 Aug 2024 09:34:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9A58B6B0095; Mon, 5 Aug 2024 05:34:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 92D5C6B0096; Mon, 5 Aug 2024 05:34:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7815B6B0098; Mon, 5 Aug 2024 05:34:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 5C4E16B0095 for ; Mon, 5 Aug 2024 05:34:33 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 05556A8AF8 for ; Mon, 5 Aug 2024 09:34:33 +0000 (UTC) X-FDA: 82417681626.25.41B07FF Received: from smtp-fw-52002.amazon.com (smtp-fw-52002.amazon.com [52.119.213.150]) by imf26.hostedemail.com (Postfix) with ESMTP id EE24214000D for ; Mon, 5 Aug 2024 09:34:30 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=TKnw5uIk; spf=pass (imf26.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 52.119.213.150 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722850440; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QLzX8kQ4Me7VkWkGGUOEVumIPTgCaBccV0Vp9jQZvU0=; b=5wB+/wr7M6hFDd2GF86t7QauSWxGvUBbn9hrumWWO4v3XzEe+rykTZc6K/s67WGl3+HofB MGZUbAnmmcI5yR6JQP/egvHCM9sQe9Q4/QEdJRWpkpfPr0Dgyzx0yhGFWnMlcMxbbMBnSK h5W4bDxpc85QEaU2MFmwbLfyrw1+FGc= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=TKnw5uIk; spf=pass (imf26.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 52.119.213.150 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722850440; a=rsa-sha256; cv=none; b=Z51LudRc2uCzzFcQmpz/UXQ4yU3GmFHawmKICb8C/RCHLgzaX3l9Tov8sWdaHcV64kl7q8 BudX5rHno96v1YN5xS3j1YOONilpQXAYJNiLCCOaeYQxkD8wByupZbU1JcA3o3KKuOS5o/ owqcOf84x5dxXF+WNN+EkyVNyaJGjLg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1722850471; x=1754386471; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QLzX8kQ4Me7VkWkGGUOEVumIPTgCaBccV0Vp9jQZvU0=; b=TKnw5uIkWkfndoeMCMsx5vw6hULdccDEnvHuP0BOnvIEVFcjdZySe+hD PXjDoQ/MhHZuPzGGfG1LZx5GaZllA4FXABFsNdDaYFyhYVLzjGOWYTj+k 8OC/XrAWvYpAhgis8PccOnrl2txeb4C5y0FMnqsQdL48f5K0V+1EkJi/a c=; X-IronPort-AV: E=Sophos;i="6.09,264,1716249600"; d="scan'208";a="650673665" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52002.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2024 09:34:29 +0000 Received: from EX19MTAEUB002.ant.amazon.com [10.0.17.79:60629] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.35.73:2525] with esmtp (Farcaster) id d3e48c8d-1f8d-494a-8e05-dd641a46ffbf; Mon, 5 Aug 2024 09:34:28 +0000 (UTC) X-Farcaster-Flow-ID: d3e48c8d-1f8d-494a-8e05-dd641a46ffbf Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUB002.ant.amazon.com (10.252.51.79) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:34:27 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.113) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:34:18 +0000 From: James Gowans To: CC: James Gowans , Sean Christopherson , Paolo Bonzini , Alexander Viro , Steve Sistare , Christian Brauner , Jan Kara , "Anthony Yznaga" , Mike Rapoport , "Andrew Morton" , , Jason Gunthorpe , , Usama Arif , , Alexander Graf , David Woodhouse , Paul Durrant , Nicolas Saenz Julienne Subject: [PATCH 05/10] guestmemfs: add file mmap callback Date: Mon, 5 Aug 2024 11:32:40 +0200 Message-ID: <20240805093245.889357-6-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240805093245.889357-1-jgowans@amazon.com> References: <20240805093245.889357-1-jgowans@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.146.13.113] X-ClientProxiedBy: EX19D044UWB004.ant.amazon.com (10.13.139.134) To EX19D014EUC004.ant.amazon.com (10.252.51.182) X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: EE24214000D X-Stat-Signature: r9cbzraz8ycy6ci935z7pq3rp9a3etkk X-HE-Tag: 1722850470-95087 X-HE-Meta: U2FsdGVkX185wP6oU4k9bBadHsrKtP/BLDFsy8oaocQb7lZpJ/UUDunwpGbr+QrL8T/XGPRpbrA32dCV+6Q+JwdFXcNwg4pGwxA+1ZHgbbCs/7fIFrLTPRALiLdpWUJOllkjROXh/ygrQS5vmJHWL3OoZu/CJksxOQ9Tp3/Opm+LRU35i/xxmeKzpq8Y/pVefHh5IohSYjxMOSPbbJupv6DP1HgudZvkYs/x0EBsdNGTfCwRTXtJH75EfD46JiseNxAzQImVWBPQipBfObYfZRR0OfRbu5isQhIHzWSHhe+aNzf7u5DGWhcl05ehgJEllohq/pXXb3fabQ7zyMKoLDevLLiwLZmZgLWCNeMzayvg7gN7gSkmG4wpfKMSdt1IVjWNdp9KmKU94XMkbTQAWJ7wsgZCa8f/inHH6zW+7a7+My1zkrM3vi0XAFgZH3GixQcPnbRdapj2OKxrKGJpWmu0JXJHMcYL0ZR0pUFiwoYhdyHK7fi7/qIx5OZH1uzsSDqyolHb99aG2kh3vn8IPbN0HB5U6zBLqO0D8VCEPhDvrQk6Mdgxp+iCPs89v5wUD7PkG2XHxAG98r/ZZMFpjbnqD7su82VfOwk4Kkafp9HkERLTmfGyMEuwCxvZ0hlgVe080VREamFc4UenRIymh/xo5zY/WgZU4zott4vCnLZie7QD698NRsYjNTHqPpuq4Lqz/ySufzRNnNETbkQ1puwpOo229qQ21w6HmzBGhvFES3YZ2Kt5Fz0ny0yiHYMa21vps4SwnsQal3MV/zXFWp6I3gkJC97dI69cuLEZru9PpmFoDFytJCjWBqIuDASsImAQtwb2UmQ6f3ryRT2eM9O8Vch5MuFGwSttAj4RZVWq9w7VDOG+Hq2JmHWMCY4cyGrsNwyOj9BpTDltVpzNDY5eAZhc5X74wUVQnpHIpGmT/jp+T7KtPK7uIlX4ccyiaXfEKBcpCeH9gQRRVx2 2Fwme9aA tiZMKEXTX82tbp33rMuF5zxJhtLBTZZBE629CJX6KjrlmRFUdWZwO1WDHELt11cA17Aqh31OZgAK1f98gBfhenDILdr98eN43jHfJJWt/+Du319yblvgfNk2zsCyKWXjE6LpkjbXve03pHSamORYCX+XjV0lGxWqYQb/QOEiirWsXr4nE1QLnoobE//wfrx/sUwRhU3pEt9VB5yhAOUO/BJEYo6DLGCVpEAP8QYBKkcksXw0WGjwtvPb9L56J3BMDFff0WGrJuwuSPG36Oeo+3s4wVFzLJ6OkYoMqB7op1K2RxAVdy8CgRYrU0QGOUIxbDSEASFWcETE5X3D9DuhK7Fjv6raLjtq3zvIpMikBmRwd0ah+3Z2yfbNsIbu/bnobGP+P0dWFabbx3avjZS9hq+oiwh1YR9hn4WSjdYRzzQvFdAuB0g7NYd/BWd2YRt3+WxJVfz47V+aglB756sXwohpN/Vk2USNNKWKD8ZNO1D64WzYDEfANwp53CcSklYWn9X5rDaj69233fFbY/e/1pNJp+9RFDE6HLeUnEd7QxF6/7MidgEIvHFUBznzoMJBXZ1xf8rOyiiUs/SE3/kbgV8eZLMux9uDhzz+9OxxrNaRo5q8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Make the file data usable to userspace by adding mmap. That's all that QEMU needs for guest RAM, so that's all be bother implementing for now. When mmaping the file the VMA is marked as PFNMAP to indicate that there are no struct pages for the memory in this VMA. Remap_pfn_range() is used to actually populate the page tables. All PTEs are pre-faulted into the pgtables at mmap time so that the pgtables are usable when this virtual address range is given to VFIO's MAP_DMA. Signed-off-by: James Gowans --- fs/guestmemfs/file.c | 43 +++++++++++++++++++++++++++++++++++++- fs/guestmemfs/guestmemfs.c | 2 +- fs/guestmemfs/guestmemfs.h | 3 +++ 3 files changed, 46 insertions(+), 2 deletions(-) diff --git a/fs/guestmemfs/file.c b/fs/guestmemfs/file.c index 618c93b12196..b1a52abcde65 100644 --- a/fs/guestmemfs/file.c +++ b/fs/guestmemfs/file.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only #include "guestmemfs.h" +#include static int truncate(struct inode *inode, loff_t newsize) { @@ -41,6 +42,46 @@ static int inode_setattr(struct mnt_idmap *idmap, struct dentry *dentry, struct return 0; } +/* + * To be able to use PFNMAP VMAs for VFIO DMA mapping we need the page tables + * populated with mappings. Pre-fault everything. + */ +static int mmap(struct file *filp, struct vm_area_struct *vma) +{ + int rc; + unsigned long *mappings_block; + struct guestmemfs_inode *guestmemfs_inode; + + guestmemfs_inode = guestmemfs_get_persisted_inode(filp->f_inode->i_sb, + filp->f_inode->i_ino); + + mappings_block = guestmemfs_inode->mappings; + + /* Remap-pfn-range will mark the range VM_IO */ + for (unsigned long vma_addr_offset = vma->vm_start; + vma_addr_offset < vma->vm_end; + vma_addr_offset += PMD_SIZE) { + int block, mapped_block; + unsigned long map_size = min(PMD_SIZE, vma->vm_end - vma_addr_offset); + + block = (vma_addr_offset - vma->vm_start) / PMD_SIZE; + mapped_block = *(mappings_block + block); + /* + * It's wrong to use rempa_pfn_range; this will install PTE-level entries. + * The whole point of 2 MiB allocs is to improve TLB perf! + * We should use something like mm/huge_memory.c#insert_pfn_pmd + * but that is currently static. + * TODO: figure out the best way to install PMDs. + */ + rc = remap_pfn_range(vma, + vma_addr_offset, + (guestmemfs_base >> PAGE_SHIFT) + (mapped_block * 512), + map_size, + vma->vm_page_prot); + } + return 0; +} + const struct inode_operations guestmemfs_file_inode_operations = { .setattr = inode_setattr, .getattr = simple_getattr, @@ -48,5 +89,5 @@ const struct inode_operations guestmemfs_file_inode_operations = { const struct file_operations guestmemfs_file_fops = { .owner = THIS_MODULE, - .iterate_shared = NULL, + .mmap = mmap, }; diff --git a/fs/guestmemfs/guestmemfs.c b/fs/guestmemfs/guestmemfs.c index c45c796c497a..38f20ad25286 100644 --- a/fs/guestmemfs/guestmemfs.c +++ b/fs/guestmemfs/guestmemfs.c @@ -9,7 +9,7 @@ #include #include -static phys_addr_t guestmemfs_base, guestmemfs_size; +phys_addr_t guestmemfs_base, guestmemfs_size; struct guestmemfs_sb *psb; static int statfs(struct dentry *root, struct kstatfs *buf) diff --git a/fs/guestmemfs/guestmemfs.h b/fs/guestmemfs/guestmemfs.h index 7ea03ac8ecca..0f2788ce740e 100644 --- a/fs/guestmemfs/guestmemfs.h +++ b/fs/guestmemfs/guestmemfs.h @@ -8,6 +8,9 @@ #define GUESTMEMFS_FILENAME_LEN 255 #define GUESTMEMFS_PSB(sb) ((struct guestmemfs_sb *)sb->s_fs_info) +/* Units of bytes */ +extern phys_addr_t guestmemfs_base, guestmemfs_size; + struct guestmemfs_sb { /* Inode number */ unsigned long next_free_ino; From patchwork Mon Aug 5 09:32:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Gowans, James" X-Patchwork-Id: 13753332 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83050C3DA4A for ; Mon, 5 Aug 2024 09:35:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 22DDB6B0099; Mon, 5 Aug 2024 05:35:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 202A16B009A; Mon, 5 Aug 2024 05:35:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0C9966B009B; Mon, 5 Aug 2024 05:35:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id E38946B0099 for ; Mon, 5 Aug 2024 05:35:24 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 5CFA7141C90 for ; Mon, 5 Aug 2024 09:35:24 +0000 (UTC) X-FDA: 82417683768.30.F3BFF64 Received: from smtp-fw-80007.amazon.com (smtp-fw-80007.amazon.com [99.78.197.218]) by imf15.hostedemail.com (Postfix) with ESMTP id 4F2D4A0009 for ; Mon, 5 Aug 2024 09:35:22 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=BZaZ630C; spf=pass (imf15.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 99.78.197.218 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722850460; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Wges4MT0n6I9CyO780W2I1CcNRjpe5PQ75LHC+blwyY=; b=qTFYje2JvMFp/JZG708aeApuAGfoaGJilWIpwhuXYZK4KhgGhJYuFcTjRHxIVRNe9TMXwv Yj/L1wCCAjWMprX4Hw5XSP9gvmpSfCecLRH6fiTS47uKKgeOqVEqNzmpr+P4Y5PepvSnCW bpRXEbUasbf0KQbxAws3TYyehARrpG4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722850460; a=rsa-sha256; cv=none; b=Y5/ymXE5SoJ+/LJXLEs1cF4ntlU9JmXa2B1w+wGXcHL6R3p3xGbnIUcLyIlZXwpM6hmmYA IJ5ZaxA+4l4eoLLpvg7p2xzHU0Ioxf0EFT3Ksycwn5vXODXVnE86xcI0dPld5kNu5CAtnD s43CvUFOh6LvsMuYIHGJAmz6t1TErh0= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=BZaZ630C; spf=pass (imf15.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 99.78.197.218 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1722850522; x=1754386522; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Wges4MT0n6I9CyO780W2I1CcNRjpe5PQ75LHC+blwyY=; b=BZaZ630Czllo7tAoiVOI93WyGbuSy0ixVXym56SPOhUnTDUtsCCWRwaL KBFL1+H3FhMKGYhSOaIpQt+VKzy/5T8LDu9lrkawq2NJQt+M49ghByb7o BLVImtlqykFutxqfqo4hb832aAsumYILUioDCfrZlDkl/E83Ub+XURFGe M=; X-IronPort-AV: E=Sophos;i="6.09,264,1716249600"; d="scan'208";a="318022324" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-80007.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2024 09:35:09 +0000 Received: from EX19MTAEUC002.ant.amazon.com [10.0.43.254:19549] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.14.223:2525] with esmtp (Farcaster) id 73411eaf-fd8d-4133-a679-10dbf0f4e67a; Mon, 5 Aug 2024 09:35:08 +0000 (UTC) X-Farcaster-Flow-ID: 73411eaf-fd8d-4133-a679-10dbf0f4e67a Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUC002.ant.amazon.com (10.252.51.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:35:08 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.113) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:34:58 +0000 From: James Gowans To: CC: James Gowans , Sean Christopherson , Paolo Bonzini , Alexander Viro , Steve Sistare , Christian Brauner , Jan Kara , "Anthony Yznaga" , Mike Rapoport , "Andrew Morton" , , Jason Gunthorpe , , Usama Arif , , Alexander Graf , David Woodhouse , Paul Durrant , Nicolas Saenz Julienne Subject: [PATCH 06/10] kexec/kho: Add addr flag to not initialise memory Date: Mon, 5 Aug 2024 11:32:41 +0200 Message-ID: <20240805093245.889357-7-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240805093245.889357-1-jgowans@amazon.com> References: <20240805093245.889357-1-jgowans@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.146.13.113] X-ClientProxiedBy: EX19D046UWB002.ant.amazon.com (10.13.139.181) To EX19D014EUC004.ant.amazon.com (10.252.51.182) X-Stat-Signature: qffrcttsciqj4afmzygpd1xy5uctanh5 X-Rspamd-Queue-Id: 4F2D4A0009 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1722850522-913054 X-HE-Meta: U2FsdGVkX1/DAQ0mLEBwUD1Y4jL5ZA8Foju5PvupsrfFxd/HRirPXduSqTVefiloYqd94KhQywo88WOAIsXEvfn0AdyhoVGaSbx4eeTKAj/0EJUcmqPJm/KqYRzOXi+Ql/7wjY0NbYowkCM512RhPFfdgX5bnCXxaUopkzX+jrhNIUzDRazn3rb6MtG8hg0mgaE3PSXd+bGc5QSUjwRAvbVtatvNNFQJ5b6aqhmgiB2GMQ2YkXMkP4EMiJC6uNwtF1DHkUOGOIiGl2Dp89w3+yOwzEymtvPKXFYjWfRvLbpsxN9Hc02VBll9H05pp33SRXC50D6B4CKkUtijNRe8ngOfGtz2viexWVAxOdZ7XLwUM1nehLTtAn9GUV9m+x1c6VLs1F+PBX6grRRw8QjNr3eVdla/GSEc5IAsdPAHx+WPAYBh+WWMAFqe+dKL7C7kmcgZXxJ+ORqq4uDZ6I/ZJPGacQAIh4/8ooOBl9E5yIees6/4wYZC4/gFizJIYIUnY50BD8OQr3/co9veD+OZhor5445yDy2OPtddmlrOB8rvm35uUpfYvwGfH2HwMu+jhEp6MLyEGQh3TsV/NxcTvxToHrGIkY3Ky4kxfzhCrIjl8272AmFh5MB6d7tt6GwAh7FCGCoAlFLKREoMlN577GGeRWxC8/qqPTt7qA+PL6Tt59PzLldev9YLR1pNKNyYYF9upf5IyLB96EbBs+PxCFlEWXb81NH0475paks260qtGXZ6YjbPOLShCegwySlQA7hKnwl27Jn1+XveZLuzEVrKDESEzdF/R/GJvDtPdVRpKC2es2mpttN4vIGqD12SKYwosur3KdL16V+Pbl6LVNL4M7Jhd48ucrWdB42Vv5g2r43yzALw4t+xAPr2y0JuOlBMoTHl6g4cCSILf604DkXvFXa7R9uyDFGPqEX0WBdnZl9B5p44fuyn/YicmNsVr7Z5RnNg+0Morsiu/xw 1QpC0RLU KsZusMyzAIzjkGTqb86SdBxtrT0uJbyABTCngzJyNf5s7DNRlNDnaaFRPJ+XqB0Qv7yh+xUtSlh93F4Xs9BkM0bq+lxE0bWMBsMrxKCxOZMEaMxhjnySxUQIr+WDEhQXtF2Sf0nNRyWzWVHxw/WKlB6IYxbBItyMe6Uba9Sy9wi2tAAjEKjDA/8828KbDcMmbBBE1pmTi2C5dUTpqrMzzmheYirDsel4x7wVOOcWYIbbBY7mK0IzGi+L4nWRYCnZMfwOQOHnyf0uLSopdOejIqebrJa9+U2DJktwccBktqim4JfnqHgzifdspLEXsJrCNuRrKyLipeLsNXkk1quZFyaqM+5veWXab4USq0MHzBcZziDvJdVJI9O0WGUBIXuZtHunlm0czgS5pBgpQMdpB3bpoR7Kn/8eS43Ya69cZSP6z1k/iGaB/MfhDc7XS56Z3ObgjwSCdftKaTLHdb+P3ErLiBSLzgp1X7TEPyjPw+INGevoa2tLaXVuJEJM7l8XRgj3lR2gu5Rw0FK8g9mI0XsYoEPVgRwXwIDiTvqypS7bLYf8bky2gneUvt69t/yqysN+PVBoYANMiwUY9c/S8S0EIKVo3ig+zS2TKBGQ386qvYGg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Smuggle a flag on the address field. If set the memory region being reserved via KHO will be marked as no init in memblocks so it will not get struct pages, will not get given to the buddy allocator and will not be part of the direct map. This allows drivers to pass memory ranges which the driver has allocated itself from memblocks, independent of the kernel's mm and struct page based memory management. Signed-off-by: James Gowans --- include/uapi/linux/kexec.h | 6 ++++++ kernel/kexec_kho_in.c | 12 +++++++++++- kernel/kexec_kho_out.c | 4 ++++ 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h index ad9e95b88b34..1c031a261c2c 100644 --- a/include/uapi/linux/kexec.h +++ b/include/uapi/linux/kexec.h @@ -52,6 +52,12 @@ /* KHO passes an array of kho_mem as "mem cache" to the new kernel */ struct kho_mem { + /* + * Use the last bits for flags; addrs should be at least word + * aligned. + */ +#define KHO_MEM_ADDR_FLAG_NOINIT BIT(0) +#define KHO_MEM_ADDR_FLAG_MASK (BIT(1) - 1) __u64 addr; __u64 len; }; diff --git a/kernel/kexec_kho_in.c b/kernel/kexec_kho_in.c index 5f8e0d9f9e12..943d9483b009 100644 --- a/kernel/kexec_kho_in.c +++ b/kernel/kexec_kho_in.c @@ -75,6 +75,11 @@ __init void kho_populate_refcount(void) */ for (offset = 0; offset < mem_len; offset += sizeof(struct kho_mem)) { struct kho_mem *mem = mem_virt + offset; + + /* No struct pages for this region; nothing to claim. */ + if (mem->addr & KHO_MEM_ADDR_FLAG_NOINIT) + continue; + u64 start_pfn = PFN_DOWN(mem->addr); u64 end_pfn = PFN_UP(mem->addr + mem->len); u64 pfn; @@ -183,8 +188,13 @@ void __init kho_reserve_previous_mem(void) /* Then populate all preserved memory areas as reserved */ for (off = 0; off < mem_len; off += sizeof(struct kho_mem)) { struct kho_mem *mem = mem_virt + off; + __u64 addr = mem->addr & ~KHO_MEM_ADDR_FLAG_MASK; - memblock_reserve(mem->addr, mem->len); + memblock_reserve(addr, mem->len); + if (mem->addr & KHO_MEM_ADDR_FLAG_NOINIT) { + memblock_reserved_mark_noinit(addr, mem->len); + memblock_mark_nomap(addr, mem->len); + } } /* Unreserve the mem cache - we don't need it from here on */ diff --git a/kernel/kexec_kho_out.c b/kernel/kexec_kho_out.c index 2cf5755f5e4a..4d9da501c5dc 100644 --- a/kernel/kexec_kho_out.c +++ b/kernel/kexec_kho_out.c @@ -175,6 +175,10 @@ static int kho_alloc_mem_cache(struct kimage *image, void *fdt) const struct kho_mem *mem = &mems[i]; ulong mstart = PAGE_ALIGN_DOWN(mem->addr); ulong mend = PAGE_ALIGN(mem->addr + mem->len); + + /* Re-apply flags lost during round down. */ + mstart |= mem->addr & KHO_MEM_ADDR_FLAG_MASK; + struct kho_mem cmem = { .addr = mstart, .len = (mend - mstart), From patchwork Mon Aug 5 09:32:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Gowans, James" X-Patchwork-Id: 13753333 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7045C52D70 for ; Mon, 5 Aug 2024 09:35:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7D2926B009A; Mon, 5 Aug 2024 05:35:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 735266B009B; Mon, 5 Aug 2024 05:35:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 537D06B009C; Mon, 5 Aug 2024 05:35:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2E8A66B009A for ; Mon, 5 Aug 2024 05:35:26 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E2B7881CB0 for ; Mon, 5 Aug 2024 09:35:25 +0000 (UTC) X-FDA: 82417683810.12.42CEF73 Received: from smtp-fw-2101.amazon.com (smtp-fw-2101.amazon.com [72.21.196.25]) by imf13.hostedemail.com (Postfix) with ESMTP id DF3BF2000F for ; Mon, 5 Aug 2024 09:35:23 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=l0N4dnJ9; spf=pass (imf13.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 72.21.196.25 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722850442; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GwcyuHUonPineq63SNtpYz+107twfbb1IfXA3IZYCMA=; b=Xs7wahmhkLSq8fu7MpYX5b2e+HlsbVx5YKtmKbsyklp8KnyPpuqaRmM1O0HmQOaRyalOxQ ukROgJbhrh5GbaTdzdDu/pd+fWJOdoCFaZruliMXpqfZ7RqBpOg8FUrOL+8Nrw0EmdeKMs jBskWaOBi8W7pFTYODL6gZg5WFvlvGI= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=l0N4dnJ9; spf=pass (imf13.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 72.21.196.25 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722850442; a=rsa-sha256; cv=none; b=GLe6Fy7Kt3KDVm5hUhsjTEcLWCST62yJAhXNNyS/DpIveeU9FewD33TCbysNZBDtuxEAZW EKFIYISF9GQj/yhGMh1DgvlbDW9QuCh7gTy3pumMQWCAHgk70aZ0Aj4IAVG2IXxfWtKEVz 4PgxqQglPoaxNWP+WM4mozrzgLBCQ/M= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1722850524; x=1754386524; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GwcyuHUonPineq63SNtpYz+107twfbb1IfXA3IZYCMA=; b=l0N4dnJ9A+R7uRnHMR5icwiwQLh0cAexhbGXt4mlcADbD2ueg0LDFuSY FZyUv3jjNy6obk/7pwD44imNYq0gQnELfHqNbOL92oJmeMxUXIkRAZnRb qLdahAPa0AFjkm54bQMPbb7wvAhOam8CQS5BN2Z+E5R1bOMbsYtAC7c4A Y=; X-IronPort-AV: E=Sophos;i="6.09,264,1716249600"; d="scan'208";a="419323041" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-2101.iad2.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2024 09:35:20 +0000 Received: from EX19MTAEUC001.ant.amazon.com [10.0.17.79:14005] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.35.73:2525] with esmtp (Farcaster) id e982917a-a97c-45f2-b4e4-5a00217a4f1e; Mon, 5 Aug 2024 09:35:18 +0000 (UTC) X-Farcaster-Flow-ID: e982917a-a97c-45f2-b4e4-5a00217a4f1e Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUC001.ant.amazon.com (10.252.51.155) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:35:18 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.113) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:35:08 +0000 From: James Gowans To: CC: James Gowans , Sean Christopherson , Paolo Bonzini , Alexander Viro , Steve Sistare , Christian Brauner , Jan Kara , "Anthony Yznaga" , Mike Rapoport , "Andrew Morton" , , Jason Gunthorpe , , Usama Arif , , Alexander Graf , David Woodhouse , Paul Durrant , Nicolas Saenz Julienne Subject: [PATCH 07/10] guestmemfs: Persist filesystem metadata via KHO Date: Mon, 5 Aug 2024 11:32:42 +0200 Message-ID: <20240805093245.889357-8-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240805093245.889357-1-jgowans@amazon.com> References: <20240805093245.889357-1-jgowans@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.146.13.113] X-ClientProxiedBy: EX19D046UWB002.ant.amazon.com (10.13.139.181) To EX19D014EUC004.ant.amazon.com (10.252.51.182) X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: DF3BF2000F X-Stat-Signature: ubqctyse1t35o3dzeap5s1531bxa1j47 X-Rspam-User: X-HE-Tag: 1722850523-985984 X-HE-Meta: U2FsdGVkX1+UjP1PozpuxClkdutwkoZ2lJoFzBqfDgSfMWqerFwSgLVFLS/tEtfYEgWulrXmg1HPn/fMUFWjTJxE9PnO15SQzT6TnY+xqSDHD8bC4WayEaAI6ziEUzBqqtGazOK6nFvOz8GECc8LxVGykQjG3PrRfHzWH6XoclFms0cM4PqyT1n/V3Y5OACahngggZrfTrerwrAyjjGl44KXRba9W7SLawb0oLgIbQLsZ5q3VvFkRtZVhULXQQXKGt3ULTsBlaud3CKJjPM1AZtqZdI6lQH5O9LeOdBUxkDzNSVoYZPDiTeXoUEyy7qg0891MFVAZ34MkVFTRvSZGo+tP1I/JEjxByAl/rYBYZSX2FyFD14+HTegll619bjc5QnX+Qh58tgFtbD/h26yessNOqfIE2hzHUasHCWB5OP6q71DWYe38thqzwqjyCFVg4s8bYZ2DVu+TJkcq7N0WYLevlm1Nw+Jh0Q2s7AAZ2SXqxIPbfA2yLYC8XAVV1mbNHBGov9ZJZcGE+9izqiw2DOukYZrg7pSnJotNIjfYBg1gc0CGp/fugpbny7nPwrcUAj/W3Zod45FM/VqmHNS/LUFDCvOkRsamDiQD51E7XiDyyzxTr0H5NCZ4R592WFeisjWSljw6qIJrBXXZ6Bm2NSNKYL5TtcUM+LUL9sjvylt3K6uA+lhuiiXs8VKkCPHmJnPOpgVV4eIkTqwtRWxVyMUGjJAR3MNOagcYSzdefFiZf0tpl8bLd36pztoAyr4tvvNlv71chNHRO59cBXAeorZDWMM19LflZ1GBY1WgjXx/+MFf5+UpYNqiIhE8UtSPory8SaqCjWIyelehWDrwbeYlft5558I5cq3UVtW2pC/QVqLL2GVEnaH6CrlEW4E+W5Frwk1zCh7OcNjQSdptvLR89b4EjrT+B5/RNmzxTfBuPL8OEWCnzDmdZPg+7yanA3AvMwB3+5YaBhxa3Z hNhtpTDE jxe3ZYrsQCOJIiLx7v6CX5lODXrxfBibLlv/67SFBmGORGd7GI9gQFhfDu5JtG9tSHiVLALmJ2PvosW5AM0A5DLQYGWDwUenp6kH0q/iNdHYUjdOrjNtbPQNjIe5PpqKFtT9osOi0DRd40lUnV7mzmhAuncJjg1ujTIfHESPqVQL28fuUp8crZ8ovk9y5tKzS8EbHZ+ffKhrtMFXiVf+Lj7eG785YCen40cLnzR18xIw65IYE+Oe2Ylg9P3yO88grEUpGWXdngkqRc09byWjJ0VN/udxBB8UPJxryCvw0fLDJQSxB796fIW7S+/lVcuF2JaBPNMaLWVM5Wobs5r5CcqcJsvTMC4KRksDlEq/XK0CR3y386/UHTj8wxQz9swWmKM0F/bOPvPdfZoU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Filesystem metadata consists of: physical memory extents, superblock, inodes block and allocation bitmap. Here serialisation and deserialisation of all of these is done via the KHO framework. A serialisation callback is added which is run when KHO activate is triggered. This creates the device tree blob for the metadata and marks the memory as persistent via struct kho_mem(s). When the filesystem is mounted it attempts to re-hydrate metadata from KHO. Only if this fails (first boot, for example) then it allocates fresh metadata pages. The privatet data struct is switched from holding a reference to the persistent superblock to now referencing the regular struct super_block. This is necessary for the serialisation code. Better would be to be able to define callback private data, if that were possible. Signed-off-by: James Gowans --- fs/guestmemfs/Makefile | 2 + fs/guestmemfs/guestmemfs.c | 72 ++++++--- fs/guestmemfs/guestmemfs.h | 8 + fs/guestmemfs/serialise.c | 296 +++++++++++++++++++++++++++++++++++++ 4 files changed, 355 insertions(+), 23 deletions(-) create mode 100644 fs/guestmemfs/serialise.c diff --git a/fs/guestmemfs/Makefile b/fs/guestmemfs/Makefile index e93e43ba274b..8b95cac34564 100644 --- a/fs/guestmemfs/Makefile +++ b/fs/guestmemfs/Makefile @@ -4,3 +4,5 @@ # obj-y += guestmemfs.o inode.o dir.o allocator.o file.o + +obj-$(CONFIG_KEXEC_KHO) += serialise.o diff --git a/fs/guestmemfs/guestmemfs.c b/fs/guestmemfs/guestmemfs.c index 38f20ad25286..cf47e5100504 100644 --- a/fs/guestmemfs/guestmemfs.c +++ b/fs/guestmemfs/guestmemfs.c @@ -3,6 +3,7 @@ #include "guestmemfs.h" #include #include +#include #include #include #include @@ -10,7 +11,7 @@ #include phys_addr_t guestmemfs_base, guestmemfs_size; -struct guestmemfs_sb *psb; +struct super_block *guestmemfs_sb; static int statfs(struct dentry *root, struct kstatfs *buf) { @@ -33,26 +34,39 @@ static int guestmemfs_fill_super(struct super_block *sb, struct fs_context *fc) struct inode *inode; struct dentry *dentry; - psb = kzalloc(sizeof(*psb), GFP_KERNEL); - psb->inodes = kzalloc(2 << 20, GFP_KERNEL); - if (!psb->inodes) - return -ENOMEM; - psb->allocator_bitmap = kzalloc(1 << 20, GFP_KERNEL); - if (!psb->allocator_bitmap) - return -ENOMEM; - /* * Keep a reference to the persistent super block in the * ephemeral super block. */ - sb->s_fs_info = psb; - spin_lock_init(&psb->allocation_lock); - guestmemfs_initialise_inode_store(sb); - guestmemfs_zero_allocations(sb); - guestmemfs_get_persisted_inode(sb, 1)->flags = GUESTMEMFS_INODE_FLAG_DIR; - strscpy(guestmemfs_get_persisted_inode(sb, 1)->filename, ".", - GUESTMEMFS_FILENAME_LEN); - psb->next_free_ino = 2; + sb->s_fs_info = guestmemfs_restore_from_kho(); + + if (GUESTMEMFS_PSB(sb)) { + pr_info("Restored super block from KHO\n"); + } else { + struct guestmemfs_sb *psb; + + pr_info("Did not restore from KHO - allocating free\n"); + psb = kzalloc(sizeof(*psb), GFP_KERNEL); + psb->inodes = kzalloc(2 << 20, GFP_KERNEL); + if (!psb->inodes) + return -ENOMEM; + psb->allocator_bitmap = kzalloc(1 << 20, GFP_KERNEL); + if (!psb->allocator_bitmap) + return -ENOMEM; + sb->s_fs_info = psb; + spin_lock_init(&psb->allocation_lock); + guestmemfs_initialise_inode_store(sb); + guestmemfs_zero_allocations(sb); + guestmemfs_get_persisted_inode(sb, 1)->flags = GUESTMEMFS_INODE_FLAG_DIR; + strscpy(guestmemfs_get_persisted_inode(sb, 1)->filename, ".", + GUESTMEMFS_FILENAME_LEN); + GUESTMEMFS_PSB(sb)->next_free_ino = 2; + } + /* + * Keep a reference to this sb; the serialise callback needs it + * and has no oher way to get it. + */ + guestmemfs_sb = sb; sb->s_op = &guestmemfs_super_ops; @@ -98,11 +112,18 @@ static struct file_system_type guestmemfs_fs_type = { .fs_flags = FS_USERNS_MOUNT, }; + +static struct notifier_block trace_kho_nb = { + .notifier_call = guestmemfs_serialise_to_kho, +}; + static int __init guestmemfs_init(void) { int ret; ret = register_filesystem(&guestmemfs_fs_type); + if (IS_ENABLED(CONFIG_FTRACE_KHO)) + register_kho_notifier(&trace_kho_nb); return ret; } @@ -120,13 +141,18 @@ early_param("guestmemfs", parse_guestmemfs_extents); void __init guestmemfs_reserve_mem(void) { - guestmemfs_base = memblock_phys_alloc(guestmemfs_size, 4 << 10); - if (guestmemfs_base) { - memblock_reserved_mark_noinit(guestmemfs_base, guestmemfs_size); - memblock_mark_nomap(guestmemfs_base, guestmemfs_size); - } else { - pr_warn("Failed to alloc %llu bytes for guestmemfs\n", guestmemfs_size); + if (guestmemfs_size) { + guestmemfs_base = memblock_phys_alloc(guestmemfs_size, 4 << 10); + + if (guestmemfs_base) { + memblock_reserved_mark_noinit(guestmemfs_base, guestmemfs_size); + memblock_mark_nomap(guestmemfs_base, guestmemfs_size); + pr_debug("guestmemfs reserved base=%llu from memblocks\n", guestmemfs_base); + } else { + pr_warn("Failed to alloc %llu bytes for guestmemfs\n", guestmemfs_size); + } } + } MODULE_ALIAS_FS("guestmemfs"); diff --git a/fs/guestmemfs/guestmemfs.h b/fs/guestmemfs/guestmemfs.h index 0f2788ce740e..263d995b75ed 100644 --- a/fs/guestmemfs/guestmemfs.h +++ b/fs/guestmemfs/guestmemfs.h @@ -10,11 +10,14 @@ /* Units of bytes */ extern phys_addr_t guestmemfs_base, guestmemfs_size; +extern struct super_block *guestmemfs_sb; struct guestmemfs_sb { /* Inode number */ unsigned long next_free_ino; unsigned long allocated_inodes; + + /* Ephemeral fields - must be updated on deserialise */ struct guestmemfs_inode *inodes; void *allocator_bitmap; spinlock_t allocation_lock; @@ -46,6 +49,11 @@ long guestmemfs_alloc_block(struct super_block *sb); struct inode *guestmemfs_inode_get(struct super_block *sb, unsigned long ino); struct guestmemfs_inode *guestmemfs_get_persisted_inode(struct super_block *sb, int ino); +int guestmemfs_serialise_to_kho(struct notifier_block *self, + unsigned long cmd, + void *v); +struct guestmemfs_sb *guestmemfs_restore_from_kho(void); + extern const struct file_operations guestmemfs_dir_fops; extern const struct file_operations guestmemfs_file_fops; extern const struct inode_operations guestmemfs_file_inode_operations; diff --git a/fs/guestmemfs/serialise.c b/fs/guestmemfs/serialise.c new file mode 100644 index 000000000000..eb70d496a3eb --- /dev/null +++ b/fs/guestmemfs/serialise.c @@ -0,0 +1,296 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "guestmemfs.h" +#include +#include + +/* + * Responsible for serialisation and deserialisation of filesystem metadata + * to and from KHO to survive kexec. The deserialisation logic needs to mirror + * serialisation, so putting them in the same file. + * + * The format of the device tree structure is: + * + * /guestmemfs + * compatible = "guestmemfs-v1" + * fs_mem { + * mem = [ ... ] + * }; + * superblock { + * mem = [ + * persistent super block, + * inodes, + * allocator_bitmap, + * }; + * mappings_block { + * mem = [ ... ] + * }; + * // For every mappings_block mem, which inode it belongs to. + * mappings_to_inode { + * num_inodes, + * mem = [ ... ], + * } + */ + +static int serialise_superblock(struct super_block *sb, void *fdt) +{ + struct kho_mem mem[3]; + int err = 0; + struct guestmemfs_sb *psb = sb->s_fs_info; + + err |= fdt_begin_node(fdt, "superblock"); + + mem[0].addr = virt_to_phys(psb); + mem[0].len = sizeof(*psb); + + mem[1].addr = virt_to_phys(psb->inodes); + mem[1].len = 2 << 20; + + mem[2].addr = virt_to_phys(psb->allocator_bitmap); + mem[2].len = 1 << 20; + + err |= fdt_property(fdt, "mem", &mem, sizeof(mem)); + err |= fdt_end_node(fdt); + + return err; +} + +static int serialise_mappings_blocks(struct super_block *sb, void *fdt) +{ + struct kho_mem *mappings_mems; + struct kho_mem mappings_to_inode_mem; + struct guestmemfs_sb *psb = sb->s_fs_info; + int inode_idx; + size_t num_inodes = PMD_SIZE / sizeof(struct guestmemfs_inode); + struct guestmemfs_inode *inode; + int err = 0; + int *mappings_to_inode; + int mappings_to_inode_idx = 0; + + mappings_to_inode = kzalloc(PAGE_SIZE, GFP_KERNEL); + + mappings_mems = kcalloc(psb->allocated_inodes, sizeof(struct kho_mem), GFP_KERNEL); + + for (inode_idx = 1; inode_idx < num_inodes; ++inode_idx) { + inode = guestmemfs_get_persisted_inode(sb, inode_idx); + if (inode->flags & GUESTMEMFS_INODE_FLAG_FILE) { + mappings_mems[mappings_to_inode_idx].addr = virt_to_phys(inode->mappings); + mappings_mems[mappings_to_inode_idx].len = PAGE_SIZE; + mappings_to_inode[mappings_to_inode_idx] = inode_idx; + mappings_to_inode_idx++; + } + } + + err |= fdt_begin_node(fdt, "mappings_blocks"); + err |= fdt_property(fdt, "mem", mappings_mems, + sizeof(struct kho_mem) * mappings_to_inode_idx); + err |= fdt_end_node(fdt); + + + err |= fdt_begin_node(fdt, "mappings_to_inode"); + mappings_to_inode_mem.addr = virt_to_phys(mappings_to_inode); + mappings_to_inode_mem.len = PAGE_SIZE; + err |= fdt_property(fdt, "mem", &mappings_to_inode_mem, + sizeof(mappings_to_inode_mem)); + err |= fdt_property(fdt, "num_inodes", &psb->allocated_inodes, + sizeof(psb->allocated_inodes)); + + err |= fdt_end_node(fdt); + + return err; +} + +int guestmemfs_serialise_to_kho(struct notifier_block *self, + unsigned long cmd, + void *v) +{ + static const char compatible[] = "guestmemfs-v1"; + struct kho_mem mem; + void *fdt = v; + int err = 0; + + switch (cmd) { + case KEXEC_KHO_ABORT: + /* No rollback action needed. */ + return NOTIFY_DONE; + case KEXEC_KHO_DUMP: + /* Handled below */ + break; + default: + return NOTIFY_BAD; + } + + err |= fdt_begin_node(fdt, "guestmemfs"); + err |= fdt_property(fdt, "compatible", compatible, sizeof(compatible)); + + err |= fdt_begin_node(fdt, "fs_mem"); + mem.addr = guestmemfs_base | KHO_MEM_ADDR_FLAG_NOINIT; + mem.len = guestmemfs_size; + err |= fdt_property(fdt, "mem", &mem, sizeof(mem)); + err |= fdt_end_node(fdt); + + err |= serialise_superblock(guestmemfs_sb, fdt); + err |= serialise_mappings_blocks(guestmemfs_sb, fdt); + + err |= fdt_end_node(fdt); + + pr_info("Serialised extends [0x%llx + 0x%llx] via KHO: %i\n", + guestmemfs_base, guestmemfs_size, err); + + return err; +} + +static struct guestmemfs_sb *deserialise_superblock(const void *fdt, int root_off) +{ + const struct kho_mem *mem; + int mem_len; + struct guestmemfs_sb *old_sb; + int off; + + off = fdt_subnode_offset(fdt, root_off, "superblock"); + mem = fdt_getprop(fdt, off, "mem", &mem_len); + + if (mem_len != 3 * sizeof(struct kho_mem)) { + pr_err("Incorrect mem_len; got %i\n", mem_len); + return NULL; + } + + old_sb = kho_claim_mem(mem); + old_sb->inodes = kho_claim_mem(mem + 1); + old_sb->allocator_bitmap = kho_claim_mem(mem + 2); + + return old_sb; +} + +static int deserialise_mappings_blocks(const void *fdt, int root_off, + struct guestmemfs_sb *sb) +{ + int off; + int len = 0; + const unsigned long *num_inodes; + const struct kho_mem *mappings_to_inode_mem; + int *mappings_to_inode; + int mappings_block; + const struct kho_mem *mappings_blocks_mems; + + /* + * Array of struct kho_mem - one for each persisted mappings + * blocks. + */ + off = fdt_subnode_offset(fdt, root_off, "mappings_blocks"); + mappings_blocks_mems = fdt_getprop(fdt, off, "mem", &len); + + /* + * Array specifying which inode a specific index into the + * mappings_blocks kho_mem array corresponds to. num_inodes + * indicates the size of the array which is the number of mappings + * blocks which need to be restored. + */ + off = fdt_subnode_offset(fdt, root_off, "mappings_to_inode"); + if (off < 0) { + pr_warn("No fs_mem available in KHO\n"); + return -EINVAL; + } + num_inodes = fdt_getprop(fdt, off, "num_inodes", &len); + if (len != sizeof(num_inodes)) { + pr_warn("Invalid num_inodes len: %i\n", len); + return -EINVAL; + } + mappings_to_inode_mem = fdt_getprop(fdt, off, "mem", &len); + if (len != sizeof(*mappings_to_inode_mem)) { + pr_warn("Invalid mappings_to_inode_mem len: %i\n", len); + return -EINVAL; + } + mappings_to_inode = kho_claim_mem(mappings_to_inode_mem); + + /* + * Re-assigned the mappings block to the inodes. Indexes into + * mappings_to_inode specifies which inode to assign each mappings + * block to. + */ + for (mappings_block = 0; mappings_block < *num_inodes; ++mappings_block) { + int inode = mappings_to_inode[mappings_block]; + + sb->inodes[inode].mappings = kho_claim_mem(&mappings_blocks_mems[mappings_block]); + } + + return 0; +} + +static int deserialise_fs_mem(const void *fdt, int root_off) +{ + int err; + /* Offset into the KHO DT */ + int off; + int len = 0; + const struct kho_mem *mem; + + off = fdt_subnode_offset(fdt, root_off, "fs_mem"); + if (off < 0) { + pr_info("No fs_mem available in KHO\n"); + return -EINVAL; + } + + mem = fdt_getprop(fdt, off, "mem", &len); + if (mem && len == sizeof(*mem)) { + guestmemfs_base = mem->addr & ~KHO_MEM_ADDR_FLAG_MASK; + guestmemfs_size = mem->len; + } else { + pr_err("KHO did not contain a guestmemfs base address and size\n"); + return -EINVAL; + } + + pr_info("Reclaimed [%llx + %llx] via KHO\n", guestmemfs_base, guestmemfs_size); + if (err) { + pr_err("Unable to reserve [0x%llx + 0x%llx] from memblock: %i\n", + guestmemfs_base, guestmemfs_size, err); + return err; + } + return 0; +} +struct guestmemfs_sb *guestmemfs_restore_from_kho(void) +{ + const void *fdt = kho_get_fdt(); + struct guestmemfs_sb *old_sb; + int err; + /* Offset into the KHO DT */ + int off; + + if (!fdt) { + pr_err("Unable to get KHO DT after KHO boot?\n"); + return NULL; + } + + off = fdt_path_offset(fdt, "/guestmemfs"); + pr_info("guestmemfs offset: %i\n", off); + + if (!off) { + pr_info("No guestmemfs data available in KHO\n"); + return NULL; + } + err = fdt_node_check_compatible(fdt, off, "guestmemfs-v1"); + if (err) { + pr_err("Existing KHO superblock format is not compatible with this kernel\n"); + return NULL; + } + + old_sb = deserialise_superblock(fdt, off); + if (!old_sb) { + pr_warn("Failed to restore superblock\n"); + return NULL; + } + + err = deserialise_mappings_blocks(fdt, off, old_sb); + if (err) { + pr_warn("Failed to restore mappings blocks\n"); + return NULL; + } + + err = deserialise_fs_mem(fdt, off); + if (err) { + pr_warn("Failed to restore filesystem memory extents\n"); + return NULL; + } + + return old_sb; +} From patchwork Mon Aug 5 09:32:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Gowans, James" X-Patchwork-Id: 13753334 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD7ADC3DA7F for ; Mon, 5 Aug 2024 09:35:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7A9F66B009D; Mon, 5 Aug 2024 05:35:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 759C36B009E; Mon, 5 Aug 2024 05:35:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5FA116B009F; Mon, 5 Aug 2024 05:35:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 3AEE16B009D for ; Mon, 5 Aug 2024 05:35:36 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id F14F881C18 for ; Mon, 5 Aug 2024 09:35:35 +0000 (UTC) X-FDA: 82417684230.01.79B6114 Received: from smtp-fw-80009.amazon.com (smtp-fw-80009.amazon.com [99.78.197.220]) by imf05.hostedemail.com (Postfix) with ESMTP id DF791100009 for ; Mon, 5 Aug 2024 09:35:33 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=TuR4j0nd; spf=pass (imf05.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 99.78.197.220 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722850472; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=b/ZDrMWB4VIy9ADRZF2m3pCqPq++QQhsZ6WqVk7Aipc=; b=AUkGR3npc0+pXNKY1prXi5N9CsDKYX7NYLprKkJlQDx26IU6Y3JTM5k9qgb3bV/W6Rr9h3 k2y9MxAoocHRok/9yK+VdhcJ/gjpMdRJYIEtNxYci5j29vdj6esagdJhxJlAuOZ7H0AcFR 3U3xKwBIXhA/5+QZ2JNNAMuCsVIc/1w= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722850472; a=rsa-sha256; cv=none; b=xjSsMXDdh7oFGWjWDScSLXJqVOYibyS3opVPCHY2guJ1+n+XCgqd4Dm9tIKXyCUssMH06r tpLkny6LGFDKoELw9tBLYx/K475vNTkRmdBHwysf621Aqy9UIez0Kef7Zciykjn6MCdT+B 2y4wj34Zozc+QNR8YDD54chzPNWkXy4= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=TuR4j0nd; spf=pass (imf05.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 99.78.197.220 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1722850533; x=1754386533; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=b/ZDrMWB4VIy9ADRZF2m3pCqPq++QQhsZ6WqVk7Aipc=; b=TuR4j0ndKzOgL0hg3WqnXT0RQqvEPwzHWPYPHcv3ntsuouNSIPkNlsLo 2hwNXAURpd9aCJIpcTnoMizJNZ4LIwD069tg5FBSoqVbFGrt/E/Koz84o gCNZmSnTfqDaftX5u8dH5BkrEVyg4Tns2jVHXN0XYYOUX3xIfijDYmSwC 8=; X-IronPort-AV: E=Sophos;i="6.09,264,1716249600"; d="scan'208";a="112169575" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-80009.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2024 09:35:29 +0000 Received: from EX19MTAEUA001.ant.amazon.com [10.0.43.254:35726] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.45.111:2525] with esmtp (Farcaster) id ba2c37fa-2737-488d-9a39-9aa5f2392569; Mon, 5 Aug 2024 09:35:28 +0000 (UTC) X-Farcaster-Flow-ID: ba2c37fa-2737-488d-9a39-9aa5f2392569 Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUA001.ant.amazon.com (10.252.50.192) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:35:27 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.113) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:35:18 +0000 From: James Gowans To: CC: James Gowans , Sean Christopherson , Paolo Bonzini , Alexander Viro , Steve Sistare , Christian Brauner , Jan Kara , "Anthony Yznaga" , Mike Rapoport , "Andrew Morton" , , Jason Gunthorpe , , Usama Arif , , Alexander Graf , David Woodhouse , Paul Durrant , Nicolas Saenz Julienne Subject: [PATCH 08/10] guestmemfs: Block modifications when serialised Date: Mon, 5 Aug 2024 11:32:43 +0200 Message-ID: <20240805093245.889357-9-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240805093245.889357-1-jgowans@amazon.com> References: <20240805093245.889357-1-jgowans@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.146.13.113] X-ClientProxiedBy: EX19D046UWB002.ant.amazon.com (10.13.139.181) To EX19D014EUC004.ant.amazon.com (10.252.51.182) X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DF791100009 X-Stat-Signature: 3dgry3z94u13dhtu17a4iqkjf7q7u7jm X-HE-Tag: 1722850533-949321 X-HE-Meta: U2FsdGVkX18zEqYg+LsP97yhjye8jXLgoXnwvSSltPqCvyk9g7+twTqFFaU57KZxPXz0rfEOkrWz7meLOHk3nyGLVDCdAeTTux+SahBFkQqepqdbevwi9XvwoiyYdyMvMAu+M2oBBZm37p87fMn8eOC5QXI0WHy1wriocR3Jw75AMY1Y0cV3YtH0r+gssCTC0rvc0JpawdbtoUtO9ilTZZ54oDjalkU8Sru2ZJLqqag882FfOie0+YJF/7EmC+Yk9B5X9wf+LTVgG0zRc+ofKs4xybLqaCIaD9RlFDs/2wqSd7NblvB7/DmKOQS8MbIjps1xMzCu8xjgfhyW2HJWKqfmKc834GDPnhJ4jhN027wdKzOdlODQ6by8XXhrpag8I37sdLxezPJ6lc7cq7x9Y1nQ2Ukfijcg/G+Q/+IJgL9HhA0/yUiM4S7FIj69acC7q4JBdCyHBhrdCo1wXp9rRoSHKzMXZHi00vL/LG9G49GU/k8jZwz5EpRz224RG5uAxG3gNpCxEwmjeUXvn9/MkztebZ/DYq6i0JOkUo66SYlCWcNM3Gk8Z96y/0mYh92UMS/x0A1bJsTQpdCpr2a/zm5OEUNxf4rtVGW2wnEVeaa/gqL9OPcEMXIJ67TIzRv8Qnnc4SQy2l5Betltqx+/s8454kXjzz6YhHnhytYBUeLedltUvjAn0X8jgppTcLoL37io6wxkWA6MQPzJA9gtdZXKgqoSi6kXlnuCFd3wlfL2KMr1J1DPGFXwgH21NPYWHj3XHLAzlgmAY5W/W966zL7/NerRujHCD6ErB4Af2/KIBImViZYqFH7gJtbtIaNSj3ybqRT7fxx4Z5mImvxIF5z0vBHAJrkquMEogGMa8MPKEkaKl1fPpdPQRvS5DWKTUX3NDzkVDq29tnBQY3QSgoA+Ur6i5FfneOWaVXjjBadMZZ7DgmEGKfJ5NvcM9kk/DU3wX/+PeI5SQ4DDGXG oHvjtM+S 9LluHkHEwDSJEPS7HPFFyH22Tp/15tLFBExEiu8/lgAVvrMaCuYa3M4L/7Pi7duACf7LcqtDOvVn1ZmHNBAnkNB9vxufr9Ykdy32JFT/LUBpuH1Yokzz/EVllliaXO2VerxxLWEoKmJpzxxohUa1/os6aE0mMgCbqKxqMwlFYs8WaQ5xa1CYE3E9NOuy9DWkGLe7jw4ojVtn5wLXlnICN5UkRRiH24Xz0dsAtJAADScgVs9RH/9wD42kiFxGWHceySFXO87KGAHXEgS/SvOjeBXgGKZBX9tC/552OEuaET7mMYGNSYr3RZVGuyfnOi6Eg7rjidg4obH0xwGR/g4xy3vBmy7NJdPgX9CBtE50I2zfKUMRSq6Elj65AYWJRIg7Z4r83rFAF+T9bsIKegxNOfmuDyuaT9Bh0JivRF87o8bg0mXeBUWOt3tIzWT+EzGb7wMIr8iI2vBX8lfStHUUc2U5yfjyvfIqQ2VFoU+PeH6ivVc9a8giHaPAEH9HS7a3VgHMRZpq6x83Qax0JrnA0sxDimav3je0Gtl1V9jL8ObdrfQTsD078ByzT5JpJBJZpddFXk2itbzbkqg0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Once the memory regions for inodes, mappings and allocations have been serialised, further modifications would break the serialised data; it would no longer be valid. Return an error code if attempting to create new files or allocate data for files once serialised. Signed-off-by: James Gowans --- fs/guestmemfs/file.c | 19 ++++++++++++++++--- fs/guestmemfs/guestmemfs.c | 1 + fs/guestmemfs/guestmemfs.h | 1 + fs/guestmemfs/inode.c | 6 ++++++ fs/guestmemfs/serialise.c | 8 +++++++- 5 files changed, 31 insertions(+), 4 deletions(-) diff --git a/fs/guestmemfs/file.c b/fs/guestmemfs/file.c index b1a52abcde65..8707a9d3ad90 100644 --- a/fs/guestmemfs/file.c +++ b/fs/guestmemfs/file.c @@ -8,19 +8,32 @@ static int truncate(struct inode *inode, loff_t newsize) unsigned long free_block; struct guestmemfs_inode *guestmemfs_inode; unsigned long *mappings; + int rc = 0; + struct guestmemfs_sb *psb = GUESTMEMFS_PSB(inode->i_sb); + + spin_lock(&psb->allocation_lock); + + if (psb->serialised) { + rc = -EBUSY; + goto out; + } guestmemfs_inode = guestmemfs_get_persisted_inode(inode->i_sb, inode->i_ino); mappings = guestmemfs_inode->mappings; i_size_write(inode, newsize); for (int block_idx = 0; block_idx * PMD_SIZE < newsize; ++block_idx) { free_block = guestmemfs_alloc_block(inode->i_sb); - if (free_block < 0) + if (free_block < 0) { /* TODO: roll back allocations. */ - return -ENOMEM; + rc = -ENOMEM; + goto out; + } *(mappings + block_idx) = free_block; ++guestmemfs_inode->num_mappings; } - return 0; +out: + spin_unlock(&psb->allocation_lock); + return rc; } static int inode_setattr(struct mnt_idmap *idmap, struct dentry *dentry, struct iattr *iattr) diff --git a/fs/guestmemfs/guestmemfs.c b/fs/guestmemfs/guestmemfs.c index cf47e5100504..d854033bfb7e 100644 --- a/fs/guestmemfs/guestmemfs.c +++ b/fs/guestmemfs/guestmemfs.c @@ -42,6 +42,7 @@ static int guestmemfs_fill_super(struct super_block *sb, struct fs_context *fc) if (GUESTMEMFS_PSB(sb)) { pr_info("Restored super block from KHO\n"); + GUESTMEMFS_PSB(sb)->serialised = 0; } else { struct guestmemfs_sb *psb; diff --git a/fs/guestmemfs/guestmemfs.h b/fs/guestmemfs/guestmemfs.h index 263d995b75ed..91cc06ae45a5 100644 --- a/fs/guestmemfs/guestmemfs.h +++ b/fs/guestmemfs/guestmemfs.h @@ -21,6 +21,7 @@ struct guestmemfs_sb { struct guestmemfs_inode *inodes; void *allocator_bitmap; spinlock_t allocation_lock; + bool serialised; }; // If neither of these are set the inode is not in use. diff --git a/fs/guestmemfs/inode.c b/fs/guestmemfs/inode.c index 61f70441d82c..d521b35d4992 100644 --- a/fs/guestmemfs/inode.c +++ b/fs/guestmemfs/inode.c @@ -48,6 +48,12 @@ static unsigned long guestmemfs_allocate_inode(struct super_block *sb) struct guestmemfs_sb *psb = GUESTMEMFS_PSB(sb); spin_lock(&psb->allocation_lock); + + if (psb->serialised) { + spin_unlock(&psb->allocation_lock); + return -EBUSY; + } + next_free_ino = psb->next_free_ino; psb->allocated_inodes += 1; if (!next_free_ino) diff --git a/fs/guestmemfs/serialise.c b/fs/guestmemfs/serialise.c index eb70d496a3eb..347eb8049a71 100644 --- a/fs/guestmemfs/serialise.c +++ b/fs/guestmemfs/serialise.c @@ -111,7 +111,7 @@ int guestmemfs_serialise_to_kho(struct notifier_block *self, switch (cmd) { case KEXEC_KHO_ABORT: - /* No rollback action needed. */ + GUESTMEMFS_PSB(guestmemfs_sb)->serialised = 0; return NOTIFY_DONE; case KEXEC_KHO_DUMP: /* Handled below */ @@ -120,6 +120,7 @@ int guestmemfs_serialise_to_kho(struct notifier_block *self, return NOTIFY_BAD; } + spin_lock(&GUESTMEMFS_PSB(guestmemfs_sb)->allocation_lock); err |= fdt_begin_node(fdt, "guestmemfs"); err |= fdt_property(fdt, "compatible", compatible, sizeof(compatible)); @@ -134,6 +135,11 @@ int guestmemfs_serialise_to_kho(struct notifier_block *self, err |= fdt_end_node(fdt); + if (!err) + GUESTMEMFS_PSB(guestmemfs_sb)->serialised = 1; + + spin_unlock(&GUESTMEMFS_PSB(guestmemfs_sb)->allocation_lock); + pr_info("Serialised extends [0x%llx + 0x%llx] via KHO: %i\n", guestmemfs_base, guestmemfs_size, err); From patchwork Mon Aug 5 09:32:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Gowans, James" X-Patchwork-Id: 13753335 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23675C3DA4A for ; Mon, 5 Aug 2024 09:36:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B7A406B009F; Mon, 5 Aug 2024 05:36:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B52786B00A0; Mon, 5 Aug 2024 05:36:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A198A6B00A1; Mon, 5 Aug 2024 05:36:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 829E26B009F for ; Mon, 5 Aug 2024 05:36:14 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 2E9BC1201A0 for ; Mon, 5 Aug 2024 09:36:14 +0000 (UTC) X-FDA: 82417685868.23.62430E5 Received: from smtp-fw-9106.amazon.com (smtp-fw-9106.amazon.com [207.171.188.206]) by imf27.hostedemail.com (Postfix) with ESMTP id F211F40007 for ; Mon, 5 Aug 2024 09:36:11 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=PB8CXfjf; dmarc=pass (policy=quarantine) header.from=amazon.com; spf=pass (imf27.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 207.171.188.206 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722850503; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SYxgCtdC1svzDJcgLJLCz3gGK5To5Xu2NrqOWTnvUZY=; b=YoGsZqiTp0/RYir3H9tdno6MtXVCvL0O9gOARoCH5rlcy7Huj/P1u5zbsGuJoTpkbdHkI2 DXYlqlvP/j8NbKgU70lxlP85b6ggN4WzJN5+wGgVdoJEZANGlleapNse/58hY4POlToaFZ KsG5V9o5zE+a3R2UTlP4tAWtrRSTV8g= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722850503; a=rsa-sha256; cv=none; b=TgvPBe2lQ+VfVGK5GmhdbYBznQeW9cASbuYHfLDctOih4ykwqjCUjzlh+T0Q2QyusRun6C PPQ4C8EKHvJiqOmMRwdO74bX8nWsVxbAc8+CHAuWnnHK25k4f+z7gsu1AeKeoKS5owUvlh aRu9G8HaVpdtvdYhAe9wcIaypT4zEiw= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=PB8CXfjf; dmarc=pass (policy=quarantine) header.from=amazon.com; spf=pass (imf27.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 207.171.188.206 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1722850573; x=1754386573; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SYxgCtdC1svzDJcgLJLCz3gGK5To5Xu2NrqOWTnvUZY=; b=PB8CXfjfrygawNjekWAzvSGQJBfUV9jTkheKspDnsJVr4PeHw1Et5CEi I2cx3mRpgPFTvNsg9CMuwpkfgPuhaA90rkebgEVRo/zZ/Rtb6EZAN7Pwm vtPT57a6/w09f6Dbuv/3W3cXECu2b7mYoh8OmqoJlaWvWk3X90mNfgig9 s=; X-IronPort-AV: E=Sophos;i="6.09,264,1716249600"; d="scan'208";a="747199432" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-9106.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2024 09:36:11 +0000 Received: from EX19MTAEUB001.ant.amazon.com [10.0.10.100:34899] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.4.201:2525] with esmtp (Farcaster) id 7ef1e7da-8235-4c57-b9f9-879b313d6daf; Mon, 5 Aug 2024 09:36:09 +0000 (UTC) X-Farcaster-Flow-ID: 7ef1e7da-8235-4c57-b9f9-879b313d6daf Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUB001.ant.amazon.com (10.252.51.26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:36:08 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.113) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:35:59 +0000 From: James Gowans To: CC: James Gowans , Sean Christopherson , Paolo Bonzini , Alexander Viro , Steve Sistare , Christian Brauner , Jan Kara , "Anthony Yznaga" , Mike Rapoport , "Andrew Morton" , , Jason Gunthorpe , , Usama Arif , , Alexander Graf , David Woodhouse , Paul Durrant , Nicolas Saenz Julienne Subject: [PATCH 09/10] guestmemfs: Add documentation and usage instructions Date: Mon, 5 Aug 2024 11:32:44 +0200 Message-ID: <20240805093245.889357-10-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240805093245.889357-1-jgowans@amazon.com> References: <20240805093245.889357-1-jgowans@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.146.13.113] X-ClientProxiedBy: EX19D031UWC002.ant.amazon.com (10.13.139.212) To EX19D014EUC004.ant.amazon.com (10.252.51.182) X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: F211F40007 X-Stat-Signature: mifbajgzosaqag1j77ibe3hf5xey4h14 X-Rspam-User: X-HE-Tag: 1722850571-320786 X-HE-Meta: U2FsdGVkX18sKI8Cjs0O3FAvHcZQv8iSP2icZFzmKMFUbp1bB91o8rLRWs6R2jcnW6r8Ke54xijw+PfVg2MeTXBrNhDTyEownL3Oi2IqRDTGZI+8VtrfwPnSwg6pM0JTovsgPZoxt0EG5nSnTsGY6fdE0WWQ41dIJyb+10IdQKlWRaOZNiQweOP0p9iwx2Tw/DM+POZE/JqG2sQzq0mI7pgiIa2VSQIObvNtG6rfDEFN1hIvRw8HNWm9n2WrrJdYxWvGxsGb9eFHQeRdpZUpwljF4di6H+DfyXyO3r8TG/RzDSfPiedxqiioRZ4EKyTcxHHdQZgQGn0i4ClTTJPI0Tl3qGoEeZLiYryXDsBpuWFDqX5FdRYTH7zWOkUjd4VI/yh2fxn/7PXkVB1E9eNw34mWlpOQoB4E5pYAgzI4MnV6EKo0qNs6nHCaFi/tjQWlzlUW2KPwa79cJgCl3OyyU+rKr2EJFcRQg4S4EqK2688mAGwxqL+QfFM6k/X3YGM97KgpaLOnCDBw6Z9WE7XDa5po8QCnfUKS5EeL6YDc5sHsuffKgE2goDVoEHDHZTyKFZFPG15cBzsVnpLAd1HkqdWqlo90ufE5rjhIpjY7E3ATUoqcuM/puajq79/PdhWW+KZOluUjNb6Y/XVxheJVyKhG1NlDwdFrNIFrqPeL3/UMnmfh2Su/0NTIkln+N1dE2Io9i+2SjzdjoDecl01pbtgkurvbkEErY+AlE9DNgt4aqB7ZYaajoy3gtLYmSiVg8r0aO5W4rVMlnaBo/FHEqTPtfSeu/yXn+9wiqGwTjTmrBL86syvBmAEa5WOZt6GytRmNtBGKjIwpkgupPfHVIpaFs++cdcgsVQcI5wCCfwK/OfvkEIHhTYddyK7ge2NEdNz3lAyGm90TF52G8bMx5XZt0DSurPPmfgCHI8pL8c5qEBfFP0O6Vl3H6R3xuF8aNe+5djUZv+PqyOcCeuj mZpwVULp s+ymVZK3kjNj654FoPlqCzf88j3xalROL/mcbUOHUnmgrWuofNSagWLkQQkCPz1IN18tihmgc59OxmPYiW65Innd2/Dnm5ImGCN/Mn+BgIuONsMHi4SC90xj0GNZdjpdjJJ6pdp7Xb9w3Ddnv1agJAykF7CvLv0A4Ws7sU5c7igIk+5kDa/6WwT2NYJilBw6FlRmCPhBATMUIUO/oioUEYKCdf859kCgi5T2HtilV1FudQVj7Yd9MmpSKOdYY6cw1xoFof1HS2uzzlz1mUExzO6gkkKYBJuzWyyIGkx2IbfxXCfv1u751KeBiUoeWzefPvasS51CunU+G6MX/CZ2AXKh9BLroY6x5d8Mb8y5pDngN+YwIY2tpUcyUmn2oC5smtPocjCib2v6xDnfY/+oQbcdJKijhD0vqMK6linAAOlq5D2omLbEgajoxLtTltoNz/pdgaQIM0AlvkqEnaYT0brHtv41rWbVMpsQENctksxOlPDB0Q/TNJC+6hG+3xCoaJR3QNz79rm4UAHPI68cDeyFSYXf1QyKa+GVHEi0Teqv/6q0Wasw20QfkfXZSKvzQtVTVmPjT4CsStfE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Describe the motivation for guestmemfs, the functionality it provides, how to compile it in, how to use it as a source of guest memory, how to persist it across kexec and save/restore a VM. Signed-off-by: James Gowans --- Documentation/filesystems/guestmemfs.rst | 87 ++++++++++++++++++++++++ 1 file changed, 87 insertions(+) create mode 100644 Documentation/filesystems/guestmemfs.rst diff --git a/Documentation/filesystems/guestmemfs.rst b/Documentation/filesystems/guestmemfs.rst new file mode 100644 index 000000000000..d6ce0d194cc8 --- /dev/null +++ b/Documentation/filesystems/guestmemfs.rst @@ -0,0 +1,87 @@ +.. SPDX-License-Identifier: GPL-2.0 + +====================================================== +Guestmemfs - Persistent in-memory guest RAM filesystem +====================================================== + +Overview +======== + +Guestmemfs is an in-memory filesystem designed specifically for the purpose of +live update of virtual machines by being a persistent across kexec source of +guest VM memory. + +Live update of a hypervisor refers to act of pausing running VMs, serialising +state, kexec-ing into a new hypervisor image, re-hydraing the KVM guests and +resuming them. To achieve this guest memory must be preserved across kexec. + +Additionally, guestmemfs provides: +- secret hiding for guest memory: the physical memory allocated for guestmemfs + is carved out of the direct map early in boot. +- struct page overhead elimination: guestmemfs memory is not allocated by the + buddy allocator and does not have associated struct pages. +- huge page mappings: allocations are done at PMD size and this improves TLB + performance (work in progress.) + +Compilation +=========== + +Guestmemfs is enabled via CONFIG_GUESTMEMFS_FS + +Persistence across kexec is enabled via CONFIG_KEXEC_KHO + +Usage +===== + +On first boot (cold boot), allocate a large contiguous chunk of memory for +guestmemfs via a kernel cmdline argument, eg: +`guestmemfs=10G`. + +Mount guestmemfs: +mount -t guestmemfs guestmemfs /mnt/guestmemfs/ + +Create and truncate a file which will be used for guest RAM: + +touch /mnt/guesttmemfs/guest-ram +truncate -s 500M /mnt/guestmemfs/guest-ram + +Boot a VM with this as the RAM source and the live update option enabled: + +qemu-system-x86_64 ... \ + -object memory-backend-file,id=pc.ram,size=100M,mem-path=/mnt/guestmemfs/guest-ram,share=yes,prealloc=off \ + -migrate-mode-enable cpr-reboot \ + ... + +Suspect the guest and save the state via QEMU monitor: + +migrate_set_parameter mode cpr-reboot +migrate file:/qemu.sav + +Activate KHO to serialise guestmemfs metadata and then kexec to the new +hypervisor image: + +echo 1 > /sys/kernel/kho/active +kexec -s -l --reuse-cmdline +kexec -e + +After the kexec completes remount guestmemfs (or have it added to fstab) +Re-start QEMU in live update restore mode: + +qemu-system-x86_64 ... \ + -object memory-backend-file,id=pc.ram,size=100M,mem-path=/mnt/guestmemfs/guest-ram,share=yes,prealloc=off \ + -migrate-mode-enable cpr-reboot \ + -incoming defer + ... + +Finally restore the VM state and resume it via QEMU console: + +migrate_incoming file:/qemu.sav + +Future Work +=========== +- NUMA awareness and multi-mount point support +- Actually creating PMD-level mappings in page tables +- guest_memfd style interface for confidential computing +- supporting PUD-level allocations and mappings +- MCE handling +- Persisting IOMMU pgtables to allow DMA to guestmemfs during kexec From patchwork Mon Aug 5 09:32:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Gowans, James" X-Patchwork-Id: 13753336 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F03CC3DA4A for ; Mon, 5 Aug 2024 09:36:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AB8716B00A1; Mon, 5 Aug 2024 05:36:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A675A6B00A2; Mon, 5 Aug 2024 05:36:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 908DB6B00A3; Mon, 5 Aug 2024 05:36:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 6B2D66B00A1 for ; Mon, 5 Aug 2024 05:36:24 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 6242AA257E for ; Mon, 5 Aug 2024 09:36:23 +0000 (UTC) X-FDA: 82417686246.04.9EF8EF4 Received: from smtp-fw-52005.amazon.com (smtp-fw-52005.amazon.com [52.119.213.156]) by imf22.hostedemail.com (Postfix) with ESMTP id 6D09BC0027 for ; Mon, 5 Aug 2024 09:36:21 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=j04t36xV; spf=pass (imf22.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 52.119.213.156 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722850550; a=rsa-sha256; cv=none; b=Ni+wnJswul74LHnVsYMHu2zz9EzrCyJtPYkfafR05cyW6vFNnGDjj0xYU/iSHH/hGH9/Vg QOVs+8wpw253/A26Pk2mZqSETIB+8oNcHlIHTkGWSCEwco8WMfa5blfu3Qyrv9bOZPnYah 5gaHPNm0yu31DReV2GvRz8ep+OWuAiE= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=amazon.com header.s=amazon201209 header.b=j04t36xV; spf=pass (imf22.hostedemail.com: domain of "prvs=940e15008=jgowans@amazon.com" designates 52.119.213.156 as permitted sender) smtp.mailfrom="prvs=940e15008=jgowans@amazon.com"; dmarc=pass (policy=quarantine) header.from=amazon.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722850550; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xIwlsPdKypNAzAdQV72OSHi8d6jH55QygowoOsRKvhI=; b=PKrjTucqRYq6XlBfI3u38hnAGeZpzEax9E8ETTM6aO4X7lIoQDGEnFsQGXpmJB8pQgsTZw ZOth/UrxT3ZcGL4ldY74GXSRtqyYUJxb43EuYwgYmoAtc4m6hJpvTGFW4ObZNL/wBoDM4U 5WSloy4RMRZri57FphhFRb+v3bnIeSM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1722850582; x=1754386582; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=xIwlsPdKypNAzAdQV72OSHi8d6jH55QygowoOsRKvhI=; b=j04t36xVSDhserkIsv3/6bC9mLu2kZXjJh5IGXYB8c1/5CTHKz3Yom3B KRRa6lV4W8Hgzj5Da4J7mP9VHx5j/sOh2GpSA9jMPoF14J6NhfdqgVlAL DjVjgv3OGcp/jaCi5qIaq4xUOt2FXq3wt8z2pEkX1yJ7VMpfRqSOQngfu A=; X-IronPort-AV: E=Sophos;i="6.09,264,1716249600"; d="scan'208";a="672011613" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52005.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2024 09:36:20 +0000 Received: from EX19MTAEUB002.ant.amazon.com [10.0.17.79:42749] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.22.69:2525] with esmtp (Farcaster) id 9c651255-34c7-4aea-a776-f6d2f5be7b32; Mon, 5 Aug 2024 09:36:18 +0000 (UTC) X-Farcaster-Flow-ID: 9c651255-34c7-4aea-a776-f6d2f5be7b32 Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUB002.ant.amazon.com (10.252.51.79) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:36:17 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.113) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 5 Aug 2024 09:36:08 +0000 From: James Gowans To: CC: James Gowans , Sean Christopherson , Paolo Bonzini , Alexander Viro , Steve Sistare , Christian Brauner , Jan Kara , "Anthony Yznaga" , Mike Rapoport , "Andrew Morton" , , Jason Gunthorpe , , Usama Arif , , Alexander Graf , David Woodhouse , Paul Durrant , Nicolas Saenz Julienne Subject: [PATCH 10/10] MAINTAINERS: Add maintainers for guestmemfs Date: Mon, 5 Aug 2024 11:32:45 +0200 Message-ID: <20240805093245.889357-11-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240805093245.889357-1-jgowans@amazon.com> References: <20240805093245.889357-1-jgowans@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.146.13.113] X-ClientProxiedBy: EX19D031UWC002.ant.amazon.com (10.13.139.212) To EX19D014EUC004.ant.amazon.com (10.252.51.182) X-Stat-Signature: f755ktkoenfya4dzyp5p6sgrgs6oenfj X-Rspamd-Queue-Id: 6D09BC0027 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1722850581-111305 X-HE-Meta: U2FsdGVkX1+f6dvuXWe8IjSv/TQiQxox0uJ/zOvKyROpweUEKnn7cAlhvxzYE8k7yQvZNkcNE/btS7kIiaTcraDwTTrz2sUockSudTJ7131VFTTpRv3leJPI1irNPDB1XLaYq7XTpMjHc+f4tDGpvO+M+XLVS9Xoi3vdrf9aKtfpQVRpYG5gKcvolFRYorFWpGog8bi8nEik2n6q3nK8lRPQ9k/3hS66iCIu4jST38bquTOW1Tv3CSEvEKCJ9Ttwh2t4XbswG9RXt/Vx1ELa2SqAtdgiJhjMPPla4mi1pqHmjLaFL6JmI+FXD51arI9lEf1DFWk9JRqoDGKmY15eqfx/dKNiABgRha2LyIRIMP6DdMSf9XaqR/R6edz8INlOklkhQAItYOHmojRFKvkty55nKUi1iJP26a7NlrmJDvYv4x99s5TNF//jnt3OEzHd8/87goR2uPk5ppP2fIR2kN9TOhoenDfT81KKLwVzFH2wrWmDxP2WRuaUAWwWYi2jFAIjDe1uxhZTibjjhUBWTZDD/u5itT2P9IUYnxejtV6/xynhQ5zzBSJv7PjzYBYg+/EEYZp896QQ1T4wXmHB++K5tr/7htIRcDmqX7FUqErGt9vGFPiyiFGkJ5Jcxjm5H7ZnDNASHS3lc0XSEKPpQEkCRi8GviB74bryp85HUWlkL34LxEnC0/dDZH144otPSp0ck3oXgYD1qvSzhOwTOhOEoa0Bv9uEo+UseDS4LcnMUxNmXfmU/e8/Sfpk2PsqvwUUpuPjRaVg4o3mzmOE4pJsshS+o8/wDBIxcvOc5nLcgUXOVRO7khpYL/fKMO/+ui/bvDe/MF5Q6Ktlf9xY9w6w00xNyWGqhXqjtmlm9gfCXk3w0X6l7MCcexdqo2JzcXSDBguXUZ17sTczN2saEtmT2D/vWf8FZNhiCdnZxmUUqQE0+GP8CWfjosy9y0FA2Ix/maRwWgX8B5FW7ph 4qz6kBrf aZOe3Hi8yHU0C3ru6sdDpX0bUT3AX8G5Xck0/9A65PrT4SfjtFmtnoNX+SPdkuFWePrSsfZjSK+bejASkvsx1jL6ZW1fN023AKX8mKukO1ClZy9AetHGt0gfke/b8JdyPVplDdB10vQgowBs/cag9UN76C3t0KXku85ckmFqkeBp3AnsQ4qn7PTe5ZBVhuJw+cRKdDN7BVkIhb8WRWVePLo9vvXAxT7fhKlsnH1glNbc65qCWjdSB8Ftj+iGJyTgzIZNWjkLicL1T2RWsRs2yAMDGz/2zqfnMk1i1xW1+8mk2ov1jij2MpFdFIeYjorb6GUgkQ+7thsqHJS5+AKbQ8O5KE5O4QBOAaXRx4Nnb2yWnjP54pejzmCq28G+qRR7x72EQDTVEWx0VwMHPZIbJsMWL5SRwRhE6kLvSH/qqLR/BRvXcH4U/DWEA8GmcZi6ewvioKpeMqymhfw2MtopkfNO4iXoIidWUxvYtvzqsWgSR4ixG0gFr/gBmNF49j5oTTPVd3DGHKOYzPeFdfdjTaDaws+K0/nR8estdjVE+qfUYpnawQVabwuScx5/zMXu1H3Mv8dXIEv1aqLINc8TzI8ceZWN+K14XPAr046/x7YHDhh1GGCtmUC/lZRnhf4i6Ta/t0HMnNHELa3q9lcq6ZWRPB8h7+MfZ+SrDFEKfcshD6h04B9+aa+rt+4+kUGJEvTzO X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Signed-off-by: James Gowans --- MAINTAINERS | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 1028eceb59ca..e9c841bb18ba 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9412,6 +9412,14 @@ S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/pablo/gtp.git F: drivers/net/gtp.c +GUESTMEMFS +M: James Gowans +M: Alex Graf +L: linux-fsdevel@vger.kernel.org +S: Maintained +F: Documentation/filesystems/guestmemfs.rst +F: fs/guestmemfs/ + GUID PARTITION TABLE (GPT) M: Davidlohr Bueso L: linux-efi@vger.kernel.org