From patchwork Fri Jan 17 16:29:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13943554 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FA51C02185 for ; Fri, 17 Jan 2025 16:30:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 056256B009D; Fri, 17 Jan 2025 11:30:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 005AF6B009C; Fri, 17 Jan 2025 11:30:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D9B256B009D; Fri, 17 Jan 2025 11:30:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id BA1086B009B for ; Fri, 17 Jan 2025 11:30:22 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 25B2F45517 for ; Fri, 17 Jan 2025 16:30:22 +0000 (UTC) X-FDA: 83017481484.15.F8B934C Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) by imf12.hostedemail.com (Postfix) with ESMTP id 3023B40007 for ; Fri, 17 Jan 2025 16:30:19 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=RoE7W9uj; spf=pass (imf12.hostedemail.com: domain of 3moWKZwUKCBMCtuutz77z4x.v75416DG-553Etv3.7Az@flex--tabba.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=3moWKZwUKCBMCtuutz77z4x.v75416DG-553Etv3.7Az@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737131420; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gdAdQseomovuuCB1UMT0wPLl0A/JMqD3/SFKXOl1KPs=; b=j89om7VhihyfJAwUik7bkoUhww7XCab1OsC6dXX5sLDejHWLDwfZWuwPhWoUZ0SYqwIBRb SfDr780zyJsr7E4eiFQtgLDVjwD2O9UCQqx51JMHkuWMfzteYWaT713QIQ7tqV3B8TiFOc g0mQMf5ViHCYVdFO6T4GF6+rtax3DvM= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=RoE7W9uj; spf=pass (imf12.hostedemail.com: domain of 3moWKZwUKCBMCtuutz77z4x.v75416DG-553Etv3.7Az@flex--tabba.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=3moWKZwUKCBMCtuutz77z4x.v75416DG-553Etv3.7Az@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737131420; a=rsa-sha256; cv=none; b=VwwlhSk0I4Z9+0UGBOUwihiPkf1k0jPXJ8rW/sDdI/q/cd8f5pE1RM29SxiBeNMZnr0k0R eaIMGzUSzrkU6L1DPNY/1kKhnGDhRHSrHfCWuME8KVKzTdt4tqSK2luMkYKmIyt5N+MfUx 3/VTiopCbTN/WJjl2zcSIGtP9rOm74Y= Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-38bee9ae3b7so1545647f8f.1 for ; Fri, 17 Jan 2025 08:30:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1737131419; x=1737736219; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gdAdQseomovuuCB1UMT0wPLl0A/JMqD3/SFKXOl1KPs=; b=RoE7W9uj5sCkwyy5SpMHvdEHCHqP9WrprAIDWDvRukOevf2Sbor4E4A73VKnVUp4Zc Xaa14yy+Qseb+KG+xICZMv/XscuJ99FtMwp3mALQFi/xyjCuMLgmQ+SuP7Xe18vQp0nj Wm8p2LySj7AjC5oudbiXFvTKL2l4/BJX02wj1T14Bvmmjn1NQgy1vz+9xdQVO4TYauX0 3RPydW4o6N8wgiI89XMkQT4ZiTCIxfnhV9nfAgBUuC3ueBpTtGKbv5oc+jP/EHwlPSTQ 1j4S6sx2wFrrUGD03I3HFwAVBL6LbWxFZ5Z9J4kQGe0eWy1xwRdFBvX1nlRfKJy7LXdG Zhbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737131419; x=1737736219; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gdAdQseomovuuCB1UMT0wPLl0A/JMqD3/SFKXOl1KPs=; b=HN5B1bNp1rOW4chvIKpv8hxq0ExG7iicQHq3RVfCSoLBXRzOTD3xPHI+A/rWZzWMy7 ddatIXB8PrtMdXmhyU3Jq/BXhovxeiE6XvQLoIdfCqg8oeQqLN3y4iruOAOIBi08t25N qXIXKwiwQsc9Oy4ItghG/UAe0Jt2ajG8LfOzfp9rjF01REHt1vv3y31RoMIKv1HDKOsg L5SKzqWSM6ItZLjXMHx67ZKlWzqBOgW2K9UmVlBjVAx9QZ5V/EjC4nOiQfZnCx8m02Cy b1qXl3DcZev+sjFMzmfxDeIGt9PQgOyAPQdOrA1QwahNwWGz4dTgRfSmdZVVz3zjrBAT IJ0w== X-Forwarded-Encrypted: i=1; AJvYcCWua32dtRavPJTmFoqlMiutOaggbsJaLhsKe1sZ5M8FSeGJ4y9D0IfUf7O5XqbFSizRJ7/ss1K1KA==@kvack.org X-Gm-Message-State: AOJu0YwxUOCTvzf3LbYMDSop/xJw1fP7Hm7A38jBt9NCu3PzTbTwCAn6 tzATylIyFSRiT2UM2EXBHIpawnu8cIelwPvLZ9+fXrVH7RoEDXRIBMkUrYHz4JVrkf9KVaamrg= = X-Google-Smtp-Source: AGHT+IGoCiOpQZ+XPGh3+m7uM/r7mJz3ngOqFFIWL6Sni4tqd4XQp6adGu7aEk+5zToL2X1ioL4lGzVu/A== X-Received: from wmbjh12.prod.google.com ([2002:a05:600c:a08c:b0:434:f5f3:3314]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:1a8c:b0:38b:ed6f:f00f with SMTP id ffacd0b85a97d-38bf56635a1mr2878108f8f.17.1737131418732; Fri, 17 Jan 2025 08:30:18 -0800 (PST) Date: Fri, 17 Jan 2025 16:29:53 +0000 In-Reply-To: <20250117163001.2326672-1-tabba@google.com> Mime-Version: 1.0 References: <20250117163001.2326672-1-tabba@google.com> X-Mailer: git-send-email 2.48.0.rc2.279.g1de40edade-goog Message-ID: <20250117163001.2326672-8-tabba@google.com> Subject: [RFC PATCH v5 07/15] KVM: guest_memfd: Allow host to mmap guest_memfd() pages when shared From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 3023B40007 X-Stat-Signature: x8fr7phz9smpwfczon7o9d7b3jrc8dxu X-Rspam-User: X-HE-Tag: 1737131419-120585 X-HE-Meta: U2FsdGVkX1/rXT5N/dfOJExd/X6OwNIvrCEplCPJa024Rr8g5WeovAqJiUPyoJPj+7XvvlnJ6dgQTSdeDCUGzQHl2+QRw+k6R19XDkI84LBSxEsTyx9jzLRr99y2jiszsWPzwHQYWAY3OravdIj4Th5EtS1kHfSZa9yuEyPTlH/jbi9YdCgJGsD5uqKA5ylkdFeT2Hnot3UunvFpT+sJ8UGJ0IGWpu5CEaiVSPICS4umq08ypQDMJeIHDvyYQPFsu3VVn9rlFJon88O6mU3geteDnUSlqj15Ctqn9jTnUeCx9MUIUq6wsJjc/GeQaSLIcwj/8nYwb5KKb4+ahjM0mRmoR34Xe0+N5R5t3oUDJIkMwunpg0N3Rp173hhP7WpFzKtyfZPjNLxK38UYy6ZhzelcW+XjwWJdmiTHxXD2CJMp72Bu5uHRP7LkHuI4C8OrFM6byibrXoL2NbIR8II531CcTuEJcxhuNwL22YaCK8bsiBEriqL3VwcV8j8f6jWFNbXzxU7XgfcfipiMsblCgFo2n/iKv9+ckwoYmEHUkt7IYM+zC9IoUIzCPhv1Hxtl4UiN8io1vNQW37oJHqsjWu/krNsFOPTdddhAivSNmTBTrlfIPPPfk9HVPqk3H3hutR6cDiB9z7jhTy/qtD8/zD+DJdMnDxszOfNjJxqskkSWlnnl142WbyTk8FUZ4zPy9ASmKwkByRkgr9qCgprNkkh+3rhX2F1+S0UxPWXQen1Ceqy3A3cp0bWGfiGZd5kBONzrpFh0tAc2joBxSYvoZ9YnH3cbQu0Jm6lF4Sa1jf96zVxjVWeiddz44rP/EKAZhkAYY4hhGt9WiJpvpPiPQPUgk9GRnCjfhiAttbXUbkWYzhOEah1MokJ7W/d1FgA7ZEWGr8x7WLI0HjYn/3Qr/gX4r9gg5fEQbO07RW+rXuJbrJOgC07oifZ6EoP6aBQh8E9TFwilwYjxkP6RCQx 5GNPGV/L SxmqQMOMiP1fQDKNExgyOnYV7BXgvkCSlffO+YL6/iMT6+RdK2snVlOIo15dCLQH9RXsw0IcwKnUnSlsVezQpqKXmcTsCi9fVCTaF4vq0d1/66r99bpYW2M/1Zpwqegth7JWJF4QbCLLgSZ1SoOUiifXCPMhJT/20dwG1RSlasJtyQrYxahhG7uiKfDsXm/Vr2pCrfZ3JP5tR25in1+wzBI9zXJAEO98MMrS4C5XHA0bBsZtVmSGbdKWMf3rZ+gNH61JR0/ePK9DxbONOrsjQ9lbHvNEeXia7Hbwcc0kmK0nIwg0DlCC7/gqOZmZr9enX8LZ05bqJ26DlblkDmJhJ07Fh6VFWsdnVGk/06JYeMh64CILu3M0gDkGZXAFxIyr0YNgLDf40Sa0ku8MAgYFaCReuo/7rSXR3lzOI59vm8edyqmH/EeOwTjXo/UUyUQwyExWKpUcf7gMcVLvo1hBKtJCjE9PTZ/2iE/lCPTmkIXYs+/p9jWP/jgTNz37iKvIHUZ8Dk3m2ENbHiE5bT8+2l33SivijUm0xealmpaGEurvMpuEWV2wHG730+6QQA6L6skVhi9uDYIdNEipO/QTq524AobD74YMdfqz3D7axqud46mkL9HefMTh3xTuLiKOsM/NuaOpOTCO15Z400+JTM15c8gkMmXmqAFI35rBocfGxgw32Cxh0mTMQ4t5X0A0WAmuS2EOTLUhmbCouxaT8KW2Aee95arpAIPpW1m+4ZBCeqqdHW2V7jE2/9XbcHafsa31nSe6fbCrnvbk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add support for mmap() and fault() for guest_memfd in the host. The ability to fault in a guest page is contingent on that page being shared with the host. The guest_memfd PRIVATE memory attribute is not used for two reasons. First because it reflects the userspace expectation for that memory location, and therefore can be toggled by userspace. The second is, although each guest_memfd file has a 1:1 binding with a KVM instance, the plan is to allow multiple files per inode, e.g. to allow intra-host migration to a new KVM instance, without destroying guest_memfd. The mapping is restricted to only memory explicitly shared with the host. KVM checks that the host doesn't have any mappings for private memory via the folio's refcount. To avoid races between paths that check mappability and paths that check whether the host has any mappings (via the refcount), the folio lock is held in while either check is being performed. This new feature is gated with a new configuration option, CONFIG_KVM_GMEM_MAPPABLE. Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng Co-developed-by: Elliot Berman Signed-off-by: Elliot Berman Signed-off-by: Fuad Tabba --- The functions kvm_gmem_is_mapped(), kvm_gmem_set_mappable(), and int kvm_gmem_clear_mappable() are not used in this patch series. They are intended to be used in future patches [*], which check and toggle mapability when the guest shares/unshares pages with the host. [*] https://android-kvm.googlesource.com/linux/+/refs/heads/tabba/guestmem-6.13-v5-pkvm --- virt/kvm/Kconfig | 4 ++ virt/kvm/guest_memfd.c | 87 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 91 insertions(+) diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 54e959e7d68f..59400fd8f539 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -124,3 +124,7 @@ config HAVE_KVM_ARCH_GMEM_PREPARE config HAVE_KVM_ARCH_GMEM_INVALIDATE bool depends on KVM_PRIVATE_MEM + +config KVM_GMEM_MAPPABLE + select KVM_PRIVATE_MEM + bool diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 722afd9f8742..159ffa17f562 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -671,9 +671,88 @@ bool kvm_slot_gmem_is_guest_mappable(struct kvm_memory_slot *slot, gfn_t gfn) return gmem_is_guest_mappable(inode, pgoff); } + +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) +{ + struct inode *inode = file_inode(vmf->vma->vm_file); + struct folio *folio; + vm_fault_t ret = VM_FAULT_LOCKED; + + filemap_invalidate_lock_shared(inode->i_mapping); + + folio = kvm_gmem_get_folio(inode, vmf->pgoff); + if (IS_ERR(folio)) { + ret = VM_FAULT_SIGBUS; + goto out_filemap; + } + + if (folio_test_hwpoison(folio)) { + ret = VM_FAULT_HWPOISON; + goto out_folio; + } + + if (!gmem_is_mappable(inode, vmf->pgoff)) { + ret = VM_FAULT_SIGBUS; + goto out_folio; + } + + if (WARN_ON_ONCE(folio_test_guestmem(folio))) { + ret = VM_FAULT_SIGBUS; + goto out_folio; + } + + if (!folio_test_uptodate(folio)) { + unsigned long nr_pages = folio_nr_pages(folio); + unsigned long i; + + for (i = 0; i < nr_pages; i++) + clear_highpage(folio_page(folio, i)); + + folio_mark_uptodate(folio); + } + + vmf->page = folio_file_page(folio, vmf->pgoff); + +out_folio: + if (ret != VM_FAULT_LOCKED) { + folio_unlock(folio); + folio_put(folio); + } + +out_filemap: + filemap_invalidate_unlock_shared(inode->i_mapping); + + return ret; +} + +static const struct vm_operations_struct kvm_gmem_vm_ops = { + .fault = kvm_gmem_fault, +}; + +static int kvm_gmem_mmap(struct file *file, struct vm_area_struct *vma) +{ + if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) != + (VM_SHARED | VM_MAYSHARE)) { + return -EINVAL; + } + + file_accessed(file); + vm_flags_set(vma, VM_DONTDUMP); + vma->vm_ops = &kvm_gmem_vm_ops; + + return 0; +} +#else +static int gmem_set_mappable(struct inode *inode, pgoff_t start, pgoff_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +#define kvm_gmem_mmap NULL #endif /* CONFIG_KVM_GMEM_MAPPABLE */ static struct file_operations kvm_gmem_fops = { + .mmap = kvm_gmem_mmap, .open = generic_file_open, .release = kvm_gmem_release, .fallocate = kvm_gmem_fallocate, @@ -860,6 +939,14 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_gmem; } + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE)) { + err = gmem_set_mappable(file_inode(file), 0, size >> PAGE_SHIFT); + if (err) { + fput(file); + goto err_gmem; + } + } + kvm_get_kvm(kvm); gmem->kvm = kvm; xa_init(&gmem->bindings);