From patchwork Tue Sep 10 16:30:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Roy X-Patchwork-Id: 13798882 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 913CFECE58A for ; Tue, 10 Sep 2024 16:31:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BEF238D008C; Tue, 10 Sep 2024 12:31:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B203A8D0002; Tue, 10 Sep 2024 12:31:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9716B8D008C; Tue, 10 Sep 2024 12:31:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 741F98D0002 for ; Tue, 10 Sep 2024 12:31:10 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 361B91A07CF for ; Tue, 10 Sep 2024 16:31:10 +0000 (UTC) X-FDA: 82549368300.28.4A77BBB Received: from smtp-fw-52005.amazon.com (smtp-fw-52005.amazon.com [52.119.213.156]) by imf21.hostedemail.com (Postfix) with ESMTP id 567D41C0029 for ; Tue, 10 Sep 2024 16:31:08 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=amazon.co.uk header.s=amazon201209 header.b=v0ugoCKA; spf=pass (imf21.hostedemail.com: domain of "prvs=976277991=roypat@amazon.co.uk" designates 52.119.213.156 as permitted sender) smtp.mailfrom="prvs=976277991=roypat@amazon.co.uk"; dmarc=pass (policy=quarantine) header.from=amazon.co.uk ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725985765; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OSB6/wHaKh12iBDoF1XKstABhzcwNvqjtvF5UFffeGQ=; b=v7TKYeAH0B9YFYL+LhM54SjijW2XomzQbjiDO5EvdBnR1rgNUx8ITWBlW79VqaC1JW3j20 M3X3CjbeO1voqu+iSR3WnnrxwA47JtJfQMnl0N0fWX7uGNcti/OTOWy7LRSI5/1ZGqBp1K 4rG/aLlLlY4CBjF7WPqpNtUeVHqxGqk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725985765; a=rsa-sha256; cv=none; b=wBVJtyXFHY64Ej5chRu3qDC/Ke1Jh1sbGSs3xhHgv8SZUxG4p9QQXkA8/jvnP/5bzIO1/r I3/U/eyRnWeEQzRtjtRZLxZdDPnEDODpcX0zpyH+6aETpZBoYAk9S1JEMYzQWmQM4ZVO0c OnplCGZ/9gIoJfn57REihKGopXvLmyg= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=amazon.co.uk header.s=amazon201209 header.b=v0ugoCKA; spf=pass (imf21.hostedemail.com: domain of "prvs=976277991=roypat@amazon.co.uk" designates 52.119.213.156 as permitted sender) smtp.mailfrom="prvs=976277991=roypat@amazon.co.uk"; dmarc=pass (policy=quarantine) header.from=amazon.co.uk DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.co.uk; i=@amazon.co.uk; q=dns/txt; s=amazon201209; t=1725985868; x=1757521868; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=OSB6/wHaKh12iBDoF1XKstABhzcwNvqjtvF5UFffeGQ=; b=v0ugoCKAYz3H3AUhu7tLrSyiIPgVnhlzaT4HtAkiL3SpYf0U35n2yM+1 ibIu5HSNAPzYTqZuuaMOC37soH0KrKo5NzgzRDK7E+bf8uOKJAeSxmaue RWKsVEmXPdhWPclqFGy0eCGMpj2/4/nErQbzNgrMHWPS1CvNKPA7yo7T0 Q=; X-IronPort-AV: E=Sophos;i="6.10,217,1719878400"; d="scan'208";a="679397384" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-52005.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Sep 2024 16:31:05 +0000 Received: from EX19MTAUEA002.ant.amazon.com [10.0.29.78:10984] by smtpin.naws.us-east-1.prod.farcaster.email.amazon.dev [10.0.46.235:2525] with esmtp (Farcaster) id 37b8be63-b91b-41ff-9b88-4b6db0d84ee2; Tue, 10 Sep 2024 16:31:04 +0000 (UTC) X-Farcaster-Flow-ID: 37b8be63-b91b-41ff-9b88-4b6db0d84ee2 Received: from EX19D008UEA002.ant.amazon.com (10.252.134.125) by EX19MTAUEA002.ant.amazon.com (10.252.134.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Tue, 10 Sep 2024 16:30:57 +0000 Received: from EX19MTAUWB001.ant.amazon.com (10.250.64.248) by EX19D008UEA002.ant.amazon.com (10.252.134.125) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Tue, 10 Sep 2024 16:30:57 +0000 Received: from ua2d7e1a6107c5b.home (172.19.88.180) by mail-relay.amazon.com (10.250.64.254) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34 via Frontend Transport; Tue, 10 Sep 2024 16:30:52 +0000 From: Patrick Roy To: , , , , , , , , , , , , , , , , , , , , CC: Patrick Roy , , , , , Subject: [RFC PATCH v2 02/10] kvm: gmem: Add KVM_GMEM_GET_PFN_SHARED Date: Tue, 10 Sep 2024 17:30:28 +0100 Message-ID: <20240910163038.1298452-3-roypat@amazon.co.uk> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20240910163038.1298452-1-roypat@amazon.co.uk> References: <20240910163038.1298452-1-roypat@amazon.co.uk> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 567D41C0029 X-Stat-Signature: 58xixrykd79cpx1t96b1f6anao7ctp5x X-HE-Tag: 1725985868-526405 X-HE-Meta: U2FsdGVkX1/hrjosG+sMBBnY1s1KRsjZpwfip4/lBony8P2F/GsD+nPdUjCQvhuCJ4ugaDbV6JvvUwGWuNK1l6KKCqcm3qOmMqZTrlXAu0NlTn/fFdToQ40Mg7Z5LQ3hjrSkUizk8QDNHeVyVyEfjWxZcBwo1HGes+WnCE3VAOjFtMGBkl09W8caaWZ+ICLqeFA5FO6qbT7GWnHCnur11iAhlkiClPAXq8H9RAp+5D+1n1F9J/LA41JXgsO+Kl7qVBQ6zq2fAbRBFICI85n8Yq7i4frUPcvsMrR0eNTgwkOFi+A8DRCB+LTeddPPUjdnpa3e0V0HOig9WfA4gIfxlJBBWGlJXp2/sHccDFunR9RW71ZLImXapOfCG1upVck7D+4rFnhyDlGp2yud3AEcwHp+fIcKMth8A5V7tbpoTKGpom12RlIXa0vBguLvfgO5IBBAIIvBvq6YExJy4yZaRMcHRP7uPlBdj1OlF2ULyosThbcfN9J0gTtN0etsUppCzqxqMkb3Vkqp8Y/ERWXRTPC20Ydf37o8JXScfOC2odmsURYjVmAKe8lOWUYIpgdaUVeLh3tjtDTfNKhjuoVGZIYblEDC5BlMZk2UL//sj/AAKImBWtKV+H+XiYAfglqtm1oRZ3G1lJBdYm/iiM3bUwaNdFSj+azuAbJMtkncRhbplZLToztRBF9e2d6YIhaJWU8TKd2OjzQYJG8oYdRM/7zS+IZ6C5FPjCY1VKbZlvtN9M5TT4HPl3OTfrx8//blBPsCMOPv61hCsZ8JmKy2wWFT3svQ1gOkQPnA9vlGdLVxXOqL+k2GS0qz0PQBz6dD/Qq2OhfhSqLv0yTRttxOoogweEbfB5bIetbXPLv237tLd1ArsLeIIRXkYWo4KidiJ4tjWGTfmwZGcfvxqCKDLPWuKlxzhFp23NeTPJa8CnMa8IIgYaChID4qWkXz6PA8F3KW7XUPKN2LwO++mF8 cskAFHO/ wruEhiXZSKyCmUqn3xlzb1jdlQiFuyHtkFEiYFfoRwiLXKksU5ZNEyCM4ECxCCIjAn9ZSZAS3jIknFsk/xSqO3vPyGj3LeSQ2m6nZJWlwIHR4nlGPAdJunJKePmqf2C4WfT44pCb4B8WjQ0TYdsy1ZSgxzvEAFpjWytp8AAFGVcF3Rb75KyklaYnarVUHpJOItJtRMbMmh1NyzPsh6dEuadT0nyA5PMQlZTsbVi1tW1zx2oOR3EPf/RABto8BTI+r5iR0mOWz5wfNdH/S6RFVLLxRbOpPFs5CkWaTpKEBeDG0DoIjnEIU+qsA6Y7vfmyWcWXDMADlmuW8y6u59HGL+m1R/BHvwFVyf0BfB3pyDUUK9TPq0Zs3civiPzG7FWNKK55DyFtc+caKlfU61BLdvzCi2ubKPAW2Gq/r5G0Q2+00pevK+3RHU/a0jvn9qInFZizd X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If `KVM_GMEM_NO_DIRECT_MAP` is set, all gmem folios are removed from the direct map immediately after allocation. Add a flag to kvm_gmem_grab_folio to overwrite this behavior, and expose it via `kvm_gmem_get_pfn`. Only allow this flag to be set if KVM can actually access gmem (currently only if the vm type is KVM_X86_SW_PROTECTED_VM). KVM_GMEM_GET_PFN_SHARED defers the direct map removal for newly allocated folios until kvm_gmem_put_shared_pfn is called. For existing folios, the direct map entry is temporarily restored until kvm_gmem_put_shared_pfn is called. The folio lock must be held the entire time the folio is present in the direct map, to prevent races with concurrent calls kvm_gmem_folio_set_private that might remove direct map entries while the folios are being accessed by KVM. As this is currently not possible (kvm_gmem_get_pfn always unlocks the folio), the next patch will introduce a KVM_GMEM_GET_PFN_LOCKED flag. Signed-off-by: Patrick Roy --- arch/x86/kvm/mmu/mmu.c | 2 +- include/linux/kvm_host.h | 12 +++++++++-- virt/kvm/guest_memfd.c | 46 +++++++++++++++++++++++++++++++--------- 3 files changed, 47 insertions(+), 13 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 901be9e420a4c..cb2f111f2cce0 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4349,7 +4349,7 @@ static int kvm_faultin_pfn_private(struct kvm_vcpu *vcpu, } r = kvm_gmem_get_pfn(vcpu->kvm, fault->slot, fault->gfn, &fault->pfn, - &max_order); + &max_order, 0); if (r) { kvm_mmu_prepare_memory_fault_exit(vcpu, fault); return r; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 689e8be873a75..8a2975674de4b 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2432,17 +2432,25 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn) } #endif /* CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES */ +#define KVM_GMEM_GET_PFN_SHARED BIT(0) +#define KVM_GMEM_GET_PFN_PREPARE BIT(31) /* internal */ + #ifdef CONFIG_KVM_PRIVATE_MEM int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, - gfn_t gfn, kvm_pfn_t *pfn, int *max_order); + gfn_t gfn, kvm_pfn_t *pfn, int *max_order, unsigned long flags); +int kvm_gmem_put_shared_pfn(kvm_pfn_t pfn); #else static inline int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, - kvm_pfn_t *pfn, int *max_order) + kvm_pfn_t *pfn, int *max_order, int flags) { KVM_BUG_ON(1, kvm); return -EIO; } +static inline int kvm_gmem_put_shared_pfn(kvm_pfn_t pfn) +{ + return -EIO; +} #endif /* CONFIG_KVM_PRIVATE_MEM */ #ifdef CONFIG_HAVE_KVM_GMEM_PREPARE diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 2ed27992206f3..492b04f4e5c18 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -55,6 +55,11 @@ static bool kvm_gmem_test_no_direct_map(struct inode *inode) return ((unsigned long)inode->i_private & KVM_GMEM_NO_DIRECT_MAP) == KVM_GMEM_NO_DIRECT_MAP; } +static bool kvm_gmem_test_accessible(struct kvm *kvm) +{ + return kvm->arch.vm_type == KVM_X86_SW_PROTECTED_VM; +} + static int kvm_gmem_folio_set_private(struct folio *folio) { unsigned long start, npages, i; @@ -110,10 +115,11 @@ static int kvm_gmem_folio_clear_private(struct folio *folio) return r; } -static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index, bool prepare) +static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index, unsigned long flags) { int r; struct folio *folio; + bool share = flags & KVM_GMEM_GET_PFN_SHARED; /* TODO: Support huge pages. */ folio = filemap_grab_folio(inode->i_mapping, index); @@ -139,7 +145,7 @@ static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index, bool folio_mark_uptodate(folio); } - if (prepare) { + if (flags & KVM_GMEM_GET_PFN_PREPARE) { r = kvm_gmem_prepare_folio(inode, index, folio); if (r < 0) goto out_err; @@ -148,12 +154,15 @@ static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index, bool if (!kvm_gmem_test_no_direct_map(inode)) goto out; - if (!folio_test_private(folio)) { + if (folio_test_private(folio) && share) { + r = kvm_gmem_folio_clear_private(folio); + } else if (!folio_test_private(folio) && !share) { r = kvm_gmem_folio_set_private(folio); - if (r) - goto out_err; } + if (r) + goto out_err; + out: /* * Ignore accessed, referenced, and dirty flags. The memory is @@ -264,7 +273,7 @@ static long kvm_gmem_allocate(struct inode *inode, loff_t offset, loff_t len) break; } - folio = kvm_gmem_get_folio(inode, index, true); + folio = kvm_gmem_get_folio(inode, index, KVM_GMEM_GET_PFN_PREPARE); if (IS_ERR(folio)) { r = PTR_ERR(folio); break; @@ -624,7 +633,7 @@ void kvm_gmem_unbind(struct kvm_memory_slot *slot) } static int __kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot, - gfn_t gfn, kvm_pfn_t *pfn, int *max_order, bool prepare) + gfn_t gfn, kvm_pfn_t *pfn, int *max_order, unsigned long flags) { pgoff_t index = gfn - slot->base_gfn + slot->gmem.pgoff; struct kvm_gmem *gmem = file->private_data; @@ -643,7 +652,7 @@ static int __kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot, return -EIO; } - folio = kvm_gmem_get_folio(file_inode(file), index, prepare); + folio = kvm_gmem_get_folio(file_inode(file), index, flags); if (IS_ERR(folio)) return PTR_ERR(folio); @@ -667,20 +676,37 @@ static int __kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot, } int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, - gfn_t gfn, kvm_pfn_t *pfn, int *max_order) + gfn_t gfn, kvm_pfn_t *pfn, int *max_order, unsigned long flags) { struct file *file = kvm_gmem_get_file(slot); int r; + int valid_flags = KVM_GMEM_GET_PFN_SHARED; + + if ((flags & valid_flags) != flags) + return -EINVAL; + + if ((flags & KVM_GMEM_GET_PFN_SHARED) && !kvm_gmem_test_accessible(kvm)) + return -EPERM; if (!file) return -EFAULT; - r = __kvm_gmem_get_pfn(file, slot, gfn, pfn, max_order, true); + r = __kvm_gmem_get_pfn(file, slot, gfn, pfn, max_order, flags | KVM_GMEM_GET_PFN_PREPARE); fput(file); return r; } EXPORT_SYMBOL_GPL(kvm_gmem_get_pfn); +int kvm_gmem_put_shared_pfn(kvm_pfn_t pfn) { + struct folio *folio = pfn_folio(pfn); + + if (!kvm_gmem_test_no_direct_map(folio_inode(folio))) + return 0; + + return kvm_gmem_folio_set_private(folio); +} +EXPORT_SYMBOL_GPL(kvm_gmem_put_shared_pfn); + long kvm_gmem_populate(struct kvm *kvm, gfn_t start_gfn, void __user *src, long npages, kvm_gmem_populate_cb post_populate, void *opaque) {