From patchwork Fri Dec 13 16:48:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13907471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B7E6E77180 for ; Fri, 13 Dec 2024 16:48:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3F0B36B00B0; Fri, 13 Dec 2024 11:48:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 39FA46B00B1; Fri, 13 Dec 2024 11:48:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A48E6B00B2; Fri, 13 Dec 2024 11:48:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E93686B00B0 for ; Fri, 13 Dec 2024 11:48:44 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 941F8C0488 for ; Fri, 13 Dec 2024 16:48:44 +0000 (UTC) X-FDA: 82890519474.28.63DA74A Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf22.hostedemail.com (Postfix) with ESMTP id 7DC7CC000D for ; Fri, 13 Dec 2024 16:48:15 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="Wd2/PMCy"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of 3aWVcZwUKCOYbIJJIOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--tabba.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3aWVcZwUKCOYbIJJIOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--tabba.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734108510; a=rsa-sha256; cv=none; b=ue/UxLxE6PFXylcpNuDSQ5LhmD9RYB7xpoSpgDr8X9UT/VoVQ5jf5/lRQOyUYALtuNv5YW EsYoIuNZT2rRogqYYpiSAbt1laYe4ISczmj3J1scNerfdH3kji9PGQyxepKjhtbtM/AEQz FjBLvgwGHXvrSxNAt8EuzkWa+piz97c= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="Wd2/PMCy"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of 3aWVcZwUKCOYbIJJIOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--tabba.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3aWVcZwUKCOYbIJJIOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--tabba.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734108510; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3eqvqwg5DMqF/8QH18XL4R6P7ndrk18+/AZZCHc/Ifs=; b=EYrPpfi0Ox/6pY0j0ZUnI3dlbo+CSpoLMW5f5APalSOa1UEYqfVtyL0In8Dasr0sVvTWF3 QEDIzQADB8yOxDbDPFdlqqQPbQ1ZVOVXMx9hp5mhLX5dESzMmith+LWoiUkCWGHuHZg+JL TaUll0Hhi/z5/DaehvhNoK64sbDORaY= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43628594d34so4597545e9.2 for ; Fri, 13 Dec 2024 08:48:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108521; x=1734713321; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=3eqvqwg5DMqF/8QH18XL4R6P7ndrk18+/AZZCHc/Ifs=; b=Wd2/PMCyIwvUD1OmByg+RjPWQLk46sY4J2Oe4aGUGZkMNY38815kq87WHcmj3xdoGK thTQRNM0xdbUmLRINnsbDrygmI5LNCMY0k/3D+8AzfPSAvd0RzwhVWsJfXVq0XZ0yYt4 eXgelYYpBV7Al6LFEnF5gOEjI9h7jmCcdyqxYakSmecWWvo2MqbI6CpFf21rUwNge6o0 QtB3jUptb/T6lTUBq6i1iPZZ8+W02gwT6xMlzHIC3+PC5YNP3LtkDEg1mAPeY/lQo72J RMoJ3o7YBrVGngdP6aAg/ZPBmufJTh8bRBPHHj8sPyqh3p7ccXS3cF9W7B/ONZHfWqxp URaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108521; x=1734713321; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3eqvqwg5DMqF/8QH18XL4R6P7ndrk18+/AZZCHc/Ifs=; b=u8Ut39Xw+VLZ6oKvopROPJDbhb6ynlNuflpDl+8IQ6e+U3wcyRR1k0rwFI8EMYL5Bf +nAkWrIjkiOFfQrX2vrGVkFzphsvQmIKserfHMyTzfDxavmE9rCuymftegsD7rZQCoPg SJvRdYPvCt/dEZKPNiMceh7o1DlA4GL3z3aPTbWHML2+ZtfZDBmXTBibg2k2H0Fcbhu4 rsfplBfyqP84UA1FdaXI4SPmKoki8zgPJ2VYi8TUYIPV7LFB9rf0SO+XxwC7h0mzZWdS dBxfsmDeOacfWsEa10wHTyoeA6RfGrfsOgpPTU3jHrANfI8yizBgzgW6n2ilCzaY5zd8 zDyg== X-Forwarded-Encrypted: i=1; AJvYcCWU3t+CQFhGirTuhGKaow7Kbz5cinobDMnbYItyTbEkBy6eP6m1uGYy1iT7tOkxQR6grpOVvKhApw==@kvack.org X-Gm-Message-State: AOJu0Yzi4qiWFtOdWP7/1syCSOMcSiOZEmem5RYRt2pzEUbA3ApcKuk+ zret6X77qi90f4hxG0/dMK9p7AvymJdoKcjdSvPJ9U38Tl4G6mwfW+ol+TTN1hNZ6BL/NDQMCw= = X-Google-Smtp-Source: AGHT+IGK0ZDMNsH9IoiRdhl2fIEsYjGp9BzosmVVqKEIzH/3TIloY+9G+GKQNVEw6jbsarqoTxBCMQDUSA== X-Received: from wmlf18.prod.google.com ([2002:a7b:c8d2:0:b0:434:9da4:2fa5]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:3496:b0:434:a07d:b709 with SMTP id 5b1f17b1804b1-4362aab4896mr28738005e9.29.1734108521406; Fri, 13 Dec 2024 08:48:41 -0800 (PST) Date: Fri, 13 Dec 2024 16:48:09 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-14-tabba@google.com> Subject: [RFC PATCH v4 13/14] KVM: arm64: Handle guest_memfd()-backed guest page faults From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 7DC7CC000D X-Stat-Signature: 4chfpy3tgefjhias419ywy8e5z1moecw X-HE-Tag: 1734108495-625405 X-HE-Meta: U2FsdGVkX1+IcvHhN/RlvpXbmcK2bDkzGQKa84AxdxtgeAeZA0H6ilr1d97c2aKAexM3ZRj/1JLP61EDFsQNxYY+LEiqX0YerYi1PdoEq/+Qx0fGzhxxsRblN0X+BqhSCiKoYv5TbwXyVmvdLsnL/o/aGqV3JDZt1k5SR3m+QUF+LChichkUl/SCfhfFMib+58YIzYTXxz0Yz5dec1WlS8QnYf2pPpOLlmIq5v7OFdNUqsSFcaMgsku8juwvAeIZdcLLhKBhkZQr4F0NE+O4hosNaS/YZR0OvJmhgYYmN6DsMqtn1zhTrq88Wx7tOY/CLwGODb8D0QH40jvCZHkOwG66CZSsURk+7jKcVMWpFMhK8/C+Bb16FxseM9E0qkdXy5CSACex8sv4s0E7v9aPoie1N7qToFc1XXOI8UlbNK5BjGe3sg34rnmdJxzoOY160nEwA1gBZX20cE1iIBCcBTrnxoJUMlwNI9J6ibYVruo801YVZn6w1N2KR8HU3VUCDk3jU5i0WLsfgvR1PeO7+PYv1NyXL50TJ8+xu2sg/Cot7T3sHYiuxOlyhoJrItvNNWsBRaVgzJN6sPi14nduNSI2N+Tzdom6CBzQXnyrmnrvD8LVfKktZvn8qDDQJ5KHMVLakUr7/9sYsfkHb1J3UqHki1Dp6PzQqUnYrEB6tF6zPEFcUiIcehQwx6U+vG2rpYwLVcU2qbdy4pHt8jIJy/BwPv9C2i28rmZkcFby86NNpGNc+nIvJqjyF1k09w41i0VALtCnod+gmeeWbhrDJgcnZdL9mGC5FrvNamxYLFHW+CnhI86laW4cfnzv0M7GEqOlbAL8dBCkr5+YLgG6KJLSoc16MFobGeWRxHw3ZIM+gFVi54L6rE+tDjm/RyCb3aQNQPLNYr754iNWIFGvxl1LLA7GROaRUz84j4yJQNVMwdIukvzj6IVIqu4d57+YyUDRwcEyS6REUEJfJEn QxOItWKs KulxivjGifTyXiK/LzEuQ3i3mxQuf+ByYDE61BlpurXDCVdS/qaJvh+9KR59umKTpnFpF/gU9bJ+kKuHlgxr4upok2CAnAjgWwucbIcJEZjdk65FaXSxO+djK8xYyA0g1bInhDmmHYc4io5FCOmvR5LTY6KVjMjDNKqPgHGVJdb8h7PKxRYpSwcmMhu9zCKwgUDvvI8wCprdhdFgVJgvb4MmN3HccwaYvhuQMr7kjTYne3R4X/x+T/MtoHR+Owfv4HBa8zWJRhaKbqG1rcX0zXdZm5y841v/V+LYhyE20l2v170KK4gz9qBMTwccRyr+5Cx6ylDDuYkBN2jx/LkYDAcP6khMHyKbFF89vLYkFwo11mQ4/4G8DB17ZfonOcH1SFmQjL2CLA/507ErrvQ9mqQrtUVJf+cqjo6FPhH0B1SjI8o+L+13PE5WdnEI6a5ZHehRXUVx45TUN4Te0cRl3khcql/wI7eS6O8KZAWlVq0d8lD3xk4Fy8nSirH2hEisxeHpEIL2fqtkhNV2lmUvu+evErF5LoumsCOrZg4NjzSz2nxS4xRKWMeHiDaUCkvjqHiWLWkyq8RD0zrv9pcAjswQiqhG/EjXI1msrtFIrAUm6pkoVSHzy+G9RIRmJFXczsAe77lzBjaapdYV6/XgToqqBhg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add arm64 support for resolving guest page faults on guest_memfd() backed memslots. This support is not contingent on pKVM, or other confidential computing support, and works in both VHE and nVHE modes. Without confidential computing, this support is useful for testing and debugging. In the future, it might also be useful should a user want to use guest_memfd() for all code, whether it's for a protected guest or not. For now, the fault granule is restricted to PAGE_SIZE. Signed-off-by: Fuad Tabba --- arch/arm64/kvm/mmu.c | 111 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 109 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 342a9bd3848f..1c4b3871967c 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1434,6 +1434,107 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma) return vma->vm_flags & VM_MTE_ALLOWED; } +static int guest_memfd_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, + struct kvm_memory_slot *memslot, bool fault_is_perm) +{ + struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache; + bool exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); + bool logging_active = memslot_is_logging(memslot); + struct kvm_pgtable *pgt = vcpu->arch.hw_mmu->pgt; + enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; + bool write_fault = kvm_is_write_fault(vcpu); + struct mm_struct *mm = current->mm; + gfn_t gfn = gpa_to_gfn(fault_ipa); + struct kvm *kvm = vcpu->kvm; + struct page *page; + kvm_pfn_t pfn; + int ret; + + /* For now, guest_memfd() only supports PAGE_SIZE granules. */ + if (WARN_ON_ONCE(fault_is_perm && + kvm_vcpu_trap_get_perm_fault_granule(vcpu) != PAGE_SIZE)) { + return -EFAULT; + } + + VM_BUG_ON(write_fault && exec_fault); + + if (fault_is_perm && !write_fault && !exec_fault) { + kvm_err("Unexpected L2 read permission error\n"); + return -EFAULT; + } + + /* + * Permission faults just need to update the existing leaf entry, + * and so normally don't require allocations from the memcache. The + * only exception to this is when dirty logging is enabled at runtime + * and a write fault needs to collapse a block entry into a table. + */ + if (!fault_is_perm || (logging_active && write_fault)) { + ret = kvm_mmu_topup_memory_cache(memcache, + kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu)); + if (ret) + return ret; + } + + /* + * Holds the folio lock until mapped in the guest and its refcount is + * stable, to avoid races with paths that check if the folio is mapped + * by the host. + */ + ret = kvm_gmem_get_pfn_locked(kvm, memslot, gfn, &pfn, &page, NULL); + if (ret) + return ret; + + if (!kvm_slot_gmem_is_guest_mappable(memslot, gfn)) { + ret = -EAGAIN; + goto unlock_page; + } + + /* + * Once it's faulted in, a guest_memfd() page will stay in memory. + * Therefore, count it as locked. + */ + if (!fault_is_perm) { + ret = account_locked_vm(mm, 1, true); + if (ret) + goto unlock_page; + } + + read_lock(&kvm->mmu_lock); + if (write_fault) + prot |= KVM_PGTABLE_PROT_W; + + if (exec_fault) + prot |= KVM_PGTABLE_PROT_X; + + if (cpus_have_final_cap(ARM64_HAS_CACHE_DIC)) + prot |= KVM_PGTABLE_PROT_X; + + /* + * Under the premise of getting a FSC_PERM fault, we just need to relax + * permissions. + */ + if (fault_is_perm) + ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot); + else + ret = kvm_pgtable_stage2_map(pgt, fault_ipa, PAGE_SIZE, + __pfn_to_phys(pfn), prot, + memcache, + KVM_PGTABLE_WALK_HANDLE_FAULT | + KVM_PGTABLE_WALK_SHARED); + + kvm_release_faultin_page(kvm, page, !!ret, write_fault); + read_unlock(&kvm->mmu_lock); + + if (ret && !fault_is_perm) + account_locked_vm(mm, 1, false); +unlock_page: + unlock_page(page); + put_page(page); + + return ret != -EAGAIN ? ret : 0; +} + static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_s2_trans *nested, struct kvm_memory_slot *memslot, unsigned long hva, @@ -1900,8 +2001,14 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) goto out_unlock; } - ret = user_mem_abort(vcpu, fault_ipa, nested, memslot, hva, - esr_fsc_is_permission_fault(esr)); + if (kvm_slot_can_be_private(memslot)) { + ret = guest_memfd_abort(vcpu, fault_ipa, memslot, + esr_fsc_is_permission_fault(esr)); + } else { + ret = user_mem_abort(vcpu, fault_ipa, nested, memslot, hva, + esr_fsc_is_permission_fault(esr)); + } + if (ret == 0) ret = 1; out: