From patchwork Wed Jan 22 15:27:35 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13947434 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4C68C02182 for ; Wed, 22 Jan 2025 15:28:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B7ABB6B009A; Wed, 22 Jan 2025 10:27:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B50FA6B009B; Wed, 22 Jan 2025 10:27:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9CEC5280001; Wed, 22 Jan 2025 10:27:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 77EDD6B009A for ; Wed, 22 Jan 2025 10:27:57 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 1C16E45445 for ; Wed, 22 Jan 2025 15:27:57 +0000 (UTC) X-FDA: 83035468194.09.F93B4AD Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf26.hostedemail.com (Postfix) with ESMTP id 38B2714000F for ; Wed, 22 Jan 2025 15:27:54 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ySXEXIGa; spf=pass (imf26.hostedemail.com: domain of 3eQ6RZwUKCB4N4554AIIAF8.6IGFCHOR-GGEP46E.ILA@flex--tabba.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3eQ6RZwUKCB4N4554AIIAF8.6IGFCHOR-GGEP46E.ILA@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737559675; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Hci5FfrptrEYAgO84bIQek1TVo92GCx/us5uG9wbkFo=; b=TFLajvGkN588hU0UWiUm/QEgTj/ZAMJezQLyud+xSi5qY/kydBYMv9J88ATzyptEPyGjFy hWsU4Ki2qU7xVb0Tg501ZhafN1Jy//beEnUtoCWgoZUvZJJRi25BaHcjTWT2ocRuA+Ux4l K8hliGtv0kr/25WVwgIFHp3Wm8MVRCc= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ySXEXIGa; spf=pass (imf26.hostedemail.com: domain of 3eQ6RZwUKCB4N4554AIIAF8.6IGFCHOR-GGEP46E.ILA@flex--tabba.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3eQ6RZwUKCB4N4554AIIAF8.6IGFCHOR-GGEP46E.ILA@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737559675; a=rsa-sha256; cv=none; b=bXLrzgU64hb8tN+JDW9wECnWW54GLdLvgo0uxPts/Rx8DCzTAqf9KKGVgWiLiZSnh31INk 4tJ+nBQgpbQrl1Gbgv34oXfauoKQuumU34REEHNStalboIpwJxY/ILW6D2q1btNkdH7J+c HG3bGI2hEIbPqgoHstMFk0GiSQC51j4= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4361fc2b2d6so38934885e9.3 for ; Wed, 22 Jan 2025 07:27:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1737559674; x=1738164474; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Hci5FfrptrEYAgO84bIQek1TVo92GCx/us5uG9wbkFo=; b=ySXEXIGalqcVm3ZfQiZKX0Z5TkquJ6Hza9tvfwbjkByAflg7yaAgGItvkoTegbb7WH 3rX6k3vkSWga+3+fZ4Pl+RxEdc5oclW+cT/LZHyWW1iJCDUmKP6qJbNBPOAsuUz5b7lL lo98uyGF+Gr/+Cg/f84rEtsNQrWgNTz3THVEIOSfGOvxQkeF0rwAjaJP222rgqf3P5rW QQS+kONQDsm77IhJshoCZ8kgJ+M/4JGI4fej1yTzcjtwMjlal4JTMuDl8HuIS0kypNRz 6d9MnV/ibhyZWchqlgXj5Ysa+8/NGXE8eMsio3IAD0QYTIoIE8N8jMed+HFRBmrwAEfa Ou7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737559674; x=1738164474; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Hci5FfrptrEYAgO84bIQek1TVo92GCx/us5uG9wbkFo=; b=rDKju3Vz1wi3fONbUpKeLgSy4DwIYq2BGVux7FdT6TAI1S6mRri0yhSUe2vYZvnd5u iG3Oot2jtHs0dBVEjjTlapuOIxR9J0anAby7ARwGY4L4C+2Sr/etHEOKQti9EKeHpCv8 caet5btrUT/SQRo92tidnAzeP9/bTD7fw5ZKQQRQD743LyrrK5ZoK9CgV7HquTvFypwG C1i/jDn9fAcpYF7grtYjpKbcouN63+ynnTqhuSKlcJXr+buA+SKoNPwLYu1N+PoFETuh 2+vIaHQ9jNojBkCB9M637ZSkbAwK8cq+ZdH2TSRQOgx28w4jaL/oodii/D8mPRr+EsKN 3Srg== X-Forwarded-Encrypted: i=1; AJvYcCVsw2Y3xsIkVvlFhcWE3Aez0z0EMMENhEI3IowVNgGtkUvRx0dECA8D56IhNISdQc05/VNMtISG7g==@kvack.org X-Gm-Message-State: AOJu0YyrV5tuXaHX/tIx5A0gYlhOX5jeAKOFoHtehRmLLWCE8ttVaeHy MimoO2uQ9cXb+V3djDPtTBx+7uSx3K6DLIteZHQfxUj/qigIxNP93feM6O+8cz1SNIWNMoAWLw= = X-Google-Smtp-Source: AGHT+IE2Fntu2qopdn+owQhaJ80RY4NIoyvMlJMQPvu8Ke49lFI8KlAtVqYK+DTJXCt4814HosjTb2ygng== X-Received: from wmrk9.prod.google.com ([2002:a05:600c:b49:b0:434:f0a3:7876]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:450d:b0:436:840b:261c with SMTP id 5b1f17b1804b1-438914340afmr185743115e9.19.1737559673695; Wed, 22 Jan 2025 07:27:53 -0800 (PST) Date: Wed, 22 Jan 2025 15:27:35 +0000 In-Reply-To: <20250122152738.1173160-1-tabba@google.com> Mime-Version: 1.0 References: <20250122152738.1173160-1-tabba@google.com> X-Mailer: git-send-email 2.48.0.rc2.279.g1de40edade-goog Message-ID: <20250122152738.1173160-7-tabba@google.com> Subject: [RFC PATCH v1 6/9] KVM: arm64: Handle guest_memfd()-backed guest page faults From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 38B2714000F X-Stat-Signature: 7971pnyz68izpcbw9aon31y45pahujsk X-Rspam-User: X-HE-Tag: 1737559674-959266 X-HE-Meta: U2FsdGVkX19IqUTxML4XPjcmpkp6wJa5voAqbyUeAn6DEKB99j398ItOTBivkUm+rUkFl0FJ+KOEOOawxE/XqHoUUfvz2yzipHf+4/4GYGOP1krmJhBQ+ZND3lw66bJa4apckrpj+OHKT5G79HM6S0XF559ho2Q50RD9wGYvo8ptK6jaT/SIQwSEU8E9F5jn+NnkgPaiLLmnJdNCbpqhE8CJhwF7iyVynEB31HScBzPDD2uxF8dXa+kCD2WKGWGroK8065PDJzJljaHtPu+qr0/H8r6tOgMePAYJGSn6592OFeq2tbQFudGgLD6Db+XGd48QpWpi36uiObAORuszth1CS/ntSVROOiu3qqzmOwI9+bDUnAZihapwy9sQQDiM0eKBLSt8ToTD6aTvHBvQUTYy5KuU47a0ZGvaSa2rTE3rpL4JfCPaBEbUXYgysKbM8JLt2smx5htcUcrf9cE343tgXUcmEHCr6feqy04inFOt7gD+TEzPOeT1BM/rA/J/I4BAsW4dtzZUMi5Sy3fydZXWLfGo6GRoLzS9Mh7k5ImA2aIT0EYpMUHwOFTwSWX5Nc3k/nwHDx4Ao1tEnqJyfduJHGXh78M0v6X8na85zfnxpQf1d7J8kHS8OK8wNYoNaliMGlZVzt2RmGIuTxtha3003GPBIRP+6zoIXYazK0YohpPPQCqX2E5NR3k5zNKGVHPie2NvxTJoEKOHBL+rL9u9yLviJ+HHQnacXHXNp/K20Uo0ovh+jtgkeqzfh1sDK9lLoen/0Ddn40gsol0SknaDhRJ9LjMJrvej/qVsipeV2jgOaKRHfMUpwQUTpwArwAXdKmydWZYNEboewtkvtJUasG5AKclZyXXiVDVe1+iNt07JTP6rLN74CFMddrajmE7wWI1uSO6KGyPsvaoRw9+eiKp9soBQ+2PoOE7h7pn2dt8ZzsD9QEUlpH95OpL9PL8691ygoBqMvu4Y6Yk 4JzIDbt0 TrofyAlJijgePH979/+a4UOWanBAolaDERy6Q830TWBvibnA+P9OAixkQrb3KaHpvZipC4bZV0Xe2sh6hlJMIp0TE6ZG6B4cNqYTGnkZLZ4FXkPqJgSPa17Lr5OXdEYdRnLueeDHGesbBwxs5QRr7gScBdlK9qxjRdvjVGV3cXGGyqEmYMYUEvxnyrINnP4xs+d5+jFhlJ9IyAlQHwFBG5xqHNmMcphvz+TLYdQrLHTKZv07dkB0YPpuApxgtNZWIkUfFN9sEOoXWbuiG1TakB9vanefFSus46Asl4bfdEvfGQNDWoGpmdohJZwMsDaLMLo9OidEADnWU4hdI1nCaln4xwd5wcBpo9PYHkdB/jXp7gHBWxZeM9iaUdQJLiQmmJjB2S9wGdGEDNmHcDWQIlhJyjhGRERTV0OSVVyzOaV3VI2cjBtDBIWST9vj8e0no7JSf0OeHnhaYTetOtUYtN0ap0GlXaTv242TUs5z7nbblHQQJUfDrO6JUOiXBn2hEXcy/oBJkO2pNejOBJz57RqNBrMfWWpXaYVOVQI458O2dQBewncVO8gb5VrBruh2aCv3KFu9FOl3KVDgNb5Xqs3x/Tsven5koZTzz3jnwnhhyccaQpAid/udSZdiocp/6V/tKGnEUhEL6OeT2PHf/7hi9/Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add arm64 support for resolving guest page faults on guest_memfd() backed memslots. This support is not contingent on pKVM, or other confidential computing support, and works in both VHE and nVHE modes. Without confidential computing, this support is useful for testing and debugging. In the future, it might also be useful should a user want to use guest_memfd() for all code, whether it's for a protected guest or not. For now, the fault granule is restricted to PAGE_SIZE. Signed-off-by: Fuad Tabba --- arch/arm64/kvm/mmu.c | 86 ++++++++++++++++++++++++++++------------ include/linux/kvm_host.h | 5 +++ virt/kvm/kvm_main.c | 5 --- 3 files changed, 66 insertions(+), 30 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 9b1921c1a1a0..adf23618e2a0 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1434,6 +1434,39 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma) return vma->vm_flags & VM_MTE_ALLOWED; } +static kvm_pfn_t faultin_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, bool write_fault, bool *writable, + struct page **page, bool is_private) +{ + kvm_pfn_t pfn; + int ret; + + if (!is_private) + return __kvm_faultin_pfn(slot, gfn, write_fault ? FOLL_WRITE : 0, writable, page); + + *writable = false; + + if (WARN_ON_ONCE(write_fault && memslot_is_readonly(slot))) + return KVM_PFN_ERR_NOSLOT_MASK; + + ret = kvm_gmem_get_pfn(kvm, slot, gfn, &pfn, page, NULL); + if (!ret) { + *writable = write_fault; + return pfn; + } + + if (ret == -EHWPOISON) + return KVM_PFN_ERR_HWPOISON; + + return KVM_PFN_ERR_NOSLOT_MASK; +} + +static bool is_private_mem(struct kvm *kvm, struct kvm_memory_slot *memslot, phys_addr_t ipa) +{ + return kvm_arch_has_private_mem(kvm) && kvm_slot_can_be_private(memslot) && + (kvm_mem_is_private(kvm, ipa >> PAGE_SHIFT) || !memslot->userspace_addr); +} + static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_s2_trans *nested, struct kvm_memory_slot *memslot, unsigned long hva, @@ -1441,24 +1474,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, { int ret = 0; bool write_fault, writable; - bool exec_fault, mte_allowed; + bool exec_fault, mte_allowed = false; bool device = false, vfio_allow_any_uc = false; unsigned long mmu_seq; phys_addr_t ipa = fault_ipa; struct kvm *kvm = vcpu->kvm; struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache; - struct vm_area_struct *vma; + struct vm_area_struct *vma = NULL; short vma_shift; gfn_t gfn; kvm_pfn_t pfn; bool logging_active = memslot_is_logging(memslot); - bool force_pte = logging_active; - long vma_pagesize, fault_granule; + bool is_private = is_private_mem(kvm, memslot, fault_ipa); + bool force_pte = logging_active || is_private; + long vma_pagesize, fault_granule = PAGE_SIZE; enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; struct kvm_pgtable *pgt; struct page *page; - if (fault_is_perm) + if (fault_is_perm && !is_private) fault_granule = kvm_vcpu_trap_get_perm_fault_granule(vcpu); write_fault = kvm_is_write_fault(vcpu); exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); @@ -1482,24 +1516,30 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, return ret; } + mmap_read_lock(current->mm); + /* * Let's check if we will get back a huge page backed by hugetlbfs, or * get block mapping for device MMIO region. */ - mmap_read_lock(current->mm); - vma = vma_lookup(current->mm, hva); - if (unlikely(!vma)) { - kvm_err("Failed to find VMA for hva 0x%lx\n", hva); - mmap_read_unlock(current->mm); - return -EFAULT; - } + if (!is_private) { + vma = vma_lookup(current->mm, hva); + if (unlikely(!vma)) { + kvm_err("Failed to find VMA for hva 0x%lx\n", hva); + mmap_read_unlock(current->mm); + return -EFAULT; + } - /* - * logging_active is guaranteed to never be true for VM_PFNMAP - * memslots. - */ - if (WARN_ON_ONCE(logging_active && (vma->vm_flags & VM_PFNMAP))) - return -EFAULT; + /* + * logging_active is guaranteed to never be true for VM_PFNMAP + * memslots. + */ + if (WARN_ON_ONCE(logging_active && (vma->vm_flags & VM_PFNMAP))) + return -EFAULT; + + vfio_allow_any_uc = vma->vm_flags & VM_ALLOW_ANY_UNCACHED; + mte_allowed = kvm_vma_mte_allowed(vma); + } if (force_pte) vma_shift = PAGE_SHIFT; @@ -1570,17 +1610,14 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, } gfn = ipa >> PAGE_SHIFT; - mte_allowed = kvm_vma_mte_allowed(vma); - - vfio_allow_any_uc = vma->vm_flags & VM_ALLOW_ANY_UNCACHED; /* Don't use the VMA after the unlock -- it may have vanished */ vma = NULL; /* * Read mmu_invalidate_seq so that KVM can detect if the results of - * vma_lookup() or __kvm_faultin_pfn() become stale prior to - * acquiring kvm->mmu_lock. + * vma_lookup() or faultin_pfn() become stale prior to acquiring + * kvm->mmu_lock. * * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs * with the smp_wmb() in kvm_mmu_invalidate_end(). @@ -1588,8 +1625,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, mmu_seq = vcpu->kvm->mmu_invalidate_seq; mmap_read_unlock(current->mm); - pfn = __kvm_faultin_pfn(memslot, gfn, write_fault ? FOLL_WRITE : 0, - &writable, &page); + pfn = faultin_pfn(kvm, memslot, gfn, write_fault, &writable, &page, is_private); if (pfn == KVM_PFN_ERR_HWPOISON) { kvm_send_hwpoison_signal(hva, vma_shift); return 0; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index ebca0ab4c5e2..f059958b98fd 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1864,6 +1864,11 @@ static inline int memslot_id(struct kvm *kvm, gfn_t gfn) return gfn_to_memslot(kvm, gfn)->id; } +static inline bool memslot_is_readonly(const struct kvm_memory_slot *slot) +{ + return slot->flags & KVM_MEM_READONLY; +} + static inline gfn_t hva_to_gfn_memslot(unsigned long hva, struct kvm_memory_slot *slot) { diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 9cd6690b7955..10c3168db473 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2622,11 +2622,6 @@ unsigned long kvm_host_page_size(struct kvm_vcpu *vcpu, gfn_t gfn) return size; } -static bool memslot_is_readonly(const struct kvm_memory_slot *slot) -{ - return slot->flags & KVM_MEM_READONLY; -} - static unsigned long __gfn_to_hva_many(const struct kvm_memory_slot *slot, gfn_t gfn, gfn_t *nr_pages, bool write) {