From patchwork Tue Jul 5 14:26:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Catalin Marinas X-Patchwork-Id: 12906636 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 430BBC433EF for ; Tue, 5 Jul 2022 14:27:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=eH43xhOHYuoDV/w6Zz8viVIt1VmA1HjlBJibguq3XhU=; b=VwhjMxAVddCX8J jcYj66wRT07CtOrx/3ssm9ej4Mv8gTdnu4J5t2OxxymKcgfaPhBNRUt+hvWQ+reqNdGwZdFWBx4CB mjXVA8TQuuhvr7Lc6cfNU1ODBXmKLUHXEfan9naOoMK7OuFvrY1Po7O7bPTPMvNlQnMFER2CtiK9S hCvQVjiwVQOAytB97rTU5vGKaSZc8zOhZ+Oj0l/iH7N72mWi3A1pcJkHIRmKZZTs3E2+4DfGhnUws qmHDv4BpFA2rcSz26k/Mg9lHiIBYn6wKAYN7xB0/nmKmCXMPiMQ+Z9rOihWOhhI0ap4BIbSzSIiK4 zUUVGG97NvRzq5kLvqyQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1o8jVe-000npX-KU; Tue, 05 Jul 2022 14:26:38 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1o8jVS-000nln-Lh for linux-arm-kernel@lists.infradead.org; Tue, 05 Jul 2022 14:26:29 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B226861828; Tue, 5 Jul 2022 14:26:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B091CC341CF; Tue, 5 Jul 2022 14:26:23 +0000 (UTC) From: Catalin Marinas To: Will Deacon , Marc Zyngier , Steven Price , Peter Collingbourne Cc: Vincenzo Frascino , linux-arm-kernel@lists.infradead.org Subject: [PATCH 1/4] arm64: mte: Fix/clarify the PG_mte_tagged semantics Date: Tue, 5 Jul 2022 15:26:16 +0100 Message-Id: <20220705142619.4135905-2-catalin.marinas@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220705142619.4135905-1-catalin.marinas@arm.com> References: <20220705142619.4135905-1-catalin.marinas@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220705_072626_833935_7B6EF3AE X-CRM114-Status: GOOD ( 31.96 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Currently the PG_mte_tagged page flag mostly means the page contains valid tags and it should be set after the tags have been cleared or restored. However, in mte_sync_tags() it is set before setting the tags to avoid, in theory, a race with concurrent mprotect(PROT_MTE) for shared pages. However, a concurrent mprotect(PROT_MTE) with a copy on write in another thread can cause the new page to have stale tags. Similarly, tag reading via ptrace() can read stale tags of the PG_mte_tagged flag is set before actually clearing/restoring the tags. Fix the PG_mte_tagged semantics so that it is only set after the tags have been cleared or restored. This is safe for swap restoring into a MAP_SHARED or CoW page since the core code takes the page lock. Add two functions to test and set the PG_mte_tagged flag with acquire and release semantics. The downside is that concurrent mprotect(PROT_MTE) on a MAP_SHARED page may cause tag loss. This is already the case for KVM guests if a VMM changes the page protection while the guest triggers a user_mem_abort(). Signed-off-by: Catalin Marinas Cc: Will Deacon Cc: Marc Zyngier Cc: Steven Price Cc: Peter Collingbourne --- arch/arm64/include/asm/mte.h | 30 ++++++++++++++++++++++++++++++ arch/arm64/include/asm/pgtable.h | 2 +- arch/arm64/kernel/cpufeature.c | 4 +++- arch/arm64/kernel/elfcore.c | 2 +- arch/arm64/kernel/hibernate.c | 2 +- arch/arm64/kernel/mte.c | 12 +++++++----- arch/arm64/kvm/guest.c | 4 ++-- arch/arm64/kvm/mmu.c | 4 ++-- arch/arm64/mm/copypage.c | 4 ++-- arch/arm64/mm/fault.c | 2 +- arch/arm64/mm/mteswap.c | 2 +- 11 files changed, 51 insertions(+), 17 deletions(-) diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h index aa523591a44e..c69218c56980 100644 --- a/arch/arm64/include/asm/mte.h +++ b/arch/arm64/include/asm/mte.h @@ -37,6 +37,29 @@ void mte_free_tag_storage(char *storage); /* track which pages have valid allocation tags */ #define PG_mte_tagged PG_arch_2 +static inline void set_page_mte_tagged(struct page *page) +{ + /* + * Ensure that the tags written prior to this function are visible + * before the page flags update. + */ + smp_wmb(); + set_bit(PG_mte_tagged, &page->flags); +} + +static inline bool page_mte_tagged(struct page *page) +{ + bool ret = test_bit(PG_mte_tagged, &page->flags); + + /* + * If the page is tagged, ensure ordering with a likely subsequent + * read of the tags. + */ + if (ret) + smp_rmb(); + return ret; +} + void mte_zero_clear_page_tags(void *addr); void mte_sync_tags(pte_t old_pte, pte_t pte); void mte_copy_page_tags(void *kto, const void *kfrom); @@ -54,6 +77,13 @@ size_t mte_probe_user_range(const char __user *uaddr, size_t size); /* unused if !CONFIG_ARM64_MTE, silence the compiler */ #define PG_mte_tagged 0 +static inline set_page_mte_tagged(struct page *page) +{ +} +static inline bool page_mte_tagged(struct page *page) +{ + return false; +} static inline void mte_zero_clear_page_tags(void *addr) { } diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 0b6632f18364..08823669db0a 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1034,7 +1034,7 @@ static inline void arch_swap_invalidate_area(int type) static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) { if (system_supports_mte() && mte_restore_tags(entry, &folio->page)) - set_bit(PG_mte_tagged, &folio->flags); + set_page_mte_tagged(&folio->page); } #endif /* CONFIG_ARM64_MTE */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 8d88433de81d..4478e5774580 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1964,8 +1964,10 @@ static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap) * Clear the tags in the zero page. This needs to be done via the * linear map which has the Tagged attribute. */ - if (!test_and_set_bit(PG_mte_tagged, &ZERO_PAGE(0)->flags)) + if (!page_mte_tagged(ZERO_PAGE(0))) { mte_clear_page_tags(lm_alias(empty_zero_page)); + set_page_mte_tagged(ZERO_PAGE(0)); + } kasan_init_hw_tags_cpu(); } diff --git a/arch/arm64/kernel/elfcore.c b/arch/arm64/kernel/elfcore.c index 98d67444a5b6..f91bb1572d22 100644 --- a/arch/arm64/kernel/elfcore.c +++ b/arch/arm64/kernel/elfcore.c @@ -47,7 +47,7 @@ static int mte_dump_tag_range(struct coredump_params *cprm, * Pages mapped in user space as !pte_access_permitted() (e.g. * PROT_EXEC only) may not have the PG_mte_tagged flag set. */ - if (!test_bit(PG_mte_tagged, &page->flags)) { + if (!page_mte_tagged(page)) { put_page(page); dump_skip(cprm, MTE_PAGE_TAG_STORAGE); continue; diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c index 2e248342476e..018ae7a25737 100644 --- a/arch/arm64/kernel/hibernate.c +++ b/arch/arm64/kernel/hibernate.c @@ -271,7 +271,7 @@ static int swsusp_mte_save_tags(void) if (!page) continue; - if (!test_bit(PG_mte_tagged, &page->flags)) + if (!page_mte_tagged(page)) continue; ret = save_tags(page, pfn); diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index f6b00743c399..67e82ce4c285 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -41,8 +41,10 @@ static void mte_sync_page_tags(struct page *page, pte_t old_pte, if (check_swap && is_swap_pte(old_pte)) { swp_entry_t entry = pte_to_swp_entry(old_pte); - if (!non_swap_entry(entry) && mte_restore_tags(entry, page)) + if (!non_swap_entry(entry) && mte_restore_tags(entry, page)) { + set_page_mte_tagged(page); return; + } } if (!pte_is_tagged) @@ -58,6 +60,7 @@ static void mte_sync_page_tags(struct page *page, pte_t old_pte, */ smp_wmb(); mte_clear_page_tags(page_address(page)); + set_page_mte_tagged(page); } void mte_sync_tags(pte_t old_pte, pte_t pte) @@ -73,7 +76,7 @@ void mte_sync_tags(pte_t old_pte, pte_t pte) /* if PG_mte_tagged is set, tags have already been initialised */ for (i = 0; i < nr_pages; i++, page++) { - if (!test_and_set_bit(PG_mte_tagged, &page->flags)) + if (!page_mte_tagged(page)) mte_sync_page_tags(page, old_pte, check_swap, pte_is_tagged); } @@ -100,8 +103,7 @@ int memcmp_pages(struct page *page1, struct page *page2) * pages is tagged, set_pte_at() may zero or change the tags of the * other page via mte_sync_tags(). */ - if (test_bit(PG_mte_tagged, &page1->flags) || - test_bit(PG_mte_tagged, &page2->flags)) + if (page_mte_tagged(page1) || page_mte_tagged(page2)) return addr1 != addr2; return ret; @@ -407,7 +409,7 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr, put_page(page); break; } - WARN_ON_ONCE(!test_bit(PG_mte_tagged, &page->flags)); + WARN_ON_ONCE(!page_mte_tagged(page)); /* limit access to the end of the page */ offset = offset_in_page(addr); diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 8c607199cad1..3b04e69006b4 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -1058,7 +1058,7 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, maddr = page_address(page); if (!write) { - if (test_bit(PG_mte_tagged, &page->flags)) + if (page_mte_tagged(page)) num_tags = mte_copy_tags_to_user(tags, maddr, MTE_GRANULES_PER_PAGE); else @@ -1075,7 +1075,7 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, * completed fully */ if (num_tags == MTE_GRANULES_PER_PAGE) - set_bit(PG_mte_tagged, &page->flags); + set_page_mte_tagged(page); kvm_release_pfn_dirty(pfn); } diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index f5651a05b6a8..9cfa516452e1 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1075,9 +1075,9 @@ static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn, return -EFAULT; for (i = 0; i < nr_pages; i++, page++) { - if (!test_bit(PG_mte_tagged, &page->flags)) { + if (!page_mte_tagged(page)) { mte_clear_page_tags(page_address(page)); - set_bit(PG_mte_tagged, &page->flags); + set_page_mte_tagged(page); } } diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c index 0dea80bf6de4..f36d796f1bce 100644 --- a/arch/arm64/mm/copypage.c +++ b/arch/arm64/mm/copypage.c @@ -21,8 +21,7 @@ void copy_highpage(struct page *to, struct page *from) copy_page(kto, kfrom); - if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) { - set_bit(PG_mte_tagged, &to->flags); + if (system_supports_mte() && page_mte_tagged(from)) { page_kasan_tag_reset(to); /* * We need smp_wmb() in between setting the flags and clearing the @@ -33,6 +32,7 @@ void copy_highpage(struct page *to, struct page *from) */ smp_wmb(); mte_copy_page_tags(kto, kfrom); + set_page_mte_tagged(to); } } EXPORT_SYMBOL(copy_highpage); diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index c5e11768e5c1..147fe28d3fbe 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -928,5 +928,5 @@ void tag_clear_highpage(struct page *page) { mte_zero_clear_page_tags(page_address(page)); page_kasan_tag_reset(page); - set_bit(PG_mte_tagged, &page->flags); + set_page_mte_tagged(page); } diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c index a9e50e930484..9d3a8cf388fc 100644 --- a/arch/arm64/mm/mteswap.c +++ b/arch/arm64/mm/mteswap.c @@ -24,7 +24,7 @@ int mte_save_tags(struct page *page) { void *tag_storage, *ret; - if (!test_bit(PG_mte_tagged, &page->flags)) + if (!page_mte_tagged(page)) return 0; tag_storage = mte_allocate_tag_storage(); From patchwork Tue Jul 5 14:26:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Catalin Marinas X-Patchwork-Id: 12906638 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4CA86C433EF for ; Tue, 5 Jul 2022 14:27:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=J2fptepxB0gg/vIgAx30ikHMtzhdupp+EnDdcUgpe9U=; b=VBI4tFVdlEOC+L I/8TGGZIn/B1moapAXAJmc/S34h1/81bitcYah0FfWMoC9AaIV2g0WPV20gPmWoTrR+ygExz3XqpZ 0tv9Tzng0QH1Fxf2HFoa4G4unrozTg4cChR2aYmvJ/nrhpaYeKhdCFT6PZqCaktD+ezpSokDWUXXj lI4hblrbTXqPMEre+YyGHTvuSPf0Jma1GPLel+jljrb2fDVf1Xf0Xo/Ih5PSyQmQOqiNxQbWcIT/t 75ImZZXlCuPEY67PDYeluYFxnuHSAaRnp4ZXFf1Rcvu9c9BhhriAsaw3txXw77fuHXW/zoIzagGul GZN6lCEQXSE2X2Tgf5AA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1o8jVv-000ntc-Fv; Tue, 05 Jul 2022 14:26:55 +0000 Received: from ams.source.kernel.org ([145.40.68.75]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1o8jVX-000nnL-0k for linux-arm-kernel@lists.infradead.org; Tue, 05 Jul 2022 14:26:32 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 262DFB817EE; Tue, 5 Jul 2022 14:26:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 84F2DC341CE; Tue, 5 Jul 2022 14:26:25 +0000 (UTC) From: Catalin Marinas To: Will Deacon , Marc Zyngier , Steven Price , Peter Collingbourne Cc: Vincenzo Frascino , linux-arm-kernel@lists.infradead.org Subject: [PATCH 2/4] KVM: arm64: Simplify the sanitise_mte_tags() logic Date: Tue, 5 Jul 2022 15:26:17 +0100 Message-Id: <20220705142619.4135905-3-catalin.marinas@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220705142619.4135905-1-catalin.marinas@arm.com> References: <20220705142619.4135905-1-catalin.marinas@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220705_072631_414068_D00F0AA3 X-CRM114-Status: GOOD ( 22.31 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Currently sanitise_mte_tags() checks if it's an online page before attempting to sanitise the tags. Such detection should be done in the caller via the VM_MTE_ALLOWED vma flag. Since kvm_set_spte_gfn() does not have the vma, leave the page unmapped if not already tagged. Tag initialisation will be done on a subsequent access fault in user_mem_abort(). Signed-off-by: Catalin Marinas Cc: Will Deacon Cc: Marc Zyngier Cc: Steven Price Cc: Peter Collingbourne --- arch/arm64/kvm/mmu.c | 38 ++++++++++++++------------------------ 1 file changed, 14 insertions(+), 24 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 9cfa516452e1..35850f17ae08 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1056,23 +1056,14 @@ static int get_vma_page_shift(struct vm_area_struct *vma, unsigned long hva) * - mmap_lock protects between a VM faulting a page in and the VMM performing * an mprotect() to add VM_MTE */ -static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn, - unsigned long size) +static void sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn, + unsigned long size) { unsigned long i, nr_pages = size >> PAGE_SHIFT; struct page *page; if (!kvm_has_mte(kvm)) - return 0; - - /* - * pfn_to_online_page() is used to reject ZONE_DEVICE pages - * that may not support tags. - */ - page = pfn_to_online_page(pfn); - - if (!page) - return -EFAULT; + return; for (i = 0; i < nr_pages; i++, page++) { if (!page_mte_tagged(page)) { @@ -1080,8 +1071,6 @@ static int sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn, set_page_mte_tagged(page); } } - - return 0; } static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, @@ -1092,7 +1081,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, bool write_fault, writable, force_pte = false; bool exec_fault; bool device = false; - bool shared; unsigned long mmu_seq; struct kvm *kvm = vcpu->kvm; struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache; @@ -1142,8 +1130,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, vma_shift = get_vma_page_shift(vma, hva); } - shared = (vma->vm_flags & VM_SHARED); - switch (vma_shift) { #ifndef __PAGETABLE_PMD_FOLDED case PUD_SHIFT: @@ -1264,12 +1250,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (fault_status != FSC_PERM && !device && kvm_has_mte(kvm)) { /* Check the VMM hasn't introduced a new VM_SHARED VMA */ - if (!shared) - ret = sanitise_mte_tags(kvm, pfn, vma_pagesize); - else + if ((vma->vm_flags & VM_MTE_ALLOWED) && + !(vma->vm_flags & VM_SHARED)) { + sanitise_mte_tags(kvm, pfn, vma_pagesize); + } else { ret = -EFAULT; - if (ret) goto out_unlock; + } } if (writable) @@ -1491,15 +1478,18 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { kvm_pfn_t pfn = pte_pfn(range->pte); - int ret; if (!kvm->arch.mmu.pgt) return false; WARN_ON(range->end - range->start != 1); - ret = sanitise_mte_tags(kvm, pfn, PAGE_SIZE); - if (ret) + /* + * If the page isn't tagged, defer to user_mem_abort() for sanitising + * the MTE tags. The S2 pte should have been unmapped by + * mmu_notifier_invalidate_range_end(). + */ + if (kvm_has_mte(kvm) && !page_mte_tagged(pfn_to_page(pfn))) return false; /* From patchwork Tue Jul 5 14:26:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Catalin Marinas X-Patchwork-Id: 12906637 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6EC34C43334 for ; Tue, 5 Jul 2022 14:27:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=eDvXzhcAFaboh55bWMO0LpfIZzZv5Mo33rbqkfZxO6k=; b=2C4EcLJeX2nwbz jjzdqLFw55rJ0zJQiv0xVqC3J0y76X3GO6j3gomgWGstvcym3XcIueFl8i/V0P8j/UobxW+f7dVFx p9C+YouH7XUgRqBKE3NGj13ebtcpP7kuu6kPOHFEXBF8H7wctEuc4LQ5NZuMzFZ3zlT3dei0PxbsD wvngECASk/EKQLVljrsAlzoRVVaMVYyxhMeaKl08J/0cP2iSidEHnNPF2dCibZVb/Jlk1lvD+14B6 g5OiVju+urNddWKbmMieEzA/u8x9BQSJu7JfbCuYz/b6wTYGnrplEX7AqbgdSNyFUpkCvt9+oaDFq bb1tDkWol2D6CRU9Ch/Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1o8jVm-000nrF-R9; Tue, 05 Jul 2022 14:26:46 +0000 Received: from ams.source.kernel.org ([145.40.68.75]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1o8jVX-000nnN-EL for linux-arm-kernel@lists.infradead.org; Tue, 05 Jul 2022 14:26:32 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 10D5BB817F2; Tue, 5 Jul 2022 14:26:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 58761C341CA; Tue, 5 Jul 2022 14:26:27 +0000 (UTC) From: Catalin Marinas To: Will Deacon , Marc Zyngier , Steven Price , Peter Collingbourne Cc: Vincenzo Frascino , linux-arm-kernel@lists.infradead.org Subject: [PATCH 3/4] mm: Add PG_arch_3 page flag Date: Tue, 5 Jul 2022 15:26:18 +0100 Message-Id: <20220705142619.4135905-4-catalin.marinas@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220705142619.4135905-1-catalin.marinas@arm.com> References: <20220705142619.4135905-1-catalin.marinas@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220705_072631_675565_9F464347 X-CRM114-Status: GOOD ( 13.48 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Peter Collingbourne As with PG_arch_2, this flag is only allowed on 64-bit architectures due to the shortage of bits available. It will be used by the arm64 MTE code in subsequent patches. Signed-off-by: Peter Collingbourne Cc: Will Deacon Cc: Marc Zyngier Cc: Steven Price [catalin.marinas@arm.com: added flag preserving in __split_huge_page_tail()] Signed-off-by: Catalin Marinas --- fs/proc/page.c | 1 + include/linux/page-flags.h | 1 + include/trace/events/mmflags.h | 7 ++++--- mm/huge_memory.c | 1 + 4 files changed, 7 insertions(+), 3 deletions(-) diff --git a/fs/proc/page.c b/fs/proc/page.c index a2873a617ae8..438b8aa7249d 100644 --- a/fs/proc/page.c +++ b/fs/proc/page.c @@ -220,6 +220,7 @@ u64 stable_page_flags(struct page *page) u |= kpf_copy_bit(k, KPF_ARCH, PG_arch_1); #ifdef CONFIG_64BIT u |= kpf_copy_bit(k, KPF_ARCH_2, PG_arch_2); + u |= kpf_copy_bit(k, KPF_ARCH_2, PG_arch_3); #endif return u; diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index e66f7aa3191d..447cdd4b1311 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -134,6 +134,7 @@ enum pageflags { #endif #ifdef CONFIG_64BIT PG_arch_2, + PG_arch_3, #endif #ifdef CONFIG_KASAN_HW_TAGS PG_skip_kasan_poison, diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index e87cb2b80ed3..ecf7ae0de3d8 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -92,9 +92,9 @@ #endif #ifdef CONFIG_64BIT -#define IF_HAVE_PG_ARCH_2(flag,string) ,{1UL << flag, string} +#define IF_HAVE_PG_ARCH_2_3(flag,string) ,{1UL << flag, string} #else -#define IF_HAVE_PG_ARCH_2(flag,string) +#define IF_HAVE_PG_ARCH_2_3(flag,string) #endif #ifdef CONFIG_KASAN_HW_TAGS @@ -130,7 +130,8 @@ IF_HAVE_PG_UNCACHED(PG_uncached, "uncached" ) \ IF_HAVE_PG_HWPOISON(PG_hwpoison, "hwpoison" ) \ IF_HAVE_PG_IDLE(PG_young, "young" ) \ IF_HAVE_PG_IDLE(PG_idle, "idle" ) \ -IF_HAVE_PG_ARCH_2(PG_arch_2, "arch_2" ) \ +IF_HAVE_PG_ARCH_2_3(PG_arch_2, "arch_2" ) \ +IF_HAVE_PG_ARCH_2_3(PG_arch_3, "arch_3" ) \ IF_HAVE_PG_SKIP_KASAN_POISON(PG_skip_kasan_poison, "skip_kasan_poison") #define show_page_flags(flags) \ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 834f288b3769..e8fda2c28256 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2369,6 +2369,7 @@ static void __split_huge_page_tail(struct page *head, int tail, (1L << PG_unevictable) | #ifdef CONFIG_64BIT (1L << PG_arch_2) | + (1L << PG_arch_3) | #endif (1L << PG_dirty))); From patchwork Tue Jul 5 14:26:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Catalin Marinas X-Patchwork-Id: 12906639 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 54907C433EF for ; Tue, 5 Jul 2022 14:28:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=nAK+ZADfxXrr9ujRW086Xi8fJb0k4iaTy6M4fuhwowI=; b=1hxR+R6128yMLZ /Fd/BlvqqWjWG9o1Pl5Hos/JLZuKgm198rJsb73/gS2c9oWCfMyLpuc5z87NqnwVKtC2lack7umi0 JMDSK/pHW2J90wMUYA5RpzG0rYrc6lxZShXLOOmM5ngqcUweYpVJt7P4vfYXANqlaWqUZCdhhwtHE B4izagGU/CFzTQppkQ2q5kDldTMi+/ZRGLFoypQmCdy8aMOObXubPt/lNXIy3uNTQkCvAiOR9oa8g +9syAjx1egr5YkwEEy5l7VlAstTigtgKG14mZonx1mwQVhBfxflYvURbCpPDa9zn2DtdswXJJ6nZC OPkoBOQymO5RG6hHe+Jw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1o8jW6-000nyU-8l; Tue, 05 Jul 2022 14:27:06 +0000 Received: from dfw.source.kernel.org ([2604:1380:4641:c500::1]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1o8jVX-000nna-Mm for linux-arm-kernel@lists.infradead.org; Tue, 05 Jul 2022 14:26:34 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2B9216198B; Tue, 5 Jul 2022 14:26:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2C3CAC341CF; Tue, 5 Jul 2022 14:26:29 +0000 (UTC) From: Catalin Marinas To: Will Deacon , Marc Zyngier , Steven Price , Peter Collingbourne Cc: Vincenzo Frascino , linux-arm-kernel@lists.infradead.org Subject: [PATCH 4/4] arm64: mte: Lock a page for MTE tag initialisation Date: Tue, 5 Jul 2022 15:26:19 +0100 Message-Id: <20220705142619.4135905-5-catalin.marinas@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220705142619.4135905-1-catalin.marinas@arm.com> References: <20220705142619.4135905-1-catalin.marinas@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220705_072631_878470_53077700 X-CRM114-Status: GOOD ( 27.95 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Initialising the tags and setting PG_mte_tagged flag for a page can race between multiple set_pte_at() on shared pages or setting the stage 2 pte via user_mem_abort(). Introduce a new PG_mte_lock flag as PG_arch_3 and set it before attempting page initialisation. Given that PG_mte_tagged is never cleared for a page, consider setting this flag to mean page unlocked and wait on this bit with acquire semantics if the page is locked: - try_page_mte_tagging() - lock the page for tagging, return true if it can be tagged, false if already tagged. No acquire semantics if it returns true (PG_mte_tagged not set) as there is no serialisation with a previous set_page_mte_tagged(). - set_page_mte_tagged() - set PG_mte_tagged with release semantics. The two-bit locking is based on Peter Colingbourne's idea. Signed-off-by: Catalin Marinas Cc: Will Deacon Cc: Marc Zyngier Cc: Steven Price Cc: Peter Collingbourne --- arch/arm64/include/asm/mte.h | 32 ++++++++++++++++++++++++++++++++ arch/arm64/include/asm/pgtable.h | 1 + arch/arm64/kernel/cpufeature.c | 2 +- arch/arm64/kernel/mte.c | 7 +++++-- arch/arm64/kvm/guest.c | 16 ++++++++++------ arch/arm64/kvm/mmu.c | 2 +- arch/arm64/mm/copypage.c | 2 ++ arch/arm64/mm/fault.c | 2 ++ arch/arm64/mm/mteswap.c | 3 +++ 9 files changed, 57 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h index c69218c56980..29712fc9df8c 100644 --- a/arch/arm64/include/asm/mte.h +++ b/arch/arm64/include/asm/mte.h @@ -36,6 +36,8 @@ void mte_free_tag_storage(char *storage); /* track which pages have valid allocation tags */ #define PG_mte_tagged PG_arch_2 +/* simple lock to avoid multiple threads tagging the same page */ +#define PG_mte_lock PG_arch_3 static inline void set_page_mte_tagged(struct page *page) { @@ -60,6 +62,32 @@ static inline bool page_mte_tagged(struct page *page) return ret; } +/* + * Lock the page for tagging and return 'true' if the page can be tagged, + * 'false' if already tagged. PG_mte_tagged is never cleared and therefore the + * locking only happens once for page initialisation. + * + * The page MTE lock state: + * + * Locked: PG_mte_lock && !PG_mte_tagged + * Unlocked: !PG_mte_lock || PG_mte_tagged + * + * Acquire semantics only if the page is tagged (returning 'false'). + */ +static inline bool try_page_mte_tagging(struct page *page) +{ + if (!test_and_set_bit(PG_mte_lock, &page->flags)) + return !page_mte_tagged(page); + + /* + * The tags are being initialised, wait for the PG_mte_tagged flag to + * be set. + */ + smp_cond_load_acquire(&page->flags, VAL & (1UL << PG_mte_tagged)); + + return false; +} + void mte_zero_clear_page_tags(void *addr); void mte_sync_tags(pte_t old_pte, pte_t pte); void mte_copy_page_tags(void *kto, const void *kfrom); @@ -84,6 +112,10 @@ static inline bool page_mte_tagged(struct page *page) { return false; } +static inline bool try_page_mte_tagging(struct page *page) +{ + return false; +} static inline void mte_zero_clear_page_tags(void *addr) { } diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 08823669db0a..ce2dc72f64f4 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1033,6 +1033,7 @@ static inline void arch_swap_invalidate_area(int type) #define __HAVE_ARCH_SWAP_RESTORE static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) { + /* mte_restore_tags() takes the PG_mte_lock */ if (system_supports_mte() && mte_restore_tags(entry, &folio->page)) set_page_mte_tagged(&folio->page); } diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 4478e5774580..c07dd7916517 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1964,7 +1964,7 @@ static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap) * Clear the tags in the zero page. This needs to be done via the * linear map which has the Tagged attribute. */ - if (!page_mte_tagged(ZERO_PAGE(0))) { + if (try_page_mte_tagging(ZERO_PAGE(0))) { mte_clear_page_tags(lm_alias(empty_zero_page)); set_page_mte_tagged(ZERO_PAGE(0)); } diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index 67e82ce4c285..54d284a1e0a7 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -41,6 +41,7 @@ static void mte_sync_page_tags(struct page *page, pte_t old_pte, if (check_swap && is_swap_pte(old_pte)) { swp_entry_t entry = pte_to_swp_entry(old_pte); + /* mte_restore_tags() takes the PG_mte_lock */ if (!non_swap_entry(entry) && mte_restore_tags(entry, page)) { set_page_mte_tagged(page); return; @@ -59,8 +60,10 @@ static void mte_sync_page_tags(struct page *page, pte_t old_pte, * the new page->flags are visible before the tags were updated. */ smp_wmb(); - mte_clear_page_tags(page_address(page)); - set_page_mte_tagged(page); + if (try_page_mte_tagging(page)) { + mte_clear_page_tags(page_address(page)); + set_page_mte_tagged(page); + } } void mte_sync_tags(pte_t old_pte, pte_t pte) diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 3b04e69006b4..059b38e7a9e8 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -1067,15 +1067,19 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm, clear_user(tags, MTE_GRANULES_PER_PAGE); kvm_release_pfn_clean(pfn); } else { + /* + * Only locking to serialise with a concurrent + * set_pte_at() in the VMM but still overriding the + * tags, hence ignoring the return value. + */ + try_page_mte_tagging(page); num_tags = mte_copy_tags_from_user(maddr, tags, MTE_GRANULES_PER_PAGE); - /* - * Set the flag after checking the write - * completed fully - */ - if (num_tags == MTE_GRANULES_PER_PAGE) - set_page_mte_tagged(page); + /* uaccess failed, don't leave stale tags */ + if (num_tags != MTE_GRANULES_PER_PAGE) + mte_clear_page_tags(page); + set_page_mte_tagged(page); kvm_release_pfn_dirty(pfn); } diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 35850f17ae08..fdd46089f260 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1066,7 +1066,7 @@ static void sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn, return; for (i = 0; i < nr_pages; i++, page++) { - if (!page_mte_tagged(page)) { + if (try_page_mte_tagging(page)) { mte_clear_page_tags(page_address(page)); set_page_mte_tagged(page); } diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c index f36d796f1bce..9c73bc020894 100644 --- a/arch/arm64/mm/copypage.c +++ b/arch/arm64/mm/copypage.c @@ -31,6 +31,8 @@ void copy_highpage(struct page *to, struct page *from) * the new page->flags are visible before the tags were updated. */ smp_wmb(); + /* It's a new page, shouldn't have been tagged yet */ + WARN_ON_ONCE(!try_page_mte_tagging(to)); mte_copy_page_tags(kto, kfrom); set_page_mte_tagged(to); } diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 147fe28d3fbe..bd28d6bd9286 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -926,6 +926,8 @@ struct page *alloc_zeroed_user_highpage_movable(struct vm_area_struct *vma, void tag_clear_highpage(struct page *page) { + /* Newly allocated page, shouldn't have been tagged yet */ + WARN_ON_ONCE(!try_page_mte_tagging(page)); mte_zero_clear_page_tags(page_address(page)); page_kasan_tag_reset(page); set_page_mte_tagged(page); diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c index 9d3a8cf388fc..aec76a4423e9 100644 --- a/arch/arm64/mm/mteswap.c +++ b/arch/arm64/mm/mteswap.c @@ -62,6 +62,9 @@ bool mte_restore_tags(swp_entry_t entry, struct page *page) * the new page->flags are visible before the tags were updated. */ smp_wmb(); + /* racing tag restoring? */ + if (!try_page_mte_tagging(page)) + return false; mte_restore_page_tags(page_address(page), tags); return true;