From patchwork Thu Jan 25 16:42:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531288 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 00305C47422 for ; Thu, 25 Jan 2024 16:43:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=+IvUEs/v04qJ1rZzPCn5ep+z+42HTj9RB4EdScZ4zIc=; b=K5DQByDVRa/4cw bVhf/3Dj82wZ7PRYkLlMJRVHfGrrPeqiH40+KwX7vlTvjLH89y65E2o2FI8peglbSFJoaomh7E/Vn YMy2uYyvbjyxKVSEF9OoHaIwUhJFNP9+e9DP18br22HqtPm38S3LZbUlO+mH+Z3U5ak/8Ui/KEZES xFLS4ajoYYpahgrhB0rrKXkcJVSo2Yc0PWgy44sT+r0JuOgcGihBFEAkQjOosGkhzRrrqlxNVR4PV xIJYFP0DsDZ6eBT6yyC4eqWvmXdPDEXg+KxYYHwXizZmLgHdOh7Yw9bMprGtsaFqVTSEGxHUfvVH2 kk5bRZ/F+skUd+q0I0iQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2ox-00000000s5A-3QlP; Thu, 25 Jan 2024 16:43:19 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2oZ-00000000rnI-3EuL for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:42:58 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 46F74FEC; Thu, 25 Jan 2024 08:43:37 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 52C263F5A1; Thu, 25 Jan 2024 08:42:47 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 01/35] mm: page_alloc: Add gfp_flags parameter to arch_alloc_page() Date: Thu, 25 Jan 2024 16:42:22 +0000 Message-Id: <20240125164256.4147-2-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084255_964421_61C2E9ED X-CRM114-Status: UNSURE ( 9.41 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Extend the usefulness of arch_alloc_page() by adding the gfp_flags parameter. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * New patch. arch/s390/include/asm/page.h | 2 +- arch/s390/mm/page-states.c | 2 +- include/linux/gfp.h | 2 +- mm/page_alloc.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h index 73b9c3bf377f..859f0958c574 100644 --- a/arch/s390/include/asm/page.h +++ b/arch/s390/include/asm/page.h @@ -163,7 +163,7 @@ static inline int page_reset_referenced(unsigned long addr) struct page; void arch_free_page(struct page *page, int order); -void arch_alloc_page(struct page *page, int order); +void arch_alloc_page(struct page *page, int order, gfp_t gfp_flags); static inline int devmem_is_allowed(unsigned long pfn) { diff --git a/arch/s390/mm/page-states.c b/arch/s390/mm/page-states.c index 01f9b39e65f5..b986c8b158e3 100644 --- a/arch/s390/mm/page-states.c +++ b/arch/s390/mm/page-states.c @@ -21,7 +21,7 @@ void arch_free_page(struct page *page, int order) __set_page_unused(page_to_virt(page), 1UL << order); } -void arch_alloc_page(struct page *page, int order) +void arch_alloc_page(struct page *page, int order, gfp_t gfp_flags) { if (!cmma_flag) return; diff --git a/include/linux/gfp.h b/include/linux/gfp.h index de292a007138..9e8aa3d144db 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -172,7 +172,7 @@ static inline struct zonelist *node_zonelist(int nid, gfp_t flags) static inline void arch_free_page(struct page *page, int order) { } #endif #ifndef HAVE_ARCH_ALLOC_PAGE -static inline void arch_alloc_page(struct page *page, int order) { } +static inline void arch_alloc_page(struct page *page, int order, gfp_t gfp_flags) { } #endif struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 150d4f23b010..2c140abe5ee6 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1485,7 +1485,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order, set_page_private(page, 0); set_page_refcounted(page); - arch_alloc_page(page, order); + arch_alloc_page(page, order, gfp_flags); debug_pagealloc_map_pages(page, 1 << order); /* From patchwork Thu Jan 25 16:42:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531291 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C882CC47258 for ; Thu, 25 Jan 2024 16:43:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=0FbszY1V/LU0JDYoIddYUFzWuLi2TEXiRd8zVCresB0=; b=RA5pBL2v2koSzl K9vCBa5x6heSvDHByrS1EqpHlRNZscB8BdHQJPRHH/cgZOoWYXluqqlYLfvmw+SPrBgdHOlFSqUHb 8UXklzvas9LVHViRnP86xPXd3s3kttMYujj2J0Ie5zx3rRXsekFAsdkFxmOGXwFPUMgp6+UZydGuu MH0WoRsD0aLI/07K0tKOmfPJ/IFLphqVi6ClzwUsB9Kxec4EZPIG7YXVnigOwbLt8LaNlb9gQjZzt SFABgpNa4So6xPEippAt4Js+vK35rYuq9yw3/1QuIfRds8DIfJgf5F3pgZQi436KJz8qDfLmAkGb5 0UeEUfLalPBTJAJZ1ZVw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2p2-00000000s9P-2tZr; Thu, 25 Jan 2024 16:43:24 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2od-00000000rqC-2Rmu for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:43:02 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 13EB31476; Thu, 25 Jan 2024 08:43:43 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 219223F5A1; Thu, 25 Jan 2024 08:42:52 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 02/35] mm: page_alloc: Add an arch hook early in free_pages_prepare() Date: Thu, 25 Jan 2024 16:42:23 +0000 Message-Id: <20240125164256.4147-3-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084259_750845_B6E4C0DF X-CRM114-Status: GOOD ( 12.90 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The arm64 MTE code uses the PG_arch_2 page flag, which it renames to PG_mte_tagged, to track if a page has been mapped with tagging enabled. That flag is cleared by free_pages_prepare() by doing: page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; When tag storage management is added, tag storage will be reserved for a page if and only if the page is mapped as tagged (the page flag PG_mte_tagged is set). When a page is freed, likewise, the code will have to look at the the page flags to determine if the page has tag storage reserved, which should also be freed. For this purpose, add an arch_free_pages_prepare() hook that is called before that page flags are cleared. The function arch_free_page() has also been considered for this purpose, but it is called after the flags are cleared. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * Expanded commit message (David Hildenbrand). include/linux/pgtable.h | 4 ++++ mm/page_alloc.c | 1 + 2 files changed, 5 insertions(+) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index f6d0e3513948..6d98d5fdd697 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -901,6 +901,10 @@ static inline void arch_do_swap_page(struct mm_struct *mm, } #endif +#ifndef __HAVE_ARCH_FREE_PAGES_PREPARE +static inline void arch_free_pages_prepare(struct page *page, int order) { } +#endif + #ifndef __HAVE_ARCH_UNMAP_ONE /* * Some architectures support metadata associated with a page. When a diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2c140abe5ee6..27282a1c82fe 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1092,6 +1092,7 @@ static __always_inline bool free_pages_prepare(struct page *page, trace_mm_page_free(page, order); kmsan_free_page(page, order); + arch_free_pages_prepare(page, order); if (memcg_kmem_online() && PageMemcgKmem(page)) __memcg_kmem_uncharge_page(page, order); From patchwork Thu Jan 25 16:42:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531290 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E897FC4828A for ; Thu, 25 Jan 2024 16:43:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=43EJ4ksG92xRScLObdLk9fJm+PjnDI7qB7FJ9b1ZIU4=; b=ol5zDrRKfRxjdt daLb4XqAimd1w/T7p1fiJGa5heG9pEfbQqthcl8qfdY8KuomBSijGyEUZq9xlf8hNUBd/zp63tWKZ V9Ld4MgOW0ajtZtwmO1L3Rc/2bLddZ7cea4APLuYqT5ZTaHbjNxZzep8wDOCKN3OKIi4Sb4UOQXHF IrKQJ4cdx0IGnq/uNuT14K7owe1o9co1P06KFfw/5foeRuJBlcuT5Zg5tjdvop10xZbLuYiJZJY83 ERDhYvnqhh0HTdqC8LWOO/RJxEgcf/sZ5WdmN07n3bOrHj8dMLbgsQNXDF6Z2b2RZHqiaHVs1yXFx SO5aMtrYUkrk97vH1uug==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2p5-00000000sCa-3AGU; Thu, 25 Jan 2024 16:43:27 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2ok-00000000rvH-3i1v for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:43:08 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D2B4A1477; Thu, 25 Jan 2024 08:43:48 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E1DA43F5A1; Thu, 25 Jan 2024 08:42:58 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 03/35] mm: page_alloc: Add an arch hook to filter MIGRATE_CMA allocations Date: Thu, 25 Jan 2024 16:42:24 +0000 Message-Id: <20240125164256.4147-4-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084306_991871_3F402C31 X-CRM114-Status: GOOD ( 10.67 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org As an architecture might have specific requirements around the allocation of CMA pages, add an arch hook that can disable allocations from MIGRATE_CMA, if the allocation was otherwise allowed. This will be used by arm64, which will put tag storage pages on the MIGRATE_CMA list, and tag storage pages cannot be tagged. The filter will be used to deny using MIGRATE_CMA for __GFP_TAGGED allocations. Signed-off-by: Alexandru Elisei --- include/linux/pgtable.h | 7 +++++++ mm/page_alloc.c | 3 ++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 6d98d5fdd697..c5ddec6b5305 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -905,6 +905,13 @@ static inline void arch_do_swap_page(struct mm_struct *mm, static inline void arch_free_pages_prepare(struct page *page, int order) { } #endif +#ifndef __HAVE_ARCH_ALLOC_CMA +static inline bool arch_alloc_cma(gfp_t gfp) +{ + return true; +} +#endif + #ifndef __HAVE_ARCH_UNMAP_ONE /* * Some architectures support metadata associated with a page. When a diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 27282a1c82fe..a96d47a6393e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3157,7 +3157,8 @@ static inline unsigned int gfp_to_alloc_flags_cma(gfp_t gfp_mask, unsigned int alloc_flags) { #ifdef CONFIG_CMA - if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE) + if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE && + arch_alloc_cma(gfp_mask)) alloc_flags |= ALLOC_CMA; #endif return alloc_flags; From patchwork Thu Jan 25 16:42:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531293 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1CCD1C47258 for ; Thu, 25 Jan 2024 16:43:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=pDpVCBf9Kb7THevOj+iis4cgx/rHdeElC+LH6D4apHo=; b=wn9SbYLBL7bGnd ME8YJI+z1dtE03cbKmI662+aaU7GCXKn2yU9XnmgqZJ0fmlJFBQtp4wQDj25xcI8ftS+zPrCstWba nxJd5Agu1JM+A27WK1fDHaanFoHmClPEDzcFyUsw1kwY2WBVp7PK6YKQljEfLs7YXO5dI51YxKSep cdHZfAOhnunhkUX+TLb6z6n4+ZVuBZLke6IFpS0TOTuX7c+RzyMlEsnrhK2mC9Z+dt5Plj+WDWnun Usq6adtN6AAVkUEEduZAd3R3ln1aF+vhVA0UfZO/jsbF+AirrXQO/YButPF7d2cMnxyJwFDw+fc8m WkJW7/i6La/7yIGRLs7w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2p9-00000000sFx-0F8m; Thu, 25 Jan 2024 16:43:31 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2op-00000000rxy-230C for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:43:13 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B2D5A1480; Thu, 25 Jan 2024 08:43:54 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id ADAA33F5A1; Thu, 25 Jan 2024 08:43:04 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 04/35] mm: page_alloc: Partially revert "mm: page_alloc: remove stale CMA guard code" Date: Thu, 25 Jan 2024 16:42:25 +0000 Message-Id: <20240125164256.4147-5-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084311_656599_F57BBC41 X-CRM114-Status: GOOD ( 11.51 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The patch f945116e4e19 ("mm: page_alloc: remove stale CMA guard code") removed the CMA filter when allocating from the MIGRATE_MOVABLE pcp list because CMA is always allowed when __GFP_MOVABLE is set. With the introduction of the arch_alloc_cma() function, the above is not true anymore, so bring back the filter. This is a partially revert because the stale comment remains removed. Signed-off-by: Alexandru Elisei --- mm/page_alloc.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a96d47a6393e..0fa34bcfb1af 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2897,10 +2897,17 @@ struct page *rmqueue(struct zone *preferred_zone, WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1)); if (likely(pcp_allowed_order(order))) { - page = rmqueue_pcplist(preferred_zone, zone, order, - migratetype, alloc_flags); - if (likely(page)) - goto out; + /* + * MIGRATE_MOVABLE pcplist could have the pages on CMA area and + * we need to skip it when CMA area isn't allowed. + */ + if (!IS_ENABLED(CONFIG_CMA) || alloc_flags & ALLOC_CMA || + migratetype != MIGRATE_MOVABLE) { + page = rmqueue_pcplist(preferred_zone, zone, order, + migratetype, alloc_flags); + if (likely(page)) + goto out; + } } page = rmqueue_buddy(preferred_zone, zone, order, alloc_flags, From patchwork Thu Jan 25 16:42:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531398 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5F488C47422 for ; Thu, 25 Jan 2024 16:45:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=B+YVjY3koNj/Qxls83XfRjAEqAqLCaDC/qHqtVw5Pos=; b=A/usBwvGIopw8E q8uvvz6brE8UxMuwRYNVqQEDSZbsz4k7hY/Jf6scvWBEfnhkIyYiVfi7OeSfMESyLhxlqI1amFoS6 cuDfZPpZPlgy56gwO9HtDKS6bRPBN7TwJCIU4HUsMw34fKTKy9zReIwKi9G7CjxoKpB6QIlkWcT9G yBfa/pS98G44kOPb1iNfIQdunbjBAVVicVnm9LqaLLa28j0BZu4OwujupnhTlaFkf2wr8Lo+DJcmQ P7fYyMyWzz/gPgkoJLOp4crSZrRNmT2800VEEgmrXGuQJVhB3W3sgkk8nHTrYch48LxWjTQuedlFu T8IEelA22c1O6EK0c1fg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qw-00000000tfG-2Wx8; Thu, 25 Jan 2024 16:45:22 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2ov-00000000s1X-1K4B for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:43:19 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6546F14BF; Thu, 25 Jan 2024 08:44:00 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 776B83F5A1; Thu, 25 Jan 2024 08:43:10 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 05/35] mm: cma: Don't append newline when generating CMA area name Date: Thu, 25 Jan 2024 16:42:26 +0000 Message-Id: <20240125164256.4147-6-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084317_620248_F8324C9E X-CRM114-Status: UNSURE ( 9.77 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org cma->name is displayed in several CMA messages. When the name is generated by the CMA code, don't append a newline to avoid breaking the text across two lines. Signed-off-by: Alexandru Elisei Reviewed-by: Anshuman Khandual --- Changes since rfc v2: * New patch. This is a fix, and can be merged independently of the other patches. mm/cma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/cma.c b/mm/cma.c index 7c09c47e530b..f49c95f8ee37 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -204,7 +204,7 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, if (name) snprintf(cma->name, CMA_MAX_NAME, name); else - snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count); + snprintf(cma->name, CMA_MAX_NAME, "cma%d", cma_area_count); cma->base_pfn = PFN_DOWN(base); cma->count = size >> PAGE_SHIFT; From patchwork Thu Jan 25 16:42:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531393 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4C9ABC4828C for ; Thu, 25 Jan 2024 16:45:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=JA4BAVu2kmkadpMR1NyjpKqgYsxk6rvdgV3huWT1HGQ=; b=boyQOpqp2xWURJ 1BcMK7UIXvMEPfP4GMDSCSxhi3v2V3oB9OHELdiiMD137hcEmZXnk+tplfZnc49hScCE+Gl4zt+mb 7+/anhzIdGlj0K1heLrfba6I3bs+/ghK7aM3NUwdmrAb2ib6pnE4xxBKtzPmYHl4ExgBaE3Mrr907 pRmSPuXSf5OMO2PP0QkZdhjt7/ZXNrYAIlOyAQZr4kUGVJZRkRK9XToFHBXwxjawcz9FPc5r8oEg8 /wxnD42eGb+sDMCvZgLXBY1Ajj7+K54q99rER5ObmyQtiD7EBnqQ7qjqT/LEkc0ALKzhrjIM/Wiz6 tJfh6L+STl65YlvCqnxA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qY-00000000tHS-3vSY; Thu, 25 Jan 2024 16:44:58 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2p0-00000000s7M-3T5w for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:43:25 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2BBFC150C; Thu, 25 Jan 2024 08:44:06 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3BDCC3F5A1; Thu, 25 Jan 2024 08:43:16 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 06/35] mm: cma: Make CMA_ALLOC_SUCCESS/FAIL count the number of pages Date: Thu, 25 Jan 2024 16:42:27 +0000 Message-Id: <20240125164256.4147-7-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084323_069073_9369CE38 X-CRM114-Status: GOOD ( 12.45 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The CMA_ALLOC_SUCCESS, respectively CMA_ALLOC_FAIL, are increased by one after each cma_alloc() function call. This is done even though cma_alloc() can allocate an arbitrary number of CMA pages. When looking at /proc/vmstat, the number of successful (or failed) cma_alloc() calls doesn't tell much with regards to how many CMA pages were allocated via cma_alloc() versus via the page allocator (regular allocation request or PCP lists refill). This can also be rather confusing to a user who isn't familiar with the code, since the unit of measurement for nr_free_cma is the number of pages, but cma_alloc_success and cma_alloc_fail count the number of cma_alloc() function calls. Let's make this consistent, and arguably more useful, by having CMA_ALLOC_SUCCESS count the number of successfully allocated CMA pages, and CMA_ALLOC_FAIL count the number of pages the cma_alloc() failed to allocate. For users that wish to track the number of cma_alloc() calls, there are tracepoints for that already implemented. Signed-off-by: Alexandru Elisei --- mm/cma.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/cma.c b/mm/cma.c index f49c95f8ee37..dbf7fe8cb1bd 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -517,10 +517,10 @@ struct page *cma_alloc(struct cma *cma, unsigned long count, pr_debug("%s(): returned %p\n", __func__, page); out: if (page) { - count_vm_event(CMA_ALLOC_SUCCESS); + count_vm_events(CMA_ALLOC_SUCCESS, count); cma_sysfs_account_success_pages(cma, count); } else { - count_vm_event(CMA_ALLOC_FAIL); + count_vm_events(CMA_ALLOC_FAIL, count); if (cma) cma_sysfs_account_fail_pages(cma, count); } From patchwork Thu Jan 25 16:42:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531389 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 961C7C47258 for ; Thu, 25 Jan 2024 16:45:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=8DVwAmrumDJ3PXS6Xxo4J3LvKFT0uBJwpsWTwztkwgw=; b=rlFDSqul4f5U+5 5WxgDAWIo0WbHMyCrPwWQd7sIb8PBCxmnmNHYoDy3qeeqLkCnlyqTrrTSW4IQYantpX+mku9fVEkG OgCOlt0RaMKzilkufpm0X8JtqEOYxX36sp5spadKnBEFb4dwnGrmreXnOnUzVjebfeByH2SvdIpPT nWPZ16xXimKkryvbFRCiO3xPA6EfRl4uPvbx3qqI2P0NMfQiEISIqkPExUgdAcv+P97HM+dSHivre P0zhlMOntFdDqPBnBoA1vJ5+Bq+W2wr3x6/CpTdEP/xi5IGXp1pYUVfsyRoeXapy3AUAZtDk31cXz owxskLby4Ggp2iKd9qrg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qZ-00000000tHv-2ZJq; Thu, 25 Jan 2024 16:44:59 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2p6-00000000sDI-392g for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:43:35 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E9A0F1515; Thu, 25 Jan 2024 08:44:11 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 053093F5A1; Thu, 25 Jan 2024 08:43:21 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 07/35] mm: cma: Add CMA_RELEASE_{SUCCESS,FAIL} events Date: Thu, 25 Jan 2024 16:42:28 +0000 Message-Id: <20240125164256.4147-8-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084329_265504_94A914C5 X-CRM114-Status: UNSURE ( 9.92 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Similar to the two events that relate to CMA allocations, add the CMA_RELEASE_SUCCESS and CMA_RELEASE_FAIL events that count when CMA pages are freed. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * New patch. include/linux/vm_event_item.h | 2 ++ mm/cma.c | 6 +++++- mm/vmstat.c | 2 ++ 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 747943bc8cc2..aba5c5bf8127 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -83,6 +83,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, #ifdef CONFIG_CMA CMA_ALLOC_SUCCESS, CMA_ALLOC_FAIL, + CMA_RELEASE_SUCCESS, + CMA_RELEASE_FAIL, #endif UNEVICTABLE_PGCULLED, /* culled to noreclaim list */ UNEVICTABLE_PGSCANNED, /* scanned for reclaimability */ diff --git a/mm/cma.c b/mm/cma.c index dbf7fe8cb1bd..543bb6b3be8e 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -562,8 +562,10 @@ bool cma_release(struct cma *cma, const struct page *pages, { unsigned long pfn; - if (!cma_pages_valid(cma, pages, count)) + if (!cma_pages_valid(cma, pages, count)) { + count_vm_events(CMA_RELEASE_FAIL, count); return false; + } pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count); @@ -575,6 +577,8 @@ bool cma_release(struct cma *cma, const struct page *pages, cma_clear_bitmap(cma, pfn, count); trace_cma_release(cma->name, pfn, pages, count); + count_vm_events(CMA_RELEASE_SUCCESS, count); + return true; } diff --git a/mm/vmstat.c b/mm/vmstat.c index db79935e4a54..eebfd5c6c723 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1340,6 +1340,8 @@ const char * const vmstat_text[] = { #ifdef CONFIG_CMA "cma_alloc_success", "cma_alloc_fail", + "cma_release_success", + "cma_release_fail", #endif "unevictable_pgs_culled", "unevictable_pgs_scanned", From patchwork Thu Jan 25 16:42:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531392 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E7AF6C48260 for ; Thu, 25 Jan 2024 16:45:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=YCMZ648exT9LDX2Jjkda8FiKFd4p8u7Bt6Oq9xmducI=; b=p0gs7bDmMDQG/6 8o8l3iPZwHB5jKCmhwtVH5gf5DhKFhZD2hAnQG9QRjCHe3zctGZEfJMgkyv8qtxz9PiE6LBCxLAa7 ybunnHy8xq71pinv34pKsSZdotOZpMqWHJ200canR79iRWeP6F1OwyguxfdQJ9pSX4LdgBIqkBq1a YroQpryXKbwHqq+BEpsZ2bZarWPjaeze68T6QzIUjxyWS2DUsP4mrx3ipBTnHD4ppZGGdVr2PgmIc ZWliBNVb38cN1I1lciLKc9fVbCkoz9Xo/7/ELCsjZS15P3knkw/KPYwgos4l6+aXJF3+apIqJZ8OK 3JBvGJY8BJipQUhVPQhg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qa-00000000tIh-1r83; Thu, 25 Jan 2024 16:45:00 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2pC-00000000sIt-2gyX for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:43:45 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B3E191516; Thu, 25 Jan 2024 08:44:17 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C41E03F5A1; Thu, 25 Jan 2024 08:43:27 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 08/35] mm: cma: Introduce cma_alloc_range() Date: Thu, 25 Jan 2024 16:42:29 +0000 Message-Id: <20240125164256.4147-9-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084335_484268_892E7553 X-CRM114-Status: GOOD ( 19.33 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Today, cma_alloc() is used to allocate a contiguous memory region. The function allows the caller to specify the number of pages to allocate, but not the starting address. cma_alloc() will walk over the entire CMA region trying to allocate the first available range of the specified size. Introduce cma_alloc_range(), which makes CMA more versatile by allowing the caller to specify a particular range in the CMA region, defined by the start pfn and the size. arm64 will make use of this function when tag storage management will be implemented: cma_alloc_range() will be used to reserve the tag storage associated with a tagged page. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * New patch. include/linux/cma.h | 2 + include/trace/events/cma.h | 59 ++++++++++++++++++++++++++ mm/cma.c | 86 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 147 insertions(+) diff --git a/include/linux/cma.h b/include/linux/cma.h index 63873b93deaa..e32559da6942 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -50,6 +50,8 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, struct cma **res_cma); extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align, bool no_warn); +extern int cma_alloc_range(struct cma *cma, unsigned long start, unsigned long count, + unsigned tries, gfp_t gfp); extern bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned long count); extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count); diff --git a/include/trace/events/cma.h b/include/trace/events/cma.h index 25103e67737c..a89af313a572 100644 --- a/include/trace/events/cma.h +++ b/include/trace/events/cma.h @@ -36,6 +36,65 @@ TRACE_EVENT(cma_release, __entry->count) ); +TRACE_EVENT(cma_alloc_range_start, + + TP_PROTO(const char *name, unsigned long start, unsigned long count, + unsigned tries), + + TP_ARGS(name, start, count, tries), + + TP_STRUCT__entry( + __string(name, name) + __field(unsigned long, start) + __field(unsigned long, count) + __field(unsigned, tries) + ), + + TP_fast_assign( + __assign_str(name, name); + __entry->start = start; + __entry->count = count; + __entry->tries = tries; + ), + + TP_printk("name=%s start=%lx count=%lu tries=%u", + __get_str(name), + __entry->start, + __entry->count, + __entry->tries) +); + +TRACE_EVENT(cma_alloc_range_finish, + + TP_PROTO(const char *name, unsigned long start, unsigned long count, + unsigned attempts, int err), + + TP_ARGS(name, start, count, attempts, err), + + TP_STRUCT__entry( + __string(name, name) + __field(unsigned long, start) + __field(unsigned long, count) + __field(unsigned, attempts) + __field(int, err) + ), + + TP_fast_assign( + __assign_str(name, name); + __entry->start = start; + __entry->count = count; + __entry->attempts = attempts; + __entry->err = err; + ), + + TP_printk("name=%s start=%lx count=%lu attempts=%u err=%d", + __get_str(name), + __entry->start, + __entry->count, + __entry->attempts, + __entry->err) +); + TRACE_EVENT(cma_alloc_start, TP_PROTO(const char *name, unsigned long count, unsigned int align), diff --git a/mm/cma.c b/mm/cma.c index 543bb6b3be8e..4a0f68b9443b 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -416,6 +416,92 @@ static void cma_debug_show_areas(struct cma *cma) static inline void cma_debug_show_areas(struct cma *cma) { } #endif +/** + * cma_alloc_range() - allocate pages in a specific range + * @cma: Contiguous memory region for which the allocation is performed. + * @start: Starting pfn of the allocation. + * @count: Requested number of pages + * @tries: Number of tries if the range is busy + * @no_warn: Avoid printing message about failed allocation + * + * This function allocates part of contiguous memory from a specific contiguous + * memory area, from the specified starting address. The 'start' pfn and the the + * 'count' number of pages must be aligned to the CMA bitmap order per bit. + */ +int cma_alloc_range(struct cma *cma, unsigned long start, unsigned long count, + unsigned tries, gfp_t gfp) +{ + unsigned long bitmap_maxno, bitmap_no, bitmap_start, bitmap_count; + unsigned long i = 0; + struct page *page; + int err = -EINVAL; + + if (!cma || !cma->count || !cma->bitmap) + goto out_stats; + + trace_cma_alloc_range_start(cma->name, start, count, tries); + + if (!count || start < cma->base_pfn || + start + count > cma->base_pfn + cma->count) + goto out_stats; + + if (!IS_ALIGNED(start | count, 1 << cma->order_per_bit)) + goto out_stats; + + bitmap_start = (start - cma->base_pfn) >> cma->order_per_bit; + bitmap_maxno = cma_bitmap_maxno(cma); + bitmap_count = cma_bitmap_pages_to_bits(cma, count); + + spin_lock_irq(&cma->lock); + bitmap_no = bitmap_find_next_zero_area(cma->bitmap, bitmap_maxno, + bitmap_start, bitmap_count, 0); + if (bitmap_no != bitmap_start) { + spin_unlock_irq(&cma->lock); + err = -EEXIST; + goto out_stats; + } + bitmap_set(cma->bitmap, bitmap_start, bitmap_count); + spin_unlock_irq(&cma->lock); + + for (i = 0; i < tries; i++) { + mutex_lock(&cma_mutex); + err = alloc_contig_range(start, start + count, MIGRATE_CMA, gfp); + mutex_unlock(&cma_mutex); + + if (err != -EBUSY) + break; + } + + if (err) { + cma_clear_bitmap(cma, start, count); + } else { + page = pfn_to_page(start); + + /* + * CMA can allocate multiple page blocks, which results in + * different blocks being marked with different tags. Reset the + * tags to ignore those page blocks. + */ + for (i = 0; i < count; i++) + page_kasan_tag_reset(nth_page(page, i)); + } + +out_stats: + trace_cma_alloc_range_finish(cma->name, start, count, i, err); + + if (err) { + count_vm_events(CMA_ALLOC_FAIL, count); + if (cma) + cma_sysfs_account_fail_pages(cma, count); + } else { + count_vm_events(CMA_ALLOC_SUCCESS, count); + cma_sysfs_account_success_pages(cma, count); + } + + return err; +} + + /** * cma_alloc() - allocate pages from contiguous area * @cma: Contiguous memory region for which the allocation is performed. From patchwork Thu Jan 25 16:42:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531390 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D6E75C47258 for ; Thu, 25 Jan 2024 16:45:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=sO8Gx1Fus67YtJs5sBUHE0GYPqaQqmUqw3KycaU923U=; b=q/lzkmPhVbCjo1 JIn/6KhIewryBEff3ywBu9I7L2sJPqsRTMzfOnuLVLBgfjC1uGREOJbtbDoy3nbVXJUs7fMCqeeq1 2FZsi64cJe7FJCnt+Jxk+2E4cdwcJOJKM5kySAEkgpjZ0r+DutmIeclUfiIBKdTuXdmSzsiBwUjGF gOLcaFHcW9NWZd0QAsX1Eh3WufhfSAcFctVesA6g+sjKrW/vg83k919FvFgZAByhr1xRfy4bTeOqj 425ZM0FMqTuOigNrvyUHidrgobGTj5mOUUXK4r0Ok3piyuIrj8pZtyaub2eKlcOE84EfMnbG8Tlyk H/y4yTVSwEwOgkzDzEfg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qb-00000000tJP-165G; Thu, 25 Jan 2024 16:45:01 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2pI-00000000sPC-26Jw for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:43:52 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7CBA5153B; Thu, 25 Jan 2024 08:44:23 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8DF503F5A1; Thu, 25 Jan 2024 08:43:33 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 09/35] mm: cma: Introduce cma_remove_mem() Date: Thu, 25 Jan 2024 16:42:30 +0000 Message-Id: <20240125164256.4147-10-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084341_330765_6029E648 X-CRM114-Status: GOOD ( 19.00 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Memory is added to CMA with cma_declare_contiguous_nid() and cma_init_reserved_mem(). This memory is then put on the MIGRATE_CMA list in cma_init_reserved_areas(), where the page allocator can make use of it. If a device manages multiple CMA areas, and there's an error when one of the areas is added to CMA, there is no mechanism for the device to prevent the rest of the areas, which were added before the error occured, from being later added to the MIGRATE_CMA list. Add cma_remove_mem() which allows a previously reserved CMA area to be removed and thus it cannot be used by the page allocator. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * New patch. include/linux/cma.h | 1 + mm/cma.c | 30 +++++++++++++++++++++++++++++- 2 files changed, 30 insertions(+), 1 deletion(-) diff --git a/include/linux/cma.h b/include/linux/cma.h index e32559da6942..787cbec1702e 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -48,6 +48,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, unsigned int order_per_bit, const char *name, struct cma **res_cma); +extern void cma_remove_mem(struct cma **res_cma); extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align, bool no_warn); extern int cma_alloc_range(struct cma *cma, unsigned long start, unsigned long count, diff --git a/mm/cma.c b/mm/cma.c index 4a0f68b9443b..2881bab12b01 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -147,8 +147,12 @@ static int __init cma_init_reserved_areas(void) { int i; - for (i = 0; i < cma_area_count; i++) + for (i = 0; i < cma_area_count; i++) { + /* Region was removed. */ + if (!cma_areas[i].count) + continue; cma_activate_area(&cma_areas[i]); + } return 0; } @@ -216,6 +220,30 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, return 0; } +/** + * cma_remove_mem() - remove cma area + * @res_cma: Pointer to the cma region. + * + * This function removes a cma region created with cma_init_reserved_mem(). The + * ->count is set to 0. + */ +void __init cma_remove_mem(struct cma **res_cma) +{ + struct cma *cma; + + if (WARN_ON_ONCE(!res_cma || !(*res_cma))) + return; + + cma = *res_cma; + if (WARN_ON_ONCE(!cma->count)) + return; + + totalcma_pages -= cma->count; + cma->count = 0; + + *res_cma = NULL; +} + /** * cma_declare_contiguous_nid() - reserve custom contiguous area * @base: Base address of the reserved area optional, use 0 for any From patchwork Thu Jan 25 16:42:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531400 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B2032C47422 for ; Thu, 25 Jan 2024 16:45:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=L7pWWoTR5amHZ+veK/lt0k4WhVReLcyA58mLz9xzTGU=; b=4owLIkN33PQbgq Kvz8auLCfvhX5mWLFWISx812zm7rmxJZBLLZU/pG0vjRFkwXsJGnpclLYZVepLQt/NDOQA0CqGJM1 Gb5v7Ybl5KiY+f/ab56BSTLPl6OwA6rGMW45i3S0+1NnpYDILH8X8j1zhBuKQSk7/xKb9PHxYShbN 823Ixegq+eMxRNdmeEieYm/wldGUsJ3Pf2fmX7+MH3hqupRNx1PSt/PT+Pl28phMfu8ZA+/HPokx/ ijbqB7siTLiyAgCnhCoIf5R/dWkOCjqw6+cGRDbnQsdvJk7aEP79b5ioGnw8r6xLr1yZL+DpYRWd/ E7kS3j1Lo7dI0F7o/ZOQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2r3-00000000tmj-1U6i; Thu, 25 Jan 2024 16:45:29 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2q5-00000000t0G-3Zqw for linux-arm-kernel@bombadil.infradead.org; Thu, 25 Jan 2024 16:44:30 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=BrAYirD1C9ZlW+xHPq+DxNqIyz5EwyKAGNdhCWfnghE=; b=FtiofUXOUXAxL/7RWPpw+CZNhR b2xGoqoP5k6FimZw5/ZxW7HjseZl1phsglnh9h1FhuacsYWIKnew3Q6AqpDHVOmquqqX3bYWxofFc 8CGVjZ0aQI3s9XNtCz7k+ahjP2ERcjzoz20nDR7DFBYQ7nYkpJk6TRBuzGsVkkoLz7ptz4qyhaPmh VBjEG4gAf9VPumXRi0G1A/MojhAeiXmxrsR5R99SlMs6WzVjuSGHh6O0jk3DfGAuJZalUjifp1luS r9xd5kSEhD3qDNj00LrzLPUmyQlukPN3tqNTLnTgl8RgnrNTc3fJxyy6G3M+DbDnUokabMAMWEXwN ICpzlvGg==; Received: from foss.arm.com ([217.140.110.172]) by desiato.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2pa-00000005UTR-1fcb for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:44:20 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4954D152B; Thu, 25 Jan 2024 08:44:29 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5543F3F8A4; Thu, 25 Jan 2024 08:43:39 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 10/35] mm: cma: Fast track allocating memory when the pages are free Date: Thu, 25 Jan 2024 16:42:31 +0000 Message-Id: <20240125164256.4147-11-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_164404_414020_DAC7A8EC X-CRM114-Status: GOOD ( 22.30 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org If the pages to be allocated are free, take them directly off the buddy allocator, instead of going through alloc_contig_range() and avoiding costly calls to lru_cache_disable(). Only allocations of the same size as the CMA region order are considered, to avoid taking the zone spinlock for too long. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * New patch. Reworked from the rfc v2 patch #26 ("arm64: mte: Fast track reserving tag storage when the block is free") (David Hildenbrand). include/linux/page-flags.h | 15 ++++++++++++-- mm/Kconfig | 5 +++++ mm/cma.c | 42 ++++++++++++++++++++++++++++++++++---- mm/memory-failure.c | 8 ++++---- mm/page_alloc.c | 23 ++++++++++++--------- 5 files changed, 73 insertions(+), 20 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 735cddc13d20..b7237bce7446 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -575,11 +575,22 @@ TESTSCFLAG(HWPoison, hwpoison, PF_ANY) #define MAGIC_HWPOISON 0x48575053U /* HWPS */ extern void SetPageHWPoisonTakenOff(struct page *page); extern void ClearPageHWPoisonTakenOff(struct page *page); -extern bool take_page_off_buddy(struct page *page); -extern bool put_page_back_buddy(struct page *page); +extern bool PageHWPoisonTakenOff(struct page *page); #else PAGEFLAG_FALSE(HWPoison, hwpoison) +TESTSCFLAG_FALSE(HWPoison, hwpoison) #define __PG_HWPOISON 0 +static inline void SetPageHWPoisonTakenOff(struct page *page) { } +static inline void ClearPageHWPoisonTakenOff(struct page *page) { } +static inline bool PageHWPoisonTakenOff(struct page *page) +{ + return false; +} +#endif + +#ifdef CONFIG_WANTS_TAKE_PAGE_OFF_BUDDY +extern bool take_page_off_buddy(struct page *page, bool poison); +extern bool put_page_back_buddy(struct page *page, bool unpoison); #endif #if defined(CONFIG_PAGE_IDLE_FLAG) && defined(CONFIG_64BIT) diff --git a/mm/Kconfig b/mm/Kconfig index ffc3a2ba3a8c..341cf53898db 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -745,12 +745,16 @@ config DEFAULT_MMAP_MIN_ADDR config ARCH_SUPPORTS_MEMORY_FAILURE bool +config WANTS_TAKE_PAGE_OFF_BUDDY + bool + config MEMORY_FAILURE depends on MMU depends on ARCH_SUPPORTS_MEMORY_FAILURE bool "Enable recovery from hardware memory errors" select MEMORY_ISOLATION select RAS + select WANTS_TAKE_PAGE_OFF_BUDDY help Enables code to recover from some memory failures on systems with MCA recovery. This allows a system to continue running @@ -891,6 +895,7 @@ config CMA depends on MMU select MIGRATION select MEMORY_ISOLATION + select WANTS_TAKE_PAGE_OFF_BUDDY help This enables the Contiguous Memory Allocator which allows other subsystems to allocate big physically-contiguous blocks of memory. diff --git a/mm/cma.c b/mm/cma.c index 2881bab12b01..15663f95d77b 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -444,6 +444,34 @@ static void cma_debug_show_areas(struct cma *cma) static inline void cma_debug_show_areas(struct cma *cma) { } #endif +/* Called with the cma mutex held. */ +static int cma_alloc_pages_fastpath(struct cma *cma, unsigned long start, + unsigned long end) +{ + bool success = false; + unsigned long i, j; + + /* Avoid contention on the zone lock. */ + if (start - end != 1 << cma->order_per_bit) + return -EINVAL; + + for (i = start; i < end; i++) { + if (!is_free_buddy_page(pfn_to_page(i))) + break; + success = take_page_off_buddy(pfn_to_page(i), false); + if (!success) + break; + } + + if (success) + return 0; + + for (j = start; j < i; j++) + put_page_back_buddy(pfn_to_page(j), false); + + return -EBUSY; +} + /** * cma_alloc_range() - allocate pages in a specific range * @cma: Contiguous memory region for which the allocation is performed. @@ -493,7 +521,11 @@ int cma_alloc_range(struct cma *cma, unsigned long start, unsigned long count, for (i = 0; i < tries; i++) { mutex_lock(&cma_mutex); - err = alloc_contig_range(start, start + count, MIGRATE_CMA, gfp); + err = cma_alloc_pages_fastpath(cma, start, start + count); + if (err) { + err = alloc_contig_range(start, start + count, + MIGRATE_CMA, gfp); + } mutex_unlock(&cma_mutex); if (err != -EBUSY) @@ -529,7 +561,6 @@ int cma_alloc_range(struct cma *cma, unsigned long start, unsigned long count, return err; } - /** * cma_alloc() - allocate pages from contiguous area * @cma: Contiguous memory region for which the allocation is performed. @@ -589,8 +620,11 @@ struct page *cma_alloc(struct cma *cma, unsigned long count, pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit); mutex_lock(&cma_mutex); - ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, - GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0)); + ret = cma_alloc_pages_fastpath(cma, pfn, pfn + count); + if (ret) { + ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, + GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0)); + } mutex_unlock(&cma_mutex); if (ret == 0) { page = pfn_to_page(pfn); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 4f9b61f4a668..b87b533a9871 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -157,7 +157,7 @@ static int __page_handle_poison(struct page *page) zone_pcp_disable(page_zone(page)); ret = dissolve_free_huge_page(page); if (!ret) - ret = take_page_off_buddy(page); + ret = take_page_off_buddy(page, true); zone_pcp_enable(page_zone(page)); return ret; @@ -1353,7 +1353,7 @@ static int page_action(struct page_state *ps, struct page *p, return action_result(pfn, ps->type, result); } -static inline bool PageHWPoisonTakenOff(struct page *page) +bool PageHWPoisonTakenOff(struct page *page) { return PageHWPoison(page) && page_private(page) == MAGIC_HWPOISON; } @@ -2247,7 +2247,7 @@ int memory_failure(unsigned long pfn, int flags) res = get_hwpoison_page(p, flags); if (!res) { if (is_free_buddy_page(p)) { - if (take_page_off_buddy(p)) { + if (take_page_off_buddy(p, true)) { page_ref_inc(p); res = MF_RECOVERED; } else { @@ -2578,7 +2578,7 @@ int unpoison_memory(unsigned long pfn) ret = folio_test_clear_hwpoison(folio) ? 0 : -EBUSY; } else if (ghp < 0) { if (ghp == -EHWPOISON) { - ret = put_page_back_buddy(p) ? 0 : -EBUSY; + ret = put_page_back_buddy(p, true) ? 0 : -EBUSY; } else { ret = ghp; unpoison_pr_info("Unpoison: failed to grab page %#lx\n", diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 0fa34bcfb1af..502ee3eb8583 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6655,7 +6655,7 @@ bool is_free_buddy_page(struct page *page) } EXPORT_SYMBOL(is_free_buddy_page); -#ifdef CONFIG_MEMORY_FAILURE +#ifdef CONFIG_WANTS_TAKE_PAGE_OFF_BUDDY /* * Break down a higher-order page in sub-pages, and keep our target out of * buddy allocator. @@ -6687,9 +6687,9 @@ static void break_down_buddy_pages(struct zone *zone, struct page *page, } /* - * Take a page that will be marked as poisoned off the buddy allocator. + * Take a page off the buddy allocator, and optionally mark it as poisoned. */ -bool take_page_off_buddy(struct page *page) +bool take_page_off_buddy(struct page *page, bool poison) { struct zone *zone = page_zone(page); unsigned long pfn = page_to_pfn(page); @@ -6710,7 +6710,8 @@ bool take_page_off_buddy(struct page *page) del_page_from_free_list(page_head, zone, page_order); break_down_buddy_pages(zone, page_head, page, 0, page_order, migratetype); - SetPageHWPoisonTakenOff(page); + if (poison) + SetPageHWPoisonTakenOff(page); if (!is_migrate_isolate(migratetype)) __mod_zone_freepage_state(zone, -1, migratetype); ret = true; @@ -6724,9 +6725,10 @@ bool take_page_off_buddy(struct page *page) } /* - * Cancel takeoff done by take_page_off_buddy(). + * Cancel takeoff done by take_page_off_buddy(), and optionally unpoison the + * page. */ -bool put_page_back_buddy(struct page *page) +bool put_page_back_buddy(struct page *page, bool unpoison) { struct zone *zone = page_zone(page); unsigned long pfn = page_to_pfn(page); @@ -6736,17 +6738,18 @@ bool put_page_back_buddy(struct page *page) spin_lock_irqsave(&zone->lock, flags); if (put_page_testzero(page)) { - ClearPageHWPoisonTakenOff(page); + VM_WARN_ON_ONCE(PageHWPoisonTakenOff(page) && !unpoison); + if (unpoison) + ClearPageHWPoisonTakenOff(page); __free_one_page(page, pfn, zone, 0, migratetype, FPI_NONE); - if (TestClearPageHWPoison(page)) { + if (!unpoison || (unpoison && TestClearPageHWPoison(page))) ret = true; - } } spin_unlock_irqrestore(&zone->lock, flags); return ret; } -#endif +#endif /* CONFIG_WANTS_TAKE_PAGE_OFF_BUDDY */ #ifdef CONFIG_ZONE_DMA bool has_managed_dma(void) From patchwork Thu Jan 25 16:42:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531394 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9BC22C4828A for ; Thu, 25 Jan 2024 16:45:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=I1SDkHOEXhhIyFBU3BBnhEeZXrCp+hSWM7USj3WJXCc=; b=NJcrew08esW94B VlF3Yohd9ZpohVyHRksgdRUYzSzt2AIo32I/YHqTqlhVsmfq7MmLHh7w/trQ1avyw/TXUhflhWVjY NDDmNpTgx89recOiATKc7lVkVes+BhUsEuA9r8Qw9Z3SCego9uLj/BH7AQ3Unk6RAPvAUv1e9BlMJ I/nUr9OwWqu9jk9h2z9PcFkYqLBw6EJVi3+nOMc6TUDFXQ2h0FivRxeLMxVIOGWOANBTCx6YtmYq4 NXWFO/kugG+rT10xylzqxhBVk9VH6DzvBnw9enrNypWHyD3AZ4E2W+CD4Ez4M3lyGsBCDxRvMrGzT p/itLIkVaXhl+MhUmjKw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qc-00000000tKs-3Tgy; Thu, 25 Jan 2024 16:45:02 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2pT-00000000sYN-3KFK for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:43:58 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 113601570; Thu, 25 Jan 2024 08:44:35 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 240823F5A1; Thu, 25 Jan 2024 08:43:45 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 11/35] mm: Allow an arch to hook into folio allocation when VMA is known Date: Thu, 25 Jan 2024 16:42:32 +0000 Message-Id: <20240125164256.4147-12-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084352_227373_1495C6EC X-CRM114-Status: GOOD ( 22.43 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org arm64 uses VM_HIGH_ARCH_0 and VM_HIGH_ARCH_1 for enabling MTE for a VMA. When VM_HIGH_ARCH_0, which arm64 renames to VM_MTE, is set for a VMA, and the gfp flag __GFP_ZERO is present, the __GFP_ZEROTAGS gfp flag also gets set in vma_alloc_zeroed_movable_folio(). Expand this to be more generic by adding an arch hook that modifes the gfp flags for an allocation when the VMA is known. Note that __GFP_ZEROTAGS is ignored by the page allocator unless __GFP_ZERO is also set; from that point of view, the current behaviour is unchanged, even though the arm64 flag is set in more places. When arm64 will have support to reuse the tag storage for data allocation, the uses of the __GFP_ZEROTAGS flag will be expanded to instruct the page allocator to try to reserve the corresponding tag storage for the pages being allocated. The flags returned by arch_calc_vma_gfp() are or'ed with the flags set by the caller; this has been done to keep an architecture from modifying the flags already set by the core memory management code; this is similar to how do_mmap() -> calc_vm_flag_bits() -> arch_calc_vm_flag_bits() has been implemented. This can be revisited in the future if there's a need to do so. Signed-off-by: Alexandru Elisei --- arch/arm64/include/asm/page.h | 5 ++--- arch/arm64/include/asm/pgtable.h | 3 +++ arch/arm64/mm/fault.c | 19 ++++++------------- include/linux/pgtable.h | 7 +++++++ mm/mempolicy.c | 1 + mm/shmem.c | 5 ++++- 6 files changed, 23 insertions(+), 17 deletions(-) diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 2312e6ee595f..88bab032a493 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -29,9 +29,8 @@ void copy_user_highpage(struct page *to, struct page *from, void copy_highpage(struct page *to, struct page *from); #define __HAVE_ARCH_COPY_HIGHPAGE -struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, - unsigned long vaddr); -#define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio +#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ + vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false) void tag_clear_highpage(struct page *to); #define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 79ce70fbb751..08f0904dbfc2 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1071,6 +1071,9 @@ static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) #endif /* CONFIG_ARM64_MTE */ +#define __HAVE_ARCH_CALC_VMA_GFP +gfp_t arch_calc_vma_gfp(struct vm_area_struct *vma, gfp_t gfp); + /* * On AArch64, the cache coherency is handled via the set_pte_at() function. */ diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 55f6455a8284..4d3f0a870ad8 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -937,22 +937,15 @@ void do_debug_exception(unsigned long addr_if_watchpoint, unsigned long esr, NOKPROBE_SYMBOL(do_debug_exception); /* - * Used during anonymous page fault handling. + * If this is called during anonymous page fault handling, and the page is + * mapped with PROT_MTE, initialise the tags at the point of tag zeroing as this + * is usually faster than separate DC ZVA and STGM. */ -struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, - unsigned long vaddr) +gfp_t arch_calc_vma_gfp(struct vm_area_struct *vma, gfp_t gfp) { - gfp_t flags = GFP_HIGHUSER_MOVABLE | __GFP_ZERO; - - /* - * If the page is mapped with PROT_MTE, initialise the tags at the - * point of allocation and page zeroing as this is usually faster than - * separate DC ZVA and STGM. - */ if (vma->vm_flags & VM_MTE) - flags |= __GFP_ZEROTAGS; - - return vma_alloc_folio(flags, 0, vma, vaddr, false); + return __GFP_ZEROTAGS; + return 0; } void tag_clear_highpage(struct page *page) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index c5ddec6b5305..98f81ca08cbe 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -901,6 +901,13 @@ static inline void arch_do_swap_page(struct mm_struct *mm, } #endif +#ifndef __HAVE_ARCH_CALC_VMA_GFP +static inline gfp_t arch_calc_vma_gfp(struct vm_area_struct *vma, gfp_t gfp) +{ + return 0; +} +#endif + #ifndef __HAVE_ARCH_FREE_PAGES_PREPARE static inline void arch_free_pages_prepare(struct page *page, int order) { } #endif diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 10a590ee1c89..f7ef52760b32 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2168,6 +2168,7 @@ struct folio *vma_alloc_folio(gfp_t gfp, int order, struct vm_area_struct *vma, pgoff_t ilx; struct page *page; + gfp |= arch_calc_vma_gfp(vma, gfp); pol = get_vma_policy(vma, addr, order, &ilx); page = alloc_pages_mpol(gfp | __GFP_COMP, order, pol, ilx, numa_node_id()); diff --git a/mm/shmem.c b/mm/shmem.c index d7c84ff62186..14427e9982f9 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1585,7 +1585,7 @@ static struct folio *shmem_swapin_cluster(swp_entry_t swap, gfp_t gfp, */ static gfp_t limit_gfp_mask(gfp_t huge_gfp, gfp_t limit_gfp) { - gfp_t allowflags = __GFP_IO | __GFP_FS | __GFP_RECLAIM; + gfp_t allowflags = __GFP_IO | __GFP_FS | __GFP_RECLAIM | __GFP_ZEROTAGS; gfp_t denyflags = __GFP_NOWARN | __GFP_NORETRY; gfp_t zoneflags = limit_gfp & GFP_ZONEMASK; gfp_t result = huge_gfp & ~(allowflags | GFP_ZONEMASK); @@ -2038,6 +2038,7 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, gfp_t huge_gfp; huge_gfp = vma_thp_gfp_mask(vma); + huge_gfp |= arch_calc_vma_gfp(vma, huge_gfp); huge_gfp = limit_gfp_mask(huge_gfp, gfp); folio = shmem_alloc_and_add_folio(huge_gfp, inode, index, fault_mm, true); @@ -2214,6 +2215,8 @@ static vm_fault_t shmem_fault(struct vm_fault *vmf) vm_fault_t ret = 0; int err; + gfp |= arch_calc_vma_gfp(vmf->vma, gfp); + /* * Trinity finds that probing a hole which tmpfs is punching can * prevent the hole-punch from ever completing: noted in i_private. From patchwork Thu Jan 25 16:42:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531391 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 58C77C48286 for ; Thu, 25 Jan 2024 16:45:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=C5T1RJ9Ln+g0eZoKt7ykBMveoJwzuz6h96p9u9TAUH8=; b=NRxLUuw847iPgn 0WTwwMf3TdprbFQrt2PmbnAzm5kyoFxixXClXeYd4MvlzA4/OWfmIT6NtCVJj/rPa78VdC1NUG07J bP0nZzAdwel75ITSMbQdkcVNwf2jgNGLxNlTOlpz01VwJmJj8799nTwAp4CJe6wWoRAB6Km+ACVV3 VKdRVeYqk3fw7pCUrjtQlmmS0SX34Y/8vhREfxx2gonL6mPHDSFWtl2BT8ALRXD97/y4T8HXfisrP uyqbn9LeA6bZVzTi64iZCCdER3V0Wr5VesxHlNZl8bCcxUdyPtKNgzzWvzREB74369jysH6VFw25K Iu7az/pTVDLW4U16wz1Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qe-00000000tLp-00l1; Thu, 25 Jan 2024 16:45:04 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2pZ-00000000sch-3WD3 for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:44:07 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CDDEF1576; Thu, 25 Jan 2024 08:44:40 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DE15E3F5A1; Thu, 25 Jan 2024 08:43:50 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 12/35] mm: Call arch_swap_prepare_to_restore() before arch_swap_restore() Date: Thu, 25 Jan 2024 16:42:33 +0000 Message-Id: <20240125164256.4147-13-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084358_265167_7DC4BC72 X-CRM114-Status: GOOD ( 13.47 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org arm64 uses arch_swap_restore() to restore saved tags before the page is swapped in and it's called in atomic context (with the ptl lock held). Introduce arch_swap_prepare_to_restore() that will allow an architecture to perform extra work during swap in and outside of a critical section. This will be used by arm64 to allocate a buffer in memory where to temporarily save tags if tag storage is not available for the page being swapped in. Signed-off-by: Alexandru Elisei --- include/linux/pgtable.h | 7 +++++++ mm/memory.c | 4 ++++ mm/shmem.c | 9 +++++++++ mm/swapfile.c | 5 +++++ 4 files changed, 25 insertions(+) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 98f81ca08cbe..2d0f04042f62 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -959,6 +959,13 @@ static inline void arch_swap_invalidate_area(int type) } #endif +#ifndef __HAVE_ARCH_SWAP_PREPARE_TO_RESTORE +static inline vm_fault_t arch_swap_prepare_to_restore(swp_entry_t entry, struct folio *folio) +{ + return 0; +} +#endif + #ifndef __HAVE_ARCH_SWAP_RESTORE static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) { diff --git a/mm/memory.c b/mm/memory.c index 7e1f4849463a..8a421e168b57 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3975,6 +3975,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) folio_throttle_swaprate(folio, GFP_KERNEL); + ret = arch_swap_prepare_to_restore(entry, folio); + if (ret) + goto out_page; + /* * Back out if somebody else already faulted in this pte. */ diff --git a/mm/shmem.c b/mm/shmem.c index 14427e9982f9..621fabc3b8c6 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1855,6 +1855,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, struct swap_info_struct *si; struct folio *folio = NULL; swp_entry_t swap; + vm_fault_t ret; int error; VM_BUG_ON(!*foliop || !xa_is_value(*foliop)); @@ -1903,6 +1904,14 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, } folio_wait_writeback(folio); + ret = arch_swap_prepare_to_restore(swap, folio); + if (ret) { + if (fault_type) + *fault_type = ret; + error = -EINVAL; + goto unlock; + } + /* * Some architectures may have to restore extra metadata to the * folio after reading from swap. diff --git a/mm/swapfile.c b/mm/swapfile.c index 556ff7347d5f..49425598f778 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1785,6 +1785,11 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, goto setpte; } + if (arch_swap_prepare_to_restore(entry, folio)) { + ret = -EINVAL; + goto out; + } + /* * Some architectures may have to restore extra metadata to the page * when reading from swap. This metadata may be indexed by swap entry From patchwork Thu Jan 25 16:42:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531395 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D40BEC47258 for ; Thu, 25 Jan 2024 16:45:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=pWS3mRdXgraXbYXoqq4rwtBjrVykeCBwiYl+Cng3bR8=; b=VJDz2jLxP44bDF xPe+MHUux+ltB7tJ26GAswCf82yitYb+q2e5Zk6heeYPOlIKTMPoAoXZtiS3kTEFp5e/4KK/uOON3 gvbriEFOwTkOsQPuP/xN4RWaaCpHULmkUlaJXYXkMxaIZOuBGnOSwyMS4qu1sudp9/66KtOXoujE1 bQfgSxluoOLVXXSpgoSL+mBS9lY/3sr12kHxHpGLelRXk0Mk065grwWa1fyoLwOMJvqfuwcUp4q8x +anJsQaQCUNeNZ+ZU1uwjvKkDxUXyIOhK65CoVnJYtSNZSZM7mS6OYBZ1Nq4Tf383CZ90m9R2eac8 jIo2zuUFx6+VNZPPlblQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qg-00000000tNY-1BRI; Thu, 25 Jan 2024 16:45:06 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2pf-00000000sh6-2pnS for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:44:10 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B7E751596; Thu, 25 Jan 2024 08:44:46 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A76463F5A1; Thu, 25 Jan 2024 08:43:56 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 13/35] mm: memory: Introduce fault-on-access mechanism for pages Date: Thu, 25 Jan 2024 16:42:34 +0000 Message-Id: <20240125164256.4147-14-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084404_025357_672845CE X-CRM114-Status: GOOD ( 27.77 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Introduce a mechanism that allows an architecture to trigger a page fault, and add the infrastructure to handle that fault accordingly. To use make use of this, an arch is expected to mark the table entry as PAGE_NONE (which will cause a fault next time it is accessed) and to implement an arch-specific method (like a software bit) for recognizing that the fault needs to be handled by the arch code. arm64 will use of this approach to reserve tag storage for pages which are mapped in an MTE enabled VMA, but the storage needed to store tags isn't reserved (for example, because of an mprotect(PROT_MTE) call on a VMA with existing pages). Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * New patch. Split from patch #19 ("mm: mprotect: Introduce PAGE_FAULT_ON_ACCESS for mprotect(PROT_MTE)") (David Hildenbrand). include/linux/huge_mm.h | 4 ++-- include/linux/pgtable.h | 47 +++++++++++++++++++++++++++++++++++-- mm/Kconfig | 3 +++ mm/huge_memory.c | 36 +++++++++++++++++++++-------- mm/memory.c | 51 ++++++++++++++++++++++++++--------------- 5 files changed, 109 insertions(+), 32 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 5adb86af35fc..4678a0a5e6a8 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -346,7 +346,7 @@ struct page *follow_devmap_pmd(struct vm_area_struct *vma, unsigned long addr, struct page *follow_devmap_pud(struct vm_area_struct *vma, unsigned long addr, pud_t *pud, int flags, struct dev_pagemap **pgmap); -vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf); +vm_fault_t handle_huge_pmd_protnone(struct vm_fault *vmf); extern struct page *huge_zero_page; extern unsigned long huge_zero_pfn; @@ -476,7 +476,7 @@ static inline spinlock_t *pud_trans_huge_lock(pud_t *pud, return NULL; } -static inline vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf) +static inline vm_fault_t handle_huge_pmd_protnone(struct vm_fault *vmf) { return 0; } diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 2d0f04042f62..81a21be855a2 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1455,7 +1455,7 @@ static inline int pud_trans_unstable(pud_t *pud) return 0; } -#ifndef CONFIG_NUMA_BALANCING +#if !defined(CONFIG_NUMA_BALANCING) && !defined(CONFIG_ARCH_HAS_FAULT_ON_ACCESS) /* * In an inaccessible (PROT_NONE) VMA, pte_protnone() may indicate "yes". It is * perfectly valid to indicate "no" in that case, which is why our default @@ -1477,7 +1477,50 @@ static inline int pmd_protnone(pmd_t pmd) { return 0; } -#endif /* CONFIG_NUMA_BALANCING */ +#endif /* !CONFIG_NUMA_BALANCING && !CONFIG_ARCH_HAS_FAULT_ON_ACCESS */ + +#ifndef CONFIG_ARCH_HAS_FAULT_ON_ACCESS +static inline bool arch_fault_on_access_pte(pte_t pte) +{ + return false; +} + +static inline bool arch_fault_on_access_pmd(pmd_t pmd) +{ + return false; +} + +/* + * The function is called with the fault lock held and an elevated reference on + * the folio. + * + * Rules that an arch implementation of the function must follow: + * + * 1. The function must return with the elevated reference dropped. + * + * 2. If the return value contains VM_FAULT_RETRY or VM_FAULT_COMPLETED then: + * + * - if FAULT_FLAG_RETRY_NOWAIT is not set, the function must return with the + * correct fault lock released, which can be accomplished with + * release_fault_lock(vmf). Note that release_fault_lock() doesn't check if + * FAULT_FLAG_RETRY_NOWAIT is set before releasing the mmap_lock. + * + * - if FAULT_FLAG_RETRY_NOWAIT is set, then the function must not release the + * mmap_lock. The flag should be set only if the mmap_lock is held. + * + * 3. If the return value contains neither of the above, the function must not + * release the fault lock; the generic fault handler will take care of releasing + * the correct lock. + */ +static inline vm_fault_t arch_handle_folio_fault_on_access(struct folio *folio, + struct vm_fault *vmf, + bool *map_pte) +{ + *map_pte = false; + + return VM_FAULT_SIGBUS; +} +#endif #endif /* CONFIG_MMU */ diff --git a/mm/Kconfig b/mm/Kconfig index 341cf53898db..153df67221f1 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1006,6 +1006,9 @@ config IDLE_PAGE_TRACKING config ARCH_HAS_CACHE_LINE_SIZE bool +config ARCH_HAS_FAULT_ON_ACCESS + bool + config ARCH_HAS_CURRENT_STACK_POINTER bool help diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 94ef5c02b459..2bad63a7ec16 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1698,7 +1698,7 @@ struct page *follow_trans_huge_pmd(struct vm_area_struct *vma, } /* NUMA hinting page fault entry point for trans huge pmds */ -vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf) +vm_fault_t handle_huge_pmd_protnone(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; pmd_t oldpmd = vmf->orig_pmd; @@ -1708,6 +1708,7 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf) int nid = NUMA_NO_NODE; int target_nid, last_cpupid = (-1 & LAST_CPUPID_MASK); bool migrated = false, writable = false; + vm_fault_t ret; int flags = 0; vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); @@ -1731,6 +1732,20 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf) if (!folio) goto out_map; + folio_get(folio); + vma_set_access_pid_bit(vma); + + if (arch_fault_on_access_pmd(oldpmd)) { + bool map_pte = false; + + spin_unlock(vmf->ptl); + ret = arch_handle_folio_fault_on_access(folio, vmf, &map_pte); + if (ret || !map_pte) + return ret; + writable = false; + goto out_lock_and_map; + } + /* See similar comment in do_numa_page for explanation */ if (!writable) flags |= TNF_NO_GROUP; @@ -1755,15 +1770,18 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf) if (migrated) { flags |= TNF_MIGRATED; nid = target_nid; - } else { - flags |= TNF_MIGRATE_FAIL; - vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); - if (unlikely(!pmd_same(oldpmd, *vmf->pmd))) { - spin_unlock(vmf->ptl); - goto out; - } - goto out_map; + goto out; + } + + flags |= TNF_MIGRATE_FAIL; + +out_lock_and_map: + vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); + if (unlikely(!pmd_same(oldpmd, *vmf->pmd))) { + spin_unlock(vmf->ptl); + goto out; } + goto out_map; out: if (nid != NUMA_NO_NODE) diff --git a/mm/memory.c b/mm/memory.c index 8a421e168b57..110fe2224277 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4886,11 +4886,6 @@ static vm_fault_t do_fault(struct vm_fault *vmf) int numa_migrate_prep(struct folio *folio, struct vm_area_struct *vma, unsigned long addr, int page_nid, int *flags) { - folio_get(folio); - - /* Record the current PID acceesing VMA */ - vma_set_access_pid_bit(vma); - count_vm_numa_event(NUMA_HINT_FAULTS); if (page_nid == numa_node_id()) { count_vm_numa_event(NUMA_HINT_FAULTS_LOCAL); @@ -4900,13 +4895,14 @@ int numa_migrate_prep(struct folio *folio, struct vm_area_struct *vma, return mpol_misplaced(folio, vma, addr); } -static vm_fault_t do_numa_page(struct vm_fault *vmf) +static vm_fault_t handle_pte_protnone(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; struct folio *folio = NULL; int nid = NUMA_NO_NODE; bool writable = false; int last_cpupid; + vm_fault_t ret; int target_nid; pte_t pte, old_pte; int flags = 0; @@ -4939,6 +4935,20 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) if (!folio || folio_is_zone_device(folio)) goto out_map; + folio_get(folio); + /* Record the current PID acceesing VMA */ + vma_set_access_pid_bit(vma); + + if (arch_fault_on_access_pte(old_pte)) { + bool map_pte = false; + + pte_unmap_unlock(vmf->pte, vmf->ptl); + ret = arch_handle_folio_fault_on_access(folio, vmf, &map_pte); + if (ret || !map_pte) + return ret; + goto out_lock_and_map; + } + /* TODO: handle PTE-mapped THP */ if (folio_test_large(folio)) goto out_map; @@ -4983,18 +4993,21 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) if (migrate_misplaced_folio(folio, vma, target_nid)) { nid = target_nid; flags |= TNF_MIGRATED; - } else { - flags |= TNF_MIGRATE_FAIL; - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, - vmf->address, &vmf->ptl); - if (unlikely(!vmf->pte)) - goto out; - if (unlikely(!pte_same(ptep_get(vmf->pte), vmf->orig_pte))) { - pte_unmap_unlock(vmf->pte, vmf->ptl); - goto out; - } - goto out_map; + goto out; + } + + flags |= TNF_MIGRATE_FAIL; + +out_lock_and_map: + vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, + vmf->address, &vmf->ptl); + if (unlikely(!vmf->pte)) + goto out; + if (unlikely(!pte_same(ptep_get(vmf->pte), vmf->orig_pte))) { + pte_unmap_unlock(vmf->pte, vmf->ptl); + goto out; } + goto out_map; out: if (nid != NUMA_NO_NODE) @@ -5151,7 +5164,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) return do_swap_page(vmf); if (pte_protnone(vmf->orig_pte) && vma_is_accessible(vmf->vma)) - return do_numa_page(vmf); + return handle_pte_protnone(vmf); spin_lock(vmf->ptl); entry = vmf->orig_pte; @@ -5272,7 +5285,7 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, } if (pmd_trans_huge(vmf.orig_pmd) || pmd_devmap(vmf.orig_pmd)) { if (pmd_protnone(vmf.orig_pmd) && vma_is_accessible(vma)) - return do_huge_pmd_numa_page(&vmf); + return handle_huge_pmd_protnone(&vmf); if ((flags & (FAULT_FLAG_WRITE|FAULT_FLAG_UNSHARE)) && !pmd_write(vmf.orig_pmd)) { From patchwork Thu Jan 25 16:42:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531396 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 868AFC48260 for ; Thu, 25 Jan 2024 16:45:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=8ScM0bZlBQMEdIh7JI3uG9cZS/stJ61fprbFcNxQYno=; b=NGxBUKyCDwavV/ WTCnVZmTP+8x134nVRb/okmpctI1D/WIXkiX4+eapoNN2i1gouT5zC2VJaRMP7lEOtkFPpLzEm1l3 O/EXnR3h4wKszS5YBKvQLAHIcAy1sc30xtwqqP0wcZz9fiEPzInfsFlEUn74VFveGbUHMFaoYJ7qN WqO0LlFU/Ji4s9ddaeZjwxPxqbIXYsG1yvzHpiQOehIOj3q/J/ZMWkEgJQMyLeW7FScbcxPPlYUT9 +i/4NgxFwDWi3U43J7s1UGBJJvafMNXM65IJ+NTZ4KYcYOyJbYmTsly6bjRA2gNvCzOL4FGzprYXy nsM9XoIiAakioHVmOAFw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qi-00000000tQF-2Jfa; Thu, 25 Jan 2024 16:45:08 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2pl-00000000sl2-0vLI for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:44:14 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7FAFC15A1; Thu, 25 Jan 2024 08:44:52 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8E70C3F5A1; Thu, 25 Jan 2024 08:44:02 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 14/35] of: fdt: Return the region size in of_flat_dt_translate_address() Date: Thu, 25 Jan 2024 16:42:35 +0000 Message-Id: <20240125164256.4147-15-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084409_509031_D986DD8D X-CRM114-Status: GOOD ( 14.16 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Alongside the base address, arm64 will also need to know the size of a tag storage region. Teach of_flat_dt_translate_address() to parse and return the size. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * New patch, suggested by Rob Herring. arch/sh/kernel/cpu/sh2/probe.c | 2 +- drivers/of/fdt_address.c | 12 +++++++++--- drivers/tty/serial/earlycon.c | 2 +- include/linux/of_fdt.h | 2 +- 4 files changed, 12 insertions(+), 6 deletions(-) diff --git a/arch/sh/kernel/cpu/sh2/probe.c b/arch/sh/kernel/cpu/sh2/probe.c index 70a07f4f2142..fa8904e8f390 100644 --- a/arch/sh/kernel/cpu/sh2/probe.c +++ b/arch/sh/kernel/cpu/sh2/probe.c @@ -21,7 +21,7 @@ static int __init scan_cache(unsigned long node, const char *uname, if (!of_flat_dt_is_compatible(node, "jcore,cache")) return 0; - j2_ccr_base = ioremap(of_flat_dt_translate_address(node), 4); + j2_ccr_base = ioremap(of_flat_dt_translate_address(node, NULL), 4); return 1; } diff --git a/drivers/of/fdt_address.c b/drivers/of/fdt_address.c index 1dc15ab78b10..4c077778d710 100644 --- a/drivers/of/fdt_address.c +++ b/drivers/of/fdt_address.c @@ -160,7 +160,8 @@ static int __init fdt_translate_one(const void *blob, int parent, * that can be mapped to a cpu physical address). This is not really specified * that way, but this is traditionally the way IBM at least do things */ -static u64 __init fdt_translate_address(const void *blob, int node_offset) +static u64 __init fdt_translate_address(const void *blob, int node_offset, + u64 *out_size) { int parent, len; const struct of_bus *bus, *pbus; @@ -193,6 +194,9 @@ static u64 __init fdt_translate_address(const void *blob, int node_offset) goto bail; } memcpy(addr, reg, na * 4); + /* The size of the region doesn't need translating. */ + if (out_size) + *out_size = of_read_number(reg + na, ns); pr_debug("bus (na=%d, ns=%d) on %s\n", na, ns, fdt_get_name(blob, parent, NULL)); @@ -242,8 +246,10 @@ static u64 __init fdt_translate_address(const void *blob, int node_offset) /** * of_flat_dt_translate_address - translate DT addr into CPU phys addr * @node: node in the flat blob + * @out_size: size of the region, can be NULL if not needed + * @return: the address, OF_BAD_ADDR in case of error */ -u64 __init of_flat_dt_translate_address(unsigned long node) +u64 __init of_flat_dt_translate_address(unsigned long node, u64 *out_size) { - return fdt_translate_address(initial_boot_params, node); + return fdt_translate_address(initial_boot_params, node, out_size); } diff --git a/drivers/tty/serial/earlycon.c b/drivers/tty/serial/earlycon.c index a5fbb6ed38ae..e941cf786232 100644 --- a/drivers/tty/serial/earlycon.c +++ b/drivers/tty/serial/earlycon.c @@ -265,7 +265,7 @@ int __init of_setup_earlycon(const struct earlycon_id *match, spin_lock_init(&port->lock); port->iotype = UPIO_MEM; - addr = of_flat_dt_translate_address(node); + addr = of_flat_dt_translate_address(node, NULL); if (addr == OF_BAD_ADDR) { pr_warn("[%s] bad address\n", match->name); return -ENXIO; diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h index d69ad5bb1eb1..0e26f8c3b10e 100644 --- a/include/linux/of_fdt.h +++ b/include/linux/of_fdt.h @@ -36,7 +36,7 @@ extern char __dtb_start[]; extern char __dtb_end[]; /* Other Prototypes */ -extern u64 of_flat_dt_translate_address(unsigned long node); +extern u64 of_flat_dt_translate_address(unsigned long node, u64 *out_size); extern void of_fdt_limit_memory(int limit); #endif /* CONFIG_OF_FLATTREE */ From patchwork Thu Jan 25 16:42:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531397 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8CFD4C47258 for ; Thu, 25 Jan 2024 16:45:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=oFUFZL9uX7f5GDOQIrILOy9vKVsVbv6yzvq4AaJS46U=; b=VRPEtaZHsha96h PZQeD+w+MhzKPeNgqNhlRQz/R5/v2zAbEsYkddAOYC0kPJcMon+eIIxVFpULRAn9FFz4Z01LCmZEI qx6IDJKBApQnNXAzDr0ChTj/p3CHFzSGN+gtUHMFQqW63di96SHyTMAxkp443F4h6VDfgCx4y6gPa iQSGNq+1XOylxLQu4tlvrOL75uS9cnro590+OhjEMwawuH1vBbD6j/eL/9/1xR1CoQIirG4yrj6P1 +QmcufwkohthP3Zrrebyd62uHHivMLOSEPm6+VlSrQyakI0uiL+ayF5QZnox4HG4LJwwYnuSYyTvv pwcKGuPTCfoQppUazEtw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qm-00000000tUE-3HTr; Thu, 25 Jan 2024 16:45:12 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2pr-00000000sl2-14lh for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:44:21 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 48AE015BF; Thu, 25 Jan 2024 08:44:58 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5846F3F5A1; Thu, 25 Jan 2024 08:44:08 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 15/35] of: fdt: Add of_flat_read_u32() Date: Thu, 25 Jan 2024 16:42:36 +0000 Message-Id: <20240125164256.4147-16-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084415_555840_0FB7504E X-CRM114-Status: GOOD ( 10.55 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Add the function of_flat_read_u32() to return the value of a property as an u32. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * New patch, suggested by Rob Herring. drivers/of/fdt.c | 21 +++++++++++++++++++++ include/linux/of_fdt.h | 2 ++ 2 files changed, 23 insertions(+) diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index bf502ba8da95..dfcd79fd5fd9 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -755,6 +755,27 @@ const void *__init of_get_flat_dt_prop(unsigned long node, const char *name, return fdt_getprop(initial_boot_params, node, name, size); } +/* + * of_flat_read_u32 - Return the value of the given property as an u32. + * + * @node: device node from which the property value is to be read + * @propname: name of the property + * @out_value: the value of the property + * @return: 0 on success, -EINVAL if property does not exist + */ +int __init of_flat_read_u32(unsigned long node, const char *propname, + u32 *out_value) +{ + const __be32 *reg; + + reg = of_get_flat_dt_prop(node, propname, NULL); + if (!reg) + return -EINVAL; + + *out_value = be32_to_cpup(reg); + return 0; +} + /** * of_fdt_is_compatible - Return true if given node from the given blob has * compat in its compatible list diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h index 0e26f8c3b10e..d7901699061b 100644 --- a/include/linux/of_fdt.h +++ b/include/linux/of_fdt.h @@ -57,6 +57,8 @@ extern const void *of_get_flat_dt_prop(unsigned long node, const char *name, extern int of_flat_dt_is_compatible(unsigned long node, const char *name); extern unsigned long of_get_flat_dt_root(void); extern uint32_t of_get_flat_dt_phandle(unsigned long node); +extern int of_flat_read_u32(unsigned long node, const char *propname, + u32 *out_value); extern int early_init_dt_scan_chosen(char *cmdline); extern int early_init_dt_scan_memory(void); From patchwork Thu Jan 25 16:42:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531399 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 60B85C47258 for ; Thu, 25 Jan 2024 16:45:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=3LKABELQSDI1Smn7+o6YmZunEyNWvdsjI0gqQJnrTl8=; b=FTwPttVJDZWCqg 0JrY/JWbZSeyn1LZx32QovDrukIZsKg+nNsEnVoi9pGE+w033bnKmv6y/t6flPRcS+HFXqDmDycs0 n2VaL5Ac9+uFw8AsqYjnQ1poj+8cev3Dcd0i4xOSc7cMULyV6Q3s0FQ/O1f1Z/Gh8FZDJovKtFe2Q xSBniPh2Jmf+DPuRjdOhlQT243l9yMZNlwa5WtCRHozfKzHPwryYcGztXR7UN6SxGbyMAHs1OnPWn t5sc8t4QSzlSWBOs9YiMtTAzWwm+C5+YJHi0FV2+L5RC7Gaz/8TS92QifGJPJ+CxN3w3ceIEQmI53 bYs8ZPRxsBAi/7GtRX8Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2r0-00000000tjr-2v2q; Thu, 25 Jan 2024 16:45:26 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2pw-00000000suD-363l for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:44:24 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1168115DB; Thu, 25 Jan 2024 08:45:04 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 211FF3F5A1; Thu, 25 Jan 2024 08:44:13 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 16/35] KVM: arm64: Don't deny VM_PFNMAP VMAs when kvm_has_mte() Date: Thu, 25 Jan 2024 16:42:37 +0000 Message-Id: <20240125164256.4147-17-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084421_156059_07145F27 X-CRM114-Status: GOOD ( 12.83 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org According to ARM DDI 0487J.a, page D10-5976, a memory location which doesn't have the Normal memory attribute is considered Untagged, and accesses are Tag Unchecked. Tag reads from an Untagged address return 0b0000, and writes are ignored. Linux uses VM_PFNMAP VMAs represent device memory, and Linux doesn't set the VM_MTE_ALLOWED flag for these VMAs. In user_mem_abort(), KVM requires that all VMAs that back guest memory must allow tagging (VM_MTE_ALLOWED flag set), except for VMAs that represent device memory. When a memslot is created or changed, KVM enforces a different behaviour: **all** VMAs that intersect the memslot must allow tagging, even those that represent device memory. This is too restrictive, and can lead to inconsistent behaviour: a VM_PFNMAP VMA that is present when a memslot is created causes KVM_SET_USER_MEMORY_REGION to fail, but if such a VMA is created after the memslot has been created, the virtual machine will run without errors. Change kvm_arch_prepare_memory_region() to allow VM_PFNMAP VMAs when the VM has the MTE capability enabled. Signed-off-by: Alexandru Elisei --- Changes from rfc v2: * New patch. It's a fix, and can be taken independently of the series. arch/arm64/kvm/mmu.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index d14504821b79..b7517c4a19c4 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -2028,17 +2028,15 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, if (!vma) break; - if (kvm_has_mte(kvm) && !kvm_vma_mte_allowed(vma)) { - ret = -EINVAL; - break; - } - if (vma->vm_flags & VM_PFNMAP) { /* IO region dirty page logging not allowed */ if (new->flags & KVM_MEM_LOG_DIRTY_PAGES) { ret = -EINVAL; break; } + } else if (kvm_has_mte(kvm) && !kvm_vma_mte_allowed(vma)) { + ret = -EINVAL; + break; } hva = min(reg_end, vma->vm_end); } while (hva < reg_end); From patchwork Thu Jan 25 16:42:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531401 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 51C6DC47422 for ; Thu, 25 Jan 2024 16:45:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=yhk5wJkP2sKgm+AfhB4vK38W1Lsj6RRzs+zskYVS/a0=; b=t5OjOAXzf1DBX/ oFPbkSb6hq0uiXVQSU7yzw3vDYp4MS9hgEzXlGjttW2WZdsealBuH/WrZEz2EsqLza8aCYQ+28GHv NWTChqSQg/LKBAgx7wMF4R692wbg0r2sqGsGFIDPPnoT0QjjpwAUFuZ7Nzccw1inomJ60lfvqEgy4 L/73NLAY/ZOoW4/Uk23hUzql2cyX5EBmzGCoqetqZQKSlnmD225spiljo0nxqUKFK9T+ytrAbwAhg 7/tP1xSZ2luQIpDZJd+0YGvJhQJ7mKby3STkZG97Yc2GxMVUFnGSVi8BZSRlfxyywsogk/TFDQkni ViI0qf5sv8+YQT9ol0tg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2r7-00000000tql-1KYE; Thu, 25 Jan 2024 16:45:33 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2q2-00000000syT-2jfg for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:44:30 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CF4481650; Thu, 25 Jan 2024 08:45:09 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DFFF33F5A1; Thu, 25 Jan 2024 08:44:19 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 17/35] arm64: mte: Rework naming for tag manipulation functions Date: Thu, 25 Jan 2024 16:42:38 +0000 Message-Id: <20240125164256.4147-18-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084426_986319_02DE260C X-CRM114-Status: GOOD ( 26.57 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The tag save/restore/copy functions could be more explicit about from where the tags are coming from and where they are being copied to. Renaming the functions to make it easier to understand what they are doing: - Rename the mte_clear_page_tags() 'addr' parameter to 'page_addr', to match the other functions that take a page address as parameter. - Rename mte_save/restore_tags() to mte_save/restore_page_tags_by_swp_entry() to make it clear that they are saved in a collection indexed by swp_entry (this will become important when they will be also saved in a collection indexed by page pfn). Same applies to mte_invalidate_tags{,_area}_by_swp_entry(). - Rename mte_save/restore_page_tags() to make it clear where the tags are going to be saved, respectively from where they are restored - in a previously allocated memory buffer, not in an xarray, like when the tags are saved when swapping. Rename the action to 'copy' instead of 'save'/'restore' to match the copy from user functions, which also copy tags to memory. - Rename mte_allocate/free_tag_storage() to mte_allocate/free_tag_buf() to make it clear the functions have nothing to do with the memory where the corresponding tags for a page live. Change the parameter type for mte_free_tag_buf()) to be void *, to match the return value of mte_allocate_tag_buf(). Also do that because that memory is opaque and it is not meant to be directly deferenced. In the name of consistency rename local variables from tag_storage to tags. Give a similar treatment to the hibernation code that saves and restores the tags for all tagged pages. In the same spirit, rename MTE_PAGE_TAG_STORAGE to MTE_PAGE_TAG_STORAGE_SIZE to make it clear that it relates to the size of the memory needed to save the tags for a page. Oportunistically rename MTE_TAG_SIZE to MTE_TAG_SIZE_BITS to make it clear it is measured in bits, not bytes, like the rest of the size variable from the same header file. Signed-off-by: Alexandru Elisei --- arch/arm64/include/asm/mte-def.h | 16 +++++----- arch/arm64/include/asm/mte.h | 23 +++++++++------ arch/arm64/include/asm/pgtable.h | 8 ++--- arch/arm64/kernel/elfcore.c | 14 ++++----- arch/arm64/kernel/hibernate.c | 46 ++++++++++++++--------------- arch/arm64/lib/mte.S | 18 ++++++------ arch/arm64/mm/mteswap.c | 50 ++++++++++++++++---------------- 7 files changed, 90 insertions(+), 85 deletions(-) diff --git a/arch/arm64/include/asm/mte-def.h b/arch/arm64/include/asm/mte-def.h index 14ee86b019c2..eb0d76a6bdcf 100644 --- a/arch/arm64/include/asm/mte-def.h +++ b/arch/arm64/include/asm/mte-def.h @@ -5,14 +5,14 @@ #ifndef __ASM_MTE_DEF_H #define __ASM_MTE_DEF_H -#define MTE_GRANULE_SIZE UL(16) -#define MTE_GRANULE_MASK (~(MTE_GRANULE_SIZE - 1)) -#define MTE_GRANULES_PER_PAGE (PAGE_SIZE / MTE_GRANULE_SIZE) -#define MTE_TAG_SHIFT 56 -#define MTE_TAG_SIZE 4 -#define MTE_TAG_MASK GENMASK((MTE_TAG_SHIFT + (MTE_TAG_SIZE - 1)), MTE_TAG_SHIFT) -#define MTE_PAGE_TAG_STORAGE (MTE_GRANULES_PER_PAGE * MTE_TAG_SIZE / 8) +#define MTE_GRANULE_SIZE UL(16) +#define MTE_GRANULE_MASK (~(MTE_GRANULE_SIZE - 1)) +#define MTE_GRANULES_PER_PAGE (PAGE_SIZE / MTE_GRANULE_SIZE) +#define MTE_TAG_SHIFT 56 +#define MTE_TAG_SIZE_BITS 4 +#define MTE_TAG_MASK GENMASK((MTE_TAG_SHIFT + (MTE_TAG_SIZE_BITS - 1)), MTE_TAG_SHIFT) +#define MTE_PAGE_TAG_STORAGE_SIZE (MTE_GRANULES_PER_PAGE * MTE_TAG_SIZE_BITS / 8) -#define __MTE_PREAMBLE ARM64_ASM_PREAMBLE ".arch_extension memtag\n" +#define __MTE_PREAMBLE ARM64_ASM_PREAMBLE ".arch_extension memtag\n" #endif /* __ASM_MTE_DEF_H */ diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h index 91fbd5c8a391..8034695b3dd7 100644 --- a/arch/arm64/include/asm/mte.h +++ b/arch/arm64/include/asm/mte.h @@ -18,19 +18,24 @@ #include -void mte_clear_page_tags(void *addr); +void mte_clear_page_tags(void *page_addr); + unsigned long mte_copy_tags_from_user(void *to, const void __user *from, unsigned long n); unsigned long mte_copy_tags_to_user(void __user *to, void *from, unsigned long n); -int mte_save_tags(struct page *page); -void mte_save_page_tags(const void *page_addr, void *tag_storage); -void mte_restore_tags(swp_entry_t entry, struct page *page); -void mte_restore_page_tags(void *page_addr, const void *tag_storage); -void mte_invalidate_tags(int type, pgoff_t offset); -void mte_invalidate_tags_area(int type); -void *mte_allocate_tag_storage(void); -void mte_free_tag_storage(char *storage); + +int mte_save_page_tags_by_swp_entry(struct page *page); +void mte_restore_page_tags_by_swp_entry(swp_entry_t entry, struct page *page); + +void mte_copy_page_tags_to_buf(const void *page_addr, void *to); +void mte_copy_page_tags_from_buf(void *page_addr, const void *from); + +void mte_invalidate_tags_by_swp_entry(int type, pgoff_t offset); +void mte_invalidate_tags_area_by_swp_entry(int type); + +void *mte_allocate_tag_buf(void); +void mte_free_tag_buf(void *buf); #ifdef CONFIG_ARM64_MTE diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 08f0904dbfc2..2499cc4fa4f2 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1045,7 +1045,7 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, static inline int arch_prepare_to_swap(struct page *page) { if (system_supports_mte()) - return mte_save_tags(page); + return mte_save_page_tags_by_swp_entry(page); return 0; } @@ -1053,20 +1053,20 @@ static inline int arch_prepare_to_swap(struct page *page) static inline void arch_swap_invalidate_page(int type, pgoff_t offset) { if (system_supports_mte()) - mte_invalidate_tags(type, offset); + mte_invalidate_tags_by_swp_entry(type, offset); } static inline void arch_swap_invalidate_area(int type) { if (system_supports_mte()) - mte_invalidate_tags_area(type); + mte_invalidate_tags_area_by_swp_entry(type); } #define __HAVE_ARCH_SWAP_RESTORE static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) { if (system_supports_mte()) - mte_restore_tags(entry, &folio->page); + mte_restore_page_tags_by_swp_entry(entry, &folio->page); } #endif /* CONFIG_ARM64_MTE */ diff --git a/arch/arm64/kernel/elfcore.c b/arch/arm64/kernel/elfcore.c index 2e94d20c4ac7..e9ae00dacad8 100644 --- a/arch/arm64/kernel/elfcore.c +++ b/arch/arm64/kernel/elfcore.c @@ -17,7 +17,7 @@ static unsigned long mte_vma_tag_dump_size(struct core_vma_metadata *m) { - return (m->dump_size >> PAGE_SHIFT) * MTE_PAGE_TAG_STORAGE; + return (m->dump_size >> PAGE_SHIFT) * MTE_PAGE_TAG_STORAGE_SIZE; } /* Derived from dump_user_range(); start/end must be page-aligned */ @@ -38,7 +38,7 @@ static int mte_dump_tag_range(struct coredump_params *cprm, * have been all zeros. */ if (!page) { - dump_skip(cprm, MTE_PAGE_TAG_STORAGE); + dump_skip(cprm, MTE_PAGE_TAG_STORAGE_SIZE); continue; } @@ -48,12 +48,12 @@ static int mte_dump_tag_range(struct coredump_params *cprm, */ if (!page_mte_tagged(page)) { put_page(page); - dump_skip(cprm, MTE_PAGE_TAG_STORAGE); + dump_skip(cprm, MTE_PAGE_TAG_STORAGE_SIZE); continue; } if (!tags) { - tags = mte_allocate_tag_storage(); + tags = mte_allocate_tag_buf(); if (!tags) { put_page(page); ret = 0; @@ -61,16 +61,16 @@ static int mte_dump_tag_range(struct coredump_params *cprm, } } - mte_save_page_tags(page_address(page), tags); + mte_copy_page_tags_to_buf(page_address(page), tags); put_page(page); - if (!dump_emit(cprm, tags, MTE_PAGE_TAG_STORAGE)) { + if (!dump_emit(cprm, tags, MTE_PAGE_TAG_STORAGE_SIZE)) { ret = 0; break; } } if (tags) - mte_free_tag_storage(tags); + mte_free_tag_buf(tags); return ret; } diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c index 02870beb271e..a3b0e7b32457 100644 --- a/arch/arm64/kernel/hibernate.c +++ b/arch/arm64/kernel/hibernate.c @@ -215,41 +215,41 @@ static int create_safe_exec_page(void *src_start, size_t length, #ifdef CONFIG_ARM64_MTE -static DEFINE_XARRAY(mte_pages); +static DEFINE_XARRAY(tags_by_pfn); -static int save_tags(struct page *page, unsigned long pfn) +static int save_page_tags_by_pfn(struct page *page, unsigned long pfn) { - void *tag_storage, *ret; + void *tags, *ret; - tag_storage = mte_allocate_tag_storage(); - if (!tag_storage) + tags = mte_allocate_tag_buf(); + if (!tags) return -ENOMEM; - mte_save_page_tags(page_address(page), tag_storage); + mte_copy_page_tags_to_buf(page_address(page), tags); - ret = xa_store(&mte_pages, pfn, tag_storage, GFP_KERNEL); + ret = xa_store(&tags_by_pfn, pfn, tags, GFP_KERNEL); if (WARN(xa_is_err(ret), "Failed to store MTE tags")) { - mte_free_tag_storage(tag_storage); + mte_free_tag_buf(tags); return xa_err(ret); } else if (WARN(ret, "swsusp: %s: Duplicate entry", __func__)) { - mte_free_tag_storage(ret); + mte_free_tag_buf(ret); } return 0; } -static void swsusp_mte_free_storage(void) +static void swsusp_mte_free_tags(void) { - XA_STATE(xa_state, &mte_pages, 0); + XA_STATE(xa_state, &tags_by_pfn, 0); void *tags; - xa_lock(&mte_pages); + xa_lock(&tags_by_pfn); xas_for_each(&xa_state, tags, ULONG_MAX) { - mte_free_tag_storage(tags); + mte_free_tag_buf(tags); } - xa_unlock(&mte_pages); + xa_unlock(&tags_by_pfn); - xa_destroy(&mte_pages); + xa_destroy(&tags_by_pfn); } static int swsusp_mte_save_tags(void) @@ -273,9 +273,9 @@ static int swsusp_mte_save_tags(void) if (!page_mte_tagged(page)) continue; - ret = save_tags(page, pfn); + ret = save_page_tags_by_pfn(page, pfn); if (ret) { - swsusp_mte_free_storage(); + swsusp_mte_free_tags(); goto out; } @@ -290,25 +290,25 @@ static int swsusp_mte_save_tags(void) static void swsusp_mte_restore_tags(void) { - XA_STATE(xa_state, &mte_pages, 0); + XA_STATE(xa_state, &tags_by_pfn, 0); int n = 0; void *tags; - xa_lock(&mte_pages); + xa_lock(&tags_by_pfn); xas_for_each(&xa_state, tags, ULONG_MAX) { unsigned long pfn = xa_state.xa_index; struct page *page = pfn_to_online_page(pfn); - mte_restore_page_tags(page_address(page), tags); + mte_copy_page_tags_from_buf(page_address(page), tags); - mte_free_tag_storage(tags); + mte_free_tag_buf(tags); n++; } - xa_unlock(&mte_pages); + xa_unlock(&tags_by_pfn); pr_info("Restored %d MTE pages\n", n); - xa_destroy(&mte_pages); + xa_destroy(&tags_by_pfn); } #else /* CONFIG_ARM64_MTE */ diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S index 5018ac03b6bf..9f623e9da09f 100644 --- a/arch/arm64/lib/mte.S +++ b/arch/arm64/lib/mte.S @@ -119,7 +119,7 @@ SYM_FUNC_START(mte_copy_tags_to_user) cbz x2, 2f 1: ldg x4, [x1] - ubfx x4, x4, #MTE_TAG_SHIFT, #MTE_TAG_SIZE + ubfx x4, x4, #MTE_TAG_SHIFT, #MTE_TAG_SIZE_BITS USER(2f, sttrb w4, [x0]) add x0, x0, #1 add x1, x1, #MTE_GRANULE_SIZE @@ -132,11 +132,11 @@ USER(2f, sttrb w4, [x0]) SYM_FUNC_END(mte_copy_tags_to_user) /* - * Save the tags in a page + * Copy the tags in a page to a buffer * x0 - page address - * x1 - tag storage, MTE_PAGE_TAG_STORAGE bytes + * x1 - memory buffer, MTE_PAGE_TAG_STORAGE_SIZE bytes */ -SYM_FUNC_START(mte_save_page_tags) +SYM_FUNC_START(mte_copy_page_tags_to_buf) multitag_transfer_size x7, x5 1: mov x2, #0 @@ -153,14 +153,14 @@ SYM_FUNC_START(mte_save_page_tags) b.ne 1b ret -SYM_FUNC_END(mte_save_page_tags) +SYM_FUNC_END(mte_copy_page_tags_to_buf) /* - * Restore the tags in a page + * Restore the tags in a page from a buffer * x0 - page address - * x1 - tag storage, MTE_PAGE_TAG_STORAGE bytes + * x1 - memory buffer, MTE_PAGE_TAG_STORAGE_SIZE bytes */ -SYM_FUNC_START(mte_restore_page_tags) +SYM_FUNC_START(mte_copy_page_tags_from_buf) multitag_transfer_size x7, x5 1: ldr x2, [x1], #8 @@ -174,4 +174,4 @@ SYM_FUNC_START(mte_restore_page_tags) b.ne 1b ret -SYM_FUNC_END(mte_restore_page_tags) +SYM_FUNC_END(mte_copy_page_tags_from_buf) diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c index a31833e3ddc5..2a43746b803f 100644 --- a/arch/arm64/mm/mteswap.c +++ b/arch/arm64/mm/mteswap.c @@ -7,79 +7,79 @@ #include #include -static DEFINE_XARRAY(mte_pages); +static DEFINE_XARRAY(tags_by_swp_entry); -void *mte_allocate_tag_storage(void) +void *mte_allocate_tag_buf(void) { /* tags granule is 16 bytes, 2 tags stored per byte */ - return kmalloc(MTE_PAGE_TAG_STORAGE, GFP_KERNEL); + return kmalloc(MTE_PAGE_TAG_STORAGE_SIZE, GFP_KERNEL); } -void mte_free_tag_storage(char *storage) +void mte_free_tag_buf(void *buf) { - kfree(storage); + kfree(buf); } -int mte_save_tags(struct page *page) +int mte_save_page_tags_by_swp_entry(struct page *page) { - void *tag_storage, *ret; + void *tags, *ret; if (!page_mte_tagged(page)) return 0; - tag_storage = mte_allocate_tag_storage(); - if (!tag_storage) + tags = mte_allocate_tag_buf(); + if (!tags) return -ENOMEM; - mte_save_page_tags(page_address(page), tag_storage); + mte_copy_page_tags_to_buf(page_address(page), tags); /* lookup the swap entry.val from the page */ - ret = xa_store(&mte_pages, page_swap_entry(page).val, tag_storage, + ret = xa_store(&tags_by_swp_entry, page_swap_entry(page).val, tags, GFP_KERNEL); if (WARN(xa_is_err(ret), "Failed to store MTE tags")) { - mte_free_tag_storage(tag_storage); + mte_free_tag_buf(tags); return xa_err(ret); } else if (ret) { /* Entry is being replaced, free the old entry */ - mte_free_tag_storage(ret); + mte_free_tag_buf(ret); } return 0; } -void mte_restore_tags(swp_entry_t entry, struct page *page) +void mte_restore_page_tags_by_swp_entry(swp_entry_t entry, struct page *page) { - void *tags = xa_load(&mte_pages, entry.val); + void *tags = xa_load(&tags_by_swp_entry, entry.val); if (!tags) return; if (try_page_mte_tagging(page)) { - mte_restore_page_tags(page_address(page), tags); + mte_copy_page_tags_from_buf(page_address(page), tags); set_page_mte_tagged(page); } } -void mte_invalidate_tags(int type, pgoff_t offset) +void mte_invalidate_tags_by_swp_entry(int type, pgoff_t offset) { swp_entry_t entry = swp_entry(type, offset); - void *tags = xa_erase(&mte_pages, entry.val); + void *tags = xa_erase(&tags_by_swp_entry, entry.val); - mte_free_tag_storage(tags); + mte_free_tag_buf(tags); } -void mte_invalidate_tags_area(int type) +void mte_invalidate_tags_area_by_swp_entry(int type) { swp_entry_t entry = swp_entry(type, 0); swp_entry_t last_entry = swp_entry(type + 1, 0); void *tags; - XA_STATE(xa_state, &mte_pages, entry.val); + XA_STATE(xa_state, &tags_by_swp_entry, entry.val); - xa_lock(&mte_pages); + xa_lock(&tags_by_swp_entry); xas_for_each(&xa_state, tags, last_entry.val - 1) { - __xa_erase(&mte_pages, xa_state.xa_index); - mte_free_tag_storage(tags); + __xa_erase(&tags_by_swp_entry, xa_state.xa_index); + mte_free_tag_buf(tags); } - xa_unlock(&mte_pages); + xa_unlock(&tags_by_swp_entry); } From patchwork Thu Jan 25 16:42:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531402 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C9832C48260 for ; Thu, 25 Jan 2024 16:45:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=W7+gIMsFgrJejpl7si0nqhJOqEW6n0X3fwkMgEIX5d4=; b=kauHH+Jn6bvDrA AGkzMfd7m1FmWGWAB489iOtU74r8rlBMxFrMcGdmuJtyB/wqUn+Tf+stLeCRvbIck6ihXup2RnWFU enewn+m5qyQMzs8ZfXE31eFULY8SFLqZGot9L4jT4pBIA0l6++FknA1e+xH3RkKR9E+1c+OvEpTDM KS6FMFJ/cjiSeycwljSSw8fF+17neY/z5oWwEuVH2+ZDPp0NuD8bSMQ1wZhadl7erH+cxJbgdN/JH teNdGmVCE38SpCcSeC40aaPF+k/3n4c0C9uOUwVamFJu3/mo98H204EAfGVA4pZQGQbYJcyKfY/QG 0ELveRzjxVP+gj8vigLQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2rC-00000000twE-1h2N; Thu, 25 Jan 2024 16:45:38 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2q8-00000000t2D-0u4i for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:44:33 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 97188165C; Thu, 25 Jan 2024 08:45:15 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A9C433F5A1; Thu, 25 Jan 2024 08:44:25 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 18/35] arm64: mte: Rename __GFP_ZEROTAGS to __GFP_TAGGED Date: Thu, 25 Jan 2024 16:42:39 +0000 Message-Id: <20240125164256.4147-19-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084432_413311_7BDA7FA4 X-CRM114-Status: GOOD ( 18.13 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org __GFP_ZEROTAGS is used to instruct the page allocator to zero the tags at the same time as the physical frame is zeroed. The name can be slightly misleading, because it doesn't mean that the code will zero the tags unconditionally, but that the tags will be zeroed if and only if the physical frame is also zeroed (either __GFP_ZERO is set or init_on_alloc is 1). Rename it to __GFP_TAGGED, in preparation for it to be used by the page allocator to recognize when an allocation is tagged (has metadata). Signed-off-by: Alexandru Elisei --- arch/arm64/mm/fault.c | 2 +- include/linux/gfp_types.h | 6 +++--- include/trace/events/mmflags.h | 2 +- mm/page_alloc.c | 2 +- mm/shmem.c | 2 +- 5 files changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 4d3f0a870ad8..c022e473c17c 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -944,7 +944,7 @@ NOKPROBE_SYMBOL(do_debug_exception); gfp_t arch_calc_vma_gfp(struct vm_area_struct *vma, gfp_t gfp) { if (vma->vm_flags & VM_MTE) - return __GFP_ZEROTAGS; + return __GFP_TAGGED; return 0; } diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h index 1b6053da8754..f638353ebdc7 100644 --- a/include/linux/gfp_types.h +++ b/include/linux/gfp_types.h @@ -45,7 +45,7 @@ typedef unsigned int __bitwise gfp_t; #define ___GFP_HARDWALL 0x100000u #define ___GFP_THISNODE 0x200000u #define ___GFP_ACCOUNT 0x400000u -#define ___GFP_ZEROTAGS 0x800000u +#define ___GFP_TAGGED 0x800000u #ifdef CONFIG_KASAN_HW_TAGS #define ___GFP_SKIP_ZERO 0x1000000u #define ___GFP_SKIP_KASAN 0x2000000u @@ -226,7 +226,7 @@ typedef unsigned int __bitwise gfp_t; * * %__GFP_ZERO returns a zeroed page on success. * - * %__GFP_ZEROTAGS zeroes memory tags at allocation time if the memory itself + * %__GFP_TAGGED zeroes memory tags at allocation time if the memory itself * is being zeroed (either via __GFP_ZERO or via init_on_alloc, provided that * __GFP_SKIP_ZERO is not set). This flag is intended for optimization: setting * memory tags at the same time as zeroing memory has minimal additional @@ -241,7 +241,7 @@ typedef unsigned int __bitwise gfp_t; #define __GFP_NOWARN ((__force gfp_t)___GFP_NOWARN) #define __GFP_COMP ((__force gfp_t)___GFP_COMP) #define __GFP_ZERO ((__force gfp_t)___GFP_ZERO) -#define __GFP_ZEROTAGS ((__force gfp_t)___GFP_ZEROTAGS) +#define __GFP_TAGGED ((__force gfp_t)___GFP_TAGGED) #define __GFP_SKIP_ZERO ((__force gfp_t)___GFP_SKIP_ZERO) #define __GFP_SKIP_KASAN ((__force gfp_t)___GFP_SKIP_KASAN) diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index d801409b33cf..6ca0d5ed46c0 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -50,7 +50,7 @@ gfpflag_string(__GFP_RECLAIM), \ gfpflag_string(__GFP_DIRECT_RECLAIM), \ gfpflag_string(__GFP_KSWAPD_RECLAIM), \ - gfpflag_string(__GFP_ZEROTAGS) + gfpflag_string(__GFP_TAGGED) #ifdef CONFIG_KASAN_HW_TAGS #define __def_gfpflag_names_kasan , \ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 502ee3eb8583..0a0118612a13 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1480,7 +1480,7 @@ inline void post_alloc_hook(struct page *page, unsigned int order, { bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) && !should_skip_init(gfp_flags); - bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS); + bool zero_tags = init && (gfp_flags & __GFP_TAGGED); int i; set_page_private(page, 0); diff --git a/mm/shmem.c b/mm/shmem.c index 621fabc3b8c6..3e28357b0a40 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1585,7 +1585,7 @@ static struct folio *shmem_swapin_cluster(swp_entry_t swap, gfp_t gfp, */ static gfp_t limit_gfp_mask(gfp_t huge_gfp, gfp_t limit_gfp) { - gfp_t allowflags = __GFP_IO | __GFP_FS | __GFP_RECLAIM | __GFP_ZEROTAGS; + gfp_t allowflags = __GFP_IO | __GFP_FS | __GFP_RECLAIM | __GFP_TAGGED; gfp_t denyflags = __GFP_NOWARN | __GFP_NORETRY; gfp_t zoneflags = limit_gfp & GFP_ZONEMASK; gfp_t result = huge_gfp & ~(allowflags | GFP_ZONEMASK); From patchwork Thu Jan 25 16:42:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531403 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E70A1C47422 for ; Thu, 25 Jan 2024 16:45:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=hjJPwJfPW7GLdFoiNrY2Xq1H5X1opphNkiaKOtPX7kk=; b=p3GmeNwYLyNzOa M1/RezBAqLtc2M5wbRiUWAAz2eyuDWHsP1OYQVjKRtj33vPwdC2XFG3p2bKn6Oj2u/DbCj471Yu9x Kj0FYAB8UqPfcphIp/9fP4rGLYhLHRmWzMRdQHF9pT6sXsP1wC0cX5stqRmziAhoWiEOZatcyOTue CioItARpMz9lWjY8silXLpN0CkQfoPp4rkWI2dfKmWGAGO7vKuQYoFAl/cpEVkMKiETHE1UhxrwW9 3h60ZPV2kYuMQaAfr/Z1Kv9W7BxorTtkq7D9e295sasxZS2DYkC86HSFvD9L/T00Hbook7P4rQWuG tZVPTV3Tm4Qw7oCIsqSw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2rH-00000000u1E-0eFs; Thu, 25 Jan 2024 16:45:43 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qE-00000000t6e-3dhZ for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:44:40 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 607B01682; Thu, 25 Jan 2024 08:45:21 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 711BE3F5A1; Thu, 25 Jan 2024 08:44:31 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 19/35] arm64: mte: Discover tag storage memory Date: Thu, 25 Jan 2024 16:42:40 +0000 Message-Id: <20240125164256.4147-20-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084439_099765_417003D4 X-CRM114-Status: GOOD ( 33.18 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Allow the kernel to get the base address, size, block size and associated memory node for tag storage from the device tree blob. A tag storage region represents the smallest contiguous memory region that holds all the tags for the associated contiguous memory region which can be tagged. For example, for a 32GB contiguous tagged memory the corresponding tag storage region is exactly 1GB of contiguous memory, not two adjacent 512M of tag storage memory, nor one 2GB tag storage region. Tag storage is described as reserved memory; future patches will teach the kernel how to make use of it for data (non-tagged) allocations. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * Reworked from rfc v2 patch #11 ("arm64: mte: Reserve tag storage memory"). * Added device tree schema (Rob Herring) * Tag storage memory is now described in the "reserved-memory" node (Rob Herring). .../reserved-memory/arm,mte-tag-storage.yaml | 78 +++++++++ arch/arm64/Kconfig | 12 ++ arch/arm64/include/asm/mte_tag_storage.h | 16 ++ arch/arm64/kernel/Makefile | 1 + arch/arm64/kernel/mte_tag_storage.c | 158 ++++++++++++++++++ arch/arm64/mm/init.c | 3 + 6 files changed, 268 insertions(+) create mode 100644 Documentation/devicetree/bindings/reserved-memory/arm,mte-tag-storage.yaml create mode 100644 arch/arm64/include/asm/mte_tag_storage.h create mode 100644 arch/arm64/kernel/mte_tag_storage.c diff --git a/Documentation/devicetree/bindings/reserved-memory/arm,mte-tag-storage.yaml b/Documentation/devicetree/bindings/reserved-memory/arm,mte-tag-storage.yaml new file mode 100644 index 000000000000..a99aaa1e8b6e --- /dev/null +++ b/Documentation/devicetree/bindings/reserved-memory/arm,mte-tag-storage.yaml @@ -0,0 +1,78 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/reserved-memory/arm,mte-tag-storage.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Tag storage memory for Memory Tagging Extension + +description: | + Description of the tag storage memory region that Linux can use to store + data when the associated memory is not tagged. + + The reserved memory described by the node must also be described by a + standalone 'memory' node. + +maintainers: + - Alexandru Elisei + +allOf: + - $ref: reserved-memory.yaml + +properties: + compatible: + const: arm,mte-tag-storage + + reg: + description: | + Specifies the memory region that MTE uses for tag storage. The size of the + region must be equal to the size needed to store all the tags for the + associated tagged memory. + + block-size: + description: | + Specifies the minimum multiple of 4K bytes of tag storage where all the + tags stored in the block correspond to a contiguous memory region. This + is needed for platforms where the memory controller interleaves tag + writes to memory. + + For example, if the memory controller interleaves tag writes for 256KB + of contiguous memory across 8K of tag storage (2-way interleave), then + the correct value for 'block-size' is 0x2000. + + This value is a hardware property, independent of the selected kernel page + size. + $ref: /schemas/types.yaml#/definitions/uint32 + + tagged-memory: + description: | + Specifies the memory node, as a phandle, for which all the tags are + stored in the tag storage region. + + The memory node must describe one contiguous memory region (i.e, the + 'ranges' property of the memory node must have exactly one entry). + $ref: /schemas/types.yaml#/definitions/phandle + +unevaluatedProperties: false + +required: + - compatible + - reg + - block-size + - tagged-memory + - reusable + +examples: + - | + reserved-memory { + #address-cells = <2>; + #size-cells = <2>; + + tags0: tag-storage@8f8000000 { + compatible = "arm,mte-tag-storage"; + reg = <0x08 0xf8000000 0x00 0x4000000>; + block-size = <0x1000>; + tagged-memory = <&memory0>; + reusable; + }; + }; diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index aa7c1d435139..92d97930b56e 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -2082,6 +2082,18 @@ config ARM64_MTE Documentation/arch/arm64/memory-tagging-extension.rst. +if ARM64_MTE +config ARM64_MTE_TAG_STORAGE + bool + help + Adds support for dynamic management of the memory used by the hardware + for storing MTE tags. This memory, unlike normal memory, cannot be + tagged. When it is used to store tags for another memory location it + cannot be used for any type of allocation. + + If unsure, say N +endif # ARM64_MTE + endmenu # "ARMv8.5 architectural features" menu "ARMv8.7 architectural features" diff --git a/arch/arm64/include/asm/mte_tag_storage.h b/arch/arm64/include/asm/mte_tag_storage.h new file mode 100644 index 000000000000..3c2cd29e053e --- /dev/null +++ b/arch/arm64/include/asm/mte_tag_storage.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2023 ARM Ltd. + */ +#ifndef __ASM_MTE_TAG_STORAGE_H +#define __ASM_MTE_TAG_STORAGE_H + +#ifdef CONFIG_ARM64_MTE_TAG_STORAGE +void mte_init_tag_storage(void); +#else +static inline void mte_init_tag_storage(void) +{ +} +#endif /* CONFIG_ARM64_MTE_TAG_STORAGE */ + +#endif /* __ASM_MTE_TAG_STORAGE_H */ diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile index e5d03a7039b4..89c28b538908 100644 --- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -70,6 +70,7 @@ obj-$(CONFIG_CRASH_CORE) += crash_core.o obj-$(CONFIG_ARM_SDE_INTERFACE) += sdei.o obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o obj-$(CONFIG_ARM64_MTE) += mte.o +obj-$(CONFIG_ARM64_MTE_TAG_STORAGE) += mte_tag_storage.o obj-y += vdso-wrap.o obj-$(CONFIG_COMPAT_VDSO) += vdso32-wrap.o obj-$(CONFIG_UNWIND_PATCH_PAC_INTO_SCS) += patch-scs.o diff --git a/arch/arm64/kernel/mte_tag_storage.c b/arch/arm64/kernel/mte_tag_storage.c new file mode 100644 index 000000000000..2f32265d8ad8 --- /dev/null +++ b/arch/arm64/kernel/mte_tag_storage.c @@ -0,0 +1,158 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Support for dynamic tag storage. + * + * Copyright (C) 2023 ARM Ltd. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +struct tag_region { + struct range mem_range; /* Memory associated with the tag storage, in PFNs. */ + struct range tag_range; /* Tag storage memory, in PFNs. */ + u32 block_size_pages; /* Tag block size, in pages. */ + phandle mem_phandle; /* phandle for the associated memory node. */ +}; + +#define MAX_TAG_REGIONS 32 + +static struct tag_region tag_regions[MAX_TAG_REGIONS]; +static int num_tag_regions; + +static u32 __init get_block_size_pages(u32 block_size_bytes) +{ + u32 a = PAGE_SIZE; + u32 b = block_size_bytes; + u32 r; + + /* Find greatest common divisor using the Euclidian algorithm. */ + do { + r = a % b; + a = b; + b = r; + } while (b != 0); + + return PHYS_PFN(PAGE_SIZE * block_size_bytes / a); +} + +int __init tag_storage_probe(struct reserved_mem *rmem) +{ + struct tag_region *region; + u32 block_size_bytes; + int ret; + + if (num_tag_regions == MAX_TAG_REGIONS) { + pr_err("Exceeded maximum number of tag storage regions"); + goto out_err; + } + + region = &tag_regions[num_tag_regions]; + region->tag_range.start = PHYS_PFN(rmem->base); + region->tag_range.end = PHYS_PFN(rmem->base + rmem->size - 1); + + ret = of_flat_read_u32(rmem->fdt_node, "block-size", &block_size_bytes); + if (ret || block_size_bytes == 0) { + pr_err("Invalid or missing 'block-size' property"); + goto out_err; + } + + region->block_size_pages = get_block_size_pages(block_size_bytes); + if (range_len(®ion->tag_range) % region->block_size_pages != 0) { + pr_err("Tag storage region size 0x%llx pages is not a multiple of block size 0x%x pages", + range_len(®ion->tag_range), region->block_size_pages); + goto out_err; + } + + ret = of_flat_read_u32(rmem->fdt_node, "tagged-memory", ®ion->mem_phandle); + if (ret) { + pr_err("Invalid or missing 'tagged-memory' property"); + goto out_err; + } + + num_tag_regions++; + return 0; + +out_err: + num_tag_regions = 0; + return -EINVAL; +} +RESERVEDMEM_OF_DECLARE(tag_storage, "arm,mte-tag-storage", tag_storage_probe); + +static int __init mte_find_tagged_memory_regions(void) +{ + struct device_node *mem_dev; + struct tag_region *region; + struct range *mem_range; + const __be32 *reg; + u64 addr, size; + int i; + + for (i = 0; i < num_tag_regions; i++) { + region = &tag_regions[i]; + mem_range = ®ion->mem_range; + + mem_dev = of_find_node_by_phandle(region->mem_phandle); + if (!mem_dev) { + pr_err("Cannot find tagged memory node"); + goto out; + } + + reg = of_get_property(mem_dev, "reg", NULL); + if (!reg) { + pr_err("Invalid tagged memory node"); + goto out_put_mem; + } + + addr = of_translate_address(mem_dev, reg); + if (addr == OF_BAD_ADDR) { + pr_err("Invalid memory address"); + goto out_put_mem; + } + + size = of_read_number(reg + of_n_addr_cells(mem_dev), of_n_size_cells(mem_dev)); + if (!size) { + pr_err("Invalid memory size"); + goto out_put_mem; + } + + mem_range->start = PHYS_PFN(addr); + mem_range->end = PHYS_PFN(addr + size - 1); + + of_node_put(mem_dev); + } + + return 0; + +out_put_mem: + of_node_put(mem_dev); +out: + return -EINVAL; +} + +void __init mte_init_tag_storage(void) +{ + int ret; + + if (num_tag_regions == 0) + return; + + ret = mte_find_tagged_memory_regions(); + if (ret) + goto out_disabled; + + return; + +out_disabled: + num_tag_regions = 0; + pr_info("MTE tag storage region management disabled"); +} diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 74c1db8ce271..2ccc0c294a13 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -39,6 +39,7 @@ #include #include #include +#include #include #include #include @@ -386,6 +387,8 @@ void __init mem_init(void) /* this will put all unused low memory onto the freelists */ memblock_free_all(); + mte_init_tag_storage(); + /* * Check boundaries twice: Some fundamental inconsistencies can be * detected at build time already. From patchwork Thu Jan 25 16:42:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531404 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 65F13C47258 for ; Thu, 25 Jan 2024 16:46:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=18uzq84p+ZWLn+HheDr6Nc9do2BLdjkdGLE2ZR0PnZc=; b=NfnPDmWpdMaqVj IspbWdLawv2R9aGPW2kYlBI4T72fRalb5OvyrI2Pg0DxMcAZ0Hh+uVbBVpmlt06wUHuIrLRlkBPIk bfmQNCN2u0G0k3CT7hm+yVmDzerBxLZ7z7C2w2hZJTQMD3s/LIt4TQdHfaUWpe1QM9H5NqHyY29WU 4TGEPQtt2n13XsfndJjExSyly+GtTsr8VX4suvtD0aF0toikqiDPm8d+yY6RTqLv/IXLWMZArdvYL 1P11X2fZlNZjmoRN3QpTHFSu1SJDO6bexLiuiM+cFj7g5tFTqLutoDvPu6pgcdxLEPcIBVrB44eFF OjYItC0KjLeSqBNwyHKw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2rN-00000000u6v-1LpN; Thu, 25 Jan 2024 16:45:49 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qJ-00000000t9r-3F7v for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:44:45 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 294E21684; Thu, 25 Jan 2024 08:45:27 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3AF663F5A1; Thu, 25 Jan 2024 08:44:37 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 20/35] arm64: mte: Add tag storage memory to CMA Date: Thu, 25 Jan 2024 16:42:41 +0000 Message-Id: <20240125164256.4147-21-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084443_975334_B0B01D98 X-CRM114-Status: GOOD ( 27.59 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Add the MTE tag storage pages to CMA, which allows the page allocator to manage them like regular pages. The CMA migratype lends the tag storage pages some very desirable properties: * They cannot be longterm pinned, meaning they should always be migratable. * The pages can be allocated explicitely by using their PFN (with alloc_cma_range()) when they are needed to store tags. Signed-off-by: Alexandru Elisei --- Changes since v2: * Reworked from rfc v2 patch #12 ("arm64: mte: Add tag storage pages to the MIGRATE_CMA migratetype"). * Tag storage memory is now added to the cma_areas array and will be managed like a regular CMA region (David Hildenbrand). * If a tag storage region spans multiple zones, CMA won't be able to activate the region. Split such regions into multiple tag storage regions (Hyesoo Yu). arch/arm64/Kconfig | 1 + arch/arm64/kernel/mte_tag_storage.c | 150 +++++++++++++++++++++++++++- 2 files changed, 150 insertions(+), 1 deletion(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 92d97930b56e..6f65e9005dc9 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -2085,6 +2085,7 @@ config ARM64_MTE if ARM64_MTE config ARM64_MTE_TAG_STORAGE bool + select CONFIG_CMA help Adds support for dynamic management of the memory used by the hardware for storing MTE tags. This memory, unlike normal memory, cannot be diff --git a/arch/arm64/kernel/mte_tag_storage.c b/arch/arm64/kernel/mte_tag_storage.c index 2f32265d8ad8..90b157132efa 100644 --- a/arch/arm64/kernel/mte_tag_storage.c +++ b/arch/arm64/kernel/mte_tag_storage.c @@ -5,6 +5,8 @@ * Copyright (C) 2023 ARM Ltd. */ +#include +#include #include #include #include @@ -22,6 +24,7 @@ struct tag_region { struct range tag_range; /* Tag storage memory, in PFNs. */ u32 block_size_pages; /* Tag block size, in pages. */ phandle mem_phandle; /* phandle for the associated memory node. */ + struct cma *cma; /* CMA cookie */ }; #define MAX_TAG_REGIONS 32 @@ -139,9 +142,88 @@ static int __init mte_find_tagged_memory_regions(void) return -EINVAL; } +static void __init mte_split_tag_region(struct tag_region *region, unsigned long last_tag_pfn) +{ + struct tag_region *new_region; + unsigned long last_mem_pfn; + + new_region = &tag_regions[num_tag_regions]; + last_mem_pfn = region->mem_range.start + (last_tag_pfn - region->tag_range.start) * 32; + + new_region->mem_range.start = last_mem_pfn + 1; + new_region->mem_range.end = region->mem_range.end; + region->mem_range.end = last_mem_pfn; + + new_region->tag_range.start = last_tag_pfn + 1; + new_region->tag_range.end = region->tag_range.end; + region->tag_range.end = last_tag_pfn; + + new_region->block_size_pages = region->block_size_pages; + + num_tag_regions++; +} + +/* + * Split any tag region that spans multiple zones - CMA will fail if that + * happens. + */ +static int __init mte_split_tag_regions(void) +{ + struct tag_region *region; + struct range *tag_range; + struct zone *zone; + unsigned long pfn; + int i; + + for (i = 0; i < num_tag_regions; i++) { + region = &tag_regions[i]; + tag_range = ®ion->tag_range; + zone = page_zone(pfn_to_page(tag_range->start)); + + for (pfn = tag_range->start + 1; pfn <= tag_range->end; pfn++) { + if (page_zone(pfn_to_page(pfn)) == zone) + continue; + + if (WARN_ON_ONCE(pfn % region->block_size_pages)) + goto out_err; + + if (num_tag_regions == MAX_TAG_REGIONS) + goto out_err; + + mte_split_tag_region(&tag_regions[i], pfn - 1); + /* Move on to the next region. */ + break; + } + } + + return 0; + +out_err: + pr_err("Error splitting tag storage region 0x%llx-0x%llx spanning multiple zones", + PFN_PHYS(tag_range->start), PFN_PHYS(tag_range->end + 1) - 1); + return -EINVAL; +} + void __init mte_init_tag_storage(void) { - int ret; + unsigned long long mem_end; + struct tag_region *region; + unsigned long pfn, order; + u64 start, end; + int i, j, ret; + + /* + * Tag storage memory requires that tag storage pages in use for data + * are always migratable when they need to be repurposed to store tags. + * If ARCH_KEEP_MEMBLOCK is enabled, kexec will not scan reserved + * memblocks when trying to find a suitable location for the kernel + * image. This means that kexec will not use tag storage pages for + * copying the kernel, and the pages will remain migratable. + * + * Add the check in case arm64 stops selecting ARCH_KEEP_MEMBLOCK by + * default. + */ + BUILD_BUG_ON(!IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)); if (num_tag_regions == 0) return; @@ -150,6 +232,72 @@ void __init mte_init_tag_storage(void) if (ret) goto out_disabled; + mem_end = PHYS_PFN(memblock_end_of_DRAM()); + + /* + * MTE is disabled, tag storage pages can be used like any other pages. + * The only restriction is that the pages cannot be used by kexec + * because the memory remains marked as reserved in the memblock + * allocator. + */ + if (!system_supports_mte()) { + for (i = 0; i< num_tag_regions; i++) { + start = tag_regions[i].tag_range.start; + end = tag_regions[i].tag_range.end; + + /* end is inclusive, mem_end is not */ + if (end >= mem_end) + end = mem_end - 1; + if (end < start) + continue; + for (pfn = start; pfn <= end; pfn++) + free_reserved_page(pfn_to_page(pfn)); + } + goto out_disabled; + } + + /* + * Check that tag storage is addressable by the kernel. + * cma_init_reserved_mem(), unlike cma_declare_contiguous_nid(), doesn't + * perform this check. + */ + for (i = 0; i< num_tag_regions; i++) { + start = tag_regions[i].tag_range.start; + end = tag_regions[i].tag_range.end; + + if (end >= mem_end) { + pr_err("Tag region 0x%llx-0x%llx outside addressable memory", + PFN_PHYS(start), PFN_PHYS(end + 1) - 1); + goto out_disabled; + } + } + + ret = mte_split_tag_regions(); + if (ret) + goto out_disabled; + + for (i = 0; i < num_tag_regions; i++) { + region = &tag_regions[i]; + + /* Tag storage pages are managed in block_size_pages chunks. */ + if (is_power_of_2(region->block_size_pages)) + order = ilog2(region->block_size_pages); + else + order = 0; + + ret = cma_init_reserved_mem(PFN_PHYS(region->tag_range.start), + PFN_PHYS(range_len(®ion->tag_range)), + order, NULL, ®ion->cma); + if (ret) { + for (j = 0; j < i; j++) + cma_remove_mem(®ion->cma); + goto out_disabled; + } + + /* Keep pages reserved if activation fails. */ + cma_reserve_pages_on_error(region->cma); + } + return; out_disabled: From patchwork Thu Jan 25 16:42:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531482 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 82FC5C47258 for ; Thu, 25 Jan 2024 17:49:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=sC8JmadykEqQNsicIIpK9lpCMfCYt+jKC1VtRtbNoF0=; b=qfmy4uf7YZvihC ddxJF5hcOLsv4haAoj7gELxD7Dkh/77+i6+UnIRHaDRJs/VCOuuFV9UKo+naEebWsJSj6hp4rXcta vj5jfkYWqxl7uu4PIQGXRRulv4KxvMfKioNKOrWw7VEutE3lm2HwMWxcw45IF5x1nRo9v5kv1Y6Uj L+vwDYVqTxLt/P7J+owKAp3KeFHqSgcSnzlTVIV6L6Xj73NM3S6qt6kiqV4fYy9efGKJuz+wCUwyn XIkFUisQ7qe0gBgq6DxqB9Ac1TIaabX5GngNc96UVTFgPjDG4MdwEbRXLDxXKpMY6ceo6W4edB2D5 XQds+Dy1HjLWy/BXPjVQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT3rG-00000001DDZ-3MYh; Thu, 25 Jan 2024 17:49:46 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qQ-00000000tDE-3Ett for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:44:52 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E894C1688; Thu, 25 Jan 2024 08:45:32 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 03CA73F5A1; Thu, 25 Jan 2024 08:44:42 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 21/35] arm64: mte: Disable dynamic tag storage management if HW KASAN is enabled Date: Thu, 25 Jan 2024 16:42:42 +0000 Message-Id: <20240125164256.4147-22-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084450_890922_9342D8EB X-CRM114-Status: GOOD ( 10.48 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org To be able to reserve the tag storage associated with a tagged page requires that the tag storage can be migrated, if it's in use for data. The kernel allocates pages in non-preemptible contexts, which makes migration impossible. The only user of tagged pages in the kernel is HW KASAN, so don't use tag storage pages if HW KASAN is enabled. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * Expanded commit message (David Hildenbrand) arch/arm64/kernel/mte_tag_storage.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/arm64/kernel/mte_tag_storage.c b/arch/arm64/kernel/mte_tag_storage.c index 90b157132efa..9a1a8a45171e 100644 --- a/arch/arm64/kernel/mte_tag_storage.c +++ b/arch/arm64/kernel/mte_tag_storage.c @@ -256,6 +256,16 @@ void __init mte_init_tag_storage(void) goto out_disabled; } + /* + * The kernel allocates memory in non-preemptible contexts, which makes + * migration impossible when reserving the associated tag storage. The + * only in-kernel user of tagged pages is HW KASAN. + */ + if (kasan_hw_tags_enabled()) { + pr_info("KASAN HW tags incompatible with MTE tag storage management"); + goto out_disabled; + } + /* * Check that tag storage is addressable by the kernel. * cma_init_reserved_mem(), unlike cma_declare_contiguous_nid(), doesn't From patchwork Thu Jan 25 16:42:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9D9BCC48260 for ; Thu, 25 Jan 2024 17:49:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=JV/eEWFxBNg7JjuXT8XO3iRSfDhB6T9tyAE/Gde5iQI=; b=Wwn90rLHjEVlEE CoAB2PaDsHJD7fcuGfEgAeKdNOA/tmJcNuXysmr+3srgGx63uQK4OimfBfBWl/jYwjmQqHqDdFGxK It5BIMu3YIS0fpTh9GYm68BFc8U/qitmUEJKlYXL7epjgAbMAcx+Kf7Hpee+ENR2Fw5WwUd24fgAR ZqW2zOIW7CqFrrHE0dT5dfzt89bbHfms1GIBRAoqoYiLJ5FnUZs2E9Syg8D8H8+TnHIB8pEVjy9Le MV6EWa1W7stMfyYFhgqfQCtYBq1Wm8nyF8nvbjKSf+bNVIj/kfQ5jbZT29i11a3phmbvDNMosPVs+ 3IDZCkIXA7fkkWEQemKQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT3rH-00000001DDz-1ivn; Thu, 25 Jan 2024 17:49:47 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qV-00000000tFM-14t2 for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:44:56 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B1083168F; Thu, 25 Jan 2024 08:45:38 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C149A3F5A1; Thu, 25 Jan 2024 08:44:48 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 22/35] arm64: mte: Enable tag storage if CMA areas have been activated Date: Thu, 25 Jan 2024 16:42:43 +0000 Message-Id: <20240125164256.4147-23-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084455_448214_B47DC661 X-CRM114-Status: GOOD ( 18.02 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Before enabling MTE tag storage management, make sure that the CMA areas have been successfully activated. If a CMA area fails activation, the pages are kept as reserved. Reserved pages are never used by the page allocator. If this happens, the kernel would have to manage tag storage only for some of the memory, but not for all memory, and that would make the code unreasonably complicated. Choose to disable tag storage management altogether if a CMA area fails to be activated. Signed-off-by: Alexandru Elisei --- Changes since v2: * New patch. arch/arm64/include/asm/mte_tag_storage.h | 12 ++++++ arch/arm64/kernel/mte_tag_storage.c | 50 ++++++++++++++++++++++++ 2 files changed, 62 insertions(+) diff --git a/arch/arm64/include/asm/mte_tag_storage.h b/arch/arm64/include/asm/mte_tag_storage.h index 3c2cd29e053e..7b3f6bff8e6f 100644 --- a/arch/arm64/include/asm/mte_tag_storage.h +++ b/arch/arm64/include/asm/mte_tag_storage.h @@ -6,8 +6,20 @@ #define __ASM_MTE_TAG_STORAGE_H #ifdef CONFIG_ARM64_MTE_TAG_STORAGE + +DECLARE_STATIC_KEY_FALSE(tag_storage_enabled_key); + +static inline bool tag_storage_enabled(void) +{ + return static_branch_likely(&tag_storage_enabled_key); +} + void mte_init_tag_storage(void); #else +static inline bool tag_storage_enabled(void) +{ + return false; +} static inline void mte_init_tag_storage(void) { } diff --git a/arch/arm64/kernel/mte_tag_storage.c b/arch/arm64/kernel/mte_tag_storage.c index 9a1a8a45171e..d58c68b4a849 100644 --- a/arch/arm64/kernel/mte_tag_storage.c +++ b/arch/arm64/kernel/mte_tag_storage.c @@ -19,6 +19,8 @@ #include +__ro_after_init DEFINE_STATIC_KEY_FALSE(tag_storage_enabled_key); + struct tag_region { struct range mem_range; /* Memory associated with the tag storage, in PFNs. */ struct range tag_range; /* Tag storage memory, in PFNs. */ @@ -314,3 +316,51 @@ void __init mte_init_tag_storage(void) num_tag_regions = 0; pr_info("MTE tag storage region management disabled"); } + +static int __init mte_enable_tag_storage(void) +{ + struct range *tag_range; + struct cma *cma; + int i, ret; + + if (num_tag_regions == 0) + return 0; + + for (i = 0; i < num_tag_regions; i++) { + tag_range = &tag_regions[i].tag_range; + cma = tag_regions[i].cma; + /* + * CMA will keep the pages as reserved when the region fails + * activation. + */ + if (PageReserved(pfn_to_page(tag_range->start))) + goto out_disabled; + } + + static_branch_enable(&tag_storage_enabled_key); + pr_info("MTE tag storage region management enabled"); + + return 0; + +out_disabled: + for (i = 0; i < num_tag_regions; i++) { + tag_range = &tag_regions[i].tag_range; + cma = tag_regions[i].cma; + + if (PageReserved(pfn_to_page(tag_range->start))) + continue; + + /* Try really hard to reserve the tag storage. */ + ret = cma_alloc(cma, range_len(tag_range), 8, true); + /* + * Tag storage is still in use for data, memory and/or tag + * corruption will ensue. + */ + WARN_ON_ONCE(ret); + } + num_tag_regions = 0; + pr_info("MTE tag storage region management disabled"); + + return -EINVAL; +} +arch_initcall(mte_enable_tag_storage); From patchwork Thu Jan 25 16:42:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CE95BC47422 for ; Thu, 25 Jan 2024 16:48:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=OdD3z+xaPGogjYCjhq/3JwHOmsUqgWb/IdThGK6MIdE=; b=KhOBDMdHLG0fiT wQAHUl74M8c82dJ1OulgwbdeK+TSnmQAI78PceSr57Q+5dDruisz3vlTo4TnfXSxL+/rJkk3DvDUX Usww05k2V5wtltcG7AMIqZFn9taMSzCgBTKRtJVsQVJR7ExcpzzsDkIATMA2c2rtnITUKy97PefnR mveFCRkot1r+T6N9qQGsINhVGa148Y6spmGFFZ5YtWka0Y9jTuCrkEjW9s2qeEXRh/RPQStigDQUg FPCojNOeWpD1W30vpV47UtZgit8Nb4beFEMFSHbXl0odtRmiNMEQPvmrKC8prOsNFtF2RAeAP9u7F 95a55/EYvUBRyauBMaXQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2tc-00000000vIx-15Bi; Thu, 25 Jan 2024 16:48:08 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qb-00000000tIc-0TLC for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:45:06 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7B1D41691; Thu, 25 Jan 2024 08:45:44 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8C0543F5A1; Thu, 25 Jan 2024 08:44:54 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 23/35] arm64: mte: Try to reserve tag storage in arch_alloc_page() Date: Thu, 25 Jan 2024 16:42:44 +0000 Message-Id: <20240125164256.4147-24-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084501_588936_F1E027B9 X-CRM114-Status: GOOD ( 29.64 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Reserve tag storage for a page that is being allocated as tagged. This is a best effort approach, and failing to reserve tag storage is allowed. When all the associated tagged pages have been freed, return the tag storage pages back to the page allocator, where they can be used again for data allocations. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * Based on rfc v2 patch #16 ("arm64: mte: Manage tag storage on page allocation"). * Fixed calculation of the number of associated tag storage blocks (Hyesoo Yu). * Tag storage is reserved in arch_alloc_page() instead of arch_prep_new_page(). arch/arm64/include/asm/mte.h | 16 +- arch/arm64/include/asm/mte_tag_storage.h | 31 +++ arch/arm64/include/asm/page.h | 5 + arch/arm64/include/asm/pgtable.h | 19 ++ arch/arm64/kernel/mte_tag_storage.c | 234 +++++++++++++++++++++++ arch/arm64/mm/fault.c | 7 + fs/proc/page.c | 1 + include/linux/kernel-page-flags.h | 1 + include/linux/page-flags.h | 1 + include/trace/events/mmflags.h | 3 +- mm/huge_memory.c | 1 + 11 files changed, 316 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h index 8034695b3dd7..6457b7899207 100644 --- a/arch/arm64/include/asm/mte.h +++ b/arch/arm64/include/asm/mte.h @@ -40,12 +40,24 @@ void mte_free_tag_buf(void *buf); #ifdef CONFIG_ARM64_MTE /* track which pages have valid allocation tags */ -#define PG_mte_tagged PG_arch_2 +#define PG_mte_tagged PG_arch_2 /* simple lock to avoid multiple threads tagging the same page */ -#define PG_mte_lock PG_arch_3 +#define PG_mte_lock PG_arch_3 +/* Track if a tagged page has tag storage reserved */ +#define PG_tag_storage_reserved PG_arch_4 + +#ifdef CONFIG_ARM64_MTE_TAG_STORAGE +DECLARE_STATIC_KEY_FALSE(tag_storage_enabled_key); +extern bool page_tag_storage_reserved(struct page *page); +#endif static inline void set_page_mte_tagged(struct page *page) { +#ifdef CONFIG_ARM64_MTE_TAG_STORAGE + /* Open code mte_tag_storage_enabled() */ + WARN_ON_ONCE(static_branch_likely(&tag_storage_enabled_key) && + !page_tag_storage_reserved(page)); +#endif /* * Ensure that the tags written prior to this function are visible * before the page flags update. diff --git a/arch/arm64/include/asm/mte_tag_storage.h b/arch/arm64/include/asm/mte_tag_storage.h index 7b3f6bff8e6f..09f1318d924e 100644 --- a/arch/arm64/include/asm/mte_tag_storage.h +++ b/arch/arm64/include/asm/mte_tag_storage.h @@ -5,6 +5,12 @@ #ifndef __ASM_MTE_TAG_STORAGE_H #define __ASM_MTE_TAG_STORAGE_H +#ifndef __ASSEMBLY__ + +#include + +#include + #ifdef CONFIG_ARM64_MTE_TAG_STORAGE DECLARE_STATIC_KEY_FALSE(tag_storage_enabled_key); @@ -15,6 +21,15 @@ static inline bool tag_storage_enabled(void) } void mte_init_tag_storage(void); + +static inline bool alloc_requires_tag_storage(gfp_t gfp) +{ + return gfp & __GFP_TAGGED; +} +int reserve_tag_storage(struct page *page, int order, gfp_t gfp); +void free_tag_storage(struct page *page, int order); + +bool page_tag_storage_reserved(struct page *page); #else static inline bool tag_storage_enabled(void) { @@ -23,6 +38,22 @@ static inline bool tag_storage_enabled(void) static inline void mte_init_tag_storage(void) { } +static inline bool alloc_requires_tag_storage(struct page *page) +{ + return false; +} +static inline int reserve_tag_storage(struct page *page, int order, gfp_t gfp) +{ + return 0; +} +static inline void free_tag_storage(struct page *page, int order) +{ +} +static inline bool page_tag_storage_reserved(struct page *page) +{ + return true; +} #endif /* CONFIG_ARM64_MTE_TAG_STORAGE */ +#endif /* !__ASSEMBLY__ */ #endif /* __ASM_MTE_TAG_STORAGE_H */ diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 88bab032a493..3a656492f34a 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -35,6 +35,11 @@ void copy_highpage(struct page *to, struct page *from); void tag_clear_highpage(struct page *to); #define __HAVE_ARCH_TAG_CLEAR_HIGHPAGE +#ifdef CONFIG_ARM64_MTE_TAG_STORAGE +void arch_alloc_page(struct page *, int order, gfp_t gfp); +#define HAVE_ARCH_ALLOC_PAGE +#endif + #define clear_user_page(page, vaddr, pg) clear_page(page) #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 2499cc4fa4f2..f30466199a9b 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -10,6 +10,7 @@ #include #include +#include #include #include #include @@ -1069,6 +1070,24 @@ static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) mte_restore_page_tags_by_swp_entry(entry, &folio->page); } +#ifdef CONFIG_ARM64_MTE_TAG_STORAGE + +#define __HAVE_ARCH_FREE_PAGES_PREPARE +static inline void arch_free_pages_prepare(struct page *page, int order) +{ + if (tag_storage_enabled() && page_mte_tagged(page)) + free_tag_storage(page, order); +} + +#define __HAVE_ARCH_ALLOC_CMA +static inline bool arch_alloc_cma(gfp_t gfp_mask) +{ + if (tag_storage_enabled() && alloc_requires_tag_storage(gfp_mask)) + return false; + return true; +} + +#endif /* CONFIG_ARM64_MTE_TAG_STORAGE */ #endif /* CONFIG_ARM64_MTE */ #define __HAVE_ARCH_CALC_VMA_GFP diff --git a/arch/arm64/kernel/mte_tag_storage.c b/arch/arm64/kernel/mte_tag_storage.c index d58c68b4a849..762c7c803a70 100644 --- a/arch/arm64/kernel/mte_tag_storage.c +++ b/arch/arm64/kernel/mte_tag_storage.c @@ -34,6 +34,31 @@ struct tag_region { static struct tag_region tag_regions[MAX_TAG_REGIONS]; static int num_tag_regions; +/* + * A note on locking. Reserving tag storage takes the tag_blocks_lock mutex, + * because alloc_contig_range() might sleep. + * + * Freeing tag storage takes the xa_lock spinlock with interrupts disabled + * because pages can be freed from non-preemptible contexts, including from an + * interrupt handler. + * + * Because tag storage can be freed from interrupt contexts, the xarray is + * defined with the XA_FLAGS_LOCK_IRQ flag to disable interrupts when calling + * xa_store(). This is done to prevent a deadlock with free_tag_storage() being + * called from an interrupt raised before xa_store() releases the xa_lock. + * + * All of the above means that reserve_tag_storage() cannot run concurrently + * with itself (no concurrent insertions), but it can run at the same time as + * free_tag_storage(). The first thing that reserve_tag_storage() does after + * taking the mutex is increase the refcount on all present tag storage blocks + * with the xa_lock held, to serialize against freeing the blocks. This is an + * optimization to avoid taking and releasing the xa_lock after each iteration + * if the refcount operation was moved inside the loop, where it would have had + * to be executed for each block. + */ +static DEFINE_XARRAY_FLAGS(tag_blocks_reserved, XA_FLAGS_LOCK_IRQ); +static DEFINE_MUTEX(tag_blocks_lock); + static u32 __init get_block_size_pages(u32 block_size_bytes) { u32 a = PAGE_SIZE; @@ -364,3 +389,212 @@ static int __init mte_enable_tag_storage(void) return -EINVAL; } arch_initcall(mte_enable_tag_storage); + +static void page_set_tag_storage_reserved(struct page *page, int order) +{ + int i; + + for (i = 0; i < (1 << order); i++) + set_bit(PG_tag_storage_reserved, &(page + i)->flags); +} + +static void block_ref_add(unsigned long block, struct tag_region *region, int order) +{ + int count; + + count = min(1u << order, 32 * region->block_size_pages); + page_ref_add(pfn_to_page(block), count); +} + +static int block_ref_sub_return(unsigned long block, struct tag_region *region, int order) +{ + int count; + + count = min(1u << order, 32 * region->block_size_pages); + return page_ref_sub_return(pfn_to_page(block), count); +} + +static bool tag_storage_block_is_reserved(unsigned long block) +{ + return xa_load(&tag_blocks_reserved, block) != NULL; +} + +static int tag_storage_reserve_block(unsigned long block, struct tag_region *region, int order) +{ + int ret; + + ret = xa_err(xa_store(&tag_blocks_reserved, block, pfn_to_page(block), GFP_KERNEL)); + if (!ret) + block_ref_add(block, region, order); + + return ret; +} + +static int order_to_num_blocks(int order, u32 block_size_pages) +{ + int num_tag_storage_pages = max((1 << order) / 32, 1); + + return DIV_ROUND_UP(num_tag_storage_pages, block_size_pages); +} + +static int tag_storage_find_block_in_region(struct page *page, unsigned long *blockp, + struct tag_region *region) +{ + struct range *tag_range = ®ion->tag_range; + struct range *mem_range = ®ion->mem_range; + u64 page_pfn = page_to_pfn(page); + u64 block, block_offset; + + if (!(mem_range->start <= page_pfn && page_pfn <= mem_range->end)) + return -ERANGE; + + block_offset = (page_pfn - mem_range->start) / 32; + block = tag_range->start + rounddown(block_offset, region->block_size_pages); + + if (block + region->block_size_pages - 1 > tag_range->end) { + pr_err("Block 0x%llx-0x%llx is outside tag region 0x%llx-0x%llx\n", + PFN_PHYS(block), PFN_PHYS(block + region->block_size_pages + 1) - 1, + PFN_PHYS(tag_range->start), PFN_PHYS(tag_range->end + 1) - 1); + return -ERANGE; + } + *blockp = block; + + return 0; + +} + +static int tag_storage_find_block(struct page *page, unsigned long *block, + struct tag_region **region) +{ + int i, ret; + + for (i = 0; i < num_tag_regions; i++) { + ret = tag_storage_find_block_in_region(page, block, &tag_regions[i]); + if (ret == 0) { + *region = &tag_regions[i]; + return 0; + } + } + + return -EINVAL; +} + +bool page_tag_storage_reserved(struct page *page) +{ + return test_bit(PG_tag_storage_reserved, &page->flags); +} + +int reserve_tag_storage(struct page *page, int order, gfp_t gfp) +{ + unsigned long start_block, end_block; + struct tag_region *region; + unsigned long block; + unsigned long flags; + int ret = 0; + + VM_WARN_ON_ONCE(!preemptible()); + + if (page_tag_storage_reserved(page)) + return 0; + + /* + * __alloc_contig_migrate_range() ignores gfp when allocating the + * destination page for migration. Regardless, massage gfp flags and + * remove __GFP_TAGGED to avoid recursion in case gfp stops being + * ignored. + */ + gfp &= ~__GFP_TAGGED; + if (!(gfp & __GFP_NORETRY)) + gfp |= __GFP_RETRY_MAYFAIL; + + ret = tag_storage_find_block(page, &start_block, ®ion); + if (WARN_ONCE(ret, "Missing tag storage block for pfn 0x%lx", page_to_pfn(page))) + return -EINVAL; + end_block = start_block + order_to_num_blocks(order, region->block_size_pages); + + mutex_lock(&tag_blocks_lock); + + /* Check again, this time with the lock held. */ + if (page_tag_storage_reserved(page)) + goto out_unlock; + + /* Make sure existing entries are not freed from out under out feet. */ + xa_lock_irqsave(&tag_blocks_reserved, flags); + for (block = start_block; block < end_block; block += region->block_size_pages) { + if (tag_storage_block_is_reserved(block)) + block_ref_add(block, region, order); + } + xa_unlock_irqrestore(&tag_blocks_reserved, flags); + + for (block = start_block; block < end_block; block += region->block_size_pages) { + /* Refcount incremented above. */ + if (tag_storage_block_is_reserved(block)) + continue; + + ret = cma_alloc_range(region->cma, block, region->block_size_pages, 3, gfp); + /* Should never happen. */ + VM_WARN_ON_ONCE(ret == -EEXIST); + if (ret) + goto out_error; + + ret = tag_storage_reserve_block(block, region, order); + if (ret) { + cma_release(region->cma, pfn_to_page(block), region->block_size_pages); + goto out_error; + } + } + + page_set_tag_storage_reserved(page, order); +out_unlock: + mutex_unlock(&tag_blocks_lock); + + return 0; + +out_error: + xa_lock_irqsave(&tag_blocks_reserved, flags); + for (block = start_block; block < end_block; block += region->block_size_pages) { + if (tag_storage_block_is_reserved(block) && + block_ref_sub_return(block, region, order) == 1) { + __xa_erase(&tag_blocks_reserved, block); + cma_release(region->cma, pfn_to_page(block), region->block_size_pages); + } + } + xa_unlock_irqrestore(&tag_blocks_reserved, flags); + + mutex_unlock(&tag_blocks_lock); + + return ret; +} + +void free_tag_storage(struct page *page, int order) +{ + unsigned long block, start_block, end_block; + struct tag_region *region; + unsigned long flags; + int ret; + + ret = tag_storage_find_block(page, &start_block, ®ion); + if (WARN_ONCE(ret, "Missing tag storage block for pfn 0x%lx", page_to_pfn(page))) + return; + + end_block = start_block + order_to_num_blocks(order, region->block_size_pages); + + xa_lock_irqsave(&tag_blocks_reserved, flags); + for (block = start_block; block < end_block; block += region->block_size_pages) { + if (WARN_ONCE(!tag_storage_block_is_reserved(block), + "Block 0x%lx is not reserved for pfn 0x%lx", block, page_to_pfn(page))) + continue; + + if (block_ref_sub_return(block, region, order) == 1) { + __xa_erase(&tag_blocks_reserved, block); + cma_release(region->cma, pfn_to_page(block), region->block_size_pages); + } + } + xa_unlock_irqrestore(&tag_blocks_reserved, flags); +} + +void arch_alloc_page(struct page *page, int order, gfp_t gfp) +{ + if (tag_storage_enabled() && alloc_requires_tag_storage(gfp)) + reserve_tag_storage(page, order, gfp); +} diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index c022e473c17c..1ffaeccecda2 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -37,6 +37,7 @@ #include #include #include +#include #include #include #include @@ -950,6 +951,12 @@ gfp_t arch_calc_vma_gfp(struct vm_area_struct *vma, gfp_t gfp) void tag_clear_highpage(struct page *page) { + if (tag_storage_enabled() && !page_tag_storage_reserved(page)) { + /* Don't zero the tags if tag storage is not reserved */ + clear_page(page_address(page)); + return; + } + /* Newly allocated page, shouldn't have been tagged yet */ WARN_ON_ONCE(!try_page_mte_tagging(page)); mte_zero_clear_page_tags(page_address(page)); diff --git a/fs/proc/page.c b/fs/proc/page.c index 195b077c0fac..e7eb584a9234 100644 --- a/fs/proc/page.c +++ b/fs/proc/page.c @@ -221,6 +221,7 @@ u64 stable_page_flags(struct page *page) #ifdef CONFIG_ARCH_USES_PG_ARCH_X u |= kpf_copy_bit(k, KPF_ARCH_2, PG_arch_2); u |= kpf_copy_bit(k, KPF_ARCH_3, PG_arch_3); + u |= kpf_copy_bit(k, KPF_ARCH_4, PG_arch_4); #endif return u; diff --git a/include/linux/kernel-page-flags.h b/include/linux/kernel-page-flags.h index 859f4b0c1b2b..4a0d719ffdd4 100644 --- a/include/linux/kernel-page-flags.h +++ b/include/linux/kernel-page-flags.h @@ -19,5 +19,6 @@ #define KPF_SOFTDIRTY 40 #define KPF_ARCH_2 41 #define KPF_ARCH_3 42 +#define KPF_ARCH_4 43 #endif /* LINUX_KERNEL_PAGE_FLAGS_H */ diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index b7237bce7446..03f03e6d735e 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -135,6 +135,7 @@ enum pageflags { #ifdef CONFIG_ARCH_USES_PG_ARCH_X PG_arch_2, PG_arch_3, + PG_arch_4, #endif __NR_PAGEFLAGS, diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index 6ca0d5ed46c0..ba962fd10a2c 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -125,7 +125,8 @@ IF_HAVE_PG_HWPOISON(hwpoison) \ IF_HAVE_PG_IDLE(idle) \ IF_HAVE_PG_IDLE(young) \ IF_HAVE_PG_ARCH_X(arch_2) \ -IF_HAVE_PG_ARCH_X(arch_3) +IF_HAVE_PG_ARCH_X(arch_3) \ +IF_HAVE_PG_ARCH_X(arch_4) #define show_page_flags(flags) \ (flags) ? __print_flags(flags, "|", \ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2bad63a7ec16..47932539cc50 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2804,6 +2804,7 @@ static void __split_huge_page_tail(struct folio *folio, int tail, #ifdef CONFIG_ARCH_USES_PG_ARCH_X (1L << PG_arch_2) | (1L << PG_arch_3) | + (1L << PG_arch_4) | #endif (1L << PG_dirty) | LRU_GEN_MASK | LRU_REFS_MASK)); From patchwork Thu Jan 25 16:42:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 93AECC47258 for ; Thu, 25 Jan 2024 16:48:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=xqw6UhbljTdYIfvJhPBPnryFZpUvrglws8HX5E8wg1w=; b=kEf9G14CGl9/RU ClLNX9Q+cN+MIpNqEJl1ueFcSZCLqe6P1qmzlNxoewDvAJtadTMkevhBn1uFiTrcFzt1i2SzpTmI9 DH7f7HpoFO6ObwSBQ96CWnZdcdECzBUgi3UJbPxo62aqX91+7XPG8NflGQR6MR5RzCP4vAgwHsnuV WyfhKAVReYPJSVERQFFymD3zGS6aGwcnRJFfXei/SmcyYBID3cViHOMw8VoTXX4IqGT/NBNCM7Ivg vEZJVyan5CluvgOqqylELGk6UDkOMCH+bHzPaTv7vpBxtJcGXx0CcK0wvgFqW9j26C1K9E0LnqJHU 9GqZr+wwx9Dt2fQj2KlQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2td-00000000vJd-0Jiu; Thu, 25 Jan 2024 16:48:09 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qg-00000000tNa-3VXT for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:45:13 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 43F741692; Thu, 25 Jan 2024 08:45:50 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 54CF73F5A1; Thu, 25 Jan 2024 08:45:00 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 24/35] arm64: mte: Perform CMOs for tag blocks Date: Thu, 25 Jan 2024 16:42:45 +0000 Message-Id: <20240125164256.4147-25-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084507_093749_239A434A X-CRM114-Status: GOOD ( 13.72 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Make sure the contents of the tag storage block is not corrupted by performing: 1. A tag dcache inval when the associated tagged pages are freed, to avoid dirty tag cache lines being evicted and corrupting the tag storage block when it's being used to store data. 2. A data cache inval when the tag storage block is being reserved, to ensure that no dirty data cache lines are present, which would trigger a writeback that could corrupt the tags stored in the block. Signed-off-by: Alexandru Elisei --- arch/arm64/include/asm/assembler.h | 10 ++++++++++ arch/arm64/include/asm/mte_tag_storage.h | 2 ++ arch/arm64/kernel/mte_tag_storage.c | 11 +++++++++++ arch/arm64/lib/mte.S | 16 ++++++++++++++++ 4 files changed, 39 insertions(+) diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index 513787e43329..65fe88cce72b 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -310,6 +310,16 @@ alternative_cb_end lsl \reg, \reg, \tmp // actual cache line size .endm +/* + * tcache_line_size - get the safe tag cache line size across all CPUs + */ + .macro tcache_line_size, reg, tmp + read_ctr \tmp + ubfm \tmp, \tmp, #32, #37 // tag cache line size encoding + mov \reg, #4 // bytes per word + lsl \reg, \reg, \tmp // actual tag cache line size + .endm + /* * raw_icache_line_size - get the minimum I-cache line size on this CPU * from the CTR register. diff --git a/arch/arm64/include/asm/mte_tag_storage.h b/arch/arm64/include/asm/mte_tag_storage.h index 09f1318d924e..423b19e0cc46 100644 --- a/arch/arm64/include/asm/mte_tag_storage.h +++ b/arch/arm64/include/asm/mte_tag_storage.h @@ -11,6 +11,8 @@ #include +extern void dcache_inval_tags_poc(unsigned long start, unsigned long end); + #ifdef CONFIG_ARM64_MTE_TAG_STORAGE DECLARE_STATIC_KEY_FALSE(tag_storage_enabled_key); diff --git a/arch/arm64/kernel/mte_tag_storage.c b/arch/arm64/kernel/mte_tag_storage.c index 762c7c803a70..8c347f4855e4 100644 --- a/arch/arm64/kernel/mte_tag_storage.c +++ b/arch/arm64/kernel/mte_tag_storage.c @@ -17,6 +17,7 @@ #include #include +#include #include __ro_after_init DEFINE_STATIC_KEY_FALSE(tag_storage_enabled_key); @@ -421,8 +422,13 @@ static bool tag_storage_block_is_reserved(unsigned long block) static int tag_storage_reserve_block(unsigned long block, struct tag_region *region, int order) { + unsigned long block_va; int ret; + block_va = (unsigned long)page_to_virt(pfn_to_page(block)); + /* Avoid writeback of dirty data cache lines corrupting tags. */ + dcache_inval_poc(block_va, block_va + region->block_size_pages * PAGE_SIZE); + ret = xa_err(xa_store(&tag_blocks_reserved, block, pfn_to_page(block), GFP_KERNEL)); if (!ret) block_ref_add(block, region, order); @@ -570,6 +576,7 @@ void free_tag_storage(struct page *page, int order) { unsigned long block, start_block, end_block; struct tag_region *region; + unsigned long page_va; unsigned long flags; int ret; @@ -577,6 +584,10 @@ void free_tag_storage(struct page *page, int order) if (WARN_ONCE(ret, "Missing tag storage block for pfn 0x%lx", page_to_pfn(page))) return; + page_va = (unsigned long)page_to_virt(page); + /* Avoid writeback of dirty tag cache lines corrupting data. */ + dcache_inval_tags_poc(page_va, page_va + (PAGE_SIZE << order)); + end_block = start_block + order_to_num_blocks(order, region->block_size_pages); xa_lock_irqsave(&tag_blocks_reserved, flags); diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S index 9f623e9da09f..bc02b4e95062 100644 --- a/arch/arm64/lib/mte.S +++ b/arch/arm64/lib/mte.S @@ -175,3 +175,19 @@ SYM_FUNC_START(mte_copy_page_tags_from_buf) ret SYM_FUNC_END(mte_copy_page_tags_from_buf) + +/* + * dcache_inval_tags_poc(start, end) + * + * Ensure that any tags in the D-cache for the interval [start, end) + * are invalidated to PoC. + * + * - start - virtual start address of region + * - end - virtual end address of region + */ +SYM_FUNC_START(__pi_dcache_inval_tags_poc) + tcache_line_size x2, x3 + dcache_by_myline_op igvac, sy, x0, x1, x2, x3 + ret +SYM_FUNC_END(__pi_dcache_inval_tags_poc) +SYM_FUNC_ALIAS(dcache_inval_tags_poc, __pi_dcache_inval_tags_poc) From patchwork Thu Jan 25 16:42:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531406 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4372DC48260 for ; Thu, 25 Jan 2024 16:48:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=fHn+6M8eUzhvW/TNiAfUzJlDXHbwJlxR/MIKgt91kbw=; b=k4cLXeApc5zTIV HkEa6rObRocxHRC9gX7zFYSEFEjZu7aHfZpUeBNGQDCXxL9+aP9vBP0EqHj/LrWeH27RL6LoR/j74 QD3PthsOpNc5jBb5VPMeCxA7siRzam4h2tI9EjI+YpqZ5vHY6lgbaV45rU7XLnTVyHyvVvVUVLqac /q8+DRGuJv1ggkGTzaWdtHsKG09f2amG0AxIGpmg9YHqpR7CkhiAYMM/dqlRtNmr9uV3g2pc880K/ PyPhe4uZUyG75IJwzmbRaH0qcl9x8sUBB9mCO/b3BG2UuFATHQYZqb/6Y2jOQ46sCw+pB+ayqFvzf jWYcT9603D+iPQTGWtuw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2td-00000000vK5-3vJs; Thu, 25 Jan 2024 16:48:09 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qm-00000000tUK-2zvu for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:45:17 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 08EF7169C; Thu, 25 Jan 2024 08:45:56 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1C34E3F5A1; Thu, 25 Jan 2024 08:45:05 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 25/35] arm64: mte: Reserve tag block for the zero page Date: Thu, 25 Jan 2024 16:42:46 +0000 Message-Id: <20240125164256.4147-26-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084513_309780_7EB9C9DE X-CRM114-Status: GOOD ( 12.31 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On arm64, when a page is mapped as tagged, its tags are zeroed for two reasons: * To prevent leakage of tags to userspace. * To allow userspace to access the contents of the page with having to set the tags explicitely (bits 59:56 of an userspace pointer are zero, which correspond to tag 0b0000). The zero page receives special treatment, as the tags for the zero page are zeroed when the MTE feature is being enabled. This is done for performance reasons - the tags are zeroed once, instead of every time the page is mapped. When the tags for the zero page are zeroed, tag storage is not yet enabled. Reserve tag storage for the page immediately after tag storage management becomes enabled. Note that zeroing tags before tag storage management is enabled is safe to do because the tag storage pages are reserved at that point. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * Expanded commit message (David Hildenbrand) arch/arm64/kernel/mte_tag_storage.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/kernel/mte_tag_storage.c b/arch/arm64/kernel/mte_tag_storage.c index 8c347f4855e4..1c8469781870 100644 --- a/arch/arm64/kernel/mte_tag_storage.c +++ b/arch/arm64/kernel/mte_tag_storage.c @@ -363,6 +363,8 @@ static int __init mte_enable_tag_storage(void) goto out_disabled; } + reserve_tag_storage(ZERO_PAGE(0), 0, GFP_HIGHUSER); + static_branch_enable(&tag_storage_enabled_key); pr_info("MTE tag storage region management enabled"); From patchwork Thu Jan 25 16:42:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531410 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C6607C47258 for ; Thu, 25 Jan 2024 16:48:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=x5wi7Olj+NF17pDq9VJaBvCQsYdiEPpTB+qIT/n2K6Q=; b=g0zOh1te96gZhg QeWrPtzefTJcVYAxGTIBHsSsu0IvtwhU5ID7H2F94jH7Oz56hh+aQebPkRkV2eoGYJDfhevCIovKu 6x/dqMtsijw6w+xlNM7ysB3+xCqprbOQh0kh7vCtXCBbBo7RzFZDQqdBmPJAFo+YClxY8Tyrrlq/L 4j+gHRmliBUNyEL3CFufO4HhT3yO0OURtsGyO/8PxEaAikfgunNMaMGIwcRdyTKkMk+hmdnz9iREJ Ub+liqfkUZIrZ43KzAhgqsvH2e/jQ+QNZ1664FlDk1vgtfwvPXPFvbKGD6U9aZMLuOdBJwnbfJm01 Sb31AAgCfUywshPzJmQg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2th-00000000vMU-1MNw; Thu, 25 Jan 2024 16:48:13 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qs-00000000taf-2AvF for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:45:29 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BD7CA169E; Thu, 25 Jan 2024 08:46:01 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D5BA33F5A1; Thu, 25 Jan 2024 08:45:11 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 26/35] arm64: mte: Use fault-on-access to reserve missing tag storage Date: Thu, 25 Jan 2024 16:42:47 +0000 Message-Id: <20240125164256.4147-27-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084519_497716_46924F2D X-CRM114-Status: GOOD ( 30.10 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org There are three situations in which a page that is to be mapped as tagged doesn't have the corresponding tag storage reserved: * reserve_tag_storage() failed. * The allocation didn't specifiy __GFP_TAGGED (this can happen during migration, for example). * The page was mapped in a non-MTE enabled VMA, then an mprotect(PROT_MTE) enabled MTE. If a page that is about to be mapped as tagged doesn't have tag storage reserved, map it with the PAGE_FAULT_ON_ACCESS protection to trigger a fault next time they are accessed, and then reserve tag storage when the fault is handled. If tag storage cannot be reserved, then the page is migrated out of the VMA. Tag storage pages (which cannot be tagged) mapped in an MTE enabled MTE will be handled in a subsequent patch. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * New patch, loosely based on the arm64 code from the rfc v2 patch #19 ("mm: mprotect: Introduce PAGE_FAULT_ON_ACCESS for mprotect(PROT_MTE)") * All the common code has been moved back to the arch independent function handle_{huge_pmd,pte}_protnone() (David Hildenbrand). * Page is migrated if tag storage cannot be reserved after exhausting all attempts (Hyesoo Yu). * Moved folio_isolate_lru() declaration and struct migration_target_control to headers in include/linux (Peter Collingbourne). arch/arm64/Kconfig | 1 + arch/arm64/include/asm/mte.h | 4 +- arch/arm64/include/asm/mte_tag_storage.h | 3 + arch/arm64/include/asm/pgtable-prot.h | 2 + arch/arm64/include/asm/pgtable.h | 44 ++++++++--- arch/arm64/kernel/mte.c | 11 ++- arch/arm64/mm/fault.c | 98 ++++++++++++++++++++++++ include/linux/memcontrol.h | 2 + include/linux/migrate.h | 8 +- include/linux/migrate_mode.h | 1 + mm/internal.h | 6 -- 11 files changed, 156 insertions(+), 24 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 6f65e9005dc9..088e30fc6d12 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -2085,6 +2085,7 @@ config ARM64_MTE if ARM64_MTE config ARM64_MTE_TAG_STORAGE bool + select ARCH_HAS_FAULT_ON_ACCESS select CONFIG_CMA help Adds support for dynamic management of the memory used by the hardware diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h index 6457b7899207..70dc2e409070 100644 --- a/arch/arm64/include/asm/mte.h +++ b/arch/arm64/include/asm/mte.h @@ -107,7 +107,7 @@ static inline bool try_page_mte_tagging(struct page *page) } void mte_zero_clear_page_tags(void *addr); -void mte_sync_tags(pte_t pte, unsigned int nr_pages); +void mte_sync_tags(pte_t *pteval, unsigned int nr_pages); void mte_copy_page_tags(void *kto, const void *kfrom); void mte_thread_init_user(void); void mte_thread_switch(struct task_struct *next); @@ -139,7 +139,7 @@ static inline bool try_page_mte_tagging(struct page *page) static inline void mte_zero_clear_page_tags(void *addr) { } -static inline void mte_sync_tags(pte_t pte, unsigned int nr_pages) +static inline void mte_sync_tags(pte_t *pteval, unsigned int nr_pages) { } static inline void mte_copy_page_tags(void *kto, const void *kfrom) diff --git a/arch/arm64/include/asm/mte_tag_storage.h b/arch/arm64/include/asm/mte_tag_storage.h index 423b19e0cc46..6d0f6ffcfdd6 100644 --- a/arch/arm64/include/asm/mte_tag_storage.h +++ b/arch/arm64/include/asm/mte_tag_storage.h @@ -32,6 +32,9 @@ int reserve_tag_storage(struct page *page, int order, gfp_t gfp); void free_tag_storage(struct page *page, int order); bool page_tag_storage_reserved(struct page *page); + +vm_fault_t handle_folio_missing_tag_storage(struct folio *folio, struct vm_fault *vmf, + bool *map_pte); #else static inline bool tag_storage_enabled(void) { diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h index 483dbfa39c4c..1820e29244f8 100644 --- a/arch/arm64/include/asm/pgtable-prot.h +++ b/arch/arm64/include/asm/pgtable-prot.h @@ -19,6 +19,7 @@ #define PTE_SPECIAL (_AT(pteval_t, 1) << 56) #define PTE_DEVMAP (_AT(pteval_t, 1) << 57) #define PTE_PROT_NONE (_AT(pteval_t, 1) << 58) /* only when !PTE_VALID */ +#define PTE_TAG_STORAGE_NONE (_AT(pteval_t, 1) << 60) /* only when PTE_PROT_NONE */ /* * This bit indicates that the entry is present i.e. pmd_page() @@ -96,6 +97,7 @@ extern bool arm64_use_ng_mappings; }) #define PAGE_NONE __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN) +#define PAGE_FAULT_ON_ACCESS __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_TAG_STORAGE_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN) /* shared+writable pages are clean by default, hence PTE_RDONLY|PTE_WRITE */ #define PAGE_SHARED __pgprot(_PAGE_SHARED) #define PAGE_SHARED_EXEC __pgprot(_PAGE_SHARED_EXEC) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index f30466199a9b..0174e292f890 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -326,10 +326,10 @@ static inline void __check_safe_pte_update(struct mm_struct *mm, pte_t *ptep, __func__, pte_val(old_pte), pte_val(pte)); } -static inline void __sync_cache_and_tags(pte_t pte, unsigned int nr_pages) +static inline void __sync_cache_and_tags(pte_t *pteval, unsigned int nr_pages) { - if (pte_present(pte) && pte_user_exec(pte) && !pte_special(pte)) - __sync_icache_dcache(pte); + if (pte_present(*pteval) && pte_user_exec(*pteval) && !pte_special(*pteval)) + __sync_icache_dcache(*pteval); /* * If the PTE would provide user space access to the tags associated @@ -337,9 +337,9 @@ static inline void __sync_cache_and_tags(pte_t pte, unsigned int nr_pages) * pte_access_permitted() returns false for exec only mappings, they * don't expose tags (instruction fetches don't check tags). */ - if (system_supports_mte() && pte_access_permitted(pte, false) && - !pte_special(pte) && pte_tagged(pte)) - mte_sync_tags(pte, nr_pages); + if (system_supports_mte() && pte_access_permitted(*pteval, false) && + !pte_special(*pteval) && pte_tagged(*pteval)) + mte_sync_tags(pteval, nr_pages); } static inline void set_ptes(struct mm_struct *mm, @@ -347,7 +347,7 @@ static inline void set_ptes(struct mm_struct *mm, pte_t *ptep, pte_t pte, unsigned int nr) { page_table_check_ptes_set(mm, ptep, pte, nr); - __sync_cache_and_tags(pte, nr); + __sync_cache_and_tags(&pte, nr); for (;;) { __check_safe_pte_update(mm, ptep, pte); @@ -444,7 +444,7 @@ static inline pgprot_t pte_pgprot(pte_t pte) return __pgprot(pte_val(pfn_pte(pfn, __pgprot(0))) ^ pte_val(pte)); } -#ifdef CONFIG_NUMA_BALANCING +#if defined(CONFIG_NUMA_BALANCING) || defined(CONFIG_ARCH_HAS_FAULT_ON_ACCESS) /* * See the comment in include/linux/pgtable.h */ @@ -459,6 +459,28 @@ static inline int pmd_protnone(pmd_t pmd) } #endif +#ifdef CONFIG_ARCH_HAS_FAULT_ON_ACCESS +static inline bool arch_fault_on_access_pte(pte_t pte) +{ + return pte_protnone(pte) && (pte_val(pte) & PTE_TAG_STORAGE_NONE); +} + +static inline bool arch_fault_on_access_pmd(pmd_t pmd) +{ + return arch_fault_on_access_pte(pmd_pte(pmd)); +} + +static inline vm_fault_t arch_handle_folio_fault_on_access(struct folio *folio, + struct vm_fault *vmf, + bool *map_pte) +{ + if (tag_storage_enabled()) + return handle_folio_missing_tag_storage(folio, vmf, map_pte); + + return VM_FAULT_SIGBUS; +} +#endif /* CONFIG_ARCH_HAS_FAULT_ON_ACCESS */ + #define pmd_present_invalid(pmd) (!!(pmd_val(pmd) & PMD_PRESENT_INVALID)) static inline int pmd_present(pmd_t pmd) @@ -533,7 +555,7 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long __always_unused addr, pte_t *ptep, pte_t pte, unsigned int nr) { - __sync_cache_and_tags(pte, nr); + __sync_cache_and_tags(&pte, nr); __check_safe_pte_update(mm, ptep, pte); set_pte(ptep, pte); } @@ -828,8 +850,8 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) * in MAIR_EL1. The mask below has to include PTE_ATTRINDX_MASK. */ const pteval_t mask = PTE_USER | PTE_PXN | PTE_UXN | PTE_RDONLY | - PTE_PROT_NONE | PTE_VALID | PTE_WRITE | PTE_GP | - PTE_ATTRINDX_MASK; + PTE_PROT_NONE | PTE_TAG_STORAGE_NONE | PTE_VALID | + PTE_WRITE | PTE_GP | PTE_ATTRINDX_MASK; /* preserve the hardware dirty information */ if (pte_hw_dirty(pte)) pte = set_pte_bit(pte, __pgprot(PTE_DIRTY)); diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index a41ef3213e1e..faf09da3400a 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -35,13 +35,18 @@ DEFINE_STATIC_KEY_FALSE(mte_async_or_asymm_mode); EXPORT_SYMBOL_GPL(mte_async_or_asymm_mode); #endif -void mte_sync_tags(pte_t pte, unsigned int nr_pages) +void mte_sync_tags(pte_t *pteval, unsigned int nr_pages) { - struct page *page = pte_page(pte); + struct page *page = pte_page(*pteval); unsigned int i; - /* if PG_mte_tagged is set, tags have already been initialised */ for (i = 0; i < nr_pages; i++, page++) { + if (tag_storage_enabled() && !page_tag_storage_reserved(page)) { + *pteval = pte_modify(*pteval, PAGE_FAULT_ON_ACCESS); + continue; + } + + /* if PG_mte_tagged is set, tags have already been initialised */ if (try_page_mte_tagging(page)) { mte_clear_page_tags(page_address(page)); set_page_mte_tagged(page); diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 1ffaeccecda2..1db3adb6499f 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -12,6 +12,8 @@ #include #include #include +#include +#include #include #include #include @@ -19,6 +21,7 @@ #include #include #include +#include #include #include #include @@ -962,3 +965,98 @@ void tag_clear_highpage(struct page *page) mte_zero_clear_page_tags(page_address(page)); set_page_mte_tagged(page); } + +#ifdef CONFIG_ARM64_MTE_TAG_STORAGE + +#define MR_TAG_STORAGE MR_ARCH_1 + +/* + * Called with an elevated reference on the folio. + * Returns with the elevated reference dropped. + */ +static int replace_folio_with_tagged(struct folio *folio) +{ + struct migration_target_control mtc = { + .nid = NUMA_NO_NODE, + .gfp_mask = GFP_HIGHUSER_MOVABLE | __GFP_TAGGED, + }; + LIST_HEAD(foliolist); + int ret, tries; + + lru_cache_disable(); + + if (!folio_isolate_lru(folio)) { + lru_cache_enable(); + folio_put(folio); + return -EAGAIN; + } + + /* Isolate just grabbed another reference, drop ours. */ + folio_put(folio); + list_add_tail(&folio->lru, &foliolist); + + tries = 3; + while (tries--) { + ret = migrate_pages(&foliolist, alloc_migration_target, NULL, (unsigned long)&mtc, + MIGRATE_SYNC, MR_TAG_STORAGE, NULL); + if (ret != -EBUSY) + break; + } + + if (ret != 0) + putback_movable_pages(&foliolist); + + lru_cache_enable(); + + return ret; +} + +vm_fault_t handle_folio_missing_tag_storage(struct folio *folio, struct vm_fault *vmf, + bool *map_pte) +{ + struct vm_area_struct *vma = vmf->vma; + int ret = 0; + + *map_pte = false; + + /* + * This should never happen, once a VMA has been marked as tagged, that + * cannot be changed. + */ + if (WARN_ON_ONCE(!(vma->vm_flags & VM_MTE))) + goto out_map; + + /* + * The folio is probably being isolated for migration, replay the fault + * to give time for the entry to be replaced by a migration pte. + */ + if (unlikely(is_migrate_isolate_page(folio_page(folio, 0)))) + goto out_retry; + + ret = reserve_tag_storage(folio_page(folio, 0), folio_order(folio), GFP_HIGHUSER_MOVABLE); + if (ret) { + /* replace_folio_with_tagged() is expensive, try to avoid it. */ + if (fault_flag_allow_retry_first(vmf->flags)) + goto out_retry; + + replace_folio_with_tagged(folio); + return 0; + } + +out_map: + folio_put(folio); + *map_pte = true; + return 0; + +out_retry: + folio_put(folio); + if (fault_flag_allow_retry_first(vmf->flags)) { + /* Flag set by GUP. */ + if (!(vmf->flags & FAULT_FLAG_RETRY_NOWAIT)) + release_fault_lock(vmf); + return VM_FAULT_RETRY; + } + /* Replay the fault. */ + return 0; +} +#endif diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 20ff87f8e001..9c0b559f54f5 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1633,6 +1633,8 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, } #endif /* CONFIG_MEMCG */ +bool folio_isolate_lru(struct folio *folio); + static inline void __inc_lruvec_kmem_state(void *p, enum node_stat_item idx) { __mod_lruvec_kmem_state(p, idx, 1); diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 2ce13e8a309b..f954e19bd9d1 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -10,8 +10,6 @@ typedef struct folio *new_folio_t(struct folio *folio, unsigned long private); typedef void free_folio_t(struct folio *folio, unsigned long private); -struct migration_target_control; - /* * Return values from addresss_space_operations.migratepage(): * - negative errno on page migration failure; @@ -57,6 +55,12 @@ struct movable_operations { void (*putback_page)(struct page *); }; +struct migration_target_control { + int nid; /* preferred node id */ + nodemask_t *nmask; + gfp_t gfp_mask; +}; + /* Defined in mm/debug.c: */ extern const char *migrate_reason_names[MR_TYPES]; diff --git a/include/linux/migrate_mode.h b/include/linux/migrate_mode.h index f37cc03f9369..c6c5c7726d26 100644 --- a/include/linux/migrate_mode.h +++ b/include/linux/migrate_mode.h @@ -29,6 +29,7 @@ enum migrate_reason { MR_CONTIG_RANGE, MR_LONGTERM_PIN, MR_DEMOTION, + MR_ARCH_1, MR_TYPES }; diff --git a/mm/internal.h b/mm/internal.h index f309a010d50f..cb76cf0928f5 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -952,12 +952,6 @@ static inline bool is_migrate_highatomic_page(struct page *page) void setup_zone_pageset(struct zone *zone); -struct migration_target_control { - int nid; /* preferred node id */ - nodemask_t *nmask; - gfp_t gfp_mask; -}; - /* * mm/filemap.c */ From patchwork Thu Jan 25 16:42:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531408 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DB996C47258 for ; Thu, 25 Jan 2024 16:48:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=hjq6siVA5FJotvstbD7LTanGW2zvMPVtdrVvrEazIgM=; b=FR09+TciFBLmWa 6YRXTtDOYYZ2JqwjwWW4JWStvcMwbY9EzyEHqgrE8C4+n7ivuAyhdTDK9MvhxtDtfLrl8HE0U/RMa DJjV7jYEUuR/rUP1qfLdUbmN5aepcxoweFjjApPXxeZiIQnZ9nvyu6Dq9C5aTNTSpeGYfbqjjEHmf vZzu1IP1/ILJMIOU8IHlPH7PWAV8xiW5wRpJBw644ha6F2ma5/G82CyNy8cQ/E2aRRFs+Hv6iuaXt x3ucfp47jIsS+i412meuMPsK5Dg7jCi/T2THJQeUS5UPhKpfM4jda4ItFj9ItaFhbpCUd08AF48QN QRfV/ue6qXu1+FhNUsRA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2tg-00000000vLV-0Ggy; Thu, 25 Jan 2024 16:48:12 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2qy-00000000tgR-1pxt for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:45:28 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7EA3516A3; Thu, 25 Jan 2024 08:46:07 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 963B83F5A1; Thu, 25 Jan 2024 08:45:17 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 27/35] arm64: mte: Handle tag storage pages mapped in an MTE VMA Date: Thu, 25 Jan 2024 16:42:48 +0000 Message-Id: <20240125164256.4147-28-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084524_955477_C775032B X-CRM114-Status: GOOD ( 14.76 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Tag stoarge pages cannot be tagged. When such a page is mapped in a MTE-enabled VMA, migrate it out directly and don't try to reserve tag storage for it. Signed-off-by: Alexandru Elisei --- arch/arm64/include/asm/mte_tag_storage.h | 1 + arch/arm64/kernel/mte_tag_storage.c | 15 +++++++++++++++ arch/arm64/mm/fault.c | 11 +++++++++-- 3 files changed, 25 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/mte_tag_storage.h b/arch/arm64/include/asm/mte_tag_storage.h index 6d0f6ffcfdd6..50bdae94cf71 100644 --- a/arch/arm64/include/asm/mte_tag_storage.h +++ b/arch/arm64/include/asm/mte_tag_storage.h @@ -32,6 +32,7 @@ int reserve_tag_storage(struct page *page, int order, gfp_t gfp); void free_tag_storage(struct page *page, int order); bool page_tag_storage_reserved(struct page *page); +bool page_is_tag_storage(struct page *page); vm_fault_t handle_folio_missing_tag_storage(struct folio *folio, struct vm_fault *vmf, bool *map_pte); diff --git a/arch/arm64/kernel/mte_tag_storage.c b/arch/arm64/kernel/mte_tag_storage.c index 1c8469781870..afe2bb754879 100644 --- a/arch/arm64/kernel/mte_tag_storage.c +++ b/arch/arm64/kernel/mte_tag_storage.c @@ -492,6 +492,21 @@ bool page_tag_storage_reserved(struct page *page) return test_bit(PG_tag_storage_reserved, &page->flags); } +bool page_is_tag_storage(struct page *page) +{ + unsigned long pfn = page_to_pfn(page); + struct range *tag_range; + int i; + + for (i = 0; i < num_tag_regions; i++) { + tag_range = &tag_regions[i].tag_range; + if (tag_range->start <= pfn && pfn <= tag_range->end) + return true; + } + + return false; +} + int reserve_tag_storage(struct page *page, int order, gfp_t gfp) { unsigned long start_block, end_block; diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 1db3adb6499f..01450ab91a87 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -1014,6 +1014,7 @@ static int replace_folio_with_tagged(struct folio *folio) vm_fault_t handle_folio_missing_tag_storage(struct folio *folio, struct vm_fault *vmf, bool *map_pte) { + bool is_tag_storage = page_is_tag_storage(folio_page(folio, 0)); struct vm_area_struct *vma = vmf->vma; int ret = 0; @@ -1033,12 +1034,18 @@ vm_fault_t handle_folio_missing_tag_storage(struct folio *folio, struct vm_fault if (unlikely(is_migrate_isolate_page(folio_page(folio, 0)))) goto out_retry; - ret = reserve_tag_storage(folio_page(folio, 0), folio_order(folio), GFP_HIGHUSER_MOVABLE); - if (ret) { + if (!is_tag_storage) { + ret = reserve_tag_storage(folio_page(folio, 0), folio_order(folio), + GFP_HIGHUSER_MOVABLE); + if (!ret) + goto out_map; + /* replace_folio_with_tagged() is expensive, try to avoid it. */ if (fault_flag_allow_retry_first(vmf->flags)) goto out_retry; + } + if (ret || is_tag_storage) { replace_folio_with_tagged(folio); return 0; } From patchwork Thu Jan 25 16:42:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 78E9EC48260 for ; Thu, 25 Jan 2024 16:48:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=em76lMD4yDDA15vc3uBDVyKy2wp/SQrwj9gPt4xfFoo=; b=HlixxJvEW4ZL+2 FhbZ2UPu1OAJartJsXS7m1PrGFQtrSVPez0X3etPWKZRXuT2GS6htthe7FztiytCEsrwehVbntazS 54bancbC2GPGxr+iNGuC7/p4hg/reGYYBoFZ1d+2LzlhbH1Qu23FrS2fGd6s3ZECSByZemP2p3bIk sggFGh87pNNQRfGO2R5yFhRfECwkIMxAQL24RJM2sCKeKwA7y20Py/qhSDzADF2uRbT315j19uuUU IQY9uXRH03aBKqt75ICPos01sY+Dm/DtIowJqqAT3ok+mnGK4vAlplgeyLVThvrsbSBPDlApHvYOW W4YyK3PKlaXJFMb16rtg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2ti-00000000vNo-3lOH; Thu, 25 Jan 2024 16:48:14 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2r3-00000000tmp-45TF for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:45:42 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4406316F2; Thu, 25 Jan 2024 08:46:13 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 59B4A3F5A1; Thu, 25 Jan 2024 08:45:23 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 28/35] arm64: mte: swap: Handle tag restoring when missing tag storage Date: Thu, 25 Jan 2024 16:42:49 +0000 Message-Id: <20240125164256.4147-29-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084530_489873_F85429F1 X-CRM114-Status: GOOD ( 22.65 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Linux restores tags when a page is swapped in and there are tags associated with the swap entry which the new page will replace. The saved tags are restored even if the page will not be mapped as tagged, to protect against cases where the page is shared between different VMAs, and is tagged in some, but untagged in others. By using this approach, the process can still access the correct tags following an mprotect(PROT_MTE) on the non-MTE enabled VMA. But this poses a challenge for managing tag storage: in the scenario above, when a new page is allocated to be swapped in for the process where it will be mapped as untagged, the corresponding tag storage block is not reserved. mte_restore_page_tags_by_swp_entry(), when it restores the saved tags, will overwrite data in the tag storage block associated with the new page, leading to data corruption if the block is in use by a process. Get around this issue by saving the tags in a new xarray, this time indexed by the page pfn, and then restoring them when tag storage is reserved for the page. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * Restore saved tags **before** setting the PG_tag_storage_reserved bit to eliminate a brief window of opportunity where userspace can access uninitialized tags (Peter Collingbourne). arch/arm64/include/asm/mte_tag_storage.h | 8 ++ arch/arm64/include/asm/pgtable.h | 11 +++ arch/arm64/kernel/mte_tag_storage.c | 12 ++- arch/arm64/mm/mteswap.c | 110 +++++++++++++++++++++++ 4 files changed, 140 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/mte_tag_storage.h b/arch/arm64/include/asm/mte_tag_storage.h index 50bdae94cf71..40590a8c3748 100644 --- a/arch/arm64/include/asm/mte_tag_storage.h +++ b/arch/arm64/include/asm/mte_tag_storage.h @@ -36,6 +36,14 @@ bool page_is_tag_storage(struct page *page); vm_fault_t handle_folio_missing_tag_storage(struct folio *folio, struct vm_fault *vmf, bool *map_pte); +vm_fault_t mte_try_transfer_swap_tags(swp_entry_t entry, struct page *page); + +void tags_by_pfn_lock(void); +void tags_by_pfn_unlock(void); + +void *mte_erase_tags_for_pfn(unsigned long pfn); +bool mte_save_tags_for_pfn(void *tags, unsigned long pfn); +void mte_restore_tags_for_pfn(unsigned long start_pfn, int order); #else static inline bool tag_storage_enabled(void) { diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 0174e292f890..87ae59436162 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1085,6 +1085,17 @@ static inline void arch_swap_invalidate_area(int type) mte_invalidate_tags_area_by_swp_entry(type); } +#ifdef CONFIG_ARM64_MTE_TAG_STORAGE +#define __HAVE_ARCH_SWAP_PREPARE_TO_RESTORE +static inline vm_fault_t arch_swap_prepare_to_restore(swp_entry_t entry, + struct folio *folio) +{ + if (tag_storage_enabled()) + return mte_try_transfer_swap_tags(entry, &folio->page); + return 0; +} +#endif + #define __HAVE_ARCH_SWAP_RESTORE static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) { diff --git a/arch/arm64/kernel/mte_tag_storage.c b/arch/arm64/kernel/mte_tag_storage.c index afe2bb754879..ac7b9c9c585c 100644 --- a/arch/arm64/kernel/mte_tag_storage.c +++ b/arch/arm64/kernel/mte_tag_storage.c @@ -567,6 +567,7 @@ int reserve_tag_storage(struct page *page, int order, gfp_t gfp) } } + mte_restore_tags_for_pfn(page_to_pfn(page), order); page_set_tag_storage_reserved(page, order); out_unlock: mutex_unlock(&tag_blocks_lock); @@ -595,7 +596,8 @@ void free_tag_storage(struct page *page, int order) struct tag_region *region; unsigned long page_va; unsigned long flags; - int ret; + void *tags; + int i, ret; ret = tag_storage_find_block(page, &start_block, ®ion); if (WARN_ONCE(ret, "Missing tag storage block for pfn 0x%lx", page_to_pfn(page))) @@ -605,6 +607,14 @@ void free_tag_storage(struct page *page, int order) /* Avoid writeback of dirty tag cache lines corrupting data. */ dcache_inval_tags_poc(page_va, page_va + (PAGE_SIZE << order)); + tags_by_pfn_lock(); + for (i = 0; i < (1 << order); i++) { + tags = mte_erase_tags_for_pfn(page_to_pfn(page + i)); + if (unlikely(tags)) + mte_free_tag_buf(tags); + } + tags_by_pfn_unlock(); + end_block = start_block + order_to_num_blocks(order, region->block_size_pages); xa_lock_irqsave(&tag_blocks_reserved, flags); diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c index 2a43746b803f..e11495fa3c18 100644 --- a/arch/arm64/mm/mteswap.c +++ b/arch/arm64/mm/mteswap.c @@ -20,6 +20,112 @@ void mte_free_tag_buf(void *buf) kfree(buf); } +#ifdef CONFIG_ARM64_MTE_TAG_STORAGE +static DEFINE_XARRAY(tags_by_pfn); + +void tags_by_pfn_lock(void) +{ + xa_lock(&tags_by_pfn); +} + +void tags_by_pfn_unlock(void) +{ + xa_unlock(&tags_by_pfn); +} + +void *mte_erase_tags_for_pfn(unsigned long pfn) +{ + return __xa_erase(&tags_by_pfn, pfn); +} + +bool mte_save_tags_for_pfn(void *tags, unsigned long pfn) +{ + void *entry; + int ret; + + ret = xa_reserve(&tags_by_pfn, pfn, GFP_KERNEL); + if (ret) + return true; + + tags_by_pfn_lock(); + + if (page_tag_storage_reserved(pfn_to_page(pfn))) { + xa_release(&tags_by_pfn, pfn); + tags_by_pfn_unlock(); + return false; + } + + entry = __xa_store(&tags_by_pfn, pfn, tags, GFP_ATOMIC); + if (xa_is_err(entry)) { + xa_release(&tags_by_pfn, pfn); + goto out_unlock; + } else if (entry) { + mte_free_tag_buf(entry); + } + +out_unlock: + tags_by_pfn_unlock(); + return true; +} + +void mte_restore_tags_for_pfn(unsigned long start_pfn, int order) +{ + struct page *page = pfn_to_page(start_pfn); + unsigned long pfn; + void *tags; + + tags_by_pfn_lock(); + + for (pfn = start_pfn; pfn < start_pfn + (1 << order); pfn++, page++) { + tags = mte_erase_tags_for_pfn(pfn); + if (unlikely(tags)) { + /* + * Mark the page as tagged so mte_sync_tags() doesn't + * clear the tags. + */ + WARN_ON_ONCE(!try_page_mte_tagging(page)); + mte_copy_page_tags_from_buf(page_address(page), tags); + set_page_mte_tagged(page); + mte_free_tag_buf(tags); + } + } + + tags_by_pfn_unlock(); +} + +/* + * Note on locking: swap in/out is done with the folio locked, which eliminates + * races with mte_save/restore_page_tags_by_swp_entry. + */ +vm_fault_t mte_try_transfer_swap_tags(swp_entry_t entry, struct page *page) +{ + void *swap_tags, *pfn_tags; + bool saved; + + /* + * mte_restore_page_tags_by_swp_entry() will take care of copying the + * tags over. + */ + if (likely(page_mte_tagged(page) || page_tag_storage_reserved(page))) + return 0; + + swap_tags = xa_load(&tags_by_swp_entry, entry.val); + if (!swap_tags) + return 0; + + pfn_tags = mte_allocate_tag_buf(); + if (!pfn_tags) + return VM_FAULT_OOM; + + memcpy(pfn_tags, swap_tags, MTE_PAGE_TAG_STORAGE_SIZE); + saved = mte_save_tags_for_pfn(pfn_tags, page_to_pfn(page)); + if (!saved) + mte_free_tag_buf(pfn_tags); + + return 0; +} +#endif + int mte_save_page_tags_by_swp_entry(struct page *page) { void *tags, *ret; @@ -54,6 +160,10 @@ void mte_restore_page_tags_by_swp_entry(swp_entry_t entry, struct page *page) if (!tags) return; + /* Tags will be restored when tag storage is reserved. */ + if (tag_storage_enabled() && unlikely(!page_tag_storage_reserved(page))) + return; + if (try_page_mte_tagging(page)) { mte_copy_page_tags_from_buf(page_address(page), tags); set_page_mte_tagged(page); From patchwork Thu Jan 25 16:42:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 530ECC48260 for ; Thu, 25 Jan 2024 16:48:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=L0F9GwEfI/+kjzorkIIDK8auG3mecHN+RrOmzsJyo3M=; b=c2RAwiVGGCG0UI tqUcYvFVNpcfQyJRgAuxUSC+QCM90y7NIKsmCqeHhT0Y6sh0P/f1R3gX75V0i8cKbBAU1Iq1XTji/ PPaYoTaLOiAwViFtX+tqjhI7FJYpG+l3TM8bSGcxhcuOusHqRxYCtxP2ddhA782sgqK83dgr+DyrM 8LaC9AKhEBqMh2OUvnfDh5C8+sA+wYZAPCGMqR3vifhH/vpL0xlDjZ0q7u+8rcOlRtTry5Mxunaqq jsH86+ieM1mhgPz0P3RGAGJwfYEldkPvZUM5my4j1y/pBYSu3PuVa4bJ6SPp+SedE5WH78fY7oglJ rwMRU5id9i/c3ccs+JJA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2tk-00000000vPE-0RYj; Thu, 25 Jan 2024 16:48:16 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2r9-00000000ttB-41LQ for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:45:47 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0743516F3; Thu, 25 Jan 2024 08:46:19 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1E2EF3F5A1; Thu, 25 Jan 2024 08:45:28 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 29/35] arm64: mte: copypage: Handle tag restoring when missing tag storage Date: Thu, 25 Jan 2024 16:42:50 +0000 Message-Id: <20240125164256.4147-30-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084536_714941_817746FC X-CRM114-Status: GOOD ( 13.58 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org There are several situations where copy_highpage() can end up copying tags to a page which doesn't have its tag storage reserved. One situation involves migration racing with mprotect(PROT_MTE): VMA is initially untagged, migration starts and destination page is allocated as untagged, mprotect(PROT_MTE) changes the VMA to tagged and userspace accesses the source page, thus making it tagged. The migration code then calls copy_highpage(), which will copy the tags from the source page (now tagged) to the destination page (allocated as untagged). Yes another situation can happen during THP collapse. The huge page that will replace the HPAGE_PMD_NR contiguous mapped pages is allocated with __GFP_TAGGED not set. copy_highpage() will copy the tags from the pages being replaced to the huge page which doesn't have tag storage reserved. The situation gets even more complicated when the replacement huge page is a tag storage page. The tag storage huge page will be migrated after a fault on access, but the tags from the original pages must be copied over to the huge page that will be replacing the tag storage huge page. Signed-off-by: Alexandru Elisei --- arch/arm64/mm/copypage.c | 56 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c index a7bb20055ce0..e991ccb43fb7 100644 --- a/arch/arm64/mm/copypage.c +++ b/arch/arm64/mm/copypage.c @@ -13,6 +13,59 @@ #include #include #include +#include + +#ifdef CONFIG_ARM64_MTE_TAG_STORAGE +static inline bool try_transfer_saved_tags(struct page *from, struct page *to) +{ + void *tags; + bool saved; + + VM_WARN_ON_ONCE(!preemptible()); + + if (page_mte_tagged(from)) { + if (page_tag_storage_reserved(to)) + return false; + + tags = mte_allocate_tag_buf(); + if (WARN_ON(!tags)) + return true; + + mte_copy_page_tags_to_buf(page_address(from), tags); + saved = mte_save_tags_for_pfn(tags, page_to_pfn(to)); + if (!saved) + mte_free_tag_buf(tags); + + return saved; + } + + tags_by_pfn_lock(); + tags = mte_erase_tags_for_pfn(page_to_pfn(from)); + tags_by_pfn_unlock(); + + if (likely(!tags)) + return false; + + if (page_tag_storage_reserved(to)) { + WARN_ON_ONCE(!try_page_mte_tagging(to)); + mte_copy_page_tags_from_buf(page_address(to), tags); + set_page_mte_tagged(to); + mte_free_tag_buf(tags); + return true; + } + + saved = mte_save_tags_for_pfn(tags, page_to_pfn(to)); + if (!saved) + mte_free_tag_buf(tags); + + return saved; +} +#else +static inline bool try_transfer_saved_tags(struct page *from, struct page *to) +{ + return false; +} +#endif void copy_highpage(struct page *to, struct page *from) { @@ -24,6 +77,9 @@ void copy_highpage(struct page *to, struct page *from) if (kasan_hw_tags_enabled()) page_kasan_tag_reset(to); + if (tag_storage_enabled() && try_transfer_saved_tags(from, to)) + return; + if (system_supports_mte() && page_mte_tagged(from)) { /* It's a new page, shouldn't have been tagged yet */ WARN_ON_ONCE(!try_page_mte_tagging(to)); From patchwork Thu Jan 25 16:42:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531413 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C49E2C47422 for ; Thu, 25 Jan 2024 16:48:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=zEMd1K7Hn3Twt/o58bQ/yG/65QWKqCfvFRO+LpCuhF0=; b=tKJEG7Vw0JAVQ4 u5QND15zzCicxtu7nsKCklz+PYUVzd0tt6j83PepVJD5CxVAIKProJhG2CEMWCfDFQRhfZpSPfbuZ fIChW2zNCZF//fiHPacU5dl2eFw2bJGTXnJ62yw/fu51HQ7UlqreDpNNr7Jr70M0qgaGGOSwa1PSe Ww26LhHDKRZxsxRUBQN1aLPPwIMX7h2M7l9dH5IOBKRUEMYmlxH7wkp2VJka2lDUS2wEYawEt9NR3 WMFBgFkRMnQyhz8jYbu6t9GgMxf283nmk0MoEiywvA3LIT9VmM6k9JAl2LADY4ble8vifV2MeGNaa XKRghFbFDxRp5I4Sv6Zg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2to-00000000vSr-0rtm; Thu, 25 Jan 2024 16:48:20 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2rF-00000000tzY-3R36 for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:45:51 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 09C361713; Thu, 25 Jan 2024 08:46:25 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D3D8E3F5A1; Thu, 25 Jan 2024 08:45:34 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 30/35] arm64: mte: ptrace: Handle pages with missing tag storage Date: Thu, 25 Jan 2024 16:42:51 +0000 Message-Id: <20240125164256.4147-31-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084542_453995_5D5ABA09 X-CRM114-Status: GOOD ( 13.51 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org A page can end up mapped in a MTE enabled VMA without the corresponding tag storage block reserved. Tag accesses made by ptrace in this case can lead to the wrong tags being read or memory corruption for the process that is using the tag storage memory as data. Reserve tag storage by treating ptrace accesses like a fault. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * New patch, issue reported by Peter Collingbourne. arch/arm64/kernel/mte.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index faf09da3400a..b1fa02dad4fd 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -412,10 +412,13 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr, while (len) { struct vm_area_struct *vma; unsigned long tags, offset; + unsigned int fault_flags; + struct page *page; + vm_fault_t ret; void *maddr; - struct page *page = get_user_page_vma_remote(mm, addr, - gup_flags, &vma); +get_page: + page = get_user_page_vma_remote(mm, addr, gup_flags, &vma); if (IS_ERR(page)) { err = PTR_ERR(page); break; @@ -433,6 +436,25 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr, put_page(page); break; } + + if (tag_storage_enabled() && !page_tag_storage_reserved(page)) { + fault_flags = FAULT_FLAG_DEFAULT | \ + FAULT_FLAG_USER | \ + FAULT_FLAG_REMOTE | \ + FAULT_FLAG_ALLOW_RETRY | \ + FAULT_FLAG_RETRY_NOWAIT; + if (write) + fault_flags |= FAULT_FLAG_WRITE; + + put_page(page); + ret = handle_mm_fault(vma, addr, fault_flags, NULL); + if (ret & VM_FAULT_ERROR) { + err = -EFAULT; + break; + } + goto get_page; + } + WARN_ON_ONCE(!page_mte_tagged(page)); /* limit access to the end of the page */ From patchwork Thu Jan 25 16:42:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531412 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 277A3C47422 for ; Thu, 25 Jan 2024 16:48:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=dF5pIrOaoJEu2u5OQF9dABBfMWZTg4i8Uc4QqbPyVqQ=; b=vH6dZEOkMZlyPg IUGZP765aZ5ACY07CQJE7iKKQCzN3Hg+Vu1Nk6YLxpV7P168QgCudEBDPj9T5VxaDdElGEh0zVyVI 1TKkYFDaGPCIBYSk9r/8gk4kL0BcTgdin7fApH64lnvf7NC4aj639GJc3KV03wjqRklNWP8A+fHMF t58Dg0Q/ahcPZa+fuBxjv5vR4KP1cS5e48Hxl02jDEyFJaw7ZzwqhAQrgWX5voUpocmgPYf2rdbDB c+AEqWIFOaT/ftzXmcgyk9s0g51QiiwQKqokMkSTDBPR/WR6HrtHZb95uw/fwMs9Wjrs1FbddZtxw wKDFd1UGK6T9lSdyMRLg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2tl-00000000vQN-1W4t; Thu, 25 Jan 2024 16:48:17 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2rP-00000000u9n-2MN1 for linux-arm-kernel@bombadil.infradead.org; Thu, 25 Jan 2024 16:45:51 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=xkwftJVeqouLGSxjRxX2qTSTkNnfTRKYrouvXmLFb0c=; b=npqcdvFrvFgfhalNXANNHFFSvd JcPpFlXFz39rp3rhRRPaALxcmUzUcEOZcFKltjcTXMi2vrKKu+uI1/i/azWKOc1VG1dqTeBtPUTCI Up1SSN5CXjYK0bfxa9Jl0gpyICVxPJhzL9GGpcKj6QNM0bjOXmRd7LBWTJ509EU3CPOG17l0oCbQF /ClZLz0s2uvbuHQ8AfRHP81LJzNFlh/aGN+oPQCaP7yaGBcj6rFOSjkZBmNA1NjMzl3pl1JdFsrN+ 5QDtC/OC9iIbW4972UnECWKtJ3evtOAKkLDWRly3PD5A6O/I5X+MNHS+yRwycBteJXmLzN0aiSr+g FTaoIj9w==; Received: from foss.arm.com ([217.140.110.172]) by desiato.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2rL-00000005UiM-29wo for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:45:50 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8549C1756; Thu, 25 Jan 2024 08:46:30 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 945353F8A4; Thu, 25 Jan 2024 08:45:40 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 31/35] khugepaged: arm64: Don't collapse MTE enabled VMAs Date: Thu, 25 Jan 2024 16:42:52 +0000 Message-Id: <20240125164256.4147-32-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_164547_953043_95B54C7E X-CRM114-Status: GOOD ( 12.42 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org copy_user_highpage() will do memory allocation if there are saved tags for the destination page, and the page is missing tag storage. After commit a349d72fd9ef ("mm/pgtable: add rcu_read_lock() and rcu_read_unlock()s"), collapse_huge_page() calls __collapse_huge_page_copy() -> .. -> copy_user_highpage() with the RCU lock held, which means that copy_user_highpage() can only allocate memory using GFP_ATOMIC or equivalent. Get around this by refusing to collapse pages into a transparent huge page if the VMA is MTE-enabled. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * New patch. I think an agreement on whether copy*_user_highpage() should be always allowed to sleep, or should not be allowed, would be useful. arch/arm64/include/asm/pgtable.h | 3 +++ arch/arm64/kernel/mte_tag_storage.c | 5 +++++ include/linux/khugepaged.h | 5 +++++ mm/khugepaged.c | 4 ++++ 4 files changed, 17 insertions(+) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 87ae59436162..d0473538c926 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1120,6 +1120,9 @@ static inline bool arch_alloc_cma(gfp_t gfp_mask) return true; } +bool arch_hugepage_vma_revalidate(struct vm_area_struct *vma, unsigned long address); +#define arch_hugepage_vma_revalidate arch_hugepage_vma_revalidate + #endif /* CONFIG_ARM64_MTE_TAG_STORAGE */ #endif /* CONFIG_ARM64_MTE */ diff --git a/arch/arm64/kernel/mte_tag_storage.c b/arch/arm64/kernel/mte_tag_storage.c index ac7b9c9c585c..a99959b70573 100644 --- a/arch/arm64/kernel/mte_tag_storage.c +++ b/arch/arm64/kernel/mte_tag_storage.c @@ -636,3 +636,8 @@ void arch_alloc_page(struct page *page, int order, gfp_t gfp) if (tag_storage_enabled() && alloc_requires_tag_storage(gfp)) reserve_tag_storage(page, order, gfp); } + +bool arch_hugepage_vma_revalidate(struct vm_area_struct *vma, unsigned long address) +{ + return !(vma->vm_flags & VM_MTE); +} diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h index f68865e19b0b..461e4322dff2 100644 --- a/include/linux/khugepaged.h +++ b/include/linux/khugepaged.h @@ -38,6 +38,11 @@ static inline void khugepaged_exit(struct mm_struct *mm) if (test_bit(MMF_VM_HUGEPAGE, &mm->flags)) __khugepaged_exit(mm); } + +#ifndef arch_hugepage_vma_revalidate +#define arch_hugepage_vma_revalidate(vma, address) 1 +#endif + #else /* CONFIG_TRANSPARENT_HUGEPAGE */ static inline void khugepaged_fork(struct mm_struct *mm, struct mm_struct *oldmm) { diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 2b219acb528e..cb9a9ddb4d86 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -935,6 +935,10 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, */ if (expect_anon && (!(*vmap)->anon_vma || !vma_is_anonymous(*vmap))) return SCAN_PAGE_ANON; + + if (!arch_hugepage_vma_revalidate(vma, address)) + return SCAN_VMA_CHECK; + return SCAN_SUCCEED; } From patchwork Thu Jan 25 16:42:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531414 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7691CC48260 for ; Thu, 25 Jan 2024 16:48:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=7a7Hz01MWNQ3o2YMRF9WTsuU9TGCoOKPlJMcC6+uv98=; b=p1VJFpa8Q99l5b 5fC2DSFLkit9p9Im12SVZeQjNWDWveHTF6TS2obV8GyWL2xrc7seY9okXiuXNif/H0X9jJ1rf7LbC BYHyyAflJsoggcsqhdOrCIbNhieBOaPj27VkCDzzd4N105zqymVpCeDsMaZcyA3TOz7+Muz9JZJzR cTHsRSrQmYpd8zlJxP3meFGUktNQDrjRFTx83q1mGdTi4RUrKz6k59xApL1lTXw6UFBXhoDqL3TPD T9XV0czrXDEfFqdcWpozP9bci8GU/borqPdRTv9A3oVthaPlDn9lbOJaS4WfrUs24GpPG9lQAlOl/ Ave8/vdYAla996RqtBbQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2tt-00000000vXm-0HBH; Thu, 25 Jan 2024 16:48:25 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2rR-00000000uBC-1Egm for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:46:04 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4F38316F8; Thu, 25 Jan 2024 08:46:36 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5D8F93F5A1; Thu, 25 Jan 2024 08:45:46 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 32/35] KVM: arm64: mte: Reserve tag storage for virtual machines with MTE Date: Thu, 25 Jan 2024 16:42:53 +0000 Message-Id: <20240125164256.4147-33-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084554_134590_E959316C X-CRM114-Status: GOOD ( 20.11 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org KVM allows MTE enabled VMs to be created when the backing VMA does not have MTE enabled. As a result, pages allocated for the virtual machine's memory won't have tag storage reserved. Try to reserve tag storage the first time the page is accessed by the guest. This is similar to how pages mapped without tag storage in an MTE VMA are handled. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * New patch. arch/arm64/include/asm/mte_tag_storage.h | 10 ++++++ arch/arm64/include/asm/pgtable.h | 7 +++- arch/arm64/kvm/mmu.c | 43 ++++++++++++++++++++++++ arch/arm64/mm/fault.c | 2 +- 4 files changed, 60 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/mte_tag_storage.h b/arch/arm64/include/asm/mte_tag_storage.h index 40590a8c3748..32940ef7bcdf 100644 --- a/arch/arm64/include/asm/mte_tag_storage.h +++ b/arch/arm64/include/asm/mte_tag_storage.h @@ -34,6 +34,8 @@ void free_tag_storage(struct page *page, int order); bool page_tag_storage_reserved(struct page *page); bool page_is_tag_storage(struct page *page); +int replace_folio_with_tagged(struct folio *folio); + vm_fault_t handle_folio_missing_tag_storage(struct folio *folio, struct vm_fault *vmf, bool *map_pte); vm_fault_t mte_try_transfer_swap_tags(swp_entry_t entry, struct page *page); @@ -67,6 +69,14 @@ static inline bool page_tag_storage_reserved(struct page *page) { return true; } +static inline bool page_is_tag_storage(struct page *page) +{ + return false; +} +static inline int replace_folio_with_tagged(struct folio *folio) +{ + return -EINVAL; +} #endif /* CONFIG_ARM64_MTE_TAG_STORAGE */ #endif /* !__ASSEMBLY__ */ diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index d0473538c926..7f89606ad617 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -1108,7 +1108,12 @@ static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) #define __HAVE_ARCH_FREE_PAGES_PREPARE static inline void arch_free_pages_prepare(struct page *page, int order) { - if (tag_storage_enabled() && page_mte_tagged(page)) + /* + * KVM can free a page after tag storage has been reserved and before is + * marked as tagged, hence use page_tag_storage_reserved() instead of + * page_mte_tagged() to check for tag storage. + */ + if (tag_storage_enabled() && page_tag_storage_reserved(page)) free_tag_storage(page, order); } diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index b7517c4a19c4..986a9544228d 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1361,6 +1361,8 @@ static void sanitise_mte_tags(struct kvm *kvm, kvm_pfn_t pfn, if (!kvm_has_mte(kvm)) return; + WARN_ON_ONCE(tag_storage_enabled() && !page_tag_storage_reserved(pfn_to_page(pfn))); + for (i = 0; i < nr_pages; i++, page++) { if (try_page_mte_tagging(page)) { mte_clear_page_tags(page_address(page)); @@ -1374,6 +1376,39 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma) return vma->vm_flags & VM_MTE_ALLOWED; } +/* + * Called with an elevated reference on the pfn. If successful, the reference + * count is not changed. If it returns an error, the elevated reference is + * dropped. + */ +static int kvm_mte_reserve_tag_storage(kvm_pfn_t pfn) +{ + struct folio *folio; + int ret; + + folio = page_folio(pfn_to_page(pfn)); + + if (page_tag_storage_reserved(folio_page(folio, 0))) + return 0; + + if (page_is_tag_storage(folio_page(folio, 0))) + goto migrate; + + ret = reserve_tag_storage(folio_page(folio, 0), folio_order(folio), + GFP_HIGHUSER_MOVABLE); + if (!ret) + return 0; + +migrate: + replace_folio_with_tagged(folio); + /* + * If migration succeeds, the fault needs to be replayed because 'pfn' + * has been unmapped. If migration fails, KVM will try to reserve tag + * storage again by replaying the fault. + */ + return -EAGAIN; +} + static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_memory_slot *memslot, unsigned long hva, bool fault_is_perm) @@ -1488,6 +1523,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, write_fault, &writable, NULL); + if (pfn == KVM_PFN_ERR_HWPOISON) { kvm_send_hwpoison_signal(hva, vma_shift); return 0; @@ -1518,6 +1554,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (exec_fault && device) return -ENOEXEC; + if (tag_storage_enabled() && !fault_is_perm && !device && + kvm_has_mte(kvm) && mte_allowed) { + ret = kvm_mte_reserve_tag_storage(pfn); + if (ret) + return ret == -EAGAIN ? 0 : ret; + } + read_lock(&kvm->mmu_lock); pgt = vcpu->arch.hw_mmu->pgt; if (mmu_invalidate_retry(kvm, mmu_seq)) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 01450ab91a87..5c12232bdf0b 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -974,7 +974,7 @@ void tag_clear_highpage(struct page *page) * Called with an elevated reference on the folio. * Returns with the elevated reference dropped. */ -static int replace_folio_with_tagged(struct folio *folio) +int replace_folio_with_tagged(struct folio *folio) { struct migration_target_control mtc = { .nid = NUMA_NO_NODE, From patchwork Thu Jan 25 16:42:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531415 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E7E4AC47258 for ; Thu, 25 Jan 2024 16:48:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=B4+wvjxxwdVw8ivCwnkzPSZEn0rdAjs1/rj9BD6knyA=; b=IGn51n7N5JzojK Gck2dVB0ViFXxVZEKarPp5qdj/qZB/96qMApzk9YUDdF+6/6dostK4QzlNldNmY6jkMuQELDdXrAo cMUHibDf5L4X8d2sKz232ZVCT6Db9llfK+MA0JG4b9mqS2MX8Mn4PYN4CMVa+ONuhqS9IciTRU54b /kdzpLCSMmg0A4WW8/xMcbAk2oMw9PkVGThZeZLtnr2iFBrKRo2QP+ldIRRpHqEOp6H+M9qLEbpN4 Nqhd1Xr4encSg1cqMeFAk2UQ5nkjG5MgU1HEwwFw0U9p2l7sriFjjGrLr+BX2eJyz3Q59i5kU8q/7 zpY23JRS0I6UXiLRpR4A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2tz-00000000vg4-3L8P; Thu, 25 Jan 2024 16:48:31 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2rW-00000000uHG-3QqA for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:46:07 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1422C1758; Thu, 25 Jan 2024 08:46:42 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2B5AC3F5A1; Thu, 25 Jan 2024 08:45:52 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 33/35] KVM: arm64: mte: Introduce VM_MTE_KVM VMA flag Date: Thu, 25 Jan 2024 16:42:54 +0000 Message-Id: <20240125164256.4147-34-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084559_490212_C9C34CCC X-CRM114-Status: GOOD ( 24.58 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Tag storage pages mapped by the host in a VM with MTE enabled are migrated when they are first accessed by the guest. This introduces latency spikes for memory accesses made by the guest. Tag storage pages can be mapped in the guest memory when the VM_MTE VMA flag is not set. Introduce a new VMA flag, VM_MTE_KVM, to stop tag storage pages from being mapped in a VM with MTE enabled. The flag is different from VM_MTE, because the pages from the VMA won't be mapped as tagged in the host, and host's userspace can continue to access the guest memory as Untagged. The flag's only function is to instruct the page allocator to treat the allocation as tagged, so tag storage pages aren't used. The page allocator will also try to reserve tag storage for the new page, which can speed up stage 2 aborts further if the VMM has accessed the memory before the guest. For example, qemu and kvmtool will benefit from this change because the guest image is copied after the memslot is created. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * New patch. arch/arm64/kvm/mmu.c | 77 ++++++++++++++++++++++++++++++++++++++++++- arch/arm64/mm/fault.c | 2 +- include/linux/mm.h | 2 ++ 3 files changed, 79 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 986a9544228d..45c57c4b9fe2 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1420,7 +1420,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, unsigned long mmu_seq; struct kvm *kvm = vcpu->kvm; struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache; - struct vm_area_struct *vma; + struct vm_area_struct *vma, *old_vma; short vma_shift; gfn_t gfn; kvm_pfn_t pfn; @@ -1428,6 +1428,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, long vma_pagesize, fault_granule; enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; struct kvm_pgtable *pgt; + bool vma_has_kvm_mte = false; if (fault_is_perm) fault_granule = kvm_vcpu_trap_get_perm_fault_granule(vcpu); @@ -1506,6 +1507,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, gfn = fault_ipa >> PAGE_SHIFT; mte_allowed = kvm_vma_mte_allowed(vma); + vma_has_kvm_mte = !!(vma->vm_flags & VM_MTE_KVM); + old_vma = vma; /* Don't use the VMA after the unlock -- it may have vanished */ vma = NULL; @@ -1521,6 +1524,27 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, mmu_seq = vcpu->kvm->mmu_invalidate_seq; mmap_read_unlock(current->mm); + /* + * If the VMA was created after the memslot, it doesn't have the + * VM_MTE_KVM flag set. + */ + if (unlikely(tag_storage_enabled() && !fault_is_perm && + kvm_has_mte(kvm) && mte_allowed && !vma_has_kvm_mte)) { + mmap_write_lock(current->mm); + vma = vma_lookup(current->mm, hva); + /* The VMA was changed, replay the fault. */ + if (vma != old_vma) { + mmap_write_unlock(current->mm); + return 0; + } + if (!(vma->vm_flags & VM_MTE_KVM)) { + vma_start_write(vma); + vm_flags_reset(vma, vma->vm_flags | VM_MTE_KVM); + } + vma = NULL; + mmap_write_unlock(current->mm); + } + pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, write_fault, &writable, NULL); @@ -1986,6 +2010,40 @@ int __init kvm_mmu_init(u32 *hyp_va_bits) return err; } +static int kvm_set_clear_kvm_mte_vma(const struct kvm_memory_slot *memslot, bool set) +{ + struct vm_area_struct *vma; + hva_t hva, memslot_end; + int ret = 0; + + hva = memslot->userspace_addr; + memslot_end = hva + (memslot->npages << PAGE_SHIFT); + + mmap_write_lock(current->mm); + + do { + vma = find_vma_intersection(current->mm, hva, memslot_end); + if (!vma) + break; + if (!kvm_vma_mte_allowed(vma)) + continue; + if (set) { + if (!(vma->vm_flags & VM_MTE_KVM)) { + vma_start_write(vma); + vm_flags_reset(vma, vma->vm_flags | VM_MTE_KVM); + } + } else if (vma->vm_flags & VM_MTE_KVM) { + vma_start_write(vma); + vm_flags_reset(vma, vma->vm_flags & ~VM_MTE_KVM); + } + hva = min(memslot_end, vma->vm_end); + } while (hva < memslot_end); + + mmap_write_unlock(current->mm); + + return ret; +} + void kvm_arch_commit_memory_region(struct kvm *kvm, struct kvm_memory_slot *old, const struct kvm_memory_slot *new, @@ -1993,6 +2051,23 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, { bool log_dirty_pages = new && new->flags & KVM_MEM_LOG_DIRTY_PAGES; + if (kvm_has_mte(kvm) && change != KVM_MR_FLAGS_ONLY) { + switch (change) { + case KVM_MR_CREATE: + kvm_set_clear_kvm_mte_vma(new, true); + break; + case KVM_MR_DELETE: + kvm_set_clear_kvm_mte_vma(old, false); + break; + case KVM_MR_MOVE: + kvm_set_clear_kvm_mte_vma(old, false); + kvm_set_clear_kvm_mte_vma(new, true); + break; + default: + WARN(true, "Unknown memslot change"); + } + } + /* * At this point memslot has been committed and there is an * allocated dirty_bitmap[], dirty pages will be tracked while the diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 5c12232bdf0b..f4ca3ba8dde7 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -947,7 +947,7 @@ NOKPROBE_SYMBOL(do_debug_exception); */ gfp_t arch_calc_vma_gfp(struct vm_area_struct *vma, gfp_t gfp) { - if (vma->vm_flags & VM_MTE) + if (vma->vm_flags & (VM_MTE |VM_MTE_KVM)) return __GFP_TAGGED; return 0; } diff --git a/include/linux/mm.h b/include/linux/mm.h index f5a97dec5169..924aa7c26ec9 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -375,9 +375,11 @@ extern unsigned int kobjsize(const void *objp); #if defined(CONFIG_ARM64_MTE) # define VM_MTE VM_HIGH_ARCH_0 /* Use Tagged memory for access control */ # define VM_MTE_ALLOWED VM_HIGH_ARCH_1 /* Tagged memory permitted */ +# define VM_MTE_KVM VM_HIGH_ARCH_2 /* VMA is mapped in a virtual machine with MTE */ #else # define VM_MTE VM_NONE # define VM_MTE_ALLOWED VM_NONE +# define VM_MTE_KVM VM_NONE #endif #ifndef VM_GROWSUP From patchwork Thu Jan 25 16:42:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531416 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C1950C47422 for ; Thu, 25 Jan 2024 16:48:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=ywT+MeGUP8aZeuK+m0wLO3pWK3Fxe6vOBiYy6cDTnfE=; b=zyyiGNQzaiMUbC Q9FMtP1T4YMZehBYF4z38bvDmqsZ9voaLlkZutEollcSSamJ/jn034VQhqTnejtRlB6PBlDtouF3X Wvkj49aVZOzDVAoWG9OFhEeh1KNBtkc7kt1hWy5uMqI8SKQlKTp1gtQl/mm/1vhHQwdr4eicHPAqu okuRGrwZ0VCOr3F1s7zlAjt0lq5PFPacYwJqjo6ZcaASOvSt8za2n7tX0ms++/jr+ypnw8yqHdfqa MLa7qUURGtDBWp8m6D3VFv4syZ7lzeCuBPx7H2S6R24QnXyPw6gqWNT6dJrWVU9UoPdSWnazaq1Yj 12DfnA5/m/pT2ievGyXQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2u4-00000000vl7-3vo2; Thu, 25 Jan 2024 16:48:36 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2rc-00000000uN3-2Sgl for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:46:09 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CBC93175A; Thu, 25 Jan 2024 08:46:47 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E1F9D3F5A1; Thu, 25 Jan 2024 08:45:57 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 34/35] arm64: mte: Enable dynamic tag storage management Date: Thu, 25 Jan 2024 16:42:55 +0000 Message-Id: <20240125164256.4147-35-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084604_937487_FC175CE8 X-CRM114-Status: UNSURE ( 8.14 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Everything is in place, enable tag storage management. Signed-off-by: Alexandru Elisei --- arch/arm64/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 088e30fc6d12..95c153705a2c 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -2084,7 +2084,7 @@ config ARM64_MTE if ARM64_MTE config ARM64_MTE_TAG_STORAGE - bool + bool "MTE tag storage management" select ARCH_HAS_FAULT_ON_ACCESS select CONFIG_CMA help From patchwork Thu Jan 25 16:42:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 13531417 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A0955C47258 for ; Thu, 25 Jan 2024 16:49:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=rfwc7u8GjTyhksetSCT5LSIvzBYOORbs9F6qr8CHtPQ=; b=dUGM37qedtG0b3 CdcPH+GLtwpbxRcv0HcNELbDNjorqGZH0W/nU7jbMVI98Nf9HfFEMnZ+Z5S2sOCzAGMakN52prqE6 c9K7z+YMIBeUH8ZK69xKliRUS6UjoV/ux5H73ZVde2mnRlSccE+PA10y6th+6jGHy8+inpt+SZkUr /WaDLc0wU+V4Q4d2zw+92dgPluPGzdPMXj+lfNnWFEKhhE1S3C3S5Ku1Bpste5rMGdaWzKQ02Gb6G MfYWyEYG6occXcULzgO6qgnetkNIzkd8Yt0cdWHGnXRTXjLmKnPtLHu8TCwUgh3vwHhvxGjE4dMIQ JKDU/P7MWmeqivB+kuLw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2uC-00000000vqz-45xw; Thu, 25 Jan 2024 16:48:45 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rT2ri-00000000uSw-1Zmv for linux-arm-kernel@lists.infradead.org; Thu, 25 Jan 2024 16:46:12 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 908D41762; Thu, 25 Jan 2024 08:46:53 -0800 (PST) Received: from e121798.cable.virginm.net (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A7DED3F5A1; Thu, 25 Jan 2024 08:46:03 -0800 (PST) From: Alexandru Elisei To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 35/35] HACK! arm64: dts: Add fake tag storage to fvp-base-revc.dts Date: Thu, 25 Jan 2024 16:42:56 +0000 Message-Id: <20240125164256.4147-36-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240125_084610_593304_289BFAF2 X-CRM114-Status: UNSURE ( 9.46 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Faking a tag storage region for FVP is useful for testing. Signed-off-by: Alexandru Elisei --- Changes since rfc v2: * New patch, not intended to be merged. arch/arm64/boot/dts/arm/fvp-base-revc.dts | 42 +++++++++++++++++++++-- 1 file changed, 39 insertions(+), 3 deletions(-) diff --git a/arch/arm64/boot/dts/arm/fvp-base-revc.dts b/arch/arm64/boot/dts/arm/fvp-base-revc.dts index 60472d65a355..e9f44420cb62 100644 --- a/arch/arm64/boot/dts/arm/fvp-base-revc.dts +++ b/arch/arm64/boot/dts/arm/fvp-base-revc.dts @@ -165,10 +165,30 @@ C1_L2: l2-cache1 { }; }; - memory@80000000 { + memory0: memory@80000000 { device_type = "memory"; - reg = <0x00000000 0x80000000 0 0x80000000>, - <0x00000008 0x80000000 0 0x80000000>; + reg = <0x00 0x80000000 0x00 0x80000000>; + numa-node-id = <0x00>; + }; + + /* tags0 */ + tags_memory0: memory@8f8000000 { + device_type = "memory"; + reg = <0x08 0xf8000000 0x00 0x4000000>; + numa-node-id = <0x00>; + }; + + memory1: memory@880000000 { + device_type = "memory"; + reg = <0x08 0x80000000 0x00 0x78000000>; + numa-node-id = <0x01>; + }; + + /* tags1 */ + tags_memory1: memory@8fc00000 { + device_type = "memory"; + reg = <0x08 0xfc000000 0x00 0x3c00000>; + numa-node-id = <0x01>; }; reserved-memory { @@ -183,6 +203,22 @@ vram: vram@18000000 { reg = <0x00000000 0x18000000 0 0x00800000>; no-map; }; + + tags0: tag-storage@8f8000000 { + compatible = "arm,mte-tag-storage"; + reg = <0x08 0xf8000000 0x00 0x4000000>; + block-size = <0x1000>; + tagged-memory = <&memory0>; + reusable; + }; + + tags1: tag-storage@8fc00000 { + compatible = "arm,mte-tag-storage"; + reg = <0x08 0xfc000000 0x00 0x3c00000>; + block-size = <0x1000>; + tagged-memory = <&memory1>; + reusable; + }; }; gic: interrupt-controller@2f000000 {