From patchwork Wed Nov 24 19:20:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Catalin Marinas X-Patchwork-Id: 12637571 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50EA5C433F5 for ; Wed, 24 Nov 2021 19:20:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243829AbhKXTXn (ORCPT ); Wed, 24 Nov 2021 14:23:43 -0500 Received: from mail.kernel.org ([198.145.29.99]:56634 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243722AbhKXTXm (ORCPT ); Wed, 24 Nov 2021 14:23:42 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 62DD960F45; Wed, 24 Nov 2021 19:20:29 +0000 (UTC) From: Catalin Marinas To: Linus Torvalds , Josef Bacik , David Sterba Cc: Andreas Gruenbacher , Al Viro , Andrew Morton , Will Deacon , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-btrfs@vger.kernel.org Subject: [PATCH 1/3] mm: Introduce fault_in_exact_writeable() to probe for sub-page faults Date: Wed, 24 Nov 2021 19:20:22 +0000 Message-Id: <20211124192024.2408218-2-catalin.marinas@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211124192024.2408218-1-catalin.marinas@arm.com> References: <20211124192024.2408218-1-catalin.marinas@arm.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On hardware with features like arm64 MTE or SPARC ADI, an access fault can be triggered at sub-page granularity. Depending on how the fault_in_*() functions are used, the caller can get into a live-lock by continuously retrying the fault-in on an address different from the one where the uaccess failed. In the majority of cases progress is ensured by the following conditions: 1. copy_{to,from}_user() guarantees at least one byte access if the user address is not faulting; 2. The fault_in_*() is attempted on the next address that could not be accessed by copy_*_user(). In the places where the above conditions are not met or the fault-in/uaccess loop does not have a mechanism to bail out, the fault_in_exact_writeable() ensures that the arch code will probe the range in question at a sub-page fault granularity (e.g. 16 bytes for arm64 MTE). For large ranges, this is significantly more expensive than the non-exact versions which probe a single byte in each page or use GUP. The architecture code has to select ARCH_HAS_SUBPAGE_FAULTS and implement probe_user_writeable(). Signed-off-by: Catalin Marinas --- arch/Kconfig | 7 +++++++ include/linux/pagemap.h | 1 + include/linux/uaccess.h | 21 +++++++++++++++++++++ mm/gup.c | 19 +++++++++++++++++++ 4 files changed, 48 insertions(+) diff --git a/arch/Kconfig b/arch/Kconfig index 26b8ed11639d..02502b3362aa 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -27,6 +27,13 @@ config HAVE_IMA_KEXEC config SET_FS bool +config ARCH_HAS_SUBPAGE_FAULTS + bool + help + Select if the architecture can check permissions at sub-page + granularity (e.g. arm64 MTE). The probe_user_*() functions + must be implemented. + config HOTPLUG_SMT bool diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 1a0c646eb6ff..4bae32d6b2e3 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -910,6 +910,7 @@ void folio_add_wait_queue(struct folio *folio, wait_queue_entry_t *waiter); * Fault in userspace address range. */ size_t fault_in_writeable(char __user *uaddr, size_t size); +size_t fault_in_exact_writeable(char __user *uaddr, size_t size); size_t fault_in_safe_writeable(const char __user *uaddr, size_t size); size_t fault_in_readable(const char __user *uaddr, size_t size); diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index ac0394087f7d..08169fb38905 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -271,6 +271,27 @@ static inline bool pagefault_disabled(void) */ #define faulthandler_disabled() (pagefault_disabled() || in_atomic()) +#ifndef CONFIG_ARCH_HAS_SUBPAGE_FAULTS +/** + * probe_user_writable: probe for sub-page faults in the user range + * @uaddr: start of address range + * @size: size of address range + * + * Returns the number of bytes not accessible (like copy_to_user() and + * copy_from_user()). + * + * Architectures that can generate sub-page faults (e.g. arm64 MTE) should + * implement this function. It is expected that the caller checked for the + * write permission of each page in the range either by put_user() or GUP. + * The architecture port can implement a more efficient get_user() probing of + * the range if sub-page faults are triggered by either a load or store. + */ +static inline size_t probe_user_writable(void __user *uaddr, size_t size) +{ + return 0; +} +#endif + #ifndef ARCH_HAS_NOCACHE_UACCESS static inline __must_check unsigned long diff --git a/mm/gup.c b/mm/gup.c index 2c51e9748a6a..1c360d9fdc8e 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1693,6 +1693,25 @@ size_t fault_in_writeable(char __user *uaddr, size_t size) } EXPORT_SYMBOL(fault_in_writeable); +/** + * fault_in_exact_writeable - fault in userspace address range for writing, + * potentially checking for sub-page faults + * @uaddr: start of address range + * @size: size of address range + * + * Returns the number of bytes not faulted in (like copy_to_user() and + * copy_from_user()). + */ +size_t fault_in_exact_writeable(char __user *uaddr, size_t size) +{ + size_t accessible = size - fault_in_writeable(uaddr, size); + + if (accessible) + accessible -= probe_user_writable(uaddr, accessible); + return size - accessible; +} +EXPORT_SYMBOL(fault_in_exact_writeable); + /* * fault_in_safe_writeable - fault in an address range for writing * @uaddr: start of address range From patchwork Wed Nov 24 19:20:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Catalin Marinas X-Patchwork-Id: 12637573 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02DB3C4332F for ; Wed, 24 Nov 2021 19:20:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244383AbhKXTXr (ORCPT ); Wed, 24 Nov 2021 14:23:47 -0500 Received: from mail.kernel.org ([198.145.29.99]:56660 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243912AbhKXTXn (ORCPT ); Wed, 24 Nov 2021 14:23:43 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id CAA2460E05; Wed, 24 Nov 2021 19:20:31 +0000 (UTC) From: Catalin Marinas To: Linus Torvalds , Josef Bacik , David Sterba Cc: Andreas Gruenbacher , Al Viro , Andrew Morton , Will Deacon , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-btrfs@vger.kernel.org Subject: [PATCH 2/3] arm64: Add support for sub-page faults user probing Date: Wed, 24 Nov 2021 19:20:23 +0000 Message-Id: <20211124192024.2408218-3-catalin.marinas@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211124192024.2408218-1-catalin.marinas@arm.com> References: <20211124192024.2408218-1-catalin.marinas@arm.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org With MTE, even if the pte allows an access, a mismatched tag somewhere within a page can still cause a fault. Select ARCH_HAS_SUBPAGE_FAULTS if MTE is enabled and implement probe_user_writeable(). Signed-off-by: Catalin Marinas --- arch/arm64/Kconfig | 1 + arch/arm64/include/asm/uaccess.h | 33 ++++++++++++++++++++++++++++++++ 2 files changed, 34 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index c4207cf9bb17..dff89fd0d817 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1777,6 +1777,7 @@ config ARM64_MTE depends on AS_HAS_LSE_ATOMICS # Required for tag checking in the uaccess routines depends on ARM64_PAN + select ARCH_HAS_SUBPAGE_FAULTS select ARCH_USES_HIGH_VMA_FLAGS help Memory Tagging (part of the ARMv8.5 Extensions) provides diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h index 6e2e0b7031ab..4bf947e9f9bf 100644 --- a/arch/arm64/include/asm/uaccess.h +++ b/arch/arm64/include/asm/uaccess.h @@ -445,4 +445,37 @@ static inline int __copy_from_user_flushcache(void *dst, const void __user *src, } #endif +#ifdef CONFIG_ARCH_HAS_SUBPAGE_FAULTS +static inline size_t __mte_probe_user_range(const char __user *uaddr, + size_t size) +{ + const char __user *end = uaddr + size; + int err = 0; + char val; + + uaddr = PTR_ALIGN_DOWN(uaddr, MTE_GRANULE_SIZE); + while (uaddr < end) { + /* + * A read is sufficient for MTE, the caller should have probed + * for the pte write permission. + */ + __raw_get_user(val, uaddr, err); + if (err) + return end - uaddr; + uaddr += MTE_GRANULE_SIZE; + } + (void)val; + + return 0; +} + +static inline size_t probe_user_writable(const void __user *uaddr, + size_t size) +{ + if (!system_supports_mte()) + return 0; + return __mte_probe_user_range(uaddr, size); +} +#endif /* CONFIG_ARCH_HAS_SUBPAGE_FAULTS */ + #endif /* __ASM_UACCESS_H */ From patchwork Wed Nov 24 19:20:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Catalin Marinas X-Patchwork-Id: 12637577 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A649EC433F5 for ; Wed, 24 Nov 2021 19:20:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244741AbhKXTXu (ORCPT ); Wed, 24 Nov 2021 14:23:50 -0500 Received: from mail.kernel.org ([198.145.29.99]:56706 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244333AbhKXTXq (ORCPT ); Wed, 24 Nov 2021 14:23:46 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 3EAB560F6B; Wed, 24 Nov 2021 19:20:34 +0000 (UTC) From: Catalin Marinas To: Linus Torvalds , Josef Bacik , David Sterba Cc: Andreas Gruenbacher , Al Viro , Andrew Morton , Will Deacon , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-btrfs@vger.kernel.org Subject: [PATCH 3/3] btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults Date: Wed, 24 Nov 2021 19:20:24 +0000 Message-Id: <20211124192024.2408218-4-catalin.marinas@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211124192024.2408218-1-catalin.marinas@arm.com> References: <20211124192024.2408218-1-catalin.marinas@arm.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Commit a48b73eca4ce ("btrfs: fix potential deadlock in the search ioctl") addressed a lockdep warning by pre-faulting the user pages and attempting the copy_to_user_nofault() in an infinite loop. On architectures like arm64 with MTE, an access may fault within a page at a location different from what fault_in_writeable() probed. Since the sk_offset is rewound to the previous struct btrfs_ioctl_search_header boundary, there is no guaranteed forward progress and search_ioctl() may live-lock. Use fault_in_exact_writeable() instead which probes the entire user buffer for faults at sub-page granularity. Signed-off-by: Catalin Marinas Reported-by: Al Viro Acked-by: David Sterba --- fs/btrfs/ioctl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 92138ac2a4e2..23167c72fa47 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2223,7 +2223,8 @@ static noinline int search_ioctl(struct inode *inode, while (1) { ret = -EFAULT; - if (fault_in_writeable(ubuf + sk_offset, *buf_size - sk_offset)) + if (fault_in_exact_writeable(ubuf + sk_offset, + *buf_size - sk_offset)) break; ret = btrfs_search_forward(root, &key, path, sk->min_transid);