Message ID | 20250308023314.3981455-1-pcc@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | string: Disable read_word_at_a_time() optimizations if kernel MTE is enabled | expand |
On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote: > The optimized strscpy() and dentry_string_cmp() routines will read 8 > unaligned bytes at a time via the function read_word_at_a_time(), but > this is incompatible with MTE which will fault on a partially invalid > read. The attributes on read_word_at_a_time() that disable KASAN are > invisible to the CPU so they have no effect on MTE. Let's fix the > bug for now by disabling the optimizations if the kernel is built > with HW tag-based KASAN and consider improvements for followup changes. Why is faulting on a partially invalid read a problem? It's still invalid, so ... it should fault, yes? What am I missing? > > Signed-off-by: Peter Collingbourne <pcc@google.com> > Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548 > Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS") > Cc: stable@vger.kernel.org > --- > fs/dcache.c | 2 +- > lib/string.c | 3 ++- > 2 files changed, 3 insertions(+), 2 deletions(-) Why are DCACHE_WORD_ACCESS and HAVE_EFFICIENT_UNALIGNED_ACCESS separate things? I can see at least one place where it's directly tied: arch/arm/Kconfig:58: select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS Would it make sense to sort this out so that KASAN_HW_TAGS can be taken into account at the Kconfig level instead? > diff --git a/fs/dcache.c b/fs/dcache.c > index e3634916ffb93..71f0830ac5e69 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -223,7 +223,7 @@ fs_initcall(init_fs_dcache_sysctls); > * Compare 2 name strings, return 0 if they match, otherwise non-zero. > * The strings are both count bytes long, and count is non-zero. > */ > -#ifdef CONFIG_DCACHE_WORD_ACCESS > +#if defined(CONFIG_DCACHE_WORD_ACCESS) && !defined(CONFIG_KASAN_HW_TAGS) Why not also the word_at_a_time use in fs/namei.c and lib/siphash.c? For reference, here are the DCACHE_WORD_ACCESS places: arch/arm/Kconfig:58: select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS arch/arm64/Kconfig:137: select DCACHE_WORD_ACCESS arch/powerpc/Kconfig:192: select DCACHE_WORD_ACCESS if PPC64 && CPU_LITTLE_ENDIAN arch/riscv/Kconfig:934: select DCACHE_WORD_ACCESS if MMU arch/s390/Kconfig:154: select DCACHE_WORD_ACCESS if !KMSAN arch/x86/Kconfig:160: select DCACHE_WORD_ACCESS if !KMSAN arch/x86/um/Kconfig:12: select DCACHE_WORD_ACCESS > > #include <asm/word-at-a-time.h> > /* > diff --git a/lib/string.c b/lib/string.c > index eb4486ed40d25..9a43a3824d0d7 100644 > --- a/lib/string.c > +++ b/lib/string.c > @@ -119,7 +119,8 @@ ssize_t sized_strscpy(char *dest, const char *src, size_t count) > if (count == 0 || WARN_ON_ONCE(count > INT_MAX)) > return -E2BIG; > > -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && \ > + !defined(CONFIG_KASAN_HW_TAGS) There are lots more places checking CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS... Why only here? And the Kconfigs since I was comparing these against DCACHE_WORD_ACCESS arch/arc/Kconfig:352: select HAVE_EFFICIENT_UNALIGNED_ACCESS arch/arm/Kconfig:107: select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU arch/arm64/Kconfig:222: select HAVE_EFFICIENT_UNALIGNED_ACCESS arch/loongarch/Kconfig:140: select HAVE_EFFICIENT_UNALIGNED_ACCESS if !ARCH_STRICT_ALIGN arch/m68k/Kconfig:33: select HAVE_EFFICIENT_UNALIGNED_ACCESS if !CPU_HAS_NO_UNALIGNED arch/powerpc/Kconfig:246: select HAVE_EFFICIENT_UNALIGNED_ACCESS arch/riscv/Kconfig:935: select HAVE_EFFICIENT_UNALIGNED_ACCESS arch/s390/Kconfig:197: select HAVE_EFFICIENT_UNALIGNED_ACCESS arch/x86/Kconfig:238: select HAVE_EFFICIENT_UNALIGNED_ACCESS arch/x86/um/Kconfig:13: select HAVE_EFFICIENT_UNALIGNED_ACCESS > /* > * If src is unaligned, don't cross a page boundary, > * since we don't know if the next page is mapped. > -- > 2.49.0.rc0.332.g42c0ae87b1-goog > -Kees
On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote: > The optimized strscpy() and dentry_string_cmp() routines will read 8 > unaligned bytes at a time via the function read_word_at_a_time(), but > this is incompatible with MTE which will fault on a partially invalid > read. The attributes on read_word_at_a_time() that disable KASAN are > invisible to the CPU so they have no effect on MTE. Let's fix the > bug for now by disabling the optimizations if the kernel is built > with HW tag-based KASAN and consider improvements for followup changes. > > Signed-off-by: Peter Collingbourne <pcc@google.com> > Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548 > Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS") > Cc: stable@vger.kernel.org Some time ago Vincenzo had an attempt at fixing this but neither of us got around to posting it. It's on top of 6.2 and not sure how cleanly it would rebase: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux devel/mte-strscpy Feel free to cherry-pick patches from above, rewrite them etc. > diff --git a/lib/string.c b/lib/string.c > index eb4486ed40d25..9a43a3824d0d7 100644 > --- a/lib/string.c > +++ b/lib/string.c > @@ -119,7 +119,8 @@ ssize_t sized_strscpy(char *dest, const char *src, size_t count) > if (count == 0 || WARN_ON_ONCE(count > INT_MAX)) > return -E2BIG; > > -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && \ > + !defined(CONFIG_KASAN_HW_TAGS) Assuming that no-one wants to ever use KASAN_HW_TAGS=y in production, this patch would do. Otherwise I'd rather use TCO around the access as per the last patch from Vincenzo above. Yet another option - use load_unaligned_zeropad() instead of read_word_at_a_time(), not sure how it changes the semantics of strscpy() in any way. This can be done in the arch code
On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote: > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote: > > The optimized strscpy() and dentry_string_cmp() routines will read 8 > > unaligned bytes at a time via the function read_word_at_a_time(), but > > this is incompatible with MTE which will fault on a partially invalid > > read. The attributes on read_word_at_a_time() that disable KASAN are > > invisible to the CPU so they have no effect on MTE. Let's fix the > > bug for now by disabling the optimizations if the kernel is built > > with HW tag-based KASAN and consider improvements for followup changes. > > Why is faulting on a partially invalid read a problem? It's still > invalid, so ... it should fault, yes? What am I missing? read_word_at_a_time() is used to read 8 bytes, potentially unaligned and beyond the end of string. The has_zero() function is then used to check where the string ends. For this uses, I think we can go with load_unaligned_zeropad() which handles a potential fault and pads the rest with zeroes. > > Signed-off-by: Peter Collingbourne <pcc@google.com> > > Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548 > > Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS") > > Cc: stable@vger.kernel.org > > --- > > fs/dcache.c | 2 +- > > lib/string.c | 3 ++- > > 2 files changed, 3 insertions(+), 2 deletions(-) > > Why are DCACHE_WORD_ACCESS and HAVE_EFFICIENT_UNALIGNED_ACCESS separate > things? I can see at least one place where it's directly tied: > > arch/arm/Kconfig:58: select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS DCACHE_WORD_ACCESS requires load_unaligned_zeropad() which handles the faults. For some reason, read_word_at_a_time() doesn't expect to fault and it is only used with HAVE_EFFICIENT_UNALIGNED_ACCESS. I guess arm32 only enabled load_unaligned_zeropad() on hardware that supports efficient unaligned accesses (v6 onwards), hence the dependency. > Would it make sense to sort this out so that KASAN_HW_TAGS can be taken > into account at the Kconfig level instead? I don't think we should play with config options but rather sort out the fault path (load_unaligned_zeropad) or disable MTE temporarily. I'd go with the former as long as read_word_at_a_time() is only used for strings in conjunction with has_zero(). I haven't checked.
On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote: > On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote: > > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote: > > > The optimized strscpy() and dentry_string_cmp() routines will read 8 > > > unaligned bytes at a time via the function read_word_at_a_time(), but > > > this is incompatible with MTE which will fault on a partially invalid > > > read. The attributes on read_word_at_a_time() that disable KASAN are > > > invisible to the CPU so they have no effect on MTE. Let's fix the > > > bug for now by disabling the optimizations if the kernel is built > > > with HW tag-based KASAN and consider improvements for followup changes. > > > > Why is faulting on a partially invalid read a problem? It's still > > invalid, so ... it should fault, yes? What am I missing? > > read_word_at_a_time() is used to read 8 bytes, potentially unaligned and > beyond the end of string. The has_zero() function is then used to check > where the string ends. For this uses, I think we can go with > load_unaligned_zeropad() which handles a potential fault and pads the > rest with zeroes. Agh, right, I keep forgetting that this can read past the end of the actual allocation. I'd agree, load_unaligned_zeropad() makes sense there. > > > > Signed-off-by: Peter Collingbourne <pcc@google.com> > > > Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548 > > > Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS") > > > Cc: stable@vger.kernel.org > > > --- > > > fs/dcache.c | 2 +- > > > lib/string.c | 3 ++- > > > 2 files changed, 3 insertions(+), 2 deletions(-) > > > > Why are DCACHE_WORD_ACCESS and HAVE_EFFICIENT_UNALIGNED_ACCESS separate > > things? I can see at least one place where it's directly tied: > > > > arch/arm/Kconfig:58: select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS > > DCACHE_WORD_ACCESS requires load_unaligned_zeropad() which handles the > faults. For some reason, read_word_at_a_time() doesn't expect to fault > and it is only used with HAVE_EFFICIENT_UNALIGNED_ACCESS. I guess arm32 > only enabled load_unaligned_zeropad() on hardware that supports > efficient unaligned accesses (v6 onwards), hence the dependency. > > > Would it make sense to sort this out so that KASAN_HW_TAGS can be taken > > into account at the Kconfig level instead? > > I don't think we should play with config options but rather sort out the > fault path (load_unaligned_zeropad) or disable MTE temporarily. I'd go > with the former as long as read_word_at_a_time() is only used for > strings in conjunction with has_zero(). I haven't checked. Okay, sounds good. (And with a mild thread-merge: yes, folks want to use KASAN_HW_TAGS=y in production.)
On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote: > On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote: > > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote: > > > The optimized strscpy() and dentry_string_cmp() routines will read 8 > > > unaligned bytes at a time via the function read_word_at_a_time(), but > > > this is incompatible with MTE which will fault on a partially invalid > > > read. The attributes on read_word_at_a_time() that disable KASAN are > > > invisible to the CPU so they have no effect on MTE. Let's fix the > > > bug for now by disabling the optimizations if the kernel is built > > > with HW tag-based KASAN and consider improvements for followup changes. > > > > Why is faulting on a partially invalid read a problem? It's still > > invalid, so ... it should fault, yes? What am I missing? > > read_word_at_a_time() is used to read 8 bytes, potentially unaligned and > beyond the end of string. The has_zero() function is then used to check > where the string ends. For this uses, I think we can go with > load_unaligned_zeropad() which handles a potential fault and pads the > rest with zeroes. If we only care about synchronous and asymmetric modes, that should be possible, but that won't work in asynchronous mode. In asynchronous mode the fault will accumulate into TFSR and will be detected later asynchronously where it cannot be related to its source and fixed up. That means that both read_word_at_a_time() and load_unaligned_zeropad() are dodgy in async mode. Can we somehow hang this off ARCH_HAS_SUBPAGE_FAULTS? ... and is there anything else that deliberately makes accesses that could straddle objects? Mark.
On Mon, Mar 10, 2025 at 06:13:58PM +0000, Mark Rutland wrote: > On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote: > > On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote: > > > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote: > > > > The optimized strscpy() and dentry_string_cmp() routines will read 8 > > > > unaligned bytes at a time via the function read_word_at_a_time(), but > > > > this is incompatible with MTE which will fault on a partially invalid > > > > read. The attributes on read_word_at_a_time() that disable KASAN are > > > > invisible to the CPU so they have no effect on MTE. Let's fix the > > > > bug for now by disabling the optimizations if the kernel is built > > > > with HW tag-based KASAN and consider improvements for followup changes. > > > > > > Why is faulting on a partially invalid read a problem? It's still > > > invalid, so ... it should fault, yes? What am I missing? > > > > read_word_at_a_time() is used to read 8 bytes, potentially unaligned and > > beyond the end of string. The has_zero() function is then used to check > > where the string ends. For this uses, I think we can go with > > load_unaligned_zeropad() which handles a potential fault and pads the > > rest with zeroes. > > If we only care about synchronous and asymmetric modes, that should be > possible, but that won't work in asynchronous mode. In asynchronous mode > the fault will accumulate into TFSR and will be detected later > asynchronously where it cannot be related to its source and fixed up. > > That means that both read_word_at_a_time() and load_unaligned_zeropad() > are dodgy in async mode. load_unaligned_zeropad() has a __mte_enable_tco_async() call to set PSTATE.TCO if in async mode, so that's covered. read_word_at_a_time() is indeed busted and I've had Vincezo's patches for a couple of years already, they just never made it to the list. > Can we somehow hang this off ARCH_HAS_SUBPAGE_FAULTS? We could, though that was mostly for user-space faults while in-kernel we'd only need something similar if KASAN_HW_TAGS. > ... and is there anything else that deliberately makes accesses that > could straddle objects? So far we only came across load_unaligned_zeropad() and read_word_at_a_time(). I'm not aware of anything else.
On Mon, Mar 10, 2025 at 06:40:11PM +0000, Catalin Marinas wrote: > On Mon, Mar 10, 2025 at 06:13:58PM +0000, Mark Rutland wrote: > > On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote: > > > On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote: > > > > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote: > > > > > The optimized strscpy() and dentry_string_cmp() routines will read 8 > > > > > unaligned bytes at a time via the function read_word_at_a_time(), but > > > > > this is incompatible with MTE which will fault on a partially invalid > > > > > read. The attributes on read_word_at_a_time() that disable KASAN are > > > > > invisible to the CPU so they have no effect on MTE. Let's fix the > > > > > bug for now by disabling the optimizations if the kernel is built > > > > > with HW tag-based KASAN and consider improvements for followup changes. > > > > > > > > Why is faulting on a partially invalid read a problem? It's still > > > > invalid, so ... it should fault, yes? What am I missing? > > > > > > read_word_at_a_time() is used to read 8 bytes, potentially unaligned and > > > beyond the end of string. The has_zero() function is then used to check > > > where the string ends. For this uses, I think we can go with > > > load_unaligned_zeropad() which handles a potential fault and pads the > > > rest with zeroes. > > > > If we only care about synchronous and asymmetric modes, that should be > > possible, but that won't work in asynchronous mode. In asynchronous mode > > the fault will accumulate into TFSR and will be detected later > > asynchronously where it cannot be related to its source and fixed up. > > > > That means that both read_word_at_a_time() and load_unaligned_zeropad() > > are dodgy in async mode. > > load_unaligned_zeropad() has a __mte_enable_tco_async() call to set > PSTATE.TCO if in async mode, so that's covered. read_word_at_a_time() is > indeed busted and I've had Vincezo's patches for a couple of years > already, they just never made it to the list. Sorry, I missed the __mte_{enable,disable}_tco_async() calls. So long as we're happy to omit the check in that case, that's fine. I was worried that ex_handler_load_unaligned_zeropad() might not do the right thing in response to a tag check fault (e.g. access the wrong 8 bytes), but it looks as though that's ok due to the way it generates the offset and the aligned pointer. If load_unaligned_zeropad() is handed a string that starts with an unexpected tag (and even if that starts off aligned), ex_handler_load_unaligned_zeropad() will access that and cause another tag check fault, which will be reported. Mark.
On Mon, Mar 10, 2025 at 07:37:32PM +0000, Mark Rutland wrote: > On Mon, Mar 10, 2025 at 06:40:11PM +0000, Catalin Marinas wrote: > > On Mon, Mar 10, 2025 at 06:13:58PM +0000, Mark Rutland wrote: > > > On Mon, Mar 10, 2025 at 05:37:50PM +0000, Catalin Marinas wrote: > > > > On Fri, Mar 07, 2025 at 07:36:31PM -0800, Kees Cook wrote: > > > > > On Fri, Mar 07, 2025 at 06:33:13PM -0800, Peter Collingbourne wrote: > > > > > > The optimized strscpy() and dentry_string_cmp() routines will read 8 > > > > > > unaligned bytes at a time via the function read_word_at_a_time(), but > > > > > > this is incompatible with MTE which will fault on a partially invalid > > > > > > read. The attributes on read_word_at_a_time() that disable KASAN are > > > > > > invisible to the CPU so they have no effect on MTE. Let's fix the > > > > > > bug for now by disabling the optimizations if the kernel is built > > > > > > with HW tag-based KASAN and consider improvements for followup changes. > > > > > > > > > > Why is faulting on a partially invalid read a problem? It's still > > > > > invalid, so ... it should fault, yes? What am I missing? > > > > > > > > read_word_at_a_time() is used to read 8 bytes, potentially unaligned and > > > > beyond the end of string. The has_zero() function is then used to check > > > > where the string ends. For this uses, I think we can go with > > > > load_unaligned_zeropad() which handles a potential fault and pads the > > > > rest with zeroes. > > > > > > If we only care about synchronous and asymmetric modes, that should be > > > possible, but that won't work in asynchronous mode. In asynchronous mode > > > the fault will accumulate into TFSR and will be detected later > > > asynchronously where it cannot be related to its source and fixed up. > > > > > > That means that both read_word_at_a_time() and load_unaligned_zeropad() > > > are dodgy in async mode. > > > > load_unaligned_zeropad() has a __mte_enable_tco_async() call to set > > PSTATE.TCO if in async mode, so that's covered. read_word_at_a_time() is > > indeed busted and I've had Vincezo's patches for a couple of years > > already, they just never made it to the list. > > Sorry, I missed the __mte_{enable,disable}_tco_async() calls. So long as > we're happy to omit the check in that case, that's fine. That was the easiest. Alternatively we can try to sync the TFSR before and after the load but with the ISBs, that's too expensive. We could also do a dummy one byte load before setting TCO. read_word_at_a_time() does have an explicit kasan_check_read() but last time I checked it has no effect on MTE. > I was worried that ex_handler_load_unaligned_zeropad() might not do the > right thing in response to a tag check fault (e.g. access the wrong 8 > bytes), but it looks as though that's ok due to the way it generates the > offset and the aligned pointer. > > If load_unaligned_zeropad() is handed a string that starts with an > unexpected tag (and even if that starts off aligned), > ex_handler_load_unaligned_zeropad() will access that and cause another > tag check fault, which will be reported. Yes, it will report an async tag check fault on the exit_to_kernel_mode() path _if_ load_unaligned_zeropad() triggered the fault for other reasons (end of page). It's slightly inconsistent, we could set TCO for the async case in ex_handler_load_unaligned_zeropad() as well. For sync checks, we'd get the first fault ending up in ex_handler_load_unaligned_zeropad() and a second tag check fault while processing the first. This ends up in do_tag_recovery and we disable tag checking after the report. Not ideal but not that bad. We could adjust ex_handler_load_unaligned_zeropad() to return false if the pointer is already aligned but we need to check the semantics of load_unaligned_zeropad(), is it allowed to fault on the first byte?
On Tue, Mar 11, 2025 at 11:45:21AM +0000, Catalin Marinas wrote: > On Mon, Mar 10, 2025 at 07:37:32PM +0000, Mark Rutland wrote: > > I was worried that ex_handler_load_unaligned_zeropad() might not do the > > right thing in response to a tag check fault (e.g. access the wrong 8 > > bytes), but it looks as though that's ok due to the way it generates the > > offset and the aligned pointer. > > > > If load_unaligned_zeropad() is handed a string that starts with an > > unexpected tag (and even if that starts off aligned), > > ex_handler_load_unaligned_zeropad() will access that and cause another > > tag check fault, which will be reported. > > Yes, it will report an async tag check fault on the > exit_to_kernel_mode() path _if_ load_unaligned_zeropad() triggered the > fault for other reasons (end of page). Sorry, yes. The aligned case I mentioned shouldn't apply here. > It's slightly inconsistent, we could set TCO for the async case in > ex_handler_load_unaligned_zeropad() as well. Yep, I think that'd be necessary for async mode. > For sync checks, we'd get the first fault ending up in > ex_handler_load_unaligned_zeropad() and a second tag check fault while > processing the first. This ends up in do_tag_recovery and we disable > tag checking after the report. Not ideal but not that bad. Yep; that's what I was describing in the second paragraph above, though I forgot to say that was assuming sync or asymm mode. > We could adjust ex_handler_load_unaligned_zeropad() to return false if > the pointer is already aligned but we need to check the semantics of > load_unaligned_zeropad(), is it allowed to fault on the first byte? IIUC today it's only expected to fault due to misalignment, and the gneral expectation is that for a sequence of load_unaligned_zeropad() calls, we should get at least one byte without faulting (for the NUL terminator). I reckon it'd be better to figure this out based on the ESR if possible. Kristina's patches for MOPS would give us that. Mark.
diff --git a/fs/dcache.c b/fs/dcache.c index e3634916ffb93..71f0830ac5e69 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -223,7 +223,7 @@ fs_initcall(init_fs_dcache_sysctls); * Compare 2 name strings, return 0 if they match, otherwise non-zero. * The strings are both count bytes long, and count is non-zero. */ -#ifdef CONFIG_DCACHE_WORD_ACCESS +#if defined(CONFIG_DCACHE_WORD_ACCESS) && !defined(CONFIG_KASAN_HW_TAGS) #include <asm/word-at-a-time.h> /* diff --git a/lib/string.c b/lib/string.c index eb4486ed40d25..9a43a3824d0d7 100644 --- a/lib/string.c +++ b/lib/string.c @@ -119,7 +119,8 @@ ssize_t sized_strscpy(char *dest, const char *src, size_t count) if (count == 0 || WARN_ON_ONCE(count > INT_MAX)) return -E2BIG; -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && \ + !defined(CONFIG_KASAN_HW_TAGS) /* * If src is unaligned, don't cross a page boundary, * since we don't know if the next page is mapped.
The optimized strscpy() and dentry_string_cmp() routines will read 8 unaligned bytes at a time via the function read_word_at_a_time(), but this is incompatible with MTE which will fault on a partially invalid read. The attributes on read_word_at_a_time() that disable KASAN are invisible to the CPU so they have no effect on MTE. Let's fix the bug for now by disabling the optimizations if the kernel is built with HW tag-based KASAN and consider improvements for followup changes. Signed-off-by: Peter Collingbourne <pcc@google.com> Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548 Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS") Cc: stable@vger.kernel.org --- fs/dcache.c | 2 +- lib/string.c | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-)